Package de.l3s.boilerpipe.filters.heuristics

The BoilerpipeFilters in this package are pure heuristics.

See:
          Description

Class Summary
AddPrecedingLabelsFilter Adds the labels of the preceding block to the current block, optionally adding a prefix.
ArticleMetadataFilter  
BlockProximityFusion Fuses adjacent blocks if their distance (in blocks) does not exceed a certain limit.
ContentFusion  
DocumentTitleMatchClassifier Marks TextBlocks which contain parts of the HTML <TITLE> tag, using some heuristics which are quite specific to the news domain.
ExpandTitleToContentFilter Marks all TextBlocks "content" which are between the headline and the part that has already been marked content, if they are marked DefaultLabels.MIGHT_BE_CONTENT.
KeepLargestBlockFilter Keeps the largest TextBlock only (by the number of words).
LabelFusion Fuses adjacent blocks if their labels are equal.
SimpleBlockFusionProcessor Merges two subsequent blocks if their text densities are equal.
 

Package de.l3s.boilerpipe.filters.heuristics Description

The BoilerpipeFilters in this package are pure heuristics.