Uses of Package
de.l3s.boilerpipe.filters.english

Packages that use de.l3s.boilerpipe.filters.english
de.l3s.boilerpipe.filters.english The BoilerpipeFilters in this package have only been tested on English text. 
 

Classes in de.l3s.boilerpipe.filters.english used by de.l3s.boilerpipe.filters.english
DensityRulesClassifier
          Classifies TextBlocks as content/not-content through rules that have been determined using the C4.8 machine learning algorithm, as described in the paper "Boilerplate Detection using Shallow Text Features", particularly using text densities and link densities.
IgnoreBlocksAfterContentFilter
          Marks all blocks as "non-content" that occur after blocks that have been marked DefaultLabels.INDICATES_END_OF_TEXT.
IgnoreBlocksAfterContentFromEndFilter
          Marks all blocks as "non-content" that occur after blocks that have been marked DefaultLabels.INDICATES_END_OF_TEXT, and after any content block.
KeepLargestFulltextBlockFilter
          Keeps the largest TextBlock only (by the number of words).
MinFulltextWordsFilter
          Keeps only those content blocks which contain at least k full-text words (measured by HeuristicFilterBase.getNumFullTextWords(TextBlock)).
NumWordsRulesClassifier
          Classifies TextBlocks as content/not-content through rules that have been determined using the C4.8 machine learning algorithm, as described in the paper "Boilerplate Detection using Shallow Text Features" (WSDM 2010), particularly using number of words per block and link density per block.
TerminatingBlocksFinder
          Finds blocks which are potentially indicating the end of an article text and marks them with DefaultLabels.INDICATES_END_OF_TEXT.