de.l3s.boilerpipe.extractors
Class KeepEverythingWithMinKWordsExtractor

java.lang.Object
  extended by de.l3s.boilerpipe.extractors.ExtractorBase
      extended by de.l3s.boilerpipe.extractors.KeepEverythingWithMinKWordsExtractor
All Implemented Interfaces:
BoilerpipeExtractor, BoilerpipeFilter

public final class KeepEverythingWithMinKWordsExtractor
extends ExtractorBase

A full-text extractor which extracts the largest text component of a page. For news articles, it may perform better than the DefaultExtractor, but usually worse than ArticleExtractor.

Author:
Christian Kohlsch??tter

Constructor Summary
KeepEverythingWithMinKWordsExtractor(int kMin)
           
 
Method Summary
 boolean process(TextDocument doc)
          Processes the given document doc.
 
Methods inherited from class de.l3s.boilerpipe.extractors.ExtractorBase
getText, getText, getText, getText, getText
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

KeepEverythingWithMinKWordsExtractor

public KeepEverythingWithMinKWordsExtractor(int kMin)
Method Detail

process

public boolean process(TextDocument doc)
                throws BoilerpipeProcessingException
Description copied from interface: BoilerpipeFilter
Processes the given document doc.

Parameters:
doc - The TextDocument that is to be processed.
Returns:
true if changes have been made to the TextDocument.
Throws:
BoilerpipeProcessingException