de.l3s.boilerpipe.filters.simple
Class MinClauseWordsFilter

java.lang.Object
  extended by de.l3s.boilerpipe.filters.simple.MinClauseWordsFilter
All Implemented Interfaces:
BoilerpipeFilter

public final class MinClauseWordsFilter
extends java.lang.Object
implements BoilerpipeFilter

Keeps only blocks that have at least one segment fragment ("clause") with at least k words (default: 5). NOTE: You might consider using the SplitParagraphBlocksFilter upstream.

Author:
Christian Kohlsch??tter
See Also:
SplitParagraphBlocksFilter

Field Summary
static MinClauseWordsFilter INSTANCE
           
 
Constructor Summary
MinClauseWordsFilter(int minWords)
           
MinClauseWordsFilter(int minWords, boolean acceptClausesWithoutDelimiter)
           
 
Method Summary
 boolean process(TextDocument doc)
          Processes the given document doc.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

INSTANCE

public static final MinClauseWordsFilter INSTANCE
Constructor Detail

MinClauseWordsFilter

public MinClauseWordsFilter(int minWords)

MinClauseWordsFilter

public MinClauseWordsFilter(int minWords,
                            boolean acceptClausesWithoutDelimiter)
Method Detail

process

public boolean process(TextDocument doc)
                throws BoilerpipeProcessingException
Description copied from interface: BoilerpipeFilter
Processes the given document doc.

Specified by:
process in interface BoilerpipeFilter
Parameters:
doc - The TextDocument that is to be processed.
Returns:
true if changes have been made to the TextDocument.
Throws:
BoilerpipeProcessingException