org.apache.lucene.analysis

Class StopFilter


public final class StopFilter
extends TokenFilter

Removes stop words from a token stream.

Field Summary

Fields inherited from class org.apache.lucene.analysis.TokenFilter

input

Constructor Summary

StopFilter(TokenStream in, Hashtable stopTable)
Deprecated. Use StopFilter(TokenStream,Set) instead
StopFilter(TokenStream in, Hashtable stopTable, boolean ignoreCase)
Deprecated. Use StopFilter(TokenStream,Set) instead
StopFilter(TokenStream in, Set stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set.
StopFilter(TokenStream input, Set stopWords, boolean ignoreCase)
Construct a token stream filtering the given input.
StopFilter(TokenStream input, String[] stopWords)
Construct a token stream filtering the given input.
StopFilter(TokenStream in, String[] stopWords, boolean ignoreCase)
Constructs a filter which removes words from the input TokenStream that are named in the array of words.

Method Summary

static Set
makeStopSet(String[] stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor.
static Set
makeStopSet(String[] stopWords, boolean ignoreCase)
static Hashtable
makeStopTable(String[] stopWords)
Deprecated. Use makeStopSet(String[]) instead.
static Hashtable
makeStopTable(String[] stopWords, boolean ignoreCase)
Deprecated. Use makeStopSet(java.lang.String[], boolean) instead.
Token
next()
Returns the next input Token whose termText() is not a stop word.

Methods inherited from class org.apache.lucene.analysis.TokenFilter

close

Methods inherited from class org.apache.lucene.analysis.TokenStream

close, next

Constructor Details

StopFilter

public StopFilter(TokenStream in,
                  Hashtable stopTable)

Deprecated. Use StopFilter(TokenStream,Set) instead

Constructs a filter which removes words from the input TokenStream that are named in the Hashtable.

StopFilter

public StopFilter(TokenStream in,
                  Hashtable stopTable,
                  boolean ignoreCase)

Deprecated. Use StopFilter(TokenStream,Set) instead

Constructs a filter which removes words from the input TokenStream that are named in the Hashtable. If ignoreCase is true, all keys in the stopTable should already be lowercased.

StopFilter

public StopFilter(TokenStream in,
                  Set stopWords)
Constructs a filter which removes words from the input TokenStream that are named in the Set. It is crucial that an efficient Set implementation is used for maximum performance.
See Also:
makeStopSet(java.lang.String[])

StopFilter

public StopFilter(TokenStream input,
                  Set stopWords,
                  boolean ignoreCase)
Construct a token stream filtering the given input.
Parameters:
input -
stopWords - The set of Stop Words, as Strings. If ignoreCase is true, all strings should be lower cased
ignoreCase - -Ignore case when stopping. The stopWords set must be setup to contain only lower case words

StopFilter

public StopFilter(TokenStream input,
                  String[] stopWords)
Construct a token stream filtering the given input.

StopFilter

public StopFilter(TokenStream in,
                  String[] stopWords,
                  boolean ignoreCase)
Constructs a filter which removes words from the input TokenStream that are named in the array of words.

Method Details

makeStopSet

public static final Set makeStopSet(String[] stopWords)
Builds a Set from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this stopWords construction to be cached once when an Analyzer is constructed.
See Also:
passing false to ignoreCase

makeStopSet

public static final Set makeStopSet(String[] stopWords,
                                    boolean ignoreCase)
Parameters:
stopWords -
ignoreCase - If true, all words are lower cased first.
Returns:
a Set containing the words

makeStopTable

public static final Hashtable makeStopTable(String[] stopWords)

Deprecated. Use makeStopSet(String[]) instead.

Builds a Hashtable from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this table construction to be cached once when an Analyzer is constructed.

makeStopTable

public static final Hashtable makeStopTable(String[] stopWords,
                                            boolean ignoreCase)

Deprecated. Use makeStopSet(java.lang.String[], boolean) instead.

Builds a Hashtable from an array of stop words, appropriate for passing into the StopFilter constructor. This permits this table construction to be cached once when an Analyzer is constructed.

next

public final Token next()
            throws IOException
Returns the next input Token whose termText() is not a stop word.
Overrides:
next in interface TokenStream

Copyright © 2000-2006 Apache Software Foundation. All Rights Reserved.