org.apache.lucene.analysis
Class PorterStemFilter
java.lang.Object
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.PorterStemFilter
public final class PorterStemFilter
- extends TokenFilter
Transforms the token stream as per the Porter stemming algorithm.
Note: the input to the stemming filter must already be in lower case,
so you will need to use LowerCaseFilter or LowerCaseTokenizer farther
down the Tokenizer chain in order for this to work properly!
To use this filter with other analyzers, you'll want to write an
Analyzer class that sets up the TokenStream chain as you want it.
To use this with LowerCaseTokenizer, for example, you'd write an
analyzer like this:
class MyAnalyzer extends Analyzer {
public final TokenStream tokenStream(String fieldName, Reader reader) {
return new PorterStemFilter(new LowerCaseTokenizer(reader));
}
}
Method Summary |
Token |
next(Token result)
Returns the next token in the stream, or null at EOS. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PorterStemFilter
public PorterStemFilter(TokenStream in)
next
public final Token next(Token result)
throws IOException
- Description copied from class:
TokenStream
- Returns the next token in the stream, or null at EOS.
When possible, the input Token should be used as the
returned Token (this gives fastest tokenization
performance), but this is not required and a new Token
may be returned. Callers may re-use a single Token
instance for successive calls to this method.
This implicitly defines a "contract" between
consumers (callers of this method) and
producers (implementations of this method
that are the source for tokens):
- A consumer must fully consume the previously
returned Token before calling this method again.
- A producer must call
Token.clear()
before setting the fields in it & returning it
Note that a TokenFilter
is considered a consumer.
- Overrides:
next
in class TokenStream
- Parameters:
result
- a Token that may or may not be used to return
- Returns:
- next token in the stream or null if end-of-stream was hit
- Throws:
IOException
Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.