See: Description
Class | Description |
---|---|
Analyzer |
An Analyzer builds TokenStreams, which analyze text.
|
CharTokenizer |
An abstract base class for simple, character-oriented tokenizers.
|
ISOLatin1AccentFilter |
A filter that replaces accented characters in the ISO Latin 1 character set
(ISO-8859-1) by their unaccented equivalent.
|
KeywordAnalyzer |
"Tokenizes" the entire stream as a single token.
|
KeywordTokenizer |
Emits the entire input as a single token.
|
LengthFilter |
Removes words that are too long and too short from the stream.
|
LetterTokenizer |
A LetterTokenizer is a tokenizer that divides text at non-letters.
|
LowerCaseFilter |
Normalizes token text to lower case.
|
LowerCaseTokenizer |
LowerCaseTokenizer performs the function of LetterTokenizer
and LowerCaseFilter together.
|
PerFieldAnalyzerWrapper |
This analyzer is used to facilitate scenarios where different
fields require different analysis techniques.
|
PorterStemFilter |
Transforms the token stream as per the Porter stemming algorithm.
|
SimpleAnalyzer |
An Analyzer that filters LetterTokenizer with LowerCaseFilter.
|
StopAnalyzer |
Filters LetterTokenizer with LowerCaseFilter and StopFilter.
|
StopFilter |
Removes stop words from a token stream.
|
Token |
A Token is an occurence of a term from the text of a field.
|
TokenFilter |
A TokenFilter is a TokenStream whose input is another token stream.
|
Tokenizer |
A Tokenizer is a TokenStream whose input is a Reader.
|
TokenStream |
A TokenStream enumerates the sequence of tokens, either from
fields of a document or from query text.
|
WhitespaceAnalyzer |
An Analyzer that uses WhitespaceTokenizer.
|
WhitespaceTokenizer |
A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
|
WordlistLoader |
Loader for text files that represent a list of stopwords.
|
Copyright © 2000-2013 Apache Software Foundation. All Rights Reserved.