Uses of Class
org.apache.lucene.analysis.TokenStream

Packages that use TokenStream
org.apache.lucene.analysis API and code to convert text into indexable tokens. 
org.apache.lucene.analysis.br Analyzer for Brazilian. 
org.apache.lucene.analysis.cjk Analyzer for Chinese, Japanese and Korean. 
org.apache.lucene.analysis.cn Analyzer for Chinese. 
org.apache.lucene.analysis.cz Analyzer for Czech. 
org.apache.lucene.analysis.de Analyzer for German. 
org.apache.lucene.analysis.el Analyzer for Greek. 
org.apache.lucene.analysis.fr Analyzer for French. 
org.apache.lucene.analysis.ngram   
org.apache.lucene.analysis.nl Analyzer for Dutch. 
org.apache.lucene.analysis.ru Analyzer for Russian. 
org.apache.lucene.analysis.standard A grammar-based tokenizer constructed with JavaCC. 
org.apache.lucene.analysis.th   
 

Uses of TokenStream in org.apache.lucene.analysis
 

Subclasses of TokenStream in org.apache.lucene.analysis
 class CharTokenizer
          An abstract base class for simple, character-oriented tokenizers.
 class ISOLatin1AccentFilter
          A filter that replaces accented characters in the ISO Latin 1 character set (ISO-8859-1) by their unaccented equivalent.
 class KeywordTokenizer
          Emits the entire input as a single token.
 class LengthFilter
          Removes words that are too long and too short from the stream.
 class LetterTokenizer
          A LetterTokenizer is a tokenizer that divides text at non-letters.
 class LowerCaseFilter
          Normalizes token text to lower case.
 class LowerCaseTokenizer
          LowerCaseTokenizer performs the function of LetterTokenizer and LowerCaseFilter together.
 class PorterStemFilter
          Transforms the token stream as per the Porter stemming algorithm.
 class StopFilter
          Removes stop words from a token stream.
 class TokenFilter
          A TokenFilter is a TokenStream whose input is another token stream.
 class Tokenizer
          A Tokenizer is a TokenStream whose input is a Reader.
 class WhitespaceTokenizer
          A WhitespaceTokenizer is a tokenizer that divides text at whitespace.
 

Fields in org.apache.lucene.analysis declared as TokenStream
protected  TokenStream TokenFilter.input
          The source of tokens for this filter.
 

Methods in org.apache.lucene.analysis that return TokenStream
abstract  TokenStream Analyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 TokenStream KeywordAnalyzer.tokenStream(String fieldName, Reader reader)
           
 TokenStream PerFieldAnalyzerWrapper.tokenStream(String fieldName, Reader reader)
           
 TokenStream SimpleAnalyzer.tokenStream(String fieldName, Reader reader)
           
 TokenStream StopAnalyzer.tokenStream(String fieldName, Reader reader)
          Filters LowerCaseTokenizer with StopFilter.
 TokenStream WhitespaceAnalyzer.tokenStream(String fieldName, Reader reader)
           
 

Constructors in org.apache.lucene.analysis with parameters of type TokenStream
ISOLatin1AccentFilter(TokenStream input)
           
LengthFilter(TokenStream in, int min, int max)
          Build a filter that removes words that are too long or too short from the text.
LowerCaseFilter(TokenStream in)
           
PorterStemFilter(TokenStream in)
           
StopFilter(TokenStream input, String[] stopWords)
          Construct a token stream filtering the given input.
StopFilter(TokenStream in, String[] stopWords, boolean ignoreCase)
          Constructs a filter which removes words from the input TokenStream that are named in the array of words.
StopFilter(TokenStream input, Set stopWords, boolean ignoreCase)
          Construct a token stream filtering the given input.
StopFilter(TokenStream in, Set stopWords)
          Constructs a filter which removes words from the input TokenStream that are named in the Set.
TokenFilter(TokenStream input)
          Construct a token stream filtering the given input.
 

Uses of TokenStream in org.apache.lucene.analysis.br
 

Subclasses of TokenStream in org.apache.lucene.analysis.br
 class BrazilianStemFilter
          Based on GermanStemFilter
 

Methods in org.apache.lucene.analysis.br that return TokenStream
 TokenStream BrazilianAnalyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 

Constructors in org.apache.lucene.analysis.br with parameters of type TokenStream
BrazilianStemFilter(TokenStream in)
           
BrazilianStemFilter(TokenStream in, Set exclusiontable)
           
 

Uses of TokenStream in org.apache.lucene.analysis.cjk
 

Subclasses of TokenStream in org.apache.lucene.analysis.cjk
 class CJKTokenizer
          CJKTokenizer was modified from StopTokenizer which does a decent job for most European languages.
 

Methods in org.apache.lucene.analysis.cjk that return TokenStream
 TokenStream CJKAnalyzer.tokenStream(String fieldName, Reader reader)
          get token stream from input
 

Uses of TokenStream in org.apache.lucene.analysis.cn
 

Subclasses of TokenStream in org.apache.lucene.analysis.cn
 class ChineseFilter
          Title: ChineseFilter Description: Filter with a stop word table Rule: No digital is allowed.
 class ChineseTokenizer
          Title: ChineseTokenizer Description: Extract tokens from the Stream using Character.getType() Rule: A Chinese character as a single token Copyright: Copyright (c) 2001 Company: The difference between thr ChineseTokenizer and the CJKTokenizer (id=23545) is that they have different token parsing logic.
 

Methods in org.apache.lucene.analysis.cn that return TokenStream
 TokenStream ChineseAnalyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 

Constructors in org.apache.lucene.analysis.cn with parameters of type TokenStream
ChineseFilter(TokenStream in)
           
 

Uses of TokenStream in org.apache.lucene.analysis.cz
 

Methods in org.apache.lucene.analysis.cz that return TokenStream
 TokenStream CzechAnalyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 

Uses of TokenStream in org.apache.lucene.analysis.de
 

Subclasses of TokenStream in org.apache.lucene.analysis.de
 class GermanStemFilter
          A filter that stems German words.
 

Methods in org.apache.lucene.analysis.de that return TokenStream
 TokenStream GermanAnalyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 

Constructors in org.apache.lucene.analysis.de with parameters of type TokenStream
GermanStemFilter(TokenStream in)
           
GermanStemFilter(TokenStream in, Set exclusionSet)
          Builds a GermanStemFilter that uses an exclusiontable.
 

Uses of TokenStream in org.apache.lucene.analysis.el
 

Subclasses of TokenStream in org.apache.lucene.analysis.el
 class GreekLowerCaseFilter
          Normalizes token text to lower case, analyzing given ("greek") charset.
 

Methods in org.apache.lucene.analysis.el that return TokenStream
 TokenStream GreekAnalyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 

Constructors in org.apache.lucene.analysis.el with parameters of type TokenStream
GreekLowerCaseFilter(TokenStream in, char[] charset)
           
 

Uses of TokenStream in org.apache.lucene.analysis.fr
 

Subclasses of TokenStream in org.apache.lucene.analysis.fr
 class FrenchStemFilter
          A filter that stemms french words.
 

Methods in org.apache.lucene.analysis.fr that return TokenStream
 TokenStream FrenchAnalyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 

Constructors in org.apache.lucene.analysis.fr with parameters of type TokenStream
FrenchStemFilter(TokenStream in)
           
FrenchStemFilter(TokenStream in, Set exclusiontable)
           
 

Uses of TokenStream in org.apache.lucene.analysis.ngram
 

Subclasses of TokenStream in org.apache.lucene.analysis.ngram
 class EdgeNGramTokenizer
          Tokenizes the input into n-grams of the given size.
 class NGramTokenizer
          Tokenizes the input into n-grams of the given size(s).
 

Uses of TokenStream in org.apache.lucene.analysis.nl
 

Subclasses of TokenStream in org.apache.lucene.analysis.nl
 class DutchStemFilter
          A filter that stems Dutch words.
 

Methods in org.apache.lucene.analysis.nl that return TokenStream
 TokenStream DutchAnalyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided TextReader.
 

Constructors in org.apache.lucene.analysis.nl with parameters of type TokenStream
DutchStemFilter(TokenStream _in)
           
DutchStemFilter(TokenStream _in, Set exclusiontable)
          Builds a DutchStemFilter that uses an exclusiontable.
DutchStemFilter(TokenStream _in, Set exclusiontable, Map stemdictionary)
           
 

Uses of TokenStream in org.apache.lucene.analysis.ru
 

Subclasses of TokenStream in org.apache.lucene.analysis.ru
 class RussianLetterTokenizer
          A RussianLetterTokenizer is a tokenizer that extends LetterTokenizer by additionally looking up letters in a given "russian charset".
 class RussianLowerCaseFilter
          Normalizes token text to lower case, analyzing given ("russian") charset.
 class RussianStemFilter
          A filter that stems Russian words.
 

Methods in org.apache.lucene.analysis.ru that return TokenStream
 TokenStream RussianAnalyzer.tokenStream(String fieldName, Reader reader)
          Creates a TokenStream which tokenizes all the text in the provided Reader.
 

Constructors in org.apache.lucene.analysis.ru with parameters of type TokenStream
RussianLowerCaseFilter(TokenStream in, char[] charset)
           
RussianStemFilter(TokenStream in, char[] charset)
           
 

Uses of TokenStream in org.apache.lucene.analysis.standard
 

Subclasses of TokenStream in org.apache.lucene.analysis.standard
 class StandardFilter
          Normalizes tokens extracted with StandardTokenizer.
 class StandardTokenizer
          A grammar-based tokenizer constructed with JavaCC.
 

Methods in org.apache.lucene.analysis.standard that return TokenStream
 TokenStream StandardAnalyzer.tokenStream(String fieldName, Reader reader)
          Constructs a StandardTokenizer filtered by a StandardFilter, a LowerCaseFilter and a StopFilter.
 

Constructors in org.apache.lucene.analysis.standard with parameters of type TokenStream
StandardFilter(TokenStream in)
          Construct filtering in.
 

Uses of TokenStream in org.apache.lucene.analysis.th
 

Subclasses of TokenStream in org.apache.lucene.analysis.th
 class ThaiWordFilter
          TokenFilter that use java.text.BreakIterator to break each Token that is Thai into separate Token(s) for each Thai word.
 

Methods in org.apache.lucene.analysis.th that return TokenStream
 TokenStream ThaiAnalyzer.tokenStream(String fieldName, Reader reader)
           
 

Constructors in org.apache.lucene.analysis.th with parameters of type TokenStream
ThaiWordFilter(TokenStream input)
           
 



Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.