org.apache.lucene.analysis.fr
Class ElisionFilter

java.lang.Object
  extended by org.apache.lucene.analysis.TokenStream
      extended by org.apache.lucene.analysis.TokenFilter
          extended by org.apache.lucene.analysis.fr.ElisionFilter

public class ElisionFilter
extends TokenFilter

Removes elisions from a token stream. For example, "l'avion" (the plane) will be tokenized as "avion" (plane).

Note that StandardTokenizer sees " ' " as a space, and cuts it out.

See Also:
Elision in Wikipedia

Field Summary
 
Fields inherited from class org.apache.lucene.analysis.TokenFilter
input
 
Constructor Summary
protected ElisionFilter(TokenStream input)
          Constructs an elision filter with standard stop words
  ElisionFilter(TokenStream input, java.util.Set articles)
          Constructs an elision filter with a Set of stop words
  ElisionFilter(TokenStream input, java.lang.String[] articles)
          Constructs an elision filter with an array of stop words
 
Method Summary
 Token next(Token reusableToken)
          Returns the next input Token with term() without elisioned start
 void setArticles(java.util.Set articles)
           
 
Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, reset
 
Methods inherited from class org.apache.lucene.analysis.TokenStream
next
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ElisionFilter

protected ElisionFilter(TokenStream input)
Constructs an elision filter with standard stop words


ElisionFilter

public ElisionFilter(TokenStream input,
                     java.util.Set articles)
Constructs an elision filter with a Set of stop words


ElisionFilter

public ElisionFilter(TokenStream input,
                     java.lang.String[] articles)
Constructs an elision filter with an array of stop words

Method Detail

setArticles

public void setArticles(java.util.Set articles)

next

public Token next(Token reusableToken)
           throws java.io.IOException
Returns the next input Token with term() without elisioned start

Overrides:
next in class TokenStream
Parameters:
reusableToken - a Token that may or may not be used to return; this parameter should never be null (the callee is not required to check for null before using it, but it is a good idea to assert that it is not null.)
Returns:
next token in the stream or null if end-of-stream was hit
Throws:
java.io.IOException


Copyright © 2000-2011 Apache Software Foundation. All Rights Reserved.