org.apache.lucene.analysis.cn
Class ChineseFilter
java.lang.Object
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.cn.ChineseFilter
public final class ChineseFilter
- extends TokenFilter
Title: ChineseFilter
Description: Filter with a stop word table
Rule: No digital is allowed.
English word/token should larger than 1 character.
One Chinese character as one Chinese word.
TO DO:
1. Add Chinese stop words, such as
2. Dictionary based Chinese word extraction
3. Intelligent Chinese word extraction
Copyright: Copyright (c) 2001
Company:
- Version:
- 1.0
- Author:
- Yiyi Sun
Method Summary |
Token |
next()
Returns the next token in the stream, or null at EOS. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
STOP_WORDS
public static final String[] STOP_WORDS
ChineseFilter
public ChineseFilter(TokenStream in)
next
public final Token next()
throws IOException
- Description copied from class:
TokenStream
- Returns the next token in the stream, or null at EOS.
The returned Token is a "full private copy" (not
re-used across calls to next()) but will be slower
than calling
TokenStream.next(Token)
instead..
- Overrides:
next
in class TokenStream
- Throws:
IOException
Copyright © 2000-2008 Apache Software Foundation. All Rights Reserved.