org.apache.lucene.analysis.cn

Class ChineseFilter


public final class ChineseFilter
extends TokenFilter

Title: ChineseFilter Description: Filter with a stop word table Rule: No digital is allowed. English word/token should larger than 1 character. One Chinese character as one Chinese word. TO DO: 1. Add Chinese stop words, such as \ue400 2. Dictionary based Chinese word extraction 3. Intelligent Chinese word extraction Copyright: Copyright (c) 2001 Company:
Version:
1.0
Author:
Yiyi Sun

Field Summary

static String[]
STOP_WORDS

Fields inherited from class org.apache.lucene.analysis.TokenFilter

input

Constructor Summary

ChineseFilter(TokenStream in)

Method Summary

Token
next()
Returns the next token in the stream, or null at EOS.

Methods inherited from class org.apache.lucene.analysis.TokenFilter

close

Methods inherited from class org.apache.lucene.analysis.TokenStream

close, next

Field Details

STOP_WORDS

public static final String[] STOP_WORDS

Constructor Details

ChineseFilter

public ChineseFilter(TokenStream in)

Method Details

next

public final Token next()
            throws IOException
Returns the next token in the stream, or null at EOS.
Overrides:
next in interface TokenStream

Copyright © 2000-2006 Apache Software Foundation. All Rights Reserved.