Class HHMMSegmenter
- java.lang.Object
-
- org.apache.lucene.analysis.cn.smart.hhmm.HHMMSegmenter
-
public class HHMMSegmenter extends java.lang.Object
Finds the optimal segmentation of a sentence into Chinese words
-
-
Field Summary
Fields Modifier and Type Field Description private static WordDictionary
wordDict
-
Constructor Summary
Constructors Constructor Description HHMMSegmenter()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description private SegGraph
createSegGraph(java.lang.String sentence)
Create theSegGraph
for a sentence.private static int[]
getCharTypes(java.lang.String sentence)
Get the character types for every character in a sentence.java.util.List<SegToken>
process(java.lang.String sentence)
Return a list ofSegToken
representing the best segmentation of a sentence
-
-
-
Field Detail
-
wordDict
private static WordDictionary wordDict
-
-
Method Detail
-
createSegGraph
private SegGraph createSegGraph(java.lang.String sentence)
Create theSegGraph
for a sentence.- Parameters:
sentence
- input sentence, without start and end markers- Returns:
SegGraph
corresponding to the input sentence.
-
getCharTypes
private static int[] getCharTypes(java.lang.String sentence)
Get the character types for every character in a sentence.- Parameters:
sentence
- input sentence- Returns:
- array of character types corresponding to character positions in the sentence
- See Also:
Utility.getCharType(char)
-
-