Class UAX29URLEmailAnalyzer

    • Field Detail

      • DEFAULT_MAX_TOKEN_LENGTH

        public static final int DEFAULT_MAX_TOKEN_LENGTH
        Default maximum allowed token length
        See Also:
        Constant Field Values
      • maxTokenLength

        private int maxTokenLength
      • STOP_WORDS_SET

        public static final CharArraySet STOP_WORDS_SET
        An unmodifiable set containing some common English words that are usually not useful for searching.
    • Constructor Detail

      • UAX29URLEmailAnalyzer

        public UAX29URLEmailAnalyzer​(CharArraySet stopWords)
        Builds an analyzer with the given stop words.
        Parameters:
        stopWords - stop words
      • UAX29URLEmailAnalyzer

        public UAX29URLEmailAnalyzer()
        Builds an analyzer with the default stop words (STOP_WORDS_SET).
      • UAX29URLEmailAnalyzer

        public UAX29URLEmailAnalyzer​(java.io.Reader stopwords)
                              throws java.io.IOException
        Builds an analyzer with the stop words from the given reader.
        Parameters:
        stopwords - Reader to read stop words from
        Throws:
        java.io.IOException
        See Also:
        WordlistLoader.getWordSet(java.io.Reader)