Class RussianAnalyzer

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable

    public final class RussianAnalyzer
    extends StopwordAnalyzerBase
    Analyzer for Russian language.

    Supports an external list of stopwords (words that will not be indexed at all). A default set of stopwords is used unless an alternative list is specified.

    You must specify the required Version compatibility when creating RussianAnalyzer:

    • As of 3.1, StandardTokenizer is used, Snowball stemming is done with SnowballFilter, and Snowball stopwords are used by default.
    • Field Detail

      • DEFAULT_STOPWORD_FILE

        public static final java.lang.String DEFAULT_STOPWORD_FILE
        File containing default Russian stopwords.
        See Also:
        Constant Field Values
    • Constructor Detail

      • RussianAnalyzer

        public RussianAnalyzer​(Version matchVersion)
      • RussianAnalyzer

        public RussianAnalyzer​(Version matchVersion,
                               CharArraySet stopwords)
        Builds an analyzer with the given stop words
        Parameters:
        matchVersion - lucene compatibility version
        stopwords - a stopword set
      • RussianAnalyzer

        public RussianAnalyzer​(Version matchVersion,
                               CharArraySet stopwords,
                               CharArraySet stemExclusionSet)
        Builds an analyzer with the given stop words
        Parameters:
        matchVersion - lucene compatibility version
        stopwords - a stopword set
        stemExclusionSet - a set of words not to be stemmed
    • Method Detail

      • getDefaultStopSet

        public static CharArraySet getDefaultStopSet()
        Returns an unmodifiable instance of the default stop-words set.
        Returns:
        an unmodifiable instance of the default stop-words set.