Class ShingleAnalyzerWrapper

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable

    public final class ShingleAnalyzerWrapper
    extends AnalyzerWrapper
    A ShingleAnalyzerWrapper wraps a ShingleFilter around another Analyzer.

    A shingle is another name for a token based n-gram.

    • Constructor Detail

      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(Analyzer defaultAnalyzer)
      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(Analyzer defaultAnalyzer,
                                      int maxShingleSize)
      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(Analyzer defaultAnalyzer,
                                      int minShingleSize,
                                      int maxShingleSize)
      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(Analyzer delegate,
                                      int minShingleSize,
                                      int maxShingleSize,
                                      java.lang.String tokenSeparator,
                                      boolean outputUnigrams,
                                      boolean outputUnigramsIfNoShingles,
                                      java.lang.String fillerToken)
        Creates a new ShingleAnalyzerWrapper
        Parameters:
        delegate - Analyzer whose TokenStream is to be filtered
        minShingleSize - Min shingle (token ngram) size
        maxShingleSize - Max shingle size
        tokenSeparator - Used to separate input stream tokens in output shingles
        outputUnigrams - Whether or not the filter shall pass the original tokens to the output stream
        outputUnigramsIfNoShingles - Overrides the behavior of outputUnigrams==false for those times when no shingles are available (because there are fewer than minShingleSize tokens in the input stream)? Note that if outputUnigrams==true, then unigrams are always output, regardless of whether any shingles are available.
        fillerToken - filler token to use when positionIncrement is more than 1
      • ShingleAnalyzerWrapper

        public ShingleAnalyzerWrapper​(Version matchVersion,
                                      int minShingleSize,
                                      int maxShingleSize)
    • Method Detail

      • getMaxShingleSize

        public int getMaxShingleSize()
        The max shingle (token ngram) size
        Returns:
        The max shingle (token ngram) size
      • getMinShingleSize

        public int getMinShingleSize()
        The min shingle (token ngram) size
        Returns:
        The min shingle (token ngram) size
      • getTokenSeparator

        public java.lang.String getTokenSeparator()
      • isOutputUnigrams

        public boolean isOutputUnigrams()
      • isOutputUnigramsIfNoShingles

        public boolean isOutputUnigramsIfNoShingles()
      • getFillerToken

        public java.lang.String getFillerToken()
      • getWrappedAnalyzer

        public final Analyzer getWrappedAnalyzer​(java.lang.String fieldName)