Class ShingleAnalyzerWrapper
- java.lang.Object
-
- org.apache.lucene.analysis.Analyzer
-
- org.apache.lucene.analysis.AnalyzerWrapper
-
- org.apache.lucene.analysis.shingle.ShingleAnalyzerWrapper
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
public final class ShingleAnalyzerWrapper extends AnalyzerWrapper
A ShingleAnalyzerWrapper wraps aShingleFilter
around anotherAnalyzer
.A shingle is another name for a token based n-gram.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer
Analyzer.GlobalReuseStrategy, Analyzer.PerFieldReuseStrategy, Analyzer.ReuseStrategy, Analyzer.TokenStreamComponents
-
-
Field Summary
-
Fields inherited from class org.apache.lucene.analysis.Analyzer
GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
-
-
Constructor Summary
Constructors Constructor Description ShingleAnalyzerWrapper(Analyzer defaultAnalyzer)
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int maxShingleSize)
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int minShingleSize, int maxShingleSize)
ShingleAnalyzerWrapper(Analyzer delegate, int minShingleSize, int maxShingleSize, java.lang.String tokenSeparator, boolean outputUnigrams, boolean outputUnigramsIfNoShingles, java.lang.String fillerToken)
Creates a new ShingleAnalyzerWrapperShingleAnalyzerWrapper(Version matchVersion)
WrapsStandardAnalyzer
.ShingleAnalyzerWrapper(Version matchVersion, int minShingleSize, int maxShingleSize)
WrapsStandardAnalyzer
.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
getFillerToken()
int
getMaxShingleSize()
The max shingle (token ngram) sizeint
getMinShingleSize()
The min shingle (token ngram) sizejava.lang.String
getTokenSeparator()
Analyzer
getWrappedAnalyzer(java.lang.String fieldName)
boolean
isOutputUnigrams()
boolean
isOutputUnigramsIfNoShingles()
-
Methods inherited from class org.apache.lucene.analysis.AnalyzerWrapper
getOffsetGap, getPositionIncrementGap, initReader
-
Methods inherited from class org.apache.lucene.analysis.Analyzer
close, getReuseStrategy, tokenStream, tokenStream
-
-
-
-
Constructor Detail
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer)
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int maxShingleSize)
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer defaultAnalyzer, int minShingleSize, int maxShingleSize)
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Analyzer delegate, int minShingleSize, int maxShingleSize, java.lang.String tokenSeparator, boolean outputUnigrams, boolean outputUnigramsIfNoShingles, java.lang.String fillerToken)
Creates a new ShingleAnalyzerWrapper- Parameters:
delegate
- Analyzer whose TokenStream is to be filteredminShingleSize
- Min shingle (token ngram) sizemaxShingleSize
- Max shingle sizetokenSeparator
- Used to separate input stream tokens in output shinglesoutputUnigrams
- Whether or not the filter shall pass the original tokens to the output streamoutputUnigramsIfNoShingles
- Overrides the behavior of outputUnigrams==false for those times when no shingles are available (because there are fewer than minShingleSize tokens in the input stream)? Note that if outputUnigrams==true, then unigrams are always output, regardless of whether any shingles are available.fillerToken
- filler token to use when positionIncrement is more than 1
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Version matchVersion)
WrapsStandardAnalyzer
.
-
ShingleAnalyzerWrapper
public ShingleAnalyzerWrapper(Version matchVersion, int minShingleSize, int maxShingleSize)
WrapsStandardAnalyzer
.
-
-
Method Detail
-
getMaxShingleSize
public int getMaxShingleSize()
The max shingle (token ngram) size- Returns:
- The max shingle (token ngram) size
-
getMinShingleSize
public int getMinShingleSize()
The min shingle (token ngram) size- Returns:
- The min shingle (token ngram) size
-
getTokenSeparator
public java.lang.String getTokenSeparator()
-
isOutputUnigrams
public boolean isOutputUnigrams()
-
isOutputUnigramsIfNoShingles
public boolean isOutputUnigramsIfNoShingles()
-
getFillerToken
public java.lang.String getFillerToken()
-
getWrappedAnalyzer
public final Analyzer getWrappedAnalyzer(java.lang.String fieldName)
-
-