Class SimilarityBase
- java.lang.Object
 - 
- org.apache.lucene.search.similarities.Similarity
 - 
- org.apache.lucene.search.similarities.SimilarityBase
 
 
 
- 
- Direct Known Subclasses:
 DFRSimilarity,IBSimilarity,LMSimilarity
public abstract class SimilarityBase extends Similarity
A subclass ofSimilaritythat provides a simplified API for its descendants. Subclasses are only required to implement thescore(org.apache.lucene.search.similarities.BasicStats, float, float)andtoString()methods. Implementingexplain(Explanation, BasicStats, int, float, float)is optional, inasmuch as SimilarityBase already provides a basic explanation of the score and the term frequency. However, implementers of a subclass are encouraged to include as much detail about the scoring method as possible.Note: multi-word queries such as phrase queries are scored in a different way than Lucene's default ranking algorithm: whereas it "fakes" an IDF value for the phrase as a whole (since it does not know it), this class instead scores phrases as a summation of the individual term scores.
 
- 
- 
Nested Class Summary
- 
Nested classes/interfaces inherited from class org.apache.lucene.search.similarities.Similarity
Similarity.SimScorer, Similarity.SimWeight 
 - 
 
- 
Constructor Summary
Constructors Constructor Description SimilarityBase()Sole constructor. 
- 
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description longcomputeNorm(FieldInvertState state)Encodes the document length in the same way asTFIDFSimilarity.Similarity.SimWeightcomputeWeight(float queryBoost, CollectionStatistics collectionStats, TermStatistics... termStats)Compute any collection-level weight (e.g.booleangetDiscountOverlaps()Returns true if overlap tokens are discounted from the document's length.static doublelog2(double x)Returns the base two logarithm ofx.voidsetDiscountOverlaps(boolean v)Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm.Similarity.SimScorersimScorer(Similarity.SimWeight stats, AtomicReaderContext context)Creates a newSimilarity.SimScorerto score matching documents from a segment of the inverted index.abstract java.lang.StringtoString()Subclasses must override this method to return the name of the Similarity and preferably the values of parameters (if any) as well.- 
Methods inherited from class org.apache.lucene.search.similarities.Similarity
coord, queryNorm 
 - 
 
 - 
 
- 
- 
Method Detail
- 
setDiscountOverlaps
public void setDiscountOverlaps(boolean v)
Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm. By default this is true, meaning overlap tokens do not count when computing norms. 
- 
getDiscountOverlaps
public boolean getDiscountOverlaps()
Returns true if overlap tokens are discounted from the document's length.- See Also:
 setDiscountOverlaps(boolean)
 
- 
computeWeight
public final Similarity.SimWeight computeWeight(float queryBoost, CollectionStatistics collectionStats, TermStatistics... termStats)
Description copied from class:SimilarityCompute any collection-level weight (e.g. IDF, average document length, etc) needed for scoring a query.- Specified by:
 computeWeightin classSimilarity- Parameters:
 queryBoost- the query-time boost.collectionStats- collection-level statistics, such as the number of tokens in the collection.termStats- term-level statistics, such as the document frequency of a term across the collection.- Returns:
 - SimWeight object with the information this Similarity needs to score a query.
 
 
- 
simScorer
public Similarity.SimScorer simScorer(Similarity.SimWeight stats, AtomicReaderContext context) throws java.io.IOException
Description copied from class:SimilarityCreates a newSimilarity.SimScorerto score matching documents from a segment of the inverted index.- Specified by:
 simScorerin classSimilarity- Parameters:
 stats- collection information fromSimilarity.computeWeight(float, CollectionStatistics, TermStatistics...)context- segment of the inverted index to be scored.- Returns:
 - SloppySimScorer for scoring documents across 
context - Throws:
 java.io.IOException- if there is a low-level I/O error
 
- 
toString
public abstract java.lang.String toString()
Subclasses must override this method to return the name of the Similarity and preferably the values of parameters (if any) as well.- Overrides:
 toStringin classjava.lang.Object
 
- 
computeNorm
public long computeNorm(FieldInvertState state)
Encodes the document length in the same way asTFIDFSimilarity.- Specified by:
 computeNormin classSimilarity- Parameters:
 state- current processing state for this field- Returns:
 - computed norm value
 
 
- 
log2
public static double log2(double x)
Returns the base two logarithm ofx. 
 - 
 
 -