Class SimilarityBase
- java.lang.Object
-
- org.apache.lucene.search.similarities.Similarity
-
- org.apache.lucene.search.similarities.SimilarityBase
-
- Direct Known Subclasses:
DFRSimilarity,IBSimilarity,LMSimilarity
public abstract class SimilarityBase extends Similarity
A subclass ofSimilaritythat provides a simplified API for its descendants. Subclasses are only required to implement thescore(org.apache.lucene.search.similarities.BasicStats, float, float)andtoString()methods. Implementingexplain(Explanation, BasicStats, int, float, float)is optional, inasmuch as SimilarityBase already provides a basic explanation of the score and the term frequency. However, implementers of a subclass are encouraged to include as much detail about the scoring method as possible.Note: multi-word queries such as phrase queries are scored in a different way than Lucene's default ranking algorithm: whereas it "fakes" an IDF value for the phrase as a whole (since it does not know it), this class instead scores phrases as a summation of the individual term scores.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.search.similarities.Similarity
Similarity.SimScorer, Similarity.SimWeight
-
-
Constructor Summary
Constructors Constructor Description SimilarityBase()Sole constructor.
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description longcomputeNorm(FieldInvertState state)Encodes the document length in the same way asTFIDFSimilarity.Similarity.SimWeightcomputeWeight(float queryBoost, CollectionStatistics collectionStats, TermStatistics... termStats)Compute any collection-level weight (e.g.booleangetDiscountOverlaps()Returns true if overlap tokens are discounted from the document's length.static doublelog2(double x)Returns the base two logarithm ofx.voidsetDiscountOverlaps(boolean v)Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm.Similarity.SimScorersimScorer(Similarity.SimWeight stats, AtomicReaderContext context)Creates a newSimilarity.SimScorerto score matching documents from a segment of the inverted index.abstract java.lang.StringtoString()Subclasses must override this method to return the name of the Similarity and preferably the values of parameters (if any) as well.-
Methods inherited from class org.apache.lucene.search.similarities.Similarity
coord, queryNorm
-
-
-
-
Method Detail
-
setDiscountOverlaps
public void setDiscountOverlaps(boolean v)
Determines whether overlap tokens (Tokens with 0 position increment) are ignored when computing norm. By default this is true, meaning overlap tokens do not count when computing norms.
-
getDiscountOverlaps
public boolean getDiscountOverlaps()
Returns true if overlap tokens are discounted from the document's length.- See Also:
setDiscountOverlaps(boolean)
-
computeWeight
public final Similarity.SimWeight computeWeight(float queryBoost, CollectionStatistics collectionStats, TermStatistics... termStats)
Description copied from class:SimilarityCompute any collection-level weight (e.g. IDF, average document length, etc) needed for scoring a query.- Specified by:
computeWeightin classSimilarity- Parameters:
queryBoost- the query-time boost.collectionStats- collection-level statistics, such as the number of tokens in the collection.termStats- term-level statistics, such as the document frequency of a term across the collection.- Returns:
- SimWeight object with the information this Similarity needs to score a query.
-
simScorer
public Similarity.SimScorer simScorer(Similarity.SimWeight stats, AtomicReaderContext context) throws java.io.IOException
Description copied from class:SimilarityCreates a newSimilarity.SimScorerto score matching documents from a segment of the inverted index.- Specified by:
simScorerin classSimilarity- Parameters:
stats- collection information fromSimilarity.computeWeight(float, CollectionStatistics, TermStatistics...)context- segment of the inverted index to be scored.- Returns:
- SloppySimScorer for scoring documents across
context - Throws:
java.io.IOException- if there is a low-level I/O error
-
toString
public abstract java.lang.String toString()
Subclasses must override this method to return the name of the Similarity and preferably the values of parameters (if any) as well.- Overrides:
toStringin classjava.lang.Object
-
computeNorm
public long computeNorm(FieldInvertState state)
Encodes the document length in the same way asTFIDFSimilarity.- Specified by:
computeNormin classSimilarity- Parameters:
state- current processing state for this field- Returns:
- computed norm value
-
log2
public static double log2(double x)
Returns the base two logarithm ofx.
-
-