Package org.apache.lucene.search.spell
Class NGramDistance
- java.lang.Object
-
- org.apache.lucene.search.spell.NGramDistance
-
- All Implemented Interfaces:
StringDistance
public class NGramDistance extends java.lang.Object implements StringDistance
N-Gram version of edit distance based on paper by Grzegorz Kondrak, "N-gram similarity and distance". Proceedings of the Twelfth International Conference on String Processing and Information Retrieval (SPIRE 2005), pp. 115-126, Buenos Aires, Argentina, November 2005. http://www.cs.ualberta.ca/~kondrak/papers/spire05.pdf This implementation uses the position-based optimization to compute partial matches of n-gram sub-strings and adds a null-character prefix of size n-1 so that the first character is contained in the same number of n-grams as a middle character. Null-character prefix matches are discounted so that strings with no matching characters will return a distance of 0.
-
-
Constructor Summary
Constructors Constructor Description NGramDistance()
Creates an N-Gram distance measure using n-grams of size 2.NGramDistance(int size)
Creates an N-Gram distance measure using n-grams of the specified size.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description boolean
equals(java.lang.Object obj)
float
getDistance(java.lang.String source, java.lang.String target)
Returns a float between 0 and 1 based on how similar the specified strings are to one another.int
hashCode()
java.lang.String
toString()
-
-
-
Constructor Detail
-
NGramDistance
public NGramDistance(int size)
Creates an N-Gram distance measure using n-grams of the specified size.- Parameters:
size
- The size of the n-gram to be used to compute the string distance.
-
NGramDistance
public NGramDistance()
Creates an N-Gram distance measure using n-grams of size 2.
-
-
Method Detail
-
getDistance
public float getDistance(java.lang.String source, java.lang.String target)
Description copied from interface:StringDistance
Returns a float between 0 and 1 based on how similar the specified strings are to one another. Returning a value of 1 means the specified strings are identical and 0 means the string are maximally different.- Specified by:
getDistance
in interfaceStringDistance
- Parameters:
source
- The first string.target
- The second string.- Returns:
- a float between 0 and 1 based on how similar the specified strings are to one another.
-
hashCode
public int hashCode()
- Overrides:
hashCode
in classjava.lang.Object
-
equals
public boolean equals(java.lang.Object obj)
- Overrides:
equals
in classjava.lang.Object
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
-