Class LuceneLevenshteinDistance

  • All Implemented Interfaces:
    StringDistance

    public final class LuceneLevenshteinDistance
    extends java.lang.Object
    implements StringDistance
    Damerau-Levenshtein (optimal string alignment) implemented in a consistent way as Lucene's FuzzyTermsEnum with the transpositions option enabled. Notes:
    • This metric treats full unicode codepoints as characters
    • This metric scales raw edit distances into a floating point score based upon the shortest of the two terms
    • Transpositions of two adjacent codepoints are treated as primitive edits.
    • Edits are applied in parallel: for example, "ab" and "bca" have distance 3.
    NOTE: this class is not particularly efficient. It is only intended for merging results from multiple DirectSpellCheckers.
    • Constructor Summary

      Constructors 
      Constructor Description
      LuceneLevenshteinDistance()
      Creates a new comparator, mimicing the behavior of Lucene's internal edit distance.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      float getDistance​(java.lang.String target, java.lang.String other)
      Returns a float between 0 and 1 based on how similar the specified strings are to one another.
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • LuceneLevenshteinDistance

        public LuceneLevenshteinDistance()
        Creates a new comparator, mimicing the behavior of Lucene's internal edit distance.
    • Method Detail

      • getDistance

        public float getDistance​(java.lang.String target,
                                 java.lang.String other)
        Description copied from interface: StringDistance
        Returns a float between 0 and 1 based on how similar the specified strings are to one another. Returning a value of 1 means the specified strings are identical and 0 means the string are maximally different.
        Specified by:
        getDistance in interface StringDistance
        Parameters:
        target - The first string.
        other - The second string.
        Returns:
        a float between 0 and 1 based on how similar the specified strings are to one another.