Class JaroWinklerSimilarity
- java.lang.Object
-
- org.apache.commons.text.similarity.JaroWinklerSimilarity
-
- All Implemented Interfaces:
SimilarityScore<java.lang.Double>
public class JaroWinklerSimilarity extends java.lang.Object implements SimilarityScore<java.lang.Double>
A similarity algorithm indicating the percentage of matched characters between two character sequences.The Jaro measure is the weighted sum of percentage of matched characters from each file and transposed characters. Winkler increased this measure for matching initial characters.
This implementation is based on the Jaro Winkler similarity algorithm from http://en.wikipedia.org/wiki/Jaro%E2%80%93Winkler_distance.
This code has been adapted from Apache Commons Lang 3.3.
- Since:
- 1.7
-
-
Constructor Summary
Constructors Constructor Description JaroWinklerSimilarity()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.Double
apply(java.lang.CharSequence left, java.lang.CharSequence right)
Computes the Jaro Winkler Similarity between two character sequences.
-
-
-
Method Detail
-
apply
public java.lang.Double apply(java.lang.CharSequence left, java.lang.CharSequence right)
Computes the Jaro Winkler Similarity between two character sequences.sim.apply(null, null) = IllegalArgumentException sim.apply("foo", null) = IllegalArgumentException sim.apply(null, "foo") = IllegalArgumentException sim.apply("", "") = 1.0 sim.apply("foo", "foo") = 1.0 sim.apply("foo", "foo ") = 0.94 sim.apply("foo", "foo ") = 0.91 sim.apply("foo", " foo ") = 0.87 sim.apply("foo", " foo") = 0.51 sim.apply("", "a") = 0.0 sim.apply("aaapppp", "") = 0.0 sim.apply("frog", "fog") = 0.93 sim.apply("fly", "ant") = 0.0 sim.apply("elephant", "hippo") = 0.44 sim.apply("hippo", "elephant") = 0.44 sim.apply("hippo", "zzzzzzzz") = 0.0 sim.apply("hello", "hallo") = 0.88 sim.apply("ABC Corporation", "ABC Corp") = 0.91 sim.apply("D N H Enterprises Inc", "D & H Enterprises, Inc.") = 0.95 sim.apply("My Gym Children's Fitness Center", "My Gym. Childrens Fitness") = 0.92 sim.apply("PENNSYLVANIA", "PENNCISYLVNIA") = 0.88
- Specified by:
apply
in interfaceSimilarityScore<java.lang.Double>
- Parameters:
left
- the first CharSequence, must not be nullright
- the second CharSequence, must not be null- Returns:
- result similarity
- Throws:
java.lang.IllegalArgumentException
- if either CharSequence input isnull
-
-