Class Percentile
- java.lang.Object
-
- org.apache.commons.math.stat.descriptive.AbstractUnivariateStatistic
-
- org.apache.commons.math.stat.descriptive.rank.Percentile
-
- All Implemented Interfaces:
java.io.Serializable
,UnivariateStatistic
- Direct Known Subclasses:
Median
public class Percentile extends AbstractUnivariateStatistic implements java.io.Serializable
Provides percentile computation.There are several commonly used methods for estimating percentiles (a.k.a. quantiles) based on sample data. For large samples, the different methods agree closely, but when sample sizes are small, different methods will give significantly different results. The algorithm implemented here works as follows:
- Let
n
be the length of the (sorted) array and0 < p <= 100
be the desired percentile. - If
n = 1
return the unique array element (regardless of the value ofp
); otherwise - Compute the estimated percentile position
pos = p * (n + 1) / 100
and the difference,d
betweenpos
andfloor(pos)
(i.e. the fractional part ofpos
). Ifpos >= n
return the largest element in the array; otherwise - Let
lower
be the element in positionfloor(pos)
in the array and letupper
be the next element in the array. Returnlower + d * (upper - lower)
To compute percentiles, the data must be at least partially ordered. Input arrays are copied and recursively partitioned using an ordering definition. The ordering used by
Arrays.sort(double[])
is the one determined byDouble.compareTo(Double)
. This ordering makesDouble.NaN
larger than any other value (includingDouble.POSITIVE_INFINITY
). Therefore, for example, the median (50th percentile) of{0, 1, 2, 3, 4, Double.NaN}
evaluates to2.5.
Since percentile estimation usually involves interpolation between array elements, arrays containing
NaN
or infinite values will often result inNaN
or infinite values returned.
Since 2.2, Percentile implementation uses only selection instead of complete sorting and caches selection algorithm state between calls to the various
evaluate
methods when several percentiles are to be computed on the same data. This greatly improves efficiency, both for single percentile and multiple percentiles computations. However, it also induces a need to be sure the data at one call toevaluate
is the same as the data with the cached algorithm state from the previous calls. Percentile does this by checking the array reference itself and a checksum of its content by default. If the user already knows he callsevaluate
on an immutable array, he can save the checking time by calling theevaluate
methods that do notNote that this implementation is not synchronized. If multiple threads access an instance of this class concurrently, and at least one of the threads invokes the
increment()
orclear()
method, it must be synchronized externally.- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors
Constructor
Description
Percentile()
Constructs a Percentile with a default quantile
value of 50.0.
Percentile(double p)
Constructs a Percentile with the specific quantile value.
Percentile(Percentile original)
Copy constructor, creates a new Percentile
identical
to the original
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type
Method
Description
Percentile
copy()
Returns a copy of the statistic with the same internal state.
static void
copy(Percentile source,
Percentile dest)
Copies source to dest.
double
evaluate(double p)
Returns the result of evaluating the statistic over the stored data.
double
evaluate(double[] values,
double p)
Returns an estimate of the p
th percentile of the values
in the values
array.
double
evaluate(double[] values,
int start,
int length)
Returns an estimate of the quantile
th percentile of the
designated values in the values
array.
double
evaluate(double[] values,
int begin,
int length,
double p)
Returns an estimate of the p
th percentile of the values
in the values
array, starting with the element in (0-based)
position begin
in the array and including length
values.
double
getQuantile()
Returns the value of the quantile field (determines what percentile is
computed when evaluate() is called with no quantile argument).
void
setData(double[] values)
Set the data array.
void
setData(double[] values,
int begin,
int length)
Set the data array.
void
setQuantile(double p)
Sets the value of the quantile field (determines what percentile is
computed when evaluate() is called with no quantile argument).
-
Methods inherited from class org.apache.commons.math.stat.descriptive.AbstractUnivariateStatistic
evaluate, evaluate, getData
-
-
Constructor Detail
-
Percentile
public Percentile()
Constructs a Percentile with a default quantile
value of 50.0.
-
Percentile
public Percentile(double p)
Constructs a Percentile with the specific quantile value.
- Parameters:
p
- the quantile
- Throws:
java.lang.IllegalArgumentException
- if p is not greater than 0 and less
than or equal to 100
-
Percentile
public Percentile(Percentile original)
Copy constructor, creates a new Percentile
identical
to the original
- Parameters:
original
- the Percentile
instance to copy
-
Method Detail
-
setData
public void setData(double[] values)
Set the data array.
The stored value is a copy of the parameter array, not the array itself
- Overrides:
setData
in class AbstractUnivariateStatistic
- Parameters:
values
- data array to store (may be null to remove stored data)
- See Also:
AbstractUnivariateStatistic.evaluate()
-
setData
public void setData(double[] values,
int begin,
int length)
Set the data array.
- Overrides:
setData
in class AbstractUnivariateStatistic
- Parameters:
values
- data array to store
begin
- the index of the first element to include
length
- the number of elements to include
- See Also:
AbstractUnivariateStatistic.evaluate()
-
evaluate
public double evaluate(double p)
Returns the result of evaluating the statistic over the stored data.
The stored array is the one which was set by previous calls to
- Parameters:
p
- the percentile value to compute
- Returns:
- the value of the statistic applied to the stored data
-
evaluate
public double evaluate(double[] values,
double p)
Returns an estimate of the p
th percentile of the values
in the values
array.
Calls to this method do not modify the internal quantile
state of this statistic.
- Returns
Double.NaN
if values
has length
0
- Returns (for any value of
p
) values[0]
if values
has length 1
- Throws
IllegalArgumentException
if values
is null or p is not a valid quantile value (p must be greater than 0
and less than or equal to 100)
See Percentile
for a description of the percentile estimation
algorithm used.
- Parameters:
values
- input array of values
p
- the percentile value to compute
- Returns:
- the percentile value or Double.NaN if the array is empty
- Throws:
java.lang.IllegalArgumentException
- if values
is null
or p is invalid
-
evaluate
public double evaluate(double[] values,
int start,
int length)
Returns an estimate of the quantile
th percentile of the
designated values in the values
array. The quantile
estimated is determined by the quantile
property.
- Returns
Double.NaN
if length = 0
- Returns (for any value of
quantile
)
values[begin]
if length = 1
- Throws
IllegalArgumentException
if values
is null, or start
or length
is invalid
See Percentile
for a description of the percentile estimation
algorithm used.
- Specified by:
evaluate
in interface UnivariateStatistic
- Specified by:
evaluate
in class AbstractUnivariateStatistic
- Parameters:
values
- the input array
start
- index of the first array element to include
length
- the number of elements to include
- Returns:
- the percentile value
- Throws:
java.lang.IllegalArgumentException
- if the parameters are not valid
-
evaluate
public double evaluate(double[] values,
int begin,
int length,
double p)
Returns an estimate of the p
th percentile of the values
in the values
array, starting with the element in (0-based)
position begin
in the array and including length
values.
Calls to this method do not modify the internal quantile
state of this statistic.
- Returns
Double.NaN
if length = 0
- Returns (for any value of
p
) values[begin]
if length = 1
- Throws
IllegalArgumentException
if values
is null , begin
or length
is invalid, or
p
is not a valid quantile value (p must be greater than 0
and less than or equal to 100)
See Percentile
for a description of the percentile estimation
algorithm used.
- Parameters:
values
- array of input values
p
- the percentile to compute
begin
- the first (0-based) element to include in the computation
length
- the number of array elements to include
- Returns:
- the percentile value
- Throws:
java.lang.IllegalArgumentException
- if the parameters are not valid or the
input array is null
-
getQuantile
public double getQuantile()
Returns the value of the quantile field (determines what percentile is
computed when evaluate() is called with no quantile argument).
- Returns:
- quantile
-
setQuantile
public void setQuantile(double p)
Sets the value of the quantile field (determines what percentile is
computed when evaluate() is called with no quantile argument).
- Parameters:
p
- a value between 0 < p <= 100
- Throws:
java.lang.IllegalArgumentException
- if p is not greater than 0 and less
than or equal to 100
-
copy
public Percentile copy()
Returns a copy of the statistic with the same internal state.
- Specified by:
copy
in interface UnivariateStatistic
- Specified by:
copy
in class AbstractUnivariateStatistic
- Returns:
- a copy of the statistic
-
copy
public static void copy(Percentile source,
Percentile dest)
Copies source to dest.
Neither source nor dest can be null.
- Parameters:
source
- Percentile to copy
dest
- Percentile to copy to
- Throws:
java.lang.NullPointerException
- if either source or dest is null