Class BloomFilter<T>

  • Type Parameters:
    T - the type of instances that the BloomFilter accepts
    All Implemented Interfaces:
    Predicate<T>, java.io.Serializable

    @Beta
    public final class BloomFilter<T>
    extends java.lang.Object
    implements Predicate<T>, java.io.Serializable
    A Bloom filter for instances of T. A Bloom filter offers an approximate containment test with one-sided error: if it claims that an element is contained in it, this might be in error, but if it claims that an element is not contained in it, then this is definitely true.

    If you are unfamiliar with Bloom filters, this nice tutorial may help you understand how they work.

    The false positive probability (FPP) of a bloom filter is defined as the probability that mightContain(Object) will erroneously return true for an object that has not actually been put in the BloomFilter.

    Since:
    11.0
    See Also:
    Serialized Form
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      boolean apply​(T input)
      Equivalent to mightContain(T); provided only to satisfy the Predicate interface.
      BloomFilter<T> copy()
      Creates a new BloomFilter that's a copy of this instance.
      static <T> BloomFilter<T> create​(Funnel<T> funnel, int expectedInsertions)
      Creates a BloomFilter with the expected number of insertions and a default expected false positive probability of 3%.
      static <T> BloomFilter<T> create​(Funnel<T> funnel, int expectedInsertions, double fpp)
      Creates a BloomFilter with the expected number of insertions and expected false positive probability.
      boolean equals​(java.lang.Object object)
      Indicates whether another object is equal to this predicate.
      double expectedFpp()
      Returns the probability that mightContain(Object) will erroneously return true for an object that has not actually been put in the BloomFilter.
      int hashCode()  
      boolean isCompatible​(BloomFilter<T> that)
      Determines whether a given bloom filter is compatible with this bloom filter.
      boolean mightContain​(T object)
      Returns true if the element might have been put in this Bloom filter, false if this is definitely not the case.
      boolean put​(T object)
      Puts an element into this BloomFilter.
      void putAll​(BloomFilter<T> that)
      Combines this bloom filter with another bloom filter by performing a bitwise OR of the underlying data.
      • Methods inherited from class java.lang.Object

        getClass, notify, notifyAll, toString, wait, wait, wait
    • Method Detail

      • copy

        public BloomFilter<T> copy()
        Creates a new BloomFilter that's a copy of this instance. The new instance is equal to this instance but shares no mutable state.
        Since:
        12.0
      • mightContain

        public boolean mightContain​(T object)
        Returns true if the element might have been put in this Bloom filter, false if this is definitely not the case.
      • put

        public boolean put​(T object)
        Puts an element into this BloomFilter. Ensures that subsequent invocations of mightContain(Object) with the same element will always return true.
        Returns:
        true if the bloom filter's bits changed as a result of this operation. If the bits changed, this is definitely the first time object has been added to the filter. If the bits haven't changed, this might be the first time object has been added to the filter. Note that put(t) always returns the opposite result to what mightContain(t) would have returned at the time it is called."
        Since:
        12.0 (present in 11.0 with void return type})
      • expectedFpp

        public double expectedFpp()
        Returns the probability that mightContain(Object) will erroneously return true for an object that has not actually been put in the BloomFilter.

        Ideally, this number should be close to the fpp parameter passed in create(Funnel, int, double), or smaller. If it is significantly higher, it is usually the case that too many elements (more than expected) have been put in the BloomFilter, degenerating it.

        Since:
        14.0 (since 11.0 as expectedFalsePositiveProbability())
      • isCompatible

        public boolean isCompatible​(BloomFilter<T> that)
        Determines whether a given bloom filter is compatible with this bloom filter. For two bloom filters to be compatible, they must:
        • not be the same instance
        • have the same number of hash functions
        • have the same bit size
        • have the same strategy
        • have equal funnels
          Parameters:
          that - The bloom filter to check for compatibility.
          Since:
          15.0
        • putAll

          public void putAll​(BloomFilter<T> that)
          Combines this bloom filter with another bloom filter by performing a bitwise OR of the underlying data. The mutations happen to this instance. Callers must ensure the bloom filters are appropriately sized to avoid saturating them.
          Parameters:
          that - The bloom filter to combine this bloom filter with. It is not mutated.
          Throws:
          java.lang.IllegalArgumentException - if isCompatible(that) == false
          Since:
          15.0
        • equals

          public boolean equals​(@Nullable
                                java.lang.Object object)
          Description copied from interface: Predicate
          Indicates whether another object is equal to this predicate.

          Most implementations will have no reason to override the behavior of Object.equals(java.lang.Object). However, an implementation may also choose to return true whenever object is a Predicate that it considers interchangeable with this one. "Interchangeable" typically means that this.apply(t) == that.apply(t) for all t of type T). Note that a false result from this method does not imply that the predicates are known not to be interchangeable.

          Specified by:
          equals in interface Predicate<T>
          Overrides:
          equals in class java.lang.Object
        • hashCode

          public int hashCode()
          Overrides:
          hashCode in class java.lang.Object
        • create

          public static <T> BloomFilter<T> create​(Funnel<T> funnel,
                                                  int expectedInsertions,
                                                  double fpp)
          Creates a BloomFilter with the expected number of insertions and expected false positive probability.

          Note that overflowing a BloomFilter with significantly more elements than specified, will result in its saturation, and a sharp deterioration of its false positive probability.

          The constructed BloomFilter<T> will be serializable if the provided Funnel<T> is.

          It is recommended that the funnel be implemented as a Java enum. This has the benefit of ensuring proper serialization and deserialization, which is important since equals(java.lang.Object) also relies on object identity of funnels.

          Parameters:
          funnel - the funnel of T's that the constructed BloomFilter<T> will use
          expectedInsertions - the number of expected insertions to the constructed BloomFilter<T>; must be positive
          fpp - the desired false positive probability (must be positive and less than 1.0)
          Returns:
          a BloomFilter
        • create

          public static <T> BloomFilter<T> create​(Funnel<T> funnel,
                                                  int expectedInsertions)
          Creates a BloomFilter with the expected number of insertions and a default expected false positive probability of 3%.

          Note that overflowing a BloomFilter with significantly more elements than specified, will result in its saturation, and a sharp deterioration of its false positive probability.

          The constructed BloomFilter<T> will be serializable if the provided Funnel<T> is.

          Parameters:
          funnel - the funnel of T's that the constructed BloomFilter<T> will use
          expectedInsertions - the number of expected insertions to the constructed BloomFilter<T>; must be positive
          Returns:
          a BloomFilter