Class HighFrequencyDictionary

  • All Implemented Interfaces:
    Dictionary

    public class HighFrequencyDictionary
    extends java.lang.Object
    implements Dictionary
    HighFrequencyDictionary: terms taken from the given field of a Lucene index, which appear in a number of documents above a given threshold. Threshold is a value in [0..1] representing the minimum number of documents (of the total) where a term should appear. Based on LuceneDictionary.
    • Constructor Summary

      Constructors 
      Constructor Description
      HighFrequencyDictionary​(IndexReader reader, java.lang.String field, float thresh)
      Creates a new Dictionary, pulling source terms from the specified field in the provided reader.
    • Constructor Detail

      • HighFrequencyDictionary

        public HighFrequencyDictionary​(IndexReader reader,
                                       java.lang.String field,
                                       float thresh)
        Creates a new Dictionary, pulling source terms from the specified field in the provided reader.

        Terms appearing in less than thresh percentage of documents will be excluded.

    • Method Detail

      • getEntryIterator

        public final InputIterator getEntryIterator()
                                             throws java.io.IOException
        Description copied from interface: Dictionary
        Returns an iterator over all the entries
        Specified by:
        getEntryIterator in interface Dictionary
        Returns:
        Iterator
        Throws:
        java.io.IOException