Class SimpleFragmenter

  • All Implemented Interfaces:
    Fragmenter

    public class SimpleFragmenter
    extends java.lang.Object
    implements Fragmenter
    Fragmenter implementation which breaks text up into same-size fragments with no concerns over spotting sentence boundaries.
    • Constructor Detail

      • SimpleFragmenter

        public SimpleFragmenter()
      • SimpleFragmenter

        public SimpleFragmenter​(int fragmentSize)
        Parameters:
        fragmentSize - size in number of characters of each fragment
    • Method Detail

      • start

        public void start​(java.lang.String originalText,
                          TokenStream stream)
        Description copied from interface: Fragmenter
        Initializes the Fragmenter. You can grab references to the Attributes you are interested in from tokenStream and then access the values in Fragmenter.isNewFragment().
        Specified by:
        start in interface Fragmenter
        Parameters:
        originalText - the original source text
        stream - the TokenStream to be fragmented
      • isNewFragment

        public boolean isNewFragment()
        Description copied from interface: Fragmenter
        Test to see if this token from the stream should be held in a new TextFragment. Every time this is called, the TokenStream passed to start(String, TokenStream) will have been incremented.
        Specified by:
        isNewFragment in interface Fragmenter
      • getFragmentSize

        public int getFragmentSize()
        Returns:
        size in number of characters of each fragment
      • setFragmentSize

        public void setFragmentSize​(int size)
        Parameters:
        size - size in characters of each fragment