Class TokenSampleStream

  • All Implemented Interfaces:
    java.lang.AutoCloseable, ObjectStream<TokenSample>

    public class TokenSampleStream
    extends FilterObjectStream<java.lang.String,​TokenSample>
    This class is a stream filter which reads in string encoded samples and creates TokenSamples out of them. The input string sample is tokenized if a whitespace or the special separator chars occur.

    Sample:
    "token1 token2 token3<SPLIT>token4"
    The tokens token1 and token2 are separated by a whitespace, token3 and token3 are separated by the special character sequence, in this case the default split sequence.

    The sequence must be unique in the input string and is not escaped.

    • Constructor Detail

      • TokenSampleStream

        public TokenSampleStream​(ObjectStream<java.lang.String> sampleStrings,
                                 java.lang.String separatorChars)
      • TokenSampleStream

        public TokenSampleStream​(ObjectStream<java.lang.String> sentences)
    • Method Detail

      • read

        public TokenSample read()
                         throws java.io.IOException
        Description copied from interface: ObjectStream
        Returns the next object. Calling this method repeatedly until it returns null will return each object from the underlying source exactly once.
        Returns:
        the next object or null to signal that the stream is exhausted
        Throws:
        java.io.IOException - if there is an error during reading