Package opennlp.tools.tokenize
Class TokenSampleStream
- java.lang.Object
-
- opennlp.tools.util.FilterObjectStream<java.lang.String,TokenSample>
-
- opennlp.tools.tokenize.TokenSampleStream
-
- All Implemented Interfaces:
java.lang.AutoCloseable
,ObjectStream<TokenSample>
public class TokenSampleStream extends FilterObjectStream<java.lang.String,TokenSample>
This class is a stream filter which reads in string encoded samples and createsTokenSample
s out of them. The input string sample is tokenized if a whitespace or the special separator chars occur.Sample:
"token1 token2 token3<SPLIT>token4"
The tokens token1 and token2 are separated by a whitespace, token3 and token3 are separated by the special character sequence, in this case the default split sequence.The sequence must be unique in the input string and is not escaped.
-
-
Constructor Summary
Constructors Constructor Description TokenSampleStream(ObjectStream<java.lang.String> sentences)
TokenSampleStream(ObjectStream<java.lang.String> sampleStrings, java.lang.String separatorChars)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TokenSample
read()
Returns the next object.-
Methods inherited from class opennlp.tools.util.FilterObjectStream
close, reset
-
-
-
-
Constructor Detail
-
TokenSampleStream
public TokenSampleStream(ObjectStream<java.lang.String> sampleStrings, java.lang.String separatorChars)
-
TokenSampleStream
public TokenSampleStream(ObjectStream<java.lang.String> sentences)
-
-
Method Detail
-
read
public TokenSample read() throws java.io.IOException
Description copied from interface:ObjectStream
Returns the next object. Calling this method repeatedly until it returns null will return each object from the underlying source exactly once.- Returns:
- the next object or null to signal that the stream is exhausted
- Throws:
java.io.IOException
- if there is an error during reading
-
-