Class TextBreakIterator


  • public abstract class TextBreakIterator
    extends java.lang.Object

    Fashioned loosely on the ICU break iterator class, the text break iterator can be used to find significant breakpoints in text. The most significant difference is that this class doesn't require an array of characters; instead, the caller supplies a character property iterator.

    Currently the caller can request one of two implementation types, through static class methods, in order to find grapheme cluster breaks or word breaks. Note that word breaks are algorithmic only and do not extend to languages like Thai which require dictionary based breaking.

    • Constructor Detail

      • TextBreakIterator

        public TextBreakIterator()
    • Method Detail

      • first

        public abstract int first()
        Find the first break (typically the start-of-text).
        Returns:
        Index value--in the context of the given character property iterator--of the first break in the text.
      • next

        public abstract int next()
        Find the next break.
        Returns:
        Index value--in the context of the given character property iterator--of the next break in the text. A special value of DONE indicates there are no more breaks.
      • createGraphemeInstance

        public static TextBreakIterator createGraphemeInstance​(TextCharPropIterator poTextCharPropIterator)
        Create a grapheme cluster break iterator.
        Parameters:
        poTextCharPropIterator - - Pointer to underlying character property iterator, used by the grapheme cluster break iterator, to obtain character properties for analysis.
        Returns:
        Pointer to a break iterator implementation that performs grapheme cluster analysis. This object uses the AXTE reference counting model and the caller must release its reference when done.
      • createWordInstance

        public static TextBreakIterator createWordInstance​(TextCharPropIterator poTextCharPropIterator)
        Create a word break iterator.
        Parameters:
        poTextCharPropIterator - - Pointer to underlying character property iterator, used by the word break iterator, to obtain character properties for analysis.
        Returns:
        Pointer to a break iterator implementation that performs word analysis. This object uses the AXTE reference counting model and the caller must release its reference when done.