Class CodepointCountFilter

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable

    public final class CodepointCountFilter
    extends FilteringTokenFilter
    Removes words that are too long or too short from the stream.

    Note: Length is calculated as the number of Unicode codepoints.

    • Constructor Detail

      • CodepointCountFilter

        public CodepointCountFilter​(Version version,
                                    TokenStream in,
                                    int min,
                                    int max)
        Create a new CodepointCountFilter. This will filter out tokens whose CharTermAttribute is either too short (Character.codePointCount(char[], int, int) < min) or too long (Character.codePointCount(char[], int, int) > max).
        Parameters:
        version - the Lucene match version
        in - the TokenStream to consume
        min - the minimum length
        max - the maximum length
    • Method Detail

      • accept

        public boolean accept()