Class UCharacterProperty


  • public final class UCharacterProperty
    extends java.lang.Object

    Internal class used for Unicode character property database.

    This classes store binary data read from uprops.icu. It does not have the capability to parse the data into more high-level information. It only returns bytes of information when required.

    Due to the form most commonly used for retrieval, array of char is used to store the binary data.

    UCharacterPropertyDB also contains information on accessing indexes to significant points in the binary data.

    Responsibility for molding the binary data into more meaning form lies on UCharacter.

    Since:
    release 2.1, february 1st 2002
    • Field Detail

      • m_trie_

        public CharTrie m_trie_
        Trie data
      • m_trieIndex_

        public char[] m_trieIndex_
        Optimization CharTrie index array
      • m_trieData_

        public char[] m_trieData_
        Optimization CharTrie data array
      • m_trieInitialValue_

        public int m_trieInitialValue_
        Optimization CharTrie data offset
      • m_unicodeVersion_

        public VersionInfo m_unicodeVersion_
        Unicode version
      • LATIN_CAPITAL_LETTER_I_WITH_DOT_ABOVE_

        public static final char LATIN_CAPITAL_LETTER_I_WITH_DOT_ABOVE_
        Latin capital letter i with dot above
        See Also:
        Constant Field Values
      • LATIN_SMALL_LETTER_DOTLESS_I_

        public static final char LATIN_SMALL_LETTER_DOTLESS_I_
        Latin small letter i with dot above
        See Also:
        Constant Field Values
      • LATIN_SMALL_LETTER_I_

        public static final char LATIN_SMALL_LETTER_I_
        Latin lowercase i
        See Also:
        Constant Field Values
      • SRC_NONE

        public static final int SRC_NONE
        No source, not a supported property.
        See Also:
        Constant Field Values
      • SRC_CHAR

        public static final int SRC_CHAR
        From uchar.c/uprops.icu main trie
        See Also:
        Constant Field Values
      • SRC_PROPSVEC

        public static final int SRC_PROPSVEC
        From uchar.c/uprops.icu properties vectors trie
        See Also:
        Constant Field Values
      • SRC_HST

        public static final int SRC_HST
        Hangul_Syllable_Type, from uchar.c/uprops.icu
        See Also:
        Constant Field Values
      • SRC_BIDI

        public static final int SRC_BIDI
        From ubidi_props.c/ubidi.icu
        See Also:
        Constant Field Values
      • SRC_CHAR_AND_PROPSVEC

        public static final int SRC_CHAR_AND_PROPSVEC
        From uchar.c/uprops.icu main trie as well as properties vectors trie
        See Also:
        Constant Field Values
      • SRC_COUNT

        public static final int SRC_COUNT
        One more than the highest UPropertySource (SRC_) constant.
        See Also:
        Constant Field Values
    • Method Detail

      • setIndexData

        public void setIndexData​(CharTrie.FriendAgent friendagent)
        Java friends implementation
      • getProperty

        public final int getProperty​(int ch)
        Gets the property value at the index. This is optimized. Note this is alittle different from CharTrie the index m_trieData_ is never negative.
        Parameters:
        ch - code point whose property value is to be retrieved
        Returns:
        property value of code point
      • getUnsignedValue

        public static int getUnsignedValue​(int prop)
        Getting the unsigned numeric value of a character embedded in the property argument
        Parameters:
        prop - the character
        Returns:
        unsigned numberic value
      • getAdditional

        public int getAdditional​(int codepoint,
                                 int column)
        Gets the unicode additional properties. C version getUnicodeProperties.
        Parameters:
        codepoint - codepoint whose additional properties is to be retrieved
        column -
        Returns:
        unicode properties
      • getAge

        public VersionInfo getAge​(int codepoint)

        Get the "age" of the code point.

        The "age" is the Unicode version when the code point was first designated (as a non-character or for Private Use) or assigned a character.

        This can be useful to avoid emitting code points to receiving processes that do not accept newer characters.

        The data is from the UCD file DerivedAge.txt.

        This API does not check the validity of the codepoint.

        Parameters:
        codepoint - The code point.
        Returns:
        the Unicode version number
      • getSource

        public final int getSource​(int which)
      • getRawSupplementary

        public static int getRawSupplementary​(char lead,
                                              char trail)
        Forms a supplementary code point from the argument character
        Note this is for internal use hence no checks for the validity of the surrogate characters are done
        Parameters:
        lead - lead surrogate character
        trail - trailing surrogate character
        Returns:
        code point of the supplementary character
      • getInstance

        public static UCharacterProperty getInstance()
        Loads the property data and initialize the UCharacterProperty instance.
        Throws:
        java.util.MissingResourceException - when data is missing or data has been corrupted
      • isRuleWhiteSpace

        public static boolean isRuleWhiteSpace​(int c)
        Checks if the argument c is to be treated as a white space in ICU rules. Usually ICU rule white spaces are ignored unless quoted. Equivalent to test for Pattern_White_Space Unicode property. Stable set of characters, won't change. See UAX #31 Identifier and Pattern Syntax: http://www.unicode.org/reports/tr31/
        Parameters:
        c - codepoint to check
        Returns:
        true if c is a ICU white space
      • getMaxValues

        public int getMaxValues​(int column)
        Get the the maximum values for some enum/int properties.
        Returns:
        maximum values for the integer properties.
      • getMask

        public static final int getMask​(int type)
        Gets the type mask
        Parameters:
        type - character type
        Returns:
        mask