Class ICUResourceBundleReader

  • All Implemented Interfaces:
    ICUBinary.Authenticate

    public final class ICUResourceBundleReader
    extends java.lang.Object
    implements ICUBinary.Authenticate
    This class reads the *.res resource bundle format (For the latest version of the file format documentation see ICU4C's source/common/uresdata.h file.) File format for .res resource bundle files (formatVersion=1.2) An ICU4C resource bundle file (.res) is a binary, memory-mappable file with nested, hierarchical data structures. It physically contains the following: Resource root; -- 32-bit Resource item, root item for this bundle's tree; currently, the root item must be a table or table32 resource item int32_t indexes[indexes[0]]; -- array of indexes for friendly reading and swapping; see URES_INDEX_* above new in formatVersion 1.1 (ICU 2.8) char keys[]; -- characters for key strings (formatVersion 1.0: up to 65k of characters; 1.1: <2G) (minus the space for root and indexes[]), which consist of invariant characters (ASCII/EBCDIC) and are NUL-terminated; padded to multiple of 4 bytes for 4-alignment of the following data data; -- data directly and indirectly indexed by the root item; the structure is determined by walking the tree Each resource bundle item has a 32-bit Resource handle (see typedef above) which contains the item type number in its upper 4 bits (31..28) and either an offset or a direct value in its lower 28 bits (27..0). The order of items is undefined and only determined by walking the tree. Leaves of the tree may be stored first or last or anywhere in between, and it is in theory possible to have unreferenced holes in the file. Direct values: - Empty Unicode strings have an offset value of 0 in the Resource handle itself. - Integer values are 28-bit values stored in the Resource handle itself; the interpretation of unsigned vs. signed integers is up to the application. All other types and values use 28-bit offsets to point to the item's data. The offset is an index to the first 32-bit word of the value, relative to the start of the resource data (i.e., the root item handle is at offset 0). To get byte offsets, the offset is multiplied by 4 (or shifted left by 2 bits). All resource item values are 4-aligned. The structures (memory layouts) for the values for each item type are listed in the table above. Nested, hierarchical structures: ------------- Table items contain key-value pairs where the keys are 16-bit offsets to char * key strings. Key string offsets are also relative to the start of the resource data (of the root handle), i.e., the first string has an offset of 4 (after the 4-byte root handle). The values of these pairs are Resource handles. Array items are simple vectors of Resource handles. An alias item is special (and new in ICU 2.4): -------------- Its memory layout is just like for a UnicodeString, but at runtime it resolves to another resource bundle's item according to the path in the string. This is used to share items across bundles that are in different lookup/fallback chains (e.g., large collation data among zh_TW and zh_HK). This saves space (for large items) and maintenance effort (less duplication of data). -------------------------------------------------------------------------- Resource types: Most resources have their values stored at four-byte offsets from the start of the resource data. These values are at least 4-aligned. Some resource values are stored directly in the offset field of the Resource itself. See UResType in unicode/ures.h for enumeration constants for Resource types. Type Name Memory layout of values (in parentheses: scalar, non-offset values) 0 Unicode String: int32_t length, UChar[length], (UChar)0, (padding) or (empty string ("") if offset==0) 1 Binary: int32_t length, uint8_t[length], (padding) - this value should be 32-aligned - 2 Table: uint16_t count, uint16_t keyStringOffsets[count], (uint16_t padding), Resource[count] 3 Alias: (physically same value layout as string, new in ICU 2.4) 4 Table32: int32_t count, int32_t keyStringOffsets[count], Resource[count] (new in formatVersion 1.1/ICU 2.8) 7 Integer: (28-bit offset is integer value) 8 Array: int32_t count, Resource[count] 14 Integer Vector: int32_t length, int32_t[length] 15 Reserved: This value denotes special purpose resources and is for internal use. Note that there are 3 types with data vector values: - Vectors of 8-bit bytes stored as type Binary. - Vectors of 16-bit words stored as type Unicode String (no value restrictions, all values 0..ffff allowed!). - Vectors of 32-bit words stored as type Integer Vector.
    • Method Detail

      • getReader

        public static ICUResourceBundleReader getReader​(java.lang.String baseName,
                                                        java.lang.String localeName,
                                                        java.lang.ClassLoader root)
      • getFullName

        public static java.lang.String getFullName​(java.lang.String baseName,
                                                   java.lang.String localeName)
        Gets the full name of the resource with suffix.
      • isDataVersionAcceptable

        public boolean isDataVersionAcceptable​(byte[] version)
        Description copied from interface: ICUBinary.Authenticate
        Method used in ICUBinary.readHeader() to provide data format authentication.
        Specified by:
        isDataVersionAcceptable in interface ICUBinary.Authenticate
        Parameters:
        version - version of the current data
        Returns:
        true if dataformat is an acceptable version, false otherwise
      • getData

        public byte[] getData()
      • getRootResource

        public int getRootResource()
      • getNoFallback

        public boolean getNoFallback()