Class Lucene40StoredFieldsFormat


  • public class Lucene40StoredFieldsFormat
    extends StoredFieldsFormat
    Lucene 4.0 Stored Fields Format.

    Stored fields are represented by two files:

    1. The field index, or .fdx file.

      This is used to find the location within the field data file of the fields of a particular document. Because it contains fixed-length data, this file may be easily randomly accessed. The position of document n 's field data is the Uint64 at n*8 in this file.

      This contains, for each document, a pointer to its field data, as follows:

      • FieldIndex (.fdx) --> <Header>, <FieldValuesPosition> SegSize
      • Header --> CodecHeader
      • FieldValuesPosition --> Uint64
    2. The field data, or .fdt file.

      This contains the stored fields of each document, as follows:

      • FieldData (.fdt) --> <Header>, <DocFieldData> SegSize
      • Header --> CodecHeader
      • DocFieldData --> FieldCount, <FieldNum, Bits, Value> FieldCount
      • FieldCount --> VInt
      • FieldNum --> VInt
      • Bits --> Byte
        • low order bit reserved.
        • second bit is one for fields containing binary data
        • third bit reserved.
        • 4th to 6th bit (mask: 0x7<<3) define the type of a numeric field:
          • all bits in mask are cleared if no numeric field at all
          • 1<<3: Value is Int
          • 2<<3: Value is Long
          • 3<<3: Value is Int as Float (as of Float.intBitsToFloat(int)
          • 4<<3: Value is Long as Double (as of Double.longBitsToDouble(long)
      • Value --> String | BinaryValue | Int | Long (depending on Bits)
      • BinaryValue --> ValueSize, <Byte>^ValueSize
      • ValueSize --> VInt