Package org.apache.lucene.codecs
Class BlockTreeTermsReader
- java.lang.Object
-
- org.apache.lucene.index.Fields
-
- org.apache.lucene.codecs.FieldsProducer
-
- org.apache.lucene.codecs.BlockTreeTermsReader
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,java.lang.Iterable<java.lang.String>
public class BlockTreeTermsReader extends FieldsProducer
A block-based terms index and dictionary that assigns terms to variable length blocks according to how they share prefixes. The terms index is a prefix trie whose leaves are term blocks. The advantage of this approach is that seekExact is often able to determine a term cannot exist without doing any IO, and intersection with Automata is very fast. Note that this terms dictionary has it's own fixed terms index (ie, it does not support a pluggable terms index implementation).NOTE: this terms dictionary does not support index divisor when opening an IndexReader. Instead, you can change the min/maxItemsPerBlock during indexing.
The data structure used by this implementation is very similar to a burst trie (http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1.1.18.3499), but with added logic to break up too-large blocks of all terms sharing a given prefix into smaller ones.
Use
CheckIndex
with the-verbose
option to see summary statistics on the blocks in the dictionary. SeeBlockTreeTermsWriter
.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description class
BlockTreeTermsReader.FieldReader
BlockTree's implementation ofTerms
.static class
BlockTreeTermsReader.Stats
BlockTree statistics for a single field returned byBlockTreeTermsReader.FieldReader.computeStats()
.
-
Field Summary
-
Fields inherited from class org.apache.lucene.index.Fields
EMPTY_ARRAY
-
-
Constructor Summary
Constructors Constructor Description BlockTreeTermsReader(Directory dir, FieldInfos fieldInfos, SegmentInfo info, PostingsReaderBase postingsReader, IOContext ioContext, java.lang.String segmentSuffix, int indexDivisor)
Sole constructor.
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
java.util.Iterator<java.lang.String>
iterator()
Returns an iterator that will step through all fields names.long
ramBytesUsed()
Returns approximate RAM bytes usedint
size()
Returns the number of fields or -1 if the number of distinct field names is unknown.Terms
terms(java.lang.String field)
Get theTerms
for this field.-
Methods inherited from class org.apache.lucene.index.Fields
getUniqueTermCount
-
-
-
-
Constructor Detail
-
BlockTreeTermsReader
public BlockTreeTermsReader(Directory dir, FieldInfos fieldInfos, SegmentInfo info, PostingsReaderBase postingsReader, IOContext ioContext, java.lang.String segmentSuffix, int indexDivisor) throws java.io.IOException
Sole constructor.- Throws:
java.io.IOException
-
-
Method Detail
-
close
public void close() throws java.io.IOException
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Specified by:
close
in classFieldsProducer
- Throws:
java.io.IOException
-
iterator
public java.util.Iterator<java.lang.String> iterator()
Description copied from class:Fields
Returns an iterator that will step through all fields names. This will not return null.
-
terms
public Terms terms(java.lang.String field) throws java.io.IOException
Description copied from class:Fields
Get theTerms
for this field. This will return null if the field does not exist.
-
size
public int size()
Description copied from class:Fields
Returns the number of fields or -1 if the number of distinct field names is unknown. If >= 0,Fields.iterator()
will return as many field names.
-
ramBytesUsed
public long ramBytesUsed()
Description copied from class:FieldsProducer
Returns approximate RAM bytes used- Specified by:
ramBytesUsed
in classFieldsProducer
-
-