Utility functions for text analysis.
Interface Summary Interface Description MultiTermAwareComponentAdd to any analysis factory component to allow returning an analysis component factory for use with partial terms in prefix queries, wildcard queries, range query endpoints, regex queries, etc. ResourceLoaderAbstraction for loading resources (streams, files, and classes). ResourceLoaderAwareInterface for a component that needs to be initialized by an implementation of
Class Summary Class Description AbstractAnalysisFactory CharacterUtils CharacterUtils.CharacterBufferA simple IO buffer to use with
CharArrayIteratorA CharacterIterator used internally for use with
CharArrayMap<V>A simple class that stores key Strings as char's in a hash table. CharArraySetA simple class that stores Strings as char's in a hash table. CharFilterFactoryAbstract parent class for analysis factories that create
CharTokenizerAn abstract base class for simple, character-oriented tokenizers. ClasspathResourceLoader ElisionFilterRemoves elisions from a
ResourceLoaderthat opens resource files from the local file system, optionally resolving against a base directory.
FilteringTokenFilterAbstract base class for TokenFilters that may remove tokens. OpenStringBuilderA StringBuilder that allows one to access the array. RollingCharBufferActs like a forever growing char as you read characters into it from the provided reader, but internally it uses a circular buffer to only hold the characters that haven't been freed yet. StemmerUtilSome commonly-used stemming functions StopwordAnalyzerBaseBase class for Analyzers that need to make use of stopword sets. TokenFilterFactoryAbstract parent class for analysis factories that create
TokenizerFactoryAbstract parent class for analysis factories that create
WordlistLoaderLoader for text files that represent a list of stopwords.