Class HyphenationTree
- java.lang.Object
-
- org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
- org.apache.lucene.analysis.compound.hyphenation.HyphenationTree
-
- All Implemented Interfaces:
java.lang.Cloneable,PatternConsumer
public class HyphenationTree extends TernaryTree implements PatternConsumer
This tree structure stores the hyphenation patterns in an efficient way for fast lookup. It provides the provides the method to hyphenate a word. This class has been taken from the Apache FOP project (http://xmlgraphics.apache.org/fop/). They have been slightly modified.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
TernaryTree.Iterator
-
-
Constructor Summary
Constructors Constructor Description HyphenationTree()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidaddClass(java.lang.String chargroup)Add a character class to the tree.voidaddException(java.lang.String word, java.util.ArrayList<java.lang.Object> hyphenatedword)Add an exception to the tree.voidaddPattern(java.lang.String pattern, java.lang.String ivalue)Add a pattern to the tree.java.lang.StringfindPattern(java.lang.String pat)Hyphenationhyphenate(char[] w, int offset, int len, int remainCharCount, int pushCharCount)Hyphenate word and return an array of hyphenation points.Hyphenationhyphenate(java.lang.String word, int remainCharCount, int pushCharCount)Hyphenate word and return a Hyphenation object.voidloadPatterns(java.io.File f)Read hyphenation patterns from an XML file.voidloadPatterns(org.xml.sax.InputSource source)Read hyphenation patterns from an XML file.voidprintStats(java.io.PrintStream out)
-
-
-
Method Detail
-
loadPatterns
public void loadPatterns(java.io.File f) throws java.io.IOExceptionRead hyphenation patterns from an XML file.- Parameters:
f- the filename- Throws:
java.io.IOException- In case the parsing fails
-
loadPatterns
public void loadPatterns(org.xml.sax.InputSource source) throws java.io.IOExceptionRead hyphenation patterns from an XML file.- Parameters:
source- the InputSource for the file- Throws:
java.io.IOException- In case the parsing fails
-
findPattern
public java.lang.String findPattern(java.lang.String pat)
-
hyphenate
public Hyphenation hyphenate(java.lang.String word, int remainCharCount, int pushCharCount)
Hyphenate word and return a Hyphenation object.- Parameters:
word- the word to be hyphenatedremainCharCount- Minimum number of characters allowed before the hyphenation point.pushCharCount- Minimum number of characters allowed after the hyphenation point.- Returns:
- a
Hyphenationobject representing the hyphenated word or null if word is not hyphenated.
-
hyphenate
public Hyphenation hyphenate(char[] w, int offset, int len, int remainCharCount, int pushCharCount)
Hyphenate word and return an array of hyphenation points.- Parameters:
w- char array that contains the wordoffset- Offset to first character in wordlen- Length of wordremainCharCount- Minimum number of characters allowed before the hyphenation point.pushCharCount- Minimum number of characters allowed after the hyphenation point.- Returns:
- a
Hyphenationobject representing the hyphenated word or null if word is not hyphenated.
-
addClass
public void addClass(java.lang.String chargroup)
Add a character class to the tree. It is used byPatternParseras callback to add character classes. Character classes define the valid word characters for hyphenation. If a word contains a character not defined in any of the classes, it is not hyphenated. It also defines a way to normalize the characters in order to compare them with the stored patterns. Usually pattern files use only lower case characters, in this case a class for letter 'a', for example, should be defined as "aA", the first character being the normalization char.- Specified by:
addClassin interfacePatternConsumer- Parameters:
chargroup- character group
-
addException
public void addException(java.lang.String word, java.util.ArrayList<java.lang.Object> hyphenatedword)Add an exception to the tree. It is used byPatternParserclass as callback to store the hyphenation exceptions.- Specified by:
addExceptionin interfacePatternConsumer- Parameters:
word- normalized wordhyphenatedword- a vector of alternating strings andhyphenobjects.
-
addPattern
public void addPattern(java.lang.String pattern, java.lang.String ivalue)Add a pattern to the tree. Mainly, to be used byPatternParserclass as callback to add a pattern to the tree.- Specified by:
addPatternin interfacePatternConsumer- Parameters:
pattern- the hyphenation patternivalue- interletter weight values indicating the desirability and priority of hyphenating at a given point within the pattern. It should contain only digit characters. (i.e. '0' to '9').
-
printStats
public void printStats(java.io.PrintStream out)
- Overrides:
printStatsin classTernaryTree
-
-