Class HyphenationTree
- java.lang.Object
-
- org.apache.lucene.analysis.compound.hyphenation.TernaryTree
-
- org.apache.lucene.analysis.compound.hyphenation.HyphenationTree
-
- All Implemented Interfaces:
java.lang.Cloneable
,PatternConsumer
public class HyphenationTree extends TernaryTree implements PatternConsumer
This tree structure stores the hyphenation patterns in an efficient way for fast lookup. It provides the provides the method to hyphenate a word. This class has been taken from the Apache FOP project (http://xmlgraphics.apache.org/fop/). They have been slightly modified.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.lucene.analysis.compound.hyphenation.TernaryTree
TernaryTree.Iterator
-
-
Constructor Summary
Constructors Constructor Description HyphenationTree()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
addClass(java.lang.String chargroup)
Add a character class to the tree.void
addException(java.lang.String word, java.util.ArrayList<java.lang.Object> hyphenatedword)
Add an exception to the tree.void
addPattern(java.lang.String pattern, java.lang.String ivalue)
Add a pattern to the tree.java.lang.String
findPattern(java.lang.String pat)
Hyphenation
hyphenate(char[] w, int offset, int len, int remainCharCount, int pushCharCount)
Hyphenate word and return an array of hyphenation points.Hyphenation
hyphenate(java.lang.String word, int remainCharCount, int pushCharCount)
Hyphenate word and return a Hyphenation object.void
loadPatterns(java.io.File f)
Read hyphenation patterns from an XML file.void
loadPatterns(org.xml.sax.InputSource source)
Read hyphenation patterns from an XML file.void
printStats(java.io.PrintStream out)
-
-
-
Method Detail
-
loadPatterns
public void loadPatterns(java.io.File f) throws java.io.IOException
Read hyphenation patterns from an XML file.- Parameters:
f
- the filename- Throws:
java.io.IOException
- In case the parsing fails
-
loadPatterns
public void loadPatterns(org.xml.sax.InputSource source) throws java.io.IOException
Read hyphenation patterns from an XML file.- Parameters:
source
- the InputSource for the file- Throws:
java.io.IOException
- In case the parsing fails
-
findPattern
public java.lang.String findPattern(java.lang.String pat)
-
hyphenate
public Hyphenation hyphenate(java.lang.String word, int remainCharCount, int pushCharCount)
Hyphenate word and return a Hyphenation object.- Parameters:
word
- the word to be hyphenatedremainCharCount
- Minimum number of characters allowed before the hyphenation point.pushCharCount
- Minimum number of characters allowed after the hyphenation point.- Returns:
- a
Hyphenation
object representing the hyphenated word or null if word is not hyphenated.
-
hyphenate
public Hyphenation hyphenate(char[] w, int offset, int len, int remainCharCount, int pushCharCount)
Hyphenate word and return an array of hyphenation points.- Parameters:
w
- char array that contains the wordoffset
- Offset to first character in wordlen
- Length of wordremainCharCount
- Minimum number of characters allowed before the hyphenation point.pushCharCount
- Minimum number of characters allowed after the hyphenation point.- Returns:
- a
Hyphenation
object representing the hyphenated word or null if word is not hyphenated.
-
addClass
public void addClass(java.lang.String chargroup)
Add a character class to the tree. It is used byPatternParser
as callback to add character classes. Character classes define the valid word characters for hyphenation. If a word contains a character not defined in any of the classes, it is not hyphenated. It also defines a way to normalize the characters in order to compare them with the stored patterns. Usually pattern files use only lower case characters, in this case a class for letter 'a', for example, should be defined as "aA", the first character being the normalization char.- Specified by:
addClass
in interfacePatternConsumer
- Parameters:
chargroup
- character group
-
addException
public void addException(java.lang.String word, java.util.ArrayList<java.lang.Object> hyphenatedword)
Add an exception to the tree. It is used byPatternParser
class as callback to store the hyphenation exceptions.- Specified by:
addException
in interfacePatternConsumer
- Parameters:
word
- normalized wordhyphenatedword
- a vector of alternating strings andhyphen
objects.
-
addPattern
public void addPattern(java.lang.String pattern, java.lang.String ivalue)
Add a pattern to the tree. Mainly, to be used byPatternParser
class as callback to add a pattern to the tree.- Specified by:
addPattern
in interfacePatternConsumer
- Parameters:
pattern
- the hyphenation patternivalue
- interletter weight values indicating the desirability and priority of hyphenating at a given point within the pattern. It should contain only digit characters. (i.e. '0' to '9').
-
printStats
public void printStats(java.io.PrintStream out)
- Overrides:
printStats
in classTernaryTree
-
-