Class WordBreakSpellChecker


  • public class WordBreakSpellChecker
    extends java.lang.Object

    A spell checker whose sole function is to offer suggestions by combining multiple terms into one word and/or breaking terms into multiple words.

    • Field Detail

      • SEPARATOR_TERM

        public static final Term SEPARATOR_TERM
        Term that can be used to prohibit adjacent terms from being combined
    • Method Detail

      • suggestWordCombinations

        public CombineSuggestion[] suggestWordCombinations​(Term[] terms,
                                                           int maxSuggestions,
                                                           IndexReader ir,
                                                           SuggestMode suggestMode)
                                                    throws java.io.IOException

        Generate suggestions by combining one or more of the passed-in terms into single words. The returned CombineSuggestion contains both a SuggestWord and also an array detailing which passed-in terms were involved in creating this combination. The scores returned are equal to the number of word combinations needed, also one less than the length of the array CombineSuggestion.originalTermIndexes. Generally, a suggestion with a lower score is preferred over a higher score.

        To prevent two adjacent terms from being combined (for instance, if one is mandatory and the other is prohibited), separate the two terms with SEPARATOR_TERM

        When suggestMode equals SuggestMode.SUGGEST_WHEN_NOT_IN_INDEX, each suggestion will include at least one term not in the index.

        When suggestMode equals SuggestMode.SUGGEST_MORE_POPULAR, each suggestion will have the same, or better frequency than the most-popular included term.

        Returns:
        an array of words generated by combining original terms
        Throws:
        java.io.IOException - If there is a low-level I/O error.
      • getMinSuggestionFrequency

        public int getMinSuggestionFrequency()
        Returns the minimum frequency a term must have to be part of a suggestion.
        See Also:
        setMinSuggestionFrequency(int)
      • getMaxCombineWordLength

        public int getMaxCombineWordLength()
        Returns the maximum length of a combined suggestion
        See Also:
        setMaxCombineWordLength(int)
      • getMinBreakWordLength

        public int getMinBreakWordLength()
        Returns the minimum size of a broken word
        See Also:
        setMinBreakWordLength(int)
      • getMaxChanges

        public int getMaxChanges()
        Returns the maximum number of changes to perform on the input
        See Also:
        setMaxChanges(int)
      • getMaxEvaluations

        public int getMaxEvaluations()
        Returns the maximum number of word combinations to evaluate.
        See Also:
        setMaxEvaluations(int)
      • setMaxCombineWordLength

        public void setMaxCombineWordLength​(int maxCombineWordLength)

        The maximum length of a suggestion made by combining 1 or more original terms. Default=20

        See Also:
        getMaxCombineWordLength()
      • setMinBreakWordLength

        public void setMinBreakWordLength​(int minBreakWordLength)

        The minimum length to break words down to. Default=1

        See Also:
        getMinBreakWordLength()
      • setMaxChanges

        public void setMaxChanges​(int maxChanges)

        The maximum numbers of changes (word breaks or combinations) to make on the original term(s). Default=1

        See Also:
        getMaxChanges()
      • setMaxEvaluations

        public void setMaxEvaluations​(int maxEvaluations)

        The maximum number of word combinations to evaluate. Default=1000. A higher value might improve result quality. A lower value might improve performance.

        See Also:
        getMaxEvaluations()