public class Lang extends ObjectLanguage guessing utility.
This class encapsulates rules used to guess the possible languages that a word originates from. This is done by reference to a whole series of rules distributed in resource files.
Instances of this class are typically managed through the static factory method instance(). Unless you are developing your own language guessing rules, you will not need to interact with this class directly.
This class is intended to be immutable and thread-safe.
Language guessing rules are typically loaded from resource files. These are UTF-8 encoded text files. They are systematically named following the pattern:
org/apache/commons/codec/language/bm/lang.txtThe format of these resources is the following:
- Rules: whitespace separated strings.
There should be 3 columns to each row, and these will be interpreted as:
- pattern: a regular expression.
- languages: a '+'-separated list of languages.
- acceptOnMatch: 'true' or 'false' indicating if a match rules in or rules out the language.
- End-of-line comments: Any occurrence of '//' will cause all text following on that line to be discarded as a comment.
- Multi-line comments: Any line starting with '/*' will start multi-line commenting mode. This will skip all content until a line ending in '*' and '/' is found.
- Blank lines: All blank lines will be skipped.
Port of lang.php
- Rules: whitespace separated strings. There should be 3 columns to each row, and these will be interpreted as:
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description
guessLanguage(String text)Guesses the language of a word.
guessLanguages(String input)Guesses the languages of a word.
instance(NameType nameType)Gets a Lang instance for one of the supported NameTypes.
loadFromResource(String languageRulesResourceName, Languages languages)Loads language rules from a resource.
instanceGets a Lang instance for one of the supported NameTypes.
nameType- the NameType to look up
- a Lang encapsulating the language guessing rules for that name type
loadFromResourceLoads language rules from a resource.
In normal use, you will obtain instances of Lang through the
instance(NameType)method. You will only need to call this yourself if you are developing custom language mapping rules.
languageRulesResourceName- the fully-qualified resource name to load
languages- the languages that these rules will support
- a Lang encapsulating the loaded language-guessing rules.
guessLanguageGuesses the language of a word.
text- the word
- the language that the word originates from or
Languages.ANYif there was no unique match