Package org.apache.tika.language
Class LanguageIdentifier
- java.lang.Object
-
- org.apache.tika.language.LanguageIdentifier
-
@Deprecated public class LanguageIdentifier extends java.lang.ObjectDeprecated.use a concrete class ofLanguageDetectorIdentifier of the language that best matches a given content profile. The content profile is compared to generic language profiles based on material from various sources.- Since:
- Apache Tika 0.5
- See Also:
- Europarl: A Parallel Corpus for Statistical Machine Translation, ISO 639 Language Codes
-
-
Constructor Summary
Constructors Constructor Description LanguageIdentifier(java.lang.String content)Deprecated.Constructs a language identifier based on a String of text contentLanguageIdentifier(LanguageProfile profile)Deprecated.Constructs a language identifier based on a LanguageProfile
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description static voidaddProfile(java.lang.String language, LanguageProfile profile)Deprecated.Adds a single language profilestatic voidclearProfiles()Deprecated.Clears the current map of language profilesstatic java.lang.StringgetErrors()Deprecated.Returns a string of error messages related to initializing language profilesjava.lang.StringgetLanguage()Deprecated.Gets the identified languagestatic java.util.Set<java.lang.String>getSupportedLanguages()Deprecated.Returns what languages are supported for language identificationstatic booleanhasErrors()Deprecated.Tests whether there were errors initializing language configstatic voidinitProfiles()Deprecated.Builds the language profiles.static voidinitProfiles(java.util.Map<java.lang.String,LanguageProfile> profilesMap)Deprecated.Initializes the language profiles from a user supplied initialized Map.booleanisReasonablyCertain()Deprecated.Tries to judge whether the identification is certain enough to be trusted.java.lang.StringtoString()Deprecated.
-
-
-
Constructor Detail
-
LanguageIdentifier
public LanguageIdentifier(LanguageProfile profile)
Deprecated.Constructs a language identifier based on a LanguageProfile- Parameters:
profile- the language profile
-
LanguageIdentifier
public LanguageIdentifier(java.lang.String content)
Deprecated.Constructs a language identifier based on a String of text content- Parameters:
content- the text
-
-
Method Detail
-
addProfile
public static void addProfile(java.lang.String language, LanguageProfile profile)Deprecated.Adds a single language profile- Parameters:
language- an ISO 639 code representing languageprofile- the language profile
-
getLanguage
public java.lang.String getLanguage()
Deprecated.Gets the identified language- Returns:
- an ISO 639 code representing the detected language
-
isReasonablyCertain
public boolean isReasonablyCertain()
Deprecated.Tries to judge whether the identification is certain enough to be trusted. WARNING: Will never return true for small amount of input texts.- Returns:
trueif the distance is smaller then 0.022,falseotherwise
-
initProfiles
public static void initProfiles()
Deprecated.Builds the language profiles. The list of languages are fetched from a property file named "tika.language.properties" If a file called "tika.language.override.properties" is found on classpath, this is used instead The property file contains a key "languages" with values being comma-separated language codes
-
initProfiles
public static void initProfiles(java.util.Map<java.lang.String,LanguageProfile> profilesMap)
Deprecated.Initializes the language profiles from a user supplied initialized Map. This overrides the default set of profiles initialized at startup, and provides an alternative to configuring profiles through property file- Parameters:
profilesMap- map of language profiles
-
clearProfiles
public static void clearProfiles()
Deprecated.Clears the current map of language profiles
-
hasErrors
public static boolean hasErrors()
Deprecated.Tests whether there were errors initializing language config- Returns:
- true if there are errors. Use getErrors() to retrieve.
-
getErrors
public static java.lang.String getErrors()
Deprecated.Returns a string of error messages related to initializing language profiles- Returns:
- the String containing the error messages
-
getSupportedLanguages
public static java.util.Set<java.lang.String> getSupportedLanguages()
Deprecated.Returns what languages are supported for language identification- Returns:
- A set of Strings being the ISO 639 language codes
-
toString
public java.lang.String toString()
Deprecated.- Overrides:
toStringin classjava.lang.Object
-
-