public class RegexNERecogniser extends java.lang.Object implements NERecogniser
NERecogniser
based on
Regular Expressions.
The default configuration file "ner-regex.txt" is used when no
argument constructor is used to instantiate this class. The regex file is
loaded via Class.getResourceAsStream(String)
, so the file should be
placed in the same package path as of this class.
ENTITY_TYPE1=REGEX1 ENTITY_TYPE2=REGEX2For example, to extract week day from text:
WEEK_DAY=(?i)((sun)|(mon)|(tues)|(thurs)|(fri)|((sat)(ur)?))(day)?
Modifier and Type | Field and Description |
---|---|
java.util.Set<java.lang.String> |
entityTypes |
static java.lang.String |
NER_REGEX_FILE |
java.util.Map<java.lang.String,java.util.regex.Pattern> |
patterns |
DATE, LOCATION, MISCELLANEOUS, MONEY, ORGANIZATION, PERCENT, PERSON, TIME
Constructor and Description |
---|
RegexNERecogniser() |
RegexNERecogniser(java.io.InputStream stream) |
Modifier and Type | Method and Description |
---|---|
java.util.Set<java.lang.String> |
findMatches(java.lang.String text,
java.util.regex.Pattern pattern)
finds matching sub groups in text
|
java.util.Set<java.lang.String> |
getEntityTypes()
gets a set of entity types whose names are recognisable by this
|
static RegexNERecogniser |
getInstance() |
boolean |
isAvailable()
checks if this Named Entity recogniser is available for service
|
java.util.Map<java.lang.String,java.util.Set<java.lang.String>> |
recognise(java.lang.String text)
call for name recognition action from text
|
public static final java.lang.String NER_REGEX_FILE
public java.util.Set<java.lang.String> entityTypes
public java.util.Map<java.lang.String,java.util.regex.Pattern> patterns
public RegexNERecogniser()
public RegexNERecogniser(java.io.InputStream stream)
public static RegexNERecogniser getInstance()
public boolean isAvailable()
NERecogniser
isAvailable
in interface NERecogniser
public java.util.Set<java.lang.String> getEntityTypes()
NERecogniser
getEntityTypes
in interface NERecogniser
public java.util.Set<java.lang.String> findMatches(java.lang.String text, java.util.regex.Pattern pattern)
text
- text containing interesting sub stringspattern
- pattern to find sub stringspublic java.util.Map<java.lang.String,java.util.Set<java.lang.String>> recognise(java.lang.String text)
NERecogniser
recognise
in interface NERecogniser
text
- text with possibly contains names"Copyright © 2010 - 2020 Adobe Systems Incorporated. All Rights Reserved"