Package opennlp.tools.namefind
Class NameFinderME
- java.lang.Object
-
- opennlp.tools.namefind.NameFinderME
-
- All Implemented Interfaces:
TokenNameFinder
public class NameFinderME extends java.lang.Object implements TokenNameFinder
Class for creating a maximum-entropy-based name finder.
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
CONTINUE
static int
DEFAULT_BEAM_SIZE
static java.lang.String
OTHER
static java.lang.String
START
-
Constructor Summary
Constructors Constructor Description NameFinderME(TokenNameFinderModel model)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
clearAdaptiveData()
Forgets all adaptive data which was collected during previous calls to one of the find methods.static Span[]
dropOverlappingSpans(Span[] spans)
Removes spans with are intersecting or crossing in anyway.Span[]
find(java.lang.String[] tokens)
Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.Span[]
find(java.lang.String[] tokens, java.lang.String[][] additionalContext)
Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.double[]
probs()
Returns an array with the probabilities of the last decoded sequence.void
probs(double[] probs)
Populates the specified array with the probabilities of the last decoded sequence.double[]
probs(Span[] spans)
Returns an array of probabilities for each of the specified spans which is the arithmetic mean of the probabilities for each of the outcomes which make up the span.static TokenNameFinderModel
train(java.lang.String languageCode, java.lang.String type, ObjectStream<NameSample> samples, TrainingParameters trainParams, TokenNameFinderFactory factory)
-
-
-
Field Detail
-
DEFAULT_BEAM_SIZE
public static final int DEFAULT_BEAM_SIZE
- See Also:
- Constant Field Values
-
START
public static final java.lang.String START
- See Also:
- Constant Field Values
-
CONTINUE
public static final java.lang.String CONTINUE
- See Also:
- Constant Field Values
-
OTHER
public static final java.lang.String OTHER
- See Also:
- Constant Field Values
-
-
Constructor Detail
-
NameFinderME
public NameFinderME(TokenNameFinderModel model)
-
-
Method Detail
-
find
public Span[] find(java.lang.String[] tokens)
Description copied from interface:TokenNameFinder
Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.- Specified by:
find
in interfaceTokenNameFinder
- Parameters:
tokens
- an array of the tokens or words of the sequence, typically a sentence.- Returns:
- an array of spans for each of the names identified.
-
find
public Span[] find(java.lang.String[] tokens, java.lang.String[][] additionalContext)
Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.- Parameters:
tokens
- an array of the tokens or words of the sequence, typically a sentence.additionalContext
- features which are based on context outside of the sentence but which should also be used.- Returns:
- an array of spans for each of the names identified.
-
clearAdaptiveData
public void clearAdaptiveData()
Forgets all adaptive data which was collected during previous calls to one of the find methods. This method is typical called at the end of a document.- Specified by:
clearAdaptiveData
in interfaceTokenNameFinder
-
probs
public void probs(double[] probs)
Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call tochunk
. The specified array should be at least as large as the number of tokens in the previous call tochunk
.- Parameters:
probs
- An array used to hold the probabilities of the last decoded sequence.
-
probs
public double[] probs()
Returns an array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call tochunk
.- Returns:
- An array with the same number of probabilities as tokens were sent
to
chunk
when it was last called.
-
probs
public double[] probs(Span[] spans)
Returns an array of probabilities for each of the specified spans which is the arithmetic mean of the probabilities for each of the outcomes which make up the span.- Parameters:
spans
- The spans of the names for which probabilities are desired.- Returns:
- an array of probabilities for each of the specified spans.
-
train
public static TokenNameFinderModel train(java.lang.String languageCode, java.lang.String type, ObjectStream<NameSample> samples, TrainingParameters trainParams, TokenNameFinderFactory factory) throws java.io.IOException
- Throws:
java.io.IOException
-
dropOverlappingSpans
public static Span[] dropOverlappingSpans(Span[] spans)
Removes spans with are intersecting or crossing in anyway.The following rules are used to remove the spans:
Identical spans: The first span in the array after sorting it remains
Intersecting spans: The first span after sorting remains
Contained spans: All spans which are contained by another are removed- Parameters:
spans
-- Returns:
- non-overlapping spans
-
-