Class POSTaggerME

  • All Implemented Interfaces:
    POSTagger

    public class POSTaggerME
    extends java.lang.Object
    implements POSTagger
    A part-of-speech tagger that uses maximum entropy. Tries to predict whether words are nouns, verbs, or any of 70 other POS tags depending on their surrounding context.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      static int DEFAULT_BEAM_SIZE  
    • Constructor Summary

      Constructors 
      Constructor Description
      POSTaggerME​(POSModel model)
      Initializes the current instance with the provided model.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      static Dictionary buildNGramDictionary​(ObjectStream<POSSample> samples, int cutoff)  
      java.lang.String[] getAllPosTags()
      Retrieves an array of all possible part-of-speech tags from the tagger.
      java.lang.String[] getOrderedTags​(java.util.List<java.lang.String> words, java.util.List<java.lang.String> tags, int index)  
      java.lang.String[] getOrderedTags​(java.util.List<java.lang.String> words, java.util.List<java.lang.String> tags, int index, double[] tprobs)  
      static void populatePOSDictionary​(ObjectStream<POSSample> samples, MutableTagDictionary dict, int cutoff)  
      double[] probs()
      Returns an array with the probabilities for each tag of the last tagged sentence.
      void probs​(double[] probs)
      Populates the specified array with the probabilities for each tag of the last tagged sentence.
      java.lang.String[][] tag​(int numTaggings, java.lang.String[] sentence)
      Returns at most the specified number of taggings for the specified sentence.
      java.lang.String[] tag​(java.lang.String[] sentence)
      Assigns the sentence of tokens pos tags.
      java.lang.String[] tag​(java.lang.String[] sentence, java.lang.Object[] additionaContext)  
      Sequence[] topKSequences​(java.lang.String[] sentence)  
      Sequence[] topKSequences​(java.lang.String[] sentence, java.lang.Object[] additionaContext)  
      static POSModel train​(java.lang.String languageCode, ObjectStream<POSSample> samples, TrainingParameters trainParams, POSTaggerFactory posFactory)  
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • POSTaggerME

        public POSTaggerME​(POSModel model)
        Initializes the current instance with the provided model.
        Parameters:
        model -
    • Method Detail

      • getAllPosTags

        public java.lang.String[] getAllPosTags()
        Retrieves an array of all possible part-of-speech tags from the tagger.
        Returns:
        String[]
      • tag

        public java.lang.String[] tag​(java.lang.String[] sentence)
        Description copied from interface: POSTagger
        Assigns the sentence of tokens pos tags.
        Specified by:
        tag in interface POSTagger
        Parameters:
        sentence - The sentece of tokens to be tagged.
        Returns:
        an array of pos tags for each token provided in sentence.
      • tag

        public java.lang.String[] tag​(java.lang.String[] sentence,
                                      java.lang.Object[] additionaContext)
        Specified by:
        tag in interface POSTagger
      • tag

        public java.lang.String[][] tag​(int numTaggings,
                                        java.lang.String[] sentence)
        Returns at most the specified number of taggings for the specified sentence.
        Parameters:
        numTaggings - The number of tagging to be returned.
        sentence - An array of tokens which make up a sentence.
        Returns:
        At most the specified number of taggings for the specified sentence.
      • topKSequences

        public Sequence[] topKSequences​(java.lang.String[] sentence,
                                        java.lang.Object[] additionaContext)
        Specified by:
        topKSequences in interface POSTagger
      • probs

        public void probs​(double[] probs)
        Populates the specified array with the probabilities for each tag of the last tagged sentence.
        Parameters:
        probs - An array to put the probabilities into.
      • probs

        public double[] probs()
        Returns an array with the probabilities for each tag of the last tagged sentence.
        Returns:
        an array with the probabilities for each tag of the last tagged sentence.
      • getOrderedTags

        public java.lang.String[] getOrderedTags​(java.util.List<java.lang.String> words,
                                                 java.util.List<java.lang.String> tags,
                                                 int index)
      • getOrderedTags

        public java.lang.String[] getOrderedTags​(java.util.List<java.lang.String> words,
                                                 java.util.List<java.lang.String> tags,
                                                 int index,
                                                 double[] tprobs)
      • buildNGramDictionary

        public static Dictionary buildNGramDictionary​(ObjectStream<POSSample> samples,
                                                      int cutoff)
                                               throws java.io.IOException
        Throws:
        java.io.IOException