Class CTAKESParser

  • All Implemented Interfaces:
    java.io.Serializable, Parser

    public class CTAKESParser
    extends ParserDecorator
    CTAKESParser decorates a Parser and leverages on CTAKESContentHandler to extract biomedical information from clinical text using Apache cTAKES.

    It is normally called by supplying an instance to AutoDetectParser, such as: AutoDetectParser parser = new AutoDetectParser(new CTAKESParser());

    It can also be used by giving a Tika Config file similar to:

    Because this is a Parser Decorator, and not a normal Parser in it's own right, it isn't normally selected via the Parser Service Loader.

    See Also:
    Serialized Form
    • Constructor Detail

      • CTAKESParser

        public CTAKESParser()
        Wraps the default Parser
      • CTAKESParser

        public CTAKESParser​(TikaConfig config)
        Wraps the default Parser for this Config
      • CTAKESParser

        public CTAKESParser​(Parser parser)
        Wraps the specified Parser
    • Method Detail

      • parse

        public void parse​(java.io.InputStream stream,
                          org.xml.sax.ContentHandler handler,
                          Metadata metadata,
                          ParseContext context)
                   throws java.io.IOException,
                          org.xml.sax.SAXException,
                          TikaException
        Description copied from class: ParserDecorator
        Delegates the method call to the decorated parser. Subclasses should override this method (and use super.parse() to invoke the decorated parser) to implement extra decoration.
        Specified by:
        parse in interface Parser
        Overrides:
        parse in class ParserDecorator
        Parameters:
        stream - the document stream (input)
        handler - handler for the XHTML SAX events (output)
        metadata - document metadata (input and output)
        context - parse context
        Throws:
        java.io.IOException - if the document stream could not be read
        org.xml.sax.SAXException - if the SAX events could not be processed
        TikaException - if the document could not be parsed
      • getDecorationName

        public java.lang.String getDecorationName()
        Overrides:
        getDecorationName in class ParserDecorator
        Returns:
        A name/description of the decoration, or null if none available