Class ParserPostProcessor

  • All Implemented Interfaces:, Parser

    public class ParserPostProcessor
    extends ParserDecorator
    Parser decorator that post-processes the results from a decorated parser. The post-processing takes care of filling in the "fulltext", "summary", and "outlinks" metadata entries based on the full text content returned by the decorated parser.
    See Also:
    Serialized Form
    • Constructor Detail

      • ParserPostProcessor

        public ParserPostProcessor​(Parser parser)
        Creates a post-processing decorator for the given parser.
        parser - the parser to be decorated
    • Method Detail

      • parse

        public void parse​( stream,
                          org.xml.sax.ContentHandler handler,
                          Metadata metadata,
                          ParseContext context)
        Forwards the call to the delegated parser and post-processes the results as described above.
        Specified by:
        parse in interface Parser
        parse in class ParserDecorator
        stream - the document stream (input)
        handler - handler for the XHTML SAX events (output)
        metadata - document metadata (input and output)
        context - parse context
        Throws: - if the document stream could not be read
        org.xml.sax.SAXException - if the SAX events could not be processed
        TikaException - if the document could not be parsed