Class ToHTMLContentHandler

  • All Implemented Interfaces:
    org.xml.sax.ContentHandler, org.xml.sax.DTDHandler, org.xml.sax.EntityResolver, org.xml.sax.ErrorHandler

    public class ToHTMLContentHandler
    extends ToXMLContentHandler
    SAX event handler that serializes the HTML document to a character stream. The incoming SAX events are expected to be well-formed (properly nested, etc.) and valid HTML.
    Since:
    Apache Tika 0.10
    • Constructor Detail

      • ToHTMLContentHandler

        public ToHTMLContentHandler​(java.io.OutputStream stream,
                                    java.lang.String encoding)
                             throws java.io.UnsupportedEncodingException
        Throws:
        java.io.UnsupportedEncodingException
      • ToHTMLContentHandler

        public ToHTMLContentHandler()
    • Method Detail

      • startDocument

        public void startDocument()
                           throws org.xml.sax.SAXException
        Description copied from class: ToXMLContentHandler
        Writes the XML prefix.
        Specified by:
        startDocument in interface org.xml.sax.ContentHandler
        Overrides:
        startDocument in class ToXMLContentHandler
        Throws:
        org.xml.sax.SAXException
      • endElement

        public void endElement​(java.lang.String uri,
                               java.lang.String localName,
                               java.lang.String qName)
                        throws org.xml.sax.SAXException
        Specified by:
        endElement in interface org.xml.sax.ContentHandler
        Overrides:
        endElement in class ToXMLContentHandler
        Throws:
        org.xml.sax.SAXException