public class BoilerpipeContentHandler
extends BoilerpipeHTMLContentHandler
ContentHandler object passed to
 HtmlParser.parse(java.io.InputStream, ContentHandler, Metadata, org.apache.tika.parser.ParseContext)| Constructor and Description | 
|---|
BoilerpipeContentHandler(org.xml.sax.ContentHandler delegate)
Creates a new boilerpipe-based content extractor, using the
  
DefaultExtractor extraction rules and "delegate" as the content handler. | 
BoilerpipeContentHandler(org.xml.sax.ContentHandler delegate,
                        BoilerpipeExtractor extractor)
Creates a new boilerpipe-based content extractor, using the given
 extraction rules. 
 | 
BoilerpipeContentHandler(java.io.Writer writer)
Creates a content handler that writes XHTML body character events to
 the given writer. 
 | 
| Modifier and Type | Method and Description | 
|---|---|
void | 
characters(char[] chars,
          int offset,
          int length)  | 
void | 
endDocument()  | 
void | 
endElement(java.lang.String uri,
          java.lang.String localName,
          java.lang.String qName)  | 
TextDocument | 
getTextDocument()
Retrieves the built TextDocument 
 | 
boolean | 
isIncludeMarkup()  | 
void | 
setIncludeMarkup(boolean includeMarkup)  | 
void | 
startDocument()  | 
void | 
startElement(java.lang.String uri,
            java.lang.String localName,
            java.lang.String qName,
            org.xml.sax.Attributes atts)  | 
void | 
startPrefixMapping(java.lang.String prefix,
                  java.lang.String uri)  | 
public BoilerpipeContentHandler(org.xml.sax.ContentHandler delegate)
DefaultExtractor extraction rules and "delegate" as the content handler.delegate - The ContentHandler objectpublic BoilerpipeContentHandler(java.io.Writer writer)
writer - writerpublic BoilerpipeContentHandler(org.xml.sax.ContentHandler delegate,
                                BoilerpipeExtractor extractor)
delegate - The ContentHandler objectextractor - Extraction rules to use, e.g. ArticleExtractorpublic boolean isIncludeMarkup()
public void setIncludeMarkup(boolean includeMarkup)
public TextDocument getTextDocument()
public void startDocument()
                   throws org.xml.sax.SAXException
org.xml.sax.SAXExceptionpublic void startPrefixMapping(java.lang.String prefix,
                               java.lang.String uri)
                        throws org.xml.sax.SAXException
org.xml.sax.SAXExceptionpublic void startElement(java.lang.String uri,
                         java.lang.String localName,
                         java.lang.String qName,
                         org.xml.sax.Attributes atts)
                  throws org.xml.sax.SAXException
org.xml.sax.SAXExceptionpublic void characters(char[] chars,
                       int offset,
                       int length)
                throws org.xml.sax.SAXException
org.xml.sax.SAXExceptionpublic void endElement(java.lang.String uri,
                       java.lang.String localName,
                       java.lang.String qName)
                throws org.xml.sax.SAXException
org.xml.sax.SAXExceptionpublic void endDocument()
                 throws org.xml.sax.SAXException
org.xml.sax.SAXException"Copyright © 2010 - 2020 Adobe Systems Incorporated. All Rights Reserved"