public class ToTextContentHandler
extends org.xml.sax.helpers.DefaultHandler
As of Tika 1.20, this handler ignores content within <script> and <style> tags.
Constructor and Description |
---|
ToTextContentHandler()
Creates a content handler that writes character events
to an internal string buffer.
|
ToTextContentHandler(java.io.OutputStream stream)
Creates a content handler that writes character events to
the given output stream using the platform default encoding.
|
ToTextContentHandler(java.io.OutputStream stream,
java.lang.String encoding)
Creates a content handler that writes character events to
the given output stream using the given encoding.
|
ToTextContentHandler(java.io.Writer writer)
Creates a content handler that writes character events to
the given writer.
|
Modifier and Type | Method and Description |
---|---|
void |
characters(char[] ch,
int start,
int length)
Writes the given characters to the given character stream.
|
void |
endDocument()
Flushes the character stream so that no characters are forgotten
in internal buffers.
|
void |
endElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName) |
void |
ignorableWhitespace(char[] ch,
int start,
int length)
Writes the given ignorable characters to the given character stream.
|
void |
startElement(java.lang.String uri,
java.lang.String localName,
java.lang.String qName,
org.xml.sax.Attributes atts) |
java.lang.String |
toString()
Returns the contents of the internal string buffer where
all the received characters have been collected.
|
public ToTextContentHandler(java.io.Writer writer)
writer
- writerpublic ToTextContentHandler(java.io.OutputStream stream)
stream
- output streampublic ToTextContentHandler(java.io.OutputStream stream, java.lang.String encoding) throws java.io.UnsupportedEncodingException
stream
- output streamencoding
- output encodingjava.io.UnsupportedEncodingException
- if the encoding is unsupportedpublic ToTextContentHandler()
toString()
method to access the collected character content.public void characters(char[] ch, int start, int length) throws org.xml.sax.SAXException
characters
in interface org.xml.sax.ContentHandler
characters
in class org.xml.sax.helpers.DefaultHandler
org.xml.sax.SAXException
public void ignorableWhitespace(char[] ch, int start, int length) throws org.xml.sax.SAXException
characters(char[], int, int)
method.ignorableWhitespace
in interface org.xml.sax.ContentHandler
ignorableWhitespace
in class org.xml.sax.helpers.DefaultHandler
org.xml.sax.SAXException
public void endDocument() throws org.xml.sax.SAXException
endDocument
in interface org.xml.sax.ContentHandler
endDocument
in class org.xml.sax.helpers.DefaultHandler
org.xml.sax.SAXException
- if the stream can not be flushedpublic void startElement(java.lang.String uri, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes atts) throws org.xml.sax.SAXException
startElement
in interface org.xml.sax.ContentHandler
startElement
in class org.xml.sax.helpers.DefaultHandler
org.xml.sax.SAXException
public void endElement(java.lang.String uri, java.lang.String localName, java.lang.String qName) throws org.xml.sax.SAXException
endElement
in interface org.xml.sax.ContentHandler
endElement
in class org.xml.sax.helpers.DefaultHandler
org.xml.sax.SAXException
public java.lang.String toString()
StringWriter
to the
other constructor.toString
in class java.lang.Object
"Copyright © 2010 - 2020 Adobe Systems Incorporated. All Rights Reserved"