public abstract class AbstractRecursiveParserWrapperHandler
extends org.xml.sax.helpers.DefaultHandler
implements java.io.Serializable
RecursiveParserWrapper
.
It allows for finer-grained processing of embedded documents than in the legacy handlers.
Subclasses can choose how to process individual embedded documents.Modifier and Type | Field and Description |
---|---|
static Property |
EMBEDDED_EXCEPTION |
static Property |
EMBEDDED_RESOURCE_LIMIT_REACHED |
static Property |
EMBEDDED_RESOURCE_PATH |
static Property |
PARSE_TIME_MILLIS |
static Property |
TIKA_CONTENT |
static Property |
TIKA_CONTENT_HANDLER
Simple class name of the content handler
|
static Property |
WRITE_LIMIT_REACHED |
Constructor and Description |
---|
AbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory) |
AbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory,
int maxEmbeddedResources) |
Modifier and Type | Method and Description |
---|---|
void |
endDocument(org.xml.sax.ContentHandler contentHandler,
Metadata metadata)
This is called after the full parse has completed.
|
void |
endEmbeddedDocument(org.xml.sax.ContentHandler contentHandler,
Metadata metadata)
This is called after parsing each embedded document.
|
ContentHandlerFactory |
getContentHandlerFactory() |
org.xml.sax.ContentHandler |
getNewContentHandler() |
org.xml.sax.ContentHandler |
getNewContentHandler(java.io.OutputStream os,
java.nio.charset.Charset charset) |
boolean |
hasHitMaximumEmbeddedResources() |
void |
startEmbeddedDocument(org.xml.sax.ContentHandler contentHandler,
Metadata metadata)
This is called before parsing each embedded document.
|
characters, endDocument, endElement, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, unparsedEntityDecl, warning
public static final Property TIKA_CONTENT
public static final Property TIKA_CONTENT_HANDLER
public static final Property PARSE_TIME_MILLIS
public static final Property WRITE_LIMIT_REACHED
public static final Property EMBEDDED_RESOURCE_LIMIT_REACHED
public static final Property EMBEDDED_EXCEPTION
public static final Property EMBEDDED_RESOURCE_PATH
public AbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory)
public AbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources)
public org.xml.sax.ContentHandler getNewContentHandler()
public org.xml.sax.ContentHandler getNewContentHandler(java.io.OutputStream os, java.nio.charset.Charset charset)
public void startEmbeddedDocument(org.xml.sax.ContentHandler contentHandler, Metadata metadata) throws org.xml.sax.SAXException
contentHandler
- local handler to be used on this embedded documentmetadata
- embedded document's metadataorg.xml.sax.SAXException
public void endEmbeddedDocument(org.xml.sax.ContentHandler contentHandler, Metadata metadata) throws org.xml.sax.SAXException
contentHandler
- content handler that was used on this embedded documentmetadata
- metadata for this embedded documentorg.xml.sax.SAXException
public void endDocument(org.xml.sax.ContentHandler contentHandler, Metadata metadata) throws org.xml.sax.SAXException
super.endDocument(...)
in subclasses because this adds whether or not the embedded resource
maximum has been hit to the metadata.contentHandler
- content handler that was used on the main documentmetadata
- metadata that was gathered for the main documentorg.xml.sax.SAXException
public boolean hasHitMaximumEmbeddedResources()
public ContentHandlerFactory getContentHandlerFactory()
"Copyright © 2010 - 2020 Adobe Systems Incorporated. All Rights Reserved"