public class OfficeParser extends AbstractOfficeParser
Modifier and Type | Class and Description |
---|---|
static class |
OfficeParser.POIFSDocumentType |
Constructor and Description |
---|
OfficeParser() |
Modifier and Type | Method and Description |
---|---|
static void |
extractMacros(POIFSFileSystem fs,
org.xml.sax.ContentHandler xhtml,
EmbeddedDocumentExtractor embeddedDocumentExtractor)
Helper to extract macros from an NPOIFS/vbaProject.bin
As of POI-3.15-final, there are still some bugs in VBAMacroReader.
|
java.util.Set<MediaType> |
getSupportedTypes(ParseContext context)
Returns the set of media types supported by this parser when used
with the given parse context.
|
void |
parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
Metadata metadata,
ParseContext context)
Extracts properties and text from an MS Document input stream
|
configure, getExtractAllAlternativesFromMSG, getExtractMacros, getIncludeDeletedContent, getIncludeMoveFromContent, getUseSAXDocxExtractor, setConcatenatePhoneticRuns, setExtractAllAlternativesFromMSG, setExtractMacros, setIncludeDeletedContent, setIncludeMoveFromContent, setIncludeShapeBasedContent, setUseSAXDocxExtractor, setUseSAXPptxExtractor
parse
public java.util.Set<MediaType> getSupportedTypes(ParseContext context)
Parser
context
- parse contextpublic void parse(java.io.InputStream stream, org.xml.sax.ContentHandler handler, Metadata metadata, ParseContext context) throws java.io.IOException, org.xml.sax.SAXException, TikaException
stream
- the document stream (input)handler
- handler for the XHTML SAX events (output)metadata
- document metadata (input and output)context
- parse contextjava.io.IOException
- if the document stream could not be readorg.xml.sax.SAXException
- if the SAX events could not be processedTikaException
- if the document could not be parsedpublic static void extractMacros(POIFSFileSystem fs, org.xml.sax.ContentHandler xhtml, EmbeddedDocumentExtractor embeddedDocumentExtractor) throws java.io.IOException, org.xml.sax.SAXException
fs
- NPOIFS to extract fromxhtml
- SAX writerembeddedDocumentExtractor
- extractor for embedded documentsjava.io.IOException
- on IOException if it occurs during the extraction of the embedded docorg.xml.sax.SAXException
- on SAXException for writing to xhtml"Copyright © 2010 - 2020 Adobe Systems Incorporated. All Rights Reserved"