Interface OOXMLExtractor
-
- All Known Implementing Classes:
AbstractOOXMLExtractor
,POIXMLTextExtractorDecorator
,SXSLFPowerPointExtractorDecorator
,SXWPFWordExtractorDecorator
,XPSExtractorDecorator
,XSLFPowerPointExtractorDecorator
,XSSFBExcelExtractorDecorator
,XSSFExcelExtractorDecorator
,XWPFWordExtractorDecorator
public interface OOXMLExtractor
Interface implemented by all Tika OOXML extractors.- See Also:
POIXMLTextExtractor
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description POIXMLDocument
getDocument()
Returns the opened document.MetadataExtractor
getMetadataExtractor()
POIXMLTextExtractor.getMetadataTextExtractor()
not yet supported for OOXML by POI.void
getXHTML(org.xml.sax.ContentHandler handler, Metadata metadata, ParseContext context)
Parses the document into a sequence of XHTML SAX events sent to the given content handler.
-
-
-
Method Detail
-
getDocument
POIXMLDocument getDocument()
Returns the opened document.- See Also:
POIXMLTextExtractor.getDocument()
-
getMetadataExtractor
MetadataExtractor getMetadataExtractor()
POIXMLTextExtractor.getMetadataTextExtractor()
not yet supported for OOXML by POI.
-
getXHTML
void getXHTML(org.xml.sax.ContentHandler handler, Metadata metadata, ParseContext context) throws org.xml.sax.SAXException, XmlException, java.io.IOException, TikaException
Parses the document into a sequence of XHTML SAX events sent to the given content handler.- Throws:
org.xml.sax.SAXException
XmlException
java.io.IOException
TikaException
-
-