Interface OOXMLExtractor
- 
- All Known Implementing Classes:
 AbstractOOXMLExtractor,POIXMLTextExtractorDecorator,SXSLFPowerPointExtractorDecorator,SXWPFWordExtractorDecorator,XPSExtractorDecorator,XSLFPowerPointExtractorDecorator,XSSFBExcelExtractorDecorator,XSSFExcelExtractorDecorator,XWPFWordExtractorDecorator
public interface OOXMLExtractorInterface implemented by all Tika OOXML extractors.- See Also:
 POIXMLTextExtractor
 
- 
- 
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description POIXMLDocumentgetDocument()Returns the opened document.MetadataExtractorgetMetadataExtractor()POIXMLTextExtractor.getMetadataTextExtractor()not yet supported for OOXML by POI.voidgetXHTML(org.xml.sax.ContentHandler handler, Metadata metadata, ParseContext context)Parses the document into a sequence of XHTML SAX events sent to the given content handler. 
 - 
 
- 
- 
Method Detail
- 
getDocument
POIXMLDocument getDocument()
Returns the opened document.- See Also:
 POIXMLTextExtractor.getDocument()
 
- 
getMetadataExtractor
MetadataExtractor getMetadataExtractor()
POIXMLTextExtractor.getMetadataTextExtractor()not yet supported for OOXML by POI. 
- 
getXHTML
void getXHTML(org.xml.sax.ContentHandler handler, Metadata metadata, ParseContext context) throws org.xml.sax.SAXException, XmlException, java.io.IOException, TikaExceptionParses the document into a sequence of XHTML SAX events sent to the given content handler.- Throws:
 org.xml.sax.SAXExceptionXmlExceptionjava.io.IOExceptionTikaException
 
 - 
 
 -