Class XWPFEventBasedWordExtractor
- java.lang.Object
-
- org.apache.poi.extractor.POITextExtractor
-
- org.apache.poi.ooxml.extractor.POIXMLTextExtractor
-
- org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
public class XWPFEventBasedWordExtractor extends POIXMLTextExtractor
Experimental class that is based on POI's XSSFEventBasedExcelExtractor
-
-
Constructor Summary
Constructors Constructor Description XWPFEventBasedWordExtractor(java.lang.String path)
XWPFEventBasedWordExtractor(OPCPackage container)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description POIXMLProperties.CoreProperties
getCoreProperties()
Returns the core document propertiesPOIXMLProperties.CustomProperties
getCustomProperties()
Returns the custom document propertiesPOIXMLProperties.ExtendedProperties
getExtendedProperties()
Returns the extended document propertiesOPCPackage
getPackage()
Returns the opened OPCPackage that contains the documentjava.lang.String
getText()
Retrieves all the text from the document.static void
main(java.lang.String[] args)
-
Methods inherited from class org.apache.poi.ooxml.extractor.POIXMLTextExtractor
close, getDocument, getMetadataTextExtractor
-
Methods inherited from class org.apache.poi.extractor.POITextExtractor
setFilesystem
-
-
-
-
Constructor Detail
-
XWPFEventBasedWordExtractor
public XWPFEventBasedWordExtractor(java.lang.String path) throws XmlException, OpenXML4JException, java.io.IOException
- Throws:
XmlException
OpenXML4JException
java.io.IOException
-
XWPFEventBasedWordExtractor
public XWPFEventBasedWordExtractor(OPCPackage container) throws XmlException, OpenXML4JException, java.io.IOException
- Throws:
XmlException
OpenXML4JException
java.io.IOException
-
-
Method Detail
-
main
public static void main(java.lang.String[] args) throws java.lang.Exception
- Throws:
java.lang.Exception
-
getPackage
public OPCPackage getPackage()
Description copied from class:POIXMLTextExtractor
Returns the opened OPCPackage that contains the document- Overrides:
getPackage
in classPOIXMLTextExtractor
- Returns:
- the opened OPCPackage
-
getCoreProperties
public POIXMLProperties.CoreProperties getCoreProperties()
Description copied from class:POIXMLTextExtractor
Returns the core document properties- Overrides:
getCoreProperties
in classPOIXMLTextExtractor
- Returns:
- the core document properties
-
getExtendedProperties
public POIXMLProperties.ExtendedProperties getExtendedProperties()
Description copied from class:POIXMLTextExtractor
Returns the extended document properties- Overrides:
getExtendedProperties
in classPOIXMLTextExtractor
- Returns:
- the extended document properties
-
getCustomProperties
public POIXMLProperties.CustomProperties getCustomProperties()
Description copied from class:POIXMLTextExtractor
Returns the custom document properties- Overrides:
getCustomProperties
in classPOIXMLTextExtractor
- Returns:
- the custom document properties
-
getText
public java.lang.String getText()
Description copied from class:POITextExtractor
Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.- Specified by:
getText
in classPOITextExtractor
- Returns:
- All the text from the document
-
-