Package org.apache.poi.extractor
Class POITextExtractor
- java.lang.Object
-
- org.apache.poi.extractor.POITextExtractor
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
- Direct Known Subclasses:
POIOLE2TextExtractor
,POIXMLTextExtractor
public abstract class POITextExtractor extends java.lang.Object implements java.io.Closeable
Common Parent for Text Extractors of POI Documents. You will typically find the implementation of a given format's text extractor under org.apache.poi.[format].extractor .- See Also:
ExcelExtractor
,PowerPointExtractor
,VisioTextExtractor
,WordExtractor
-
-
Constructor Summary
Constructors Constructor Description POITextExtractor()
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description void
close()
Allows to free resources of the Extractor as soon as it is not needed any more.abstract java.lang.Object
getDocument()
abstract POITextExtractor
getMetadataTextExtractor()
Returns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.abstract java.lang.String
getText()
Retrieves all the text from the document.void
setFilesystem(java.io.Closeable fs)
Used to ensure file handle cleanup.
-
-
-
Method Detail
-
getText
public abstract java.lang.String getText()
Retrieves all the text from the document. How cells, paragraphs etc are separated in the text is implementation specific - see the javadocs for a specific project for details.- Returns:
- All the text from the document
-
getMetadataTextExtractor
public abstract POITextExtractor getMetadataTextExtractor()
Returns another text extractor, which is able to output the textual content of the document metadata / properties, such as author and title.- Returns:
- the metadata and text extractor
-
setFilesystem
public void setFilesystem(java.io.Closeable fs)
Used to ensure file handle cleanup.- Parameters:
fs
- filesystem to close
-
close
public void close() throws java.io.IOException
Allows to free resources of the Extractor as soon as it is not needed any more. This may include closing open file handles and freeing memory. The Extractor cannot be used after close has been called.- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Throws:
java.io.IOException
-
getDocument
public abstract java.lang.Object getDocument()
- Returns:
- the processed document
-
-