Package org.apache.tika.extractor
Extraction of component documents.
-
Interface Summary Interface Description ContainerExtractor Tika container extractor interface.DocumentSelector Interface for different document selection strategies for purposes like embedded document extraction by aContainerExtractor
instance.EmbeddedDocumentExtractor EmbeddedResourceHandler Tika container extractor callback interface. -
Class Summary Class Description EmbeddedDocumentUtil Utility class to handle common issues with embedded documents.ParserContainerExtractor An implementation ofContainerExtractor
powered by the regularParser
API.ParsingEmbeddedDocumentExtractor Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.