Package org.apache.tika.metadata
Interface TikaCoreProperties
- 
public interface TikaCorePropertiesContains a core set of basic Tika metadata properties, which all parsers will attempt to supply (where the file format permits). These are all defined in terms of other standard namespaces. Users of Tika who wish to have consistent metadata across file formats can make use of these Properties, knowing that where present they will have consistent semantic meaning between different file formats. (No matter if one file format calls it Title, another Long-Title and another Long-Name, if they all mean the same thing as defined byDublinCore.TITLEthen they will all be present as such) For now, most of these properties are composite ones including the deprecated non-prefixed String properties from the Metadata class. In Tika 2.0, most of these will revert back to simple assignments.- Since:
 - Apache Tika 1.2
 
 
- 
- 
Nested Class Summary
Nested Classes Modifier and Type Interface Description static classTikaCoreProperties.EmbeddedResourceTypeA file might contain different types of embedded documents. 
- 
Field Summary
Fields Modifier and Type Field Description static PropertyALTITUDEstatic PropertyCOMMENTSstatic PropertyCONTENT_TYPE_HINTThis is currently used to identify Content-Type that may be included within a document, such as in html documents (e.g.static PropertyCONTENT_TYPE_OVERRIDEstatic PropertyCONTRIBUTORstatic PropertyCOVERAGEstatic PropertyCREATEDstatic PropertyCREATORstatic PropertyCREATOR_TOOLstatic PropertyDESCRIPTIONstatic PropertyEMBEDDED_RESOURCE_TYPEEmbedded resource type propertystatic java.lang.StringEMBEDDED_RESOURCE_TYPE_KEYstatic PropertyFORMATstatic PropertyHAS_SIGNATUREstatic PropertyIDENTIFIERstatic PropertyKEYWORDSDublinCore.SUBJECT; should include both subject and keywords if a document format has both.static PropertyLANGUAGEstatic PropertyLATITUDEstatic PropertyLONGITUDEstatic PropertyMETADATA_DATEstatic PropertyMODIFIEDstatic PropertyMODIFIERstatic PropertyORIGINAL_RESOURCE_NAMESome file formats can store information about their original file name/location or about their attachment's original file name/location.static PropertyPRINT_DATEstatic PropertyPUBLISHERstatic PropertyRATINGstatic PropertyRELATIONstatic PropertyRIGHTSstatic PropertySOURCEstatic PropertyTIKA_META_EXCEPTION_EMBEDDED_STREAMUse this to store exceptions caught while trying to read the stream of an embedded resource.static java.lang.StringTIKA_META_EXCEPTION_PREFIXUse this to store parse exception information in the Metadata object.static PropertyTIKA_META_EXCEPTION_WARNINGUse this to store exceptions caught during a parse that are non-fatal, e.g.static java.lang.StringTIKA_META_PREFIXUse this to prefix metadata properties that store information about the parsing process.static PropertyTITLEstatic PropertyTRANSITION_KEYWORDS_TO_DC_SUBJECTDeprecated.use TikaCoreProperties#KEYWORDSstatic PropertyTRANSITION_SUBJECT_TO_DC_DESCRIPTIONDeprecated.use TikaCoreProperties#DESCRIPTIONstatic PropertyTRANSITION_SUBJECT_TO_DC_TITLEDeprecated.use TikaCoreProperties#TITLEstatic PropertyTRANSITION_SUBJECT_TO_OO_SUBJECTDeprecated.use OfficeOpenXMLCore#SUBJECTstatic PropertyTYPE 
 - 
 
- 
- 
Field Detail
- 
TIKA_META_PREFIX
static final java.lang.String TIKA_META_PREFIX
Use this to prefix metadata properties that store information about the parsing process. Users should be able to distinguish between metadata that was contained within the document and metadata about the parsing process. In Tika 2.0 (or earlier?), let's change X-ParsedBy to X-TIKA-Parsed-By.- See Also:
 - Constant Field Values
 
 
- 
TIKA_META_EXCEPTION_PREFIX
static final java.lang.String TIKA_META_EXCEPTION_PREFIX
Use this to store parse exception information in the Metadata object.- See Also:
 - Constant Field Values
 
 
- 
TIKA_META_EXCEPTION_WARNING
static final Property TIKA_META_EXCEPTION_WARNING
Use this to store exceptions caught during a parse that are non-fatal, e.g. if a parser is in lenient mode and more content can be extracted if we ignore an exception thrown by a dependency. 
- 
TIKA_META_EXCEPTION_EMBEDDED_STREAM
static final Property TIKA_META_EXCEPTION_EMBEDDED_STREAM
Use this to store exceptions caught while trying to read the stream of an embedded resource. Do not use this if there is a parse exception on the embedded resource. 
- 
EMBEDDED_RESOURCE_TYPE_KEY
static final java.lang.String EMBEDDED_RESOURCE_TYPE_KEY
- See Also:
 - Constant Field Values
 
 
- 
ORIGINAL_RESOURCE_NAME
static final Property ORIGINAL_RESOURCE_NAME
Some file formats can store information about their original file name/location or about their attachment's original file name/location. 
- 
CONTENT_TYPE_HINT
static final Property CONTENT_TYPE_HINT
This is currently used to identify Content-Type that may be included within a document, such as in html documents (e.g. ) , or the value might come from outside the document. This information may be faulty and should be treated only as a hint. 
- 
CONTENT_TYPE_OVERRIDE
static final Property CONTENT_TYPE_OVERRIDE
 
- 
FORMAT
static final Property FORMAT
- See Also:
 DublinCore.FORMAT
 
- 
IDENTIFIER
static final Property IDENTIFIER
- See Also:
 DublinCore.IDENTIFIER
 
- 
CONTRIBUTOR
static final Property CONTRIBUTOR
- See Also:
 DublinCore.CONTRIBUTOR
 
- 
COVERAGE
static final Property COVERAGE
- See Also:
 DublinCore.COVERAGE
 
- 
CREATOR
static final Property CREATOR
- See Also:
 DublinCore.CREATOR
 
- 
MODIFIER
static final Property MODIFIER
- See Also:
 Office.LAST_AUTHOR
 
- 
CREATOR_TOOL
static final Property CREATOR_TOOL
- See Also:
 XMP.CREATOR_TOOL
 
- 
LANGUAGE
static final Property LANGUAGE
- See Also:
 DublinCore.LANGUAGE
 
- 
PUBLISHER
static final Property PUBLISHER
- See Also:
 DublinCore.PUBLISHER
 
- 
RELATION
static final Property RELATION
- See Also:
 DublinCore.RELATION
 
- 
RIGHTS
static final Property RIGHTS
- See Also:
 DublinCore.RIGHTS
 
- 
SOURCE
static final Property SOURCE
- See Also:
 DublinCore.SOURCE
 
- 
TYPE
static final Property TYPE
- See Also:
 DublinCore.TYPE
 
- 
TITLE
static final Property TITLE
- See Also:
 DublinCore.TITLE
 
- 
DESCRIPTION
static final Property DESCRIPTION
- See Also:
 DublinCore.DESCRIPTION
 
- 
KEYWORDS
static final Property KEYWORDS
DublinCore.SUBJECT; should include both subject and keywords if a document format has both. See alsoOffice.KEYWORDSandOfficeOpenXMLCore.SUBJECT. 
- 
CREATED
static final Property CREATED
- See Also:
 DublinCore.DATE,Office.CREATION_DATE
 
- 
MODIFIED
static final Property MODIFIED
- See Also:
 DublinCore.MODIFIED,Metadata.DATE,Office.SAVE_DATE
 
- 
PRINT_DATE
static final Property PRINT_DATE
- See Also:
 Office.PRINT_DATE
 
- 
METADATA_DATE
static final Property METADATA_DATE
- See Also:
 XMP.METADATA_DATE
 
- 
LATITUDE
static final Property LATITUDE
- See Also:
 Geographic.LATITUDE
 
- 
LONGITUDE
static final Property LONGITUDE
- See Also:
 Geographic.LONGITUDE
 
- 
ALTITUDE
static final Property ALTITUDE
- See Also:
 Geographic.ALTITUDE
 
- 
RATING
static final Property RATING
- See Also:
 XMP.RATING
 
- 
COMMENTS
static final Property COMMENTS
- See Also:
 OfficeOpenXMLExtended.COMMENTS
 
- 
TRANSITION_KEYWORDS_TO_DC_SUBJECT
@Deprecated static final Property TRANSITION_KEYWORDS_TO_DC_SUBJECT
Deprecated.use TikaCoreProperties#KEYWORDS- See Also:
 DublinCore.SUBJECT
 
- 
TRANSITION_SUBJECT_TO_DC_DESCRIPTION
@Deprecated static final Property TRANSITION_SUBJECT_TO_DC_DESCRIPTION
Deprecated.use TikaCoreProperties#DESCRIPTION- See Also:
 OfficeOpenXMLExtended.COMMENTS
 
- 
TRANSITION_SUBJECT_TO_DC_TITLE
@Deprecated static final Property TRANSITION_SUBJECT_TO_DC_TITLE
Deprecated.use TikaCoreProperties#TITLE- See Also:
 DublinCore.TITLE
 
- 
TRANSITION_SUBJECT_TO_OO_SUBJECT
@Deprecated static final Property TRANSITION_SUBJECT_TO_OO_SUBJECT
Deprecated.use OfficeOpenXMLCore#SUBJECT- See Also:
 OfficeOpenXMLCore.SUBJECT
 
- 
EMBEDDED_RESOURCE_TYPE
static final Property EMBEDDED_RESOURCE_TYPE
Embedded resource type property 
- 
HAS_SIGNATURE
static final Property HAS_SIGNATURE
 
 - 
 
 -