public class TikaInputStream extends TaggedInputStream
InputStream
instance passed through the
Parser
interface and other similar APIs.
TikaInputStream instances can be created using the various static
get()
factory methods. Most of these methods take an optional
Metadata
argument that is then filled with the available input
metadata from the given resource. The created TikaInputStream instance
keeps track of the original resource used to create it, while behaving
otherwise just like a normal, buffered InputStream
.
A TikaInputStream instance is also guaranteed to support the
mark(int)
feature.
Code that wants to access the underlying file or other resources
associated with a TikaInputStream should first use the
get(InputStream)
factory method to cast or wrap a given
InputStream
into a TikaInputStream instance.
Modifier and Type | Method and Description |
---|---|
static TikaInputStream |
cast(java.io.InputStream stream)
Returns the given stream casts to a TikaInputStream, or
null if the stream is not a TikaInputStream. |
void |
close()
Invokes the delegate's
close() method. |
static TikaInputStream |
get(java.sql.Blob blob)
Creates a TikaInputStream from the given database BLOB.
|
static TikaInputStream |
get(java.sql.Blob blob,
Metadata metadata)
Creates a TikaInputStream from the given database BLOB.
|
static TikaInputStream |
get(byte[] data)
Creates a TikaInputStream from the given array of bytes.
|
static TikaInputStream |
get(byte[] data,
Metadata metadata)
Creates a TikaInputStream from the given array of bytes.
|
static TikaInputStream |
get(java.io.File file)
Deprecated.
use
get(Path) . In Tika 2.0, this will be removed
or modified to throw an IOException. |
static TikaInputStream |
get(java.io.File file,
Metadata metadata)
Deprecated.
use
get(Path, Metadata) . In Tika 2.0,
this will be removed or modified to throw an IOException. |
static TikaInputStream |
get(java.io.InputStream stream)
Casts or wraps the given stream to a TikaInputStream instance.
|
static TikaInputStream |
get(java.io.InputStream stream,
TemporaryResources tmp)
Casts or wraps the given stream to a TikaInputStream instance.
|
static TikaInputStream |
get(java.nio.file.Path path)
Creates a TikaInputStream from the file at the given path.
|
static TikaInputStream |
get(java.nio.file.Path path,
Metadata metadata)
Creates a TikaInputStream from the file at the given path.
|
static TikaInputStream |
get(java.net.URI uri)
Creates a TikaInputStream from the resource at the given URI.
|
static TikaInputStream |
get(java.net.URI uri,
Metadata metadata)
Creates a TikaInputStream from the resource at the given URI.
|
static TikaInputStream |
get(java.net.URL url)
Creates a TikaInputStream from the resource at the given URL.
|
static TikaInputStream |
get(java.net.URL url,
Metadata metadata)
Creates a TikaInputStream from the resource at the given URL.
|
java.io.File |
getFile() |
java.nio.channels.FileChannel |
getFileChannel() |
long |
getLength()
Returns the length (in bytes) of this stream.
|
java.lang.Object |
getOpenContainer()
Returns the open container object, such as a
POIFS FileSystem in the event of an OLE2
document being detected and processed by
the OLE2 detector.
|
java.nio.file.Path |
getPath()
If the user created this TikaInputStream with a file,
the original file will be returned.
|
java.nio.file.Path |
getPath(int maxBytes) |
long |
getPosition()
Returns the current position within the stream.
|
boolean |
hasFile() |
boolean |
hasLength() |
static boolean |
isTikaInputStream(java.io.InputStream stream)
Checks whether the given stream is a TikaInputStream instance.
|
void |
mark(int readlimit)
Invokes the delegate's
mark(int) method. |
boolean |
markSupported()
Invokes the delegate's
markSupported() method. |
int |
peek(byte[] buffer)
Fills the given buffer with upcoming bytes from this stream without
advancing the current stream position.
|
void |
reset()
Invokes the delegate's
reset() method. |
void |
setOpenContainer(java.lang.Object container)
Stores the open container object against
the stream, eg after a Zip contents
detector has loaded the file to decide
what it contains.
|
long |
skip(long ln)
Invokes the delegate's
skip(long) method. |
java.lang.String |
toString() |
isCauseOf, throwIfCauseOf
available, read, read, read
public static boolean isTikaInputStream(java.io.InputStream stream)
null
, in which case the return
value is false
.stream
- input stream, possibly null
true
if the stream is a TikaInputStream instance,
false
otherwisepublic static TikaInputStream get(java.io.InputStream stream, TemporaryResources tmp)
The given temporary file provider is used for any temporary files, and should be disposed when the returned stream is no longer used.
Use this method instead of the get(InputStream)
alternative
when you don't explicitly close the returned stream. The
recommended access pattern is:
try (TemporaryResources tmp = new TemporaryResources()) { TikaInputStream stream = TikaInputStream.get(..., tmp); // process stream but don't close it }
The given stream instance will not be closed when the
TemporaryResources.close()
method is called by the
try-with-resources statement. The caller is expected to explicitly
close the original stream when it's no longer used.
stream
- normal input streampublic static TikaInputStream get(java.io.InputStream stream)
Use this method instead of the
get(InputStream, TemporaryResources)
alternative when you
do explicitly close the returned stream. The recommended
access pattern is:
try (TikaInputStream stream = TikaInputStream.get(...)) { // process stream }
The given stream instance will be closed along with any other resources
associated with the returned TikaInputStream instance when the
close()
method is called by the try-with-resources statement.
stream
- normal input streampublic static TikaInputStream cast(java.io.InputStream stream)
null
if the stream is not a TikaInputStream.stream
- normal input streampublic static TikaInputStream get(byte[] data)
Note that you must always explicitly close the returned stream as in some cases it may end up writing the given data to a temporary file.
data
- input datapublic static TikaInputStream get(byte[] data, Metadata metadata)
Note that you must always explicitly close the returned stream as in some cases it may end up writing the given data to a temporary file.
data
- input datametadata
- metadata instancejava.io.IOException
public static TikaInputStream get(java.nio.file.Path path) throws java.io.IOException
Note that you must always explicitly close the returned stream to prevent leaking open file handles.
path
- input filejava.io.IOException
- if an I/O error occurspublic static TikaInputStream get(java.nio.file.Path path, Metadata metadata) throws java.io.IOException
Note that you must always explicitly close the returned stream to prevent leaking open file handles.
path
- input filemetadata
- metadata instancejava.io.IOException
- if an I/O error occurs@Deprecated public static TikaInputStream get(java.io.File file) throws java.io.FileNotFoundException
get(Path)
. In Tika 2.0, this will be removed
or modified to throw an IOException.Note that you must always explicitly close the returned stream to prevent leaking open file handles.
file
- input filejava.io.FileNotFoundException
- if the file does not exist@Deprecated public static TikaInputStream get(java.io.File file, Metadata metadata) throws java.io.FileNotFoundException
get(Path, Metadata)
. In Tika 2.0,
this will be removed or modified to throw an IOException.Note that you must always explicitly close the returned stream to prevent leaking open file handles.
file
- input filemetadata
- metadata instancejava.io.FileNotFoundException
- if the file does not exist
or cannot be opened for readingpublic static TikaInputStream get(java.sql.Blob blob) throws java.sql.SQLException
Note that the result set containing the BLOB may need to be kept open until the returned TikaInputStream has been processed and closed. You must also always explicitly close the returned stream as in some cases it may end up writing the blob data to a temporary file.
blob
- database BLOBjava.sql.SQLException
- if BLOB data can not be accessedpublic static TikaInputStream get(java.sql.Blob blob, Metadata metadata) throws java.sql.SQLException
Note that the result set containing the BLOB may need to be kept open until the returned TikaInputStream has been processed and closed. You must also always explicitly close the returned stream as in some cases it may end up writing the blob data to a temporary file.
blob
- database BLOBmetadata
- metadata instancejava.sql.SQLException
- if BLOB data can not be accessedpublic static TikaInputStream get(java.net.URI uri) throws java.io.IOException
Note that you must always explicitly close the returned stream as in some cases it may end up writing the resource to a temporary file.
uri
- resource URIjava.io.IOException
- if the resource can not be accessedpublic static TikaInputStream get(java.net.URI uri, Metadata metadata) throws java.io.IOException
Note that you must always explicitly close the returned stream as in some cases it may end up writing the resource to a temporary file.
uri
- resource URImetadata
- metadata instancejava.io.IOException
- if the resource can not be accessedpublic static TikaInputStream get(java.net.URL url) throws java.io.IOException
Note that you must always explicitly close the returned stream as in some cases it may end up writing the resource to a temporary file.
url
- resource URLjava.io.IOException
- if the resource can not be accessedpublic static TikaInputStream get(java.net.URL url, Metadata metadata) throws java.io.IOException
Note that you must always explicitly close the returned stream as in some cases it may end up writing the resource to a temporary file.
url
- resource URLmetadata
- metadata instancejava.io.IOException
- if the resource can not be accessedpublic int peek(byte[] buffer) throws java.io.IOException
buffer
- byte bufferjava.io.IOException
- if the stream can not be readpublic java.lang.Object getOpenContainer()
public void setOpenContainer(java.lang.Object container)
public boolean hasFile()
public java.nio.file.Path getPath() throws java.io.IOException
java.io.IOException
public java.nio.file.Path getPath(int maxBytes) throws java.io.IOException
maxBytes
- if this is less than 0 and if an underlying file doesn't already exist,
the full file will be spooled to diskmaxBytes
, or null
if the underlying stream was longer than maxBytes.java.io.IOException
public java.io.File getFile() throws java.io.IOException
java.io.IOException
getPath()
public java.nio.channels.FileChannel getFileChannel() throws java.io.IOException
java.io.IOException
public boolean hasLength()
public long getLength() throws java.io.IOException
getPath()
method to buffer the entire stream to
a temporary file in order to calculate the stream length. This case
will only work if the stream has not yet been consumed.java.io.IOException
- if the length can not be determinedpublic long getPosition()
public long skip(long ln) throws java.io.IOException
ProxyInputStream
skip(long)
method.skip
in class ProxyInputStream
ln
- the number of bytes to skipjava.io.IOException
- if an I/O error occurspublic void mark(int readlimit)
ProxyInputStream
mark(int)
method.mark
in class ProxyInputStream
readlimit
- read ahead limitpublic boolean markSupported()
ProxyInputStream
markSupported()
method.markSupported
in class ProxyInputStream
public void reset() throws java.io.IOException
ProxyInputStream
reset()
method.reset
in class ProxyInputStream
java.io.IOException
- if an I/O error occurspublic void close() throws java.io.IOException
ProxyInputStream
close()
method.close
in interface java.io.Closeable
close
in interface java.lang.AutoCloseable
close
in class ProxyInputStream
java.io.IOException
- if an I/O error occurspublic java.lang.String toString()
toString
in class TaggedInputStream
"Copyright © 2010 - 2020 Adobe Systems Incorporated. All Rights Reserved"