Class BOMInputStream

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable

    public class BOMInputStream
    extends ProxyInputStream
    This class is used to wrap a stream that includes an encoded ByteOrderMark as its first bytes.

    This class detects these bytes and, if required, can automatically skip them and return the subsequent byte as the first byte in the stream.

    The ByteOrderMark implementation has the following pre-defined BOMs:

    Example 1 - Detect and exclude a UTF-8 BOM

     BOMInputStream bomIn = new BOMInputStream(in);
     if (bomIn.hasBOM()) {
         // has a UTF-8 BOM
     }
     

    Example 2 - Detect a UTF-8 BOM (but don't exclude it)

     boolean include = true;
     BOMInputStream bomIn = new BOMInputStream(in, include);
     if (bomIn.hasBOM()) {
         // has a UTF-8 BOM
     }
     

    Example 3 - Detect Multiple BOMs

     BOMInputStream bomIn = new BOMInputStream(in,
       ByteOrderMark.UTF_16LE, ByteOrderMark.UTF_16BE,
       ByteOrderMark.UTF_32LE, ByteOrderMark.UTF_32BE
       );
     if (bomIn.hasBOM() == false) {
         // No BOM found
     } else if (bomIn.hasBOM(ByteOrderMark.UTF_16LE)) {
         // has a UTF-16LE BOM
     } else if (bomIn.hasBOM(ByteOrderMark.UTF_16BE)) {
         // has a UTF-16BE BOM
     } else if (bomIn.hasBOM(ByteOrderMark.UTF_32LE)) {
         // has a UTF-32LE BOM
     } else if (bomIn.hasBOM(ByteOrderMark.UTF_32BE)) {
         // has a UTF-32BE BOM
     }
     
    Since:
    2.0
    See Also:
    ByteOrderMark, Wikipedia - Byte Order Mark
    • Constructor Summary

      Constructors 
      Constructor Description
      BOMInputStream​(java.io.InputStream delegate)
      Constructs a new BOM InputStream that excludes a ByteOrderMark.UTF_8 BOM.
      BOMInputStream​(java.io.InputStream delegate, boolean include)
      Constructs a new BOM InputStream that detects a a ByteOrderMark.UTF_8 and optionally includes it.
      BOMInputStream​(java.io.InputStream delegate, boolean include, ByteOrderMark... boms)
      Constructs a new BOM InputStream that detects the specified BOMs and optionally includes them.
      BOMInputStream​(java.io.InputStream delegate, ByteOrderMark... boms)
      Constructs a new BOM InputStream that excludes the specified BOMs.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      ByteOrderMark getBOM()
      Return the BOM (Byte Order Mark).
      java.lang.String getBOMCharsetName()
      Return the BOM charset Name - ByteOrderMark.getCharsetName().
      boolean hasBOM()
      Indicates whether the stream contains one of the specified BOMs.
      boolean hasBOM​(ByteOrderMark bom)
      Indicates whether the stream contains the specified BOM.
      void mark​(int readlimit)
      Invokes the delegate's mark(int) method.
      int read()
      Invokes the delegate's read() method, detecting and optionally skipping BOM.
      int read​(byte[] buf)
      Invokes the delegate's read(byte[]) method, detecting and optionally skipping BOM.
      int read​(byte[] buf, int off, int len)
      Invokes the delegate's read(byte[], int, int) method, detecting and optionally skipping BOM.
      void reset()
      Invokes the delegate's reset() method.
      long skip​(long n)
      Invokes the delegate's skip(long) method, detecting and optionally skipping BOM.
      • Methods inherited from class java.io.InputStream

        nullInputStream, readAllBytes, readNBytes, readNBytes, transferTo
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • BOMInputStream

        public BOMInputStream​(java.io.InputStream delegate)
        Constructs a new BOM InputStream that excludes a ByteOrderMark.UTF_8 BOM.
        Parameters:
        delegate - the InputStream to delegate to
      • BOMInputStream

        public BOMInputStream​(java.io.InputStream delegate,
                              boolean include)
        Constructs a new BOM InputStream that detects a a ByteOrderMark.UTF_8 and optionally includes it.
        Parameters:
        delegate - the InputStream to delegate to
        include - true to include the UTF-8 BOM or false to exclude it
      • BOMInputStream

        public BOMInputStream​(java.io.InputStream delegate,
                              ByteOrderMark... boms)
        Constructs a new BOM InputStream that excludes the specified BOMs.
        Parameters:
        delegate - the InputStream to delegate to
        boms - The BOMs to detect and exclude
      • BOMInputStream

        public BOMInputStream​(java.io.InputStream delegate,
                              boolean include,
                              ByteOrderMark... boms)
        Constructs a new BOM InputStream that detects the specified BOMs and optionally includes them.
        Parameters:
        delegate - the InputStream to delegate to
        include - true to include the specified BOMs or false to exclude them
        boms - The BOMs to detect and optionally exclude
    • Method Detail

      • hasBOM

        public boolean hasBOM()
                       throws java.io.IOException
        Indicates whether the stream contains one of the specified BOMs.
        Returns:
        true if the stream has one of the specified BOMs, otherwise false if it does not
        Throws:
        java.io.IOException - if an error reading the first bytes of the stream occurs
      • hasBOM

        public boolean hasBOM​(ByteOrderMark bom)
                       throws java.io.IOException
        Indicates whether the stream contains the specified BOM.
        Parameters:
        bom - The BOM to check for
        Returns:
        true if the stream has the specified BOM, otherwise false if it does not
        Throws:
        java.lang.IllegalArgumentException - if the BOM is not one the stream is configured to detect
        java.io.IOException - if an error reading the first bytes of the stream occurs
      • getBOM

        public ByteOrderMark getBOM()
                             throws java.io.IOException
        Return the BOM (Byte Order Mark).
        Returns:
        The BOM or null if none
        Throws:
        java.io.IOException - if an error reading the first bytes of the stream occurs
      • getBOMCharsetName

        public java.lang.String getBOMCharsetName()
                                           throws java.io.IOException
        Return the BOM charset Name - ByteOrderMark.getCharsetName().
        Returns:
        The BOM charset Name or null if no BOM found
        Throws:
        java.io.IOException - if an error reading the first bytes of the stream occurs
      • read

        public int read()
                 throws java.io.IOException
        Invokes the delegate's read() method, detecting and optionally skipping BOM.
        Overrides:
        read in class ProxyInputStream
        Returns:
        the byte read (excluding BOM) or -1 if the end of stream
        Throws:
        java.io.IOException - if an I/O error occurs
      • read

        public int read​(byte[] buf,
                        int off,
                        int len)
                 throws java.io.IOException
        Invokes the delegate's read(byte[], int, int) method, detecting and optionally skipping BOM.
        Overrides:
        read in class ProxyInputStream
        Parameters:
        buf - the buffer to read the bytes into
        off - The start offset
        len - The number of bytes to read (excluding BOM)
        Returns:
        the number of bytes read or -1 if the end of stream
        Throws:
        java.io.IOException - if an I/O error occurs
      • read

        public int read​(byte[] buf)
                 throws java.io.IOException
        Invokes the delegate's read(byte[]) method, detecting and optionally skipping BOM.
        Overrides:
        read in class ProxyInputStream
        Parameters:
        buf - the buffer to read the bytes into
        Returns:
        the number of bytes read (excluding BOM) or -1 if the end of stream
        Throws:
        java.io.IOException - if an I/O error occurs
      • mark

        public void mark​(int readlimit)
        Invokes the delegate's mark(int) method.
        Overrides:
        mark in class ProxyInputStream
        Parameters:
        readlimit - read ahead limit
      • reset

        public void reset()
                   throws java.io.IOException
        Invokes the delegate's reset() method.
        Overrides:
        reset in class ProxyInputStream
        Throws:
        java.io.IOException - if an I/O error occurs
      • skip

        public long skip​(long n)
                  throws java.io.IOException
        Invokes the delegate's skip(long) method, detecting and optionally skipping BOM.
        Overrides:
        skip in class ProxyInputStream
        Parameters:
        n - the number of bytes to skip
        Returns:
        the number of bytes to skipped or -1 if the end of stream
        Throws:
        java.io.IOException - if an I/O error occurs