Class SafeContentHandler

  • All Implemented Interfaces:
    org.xml.sax.ContentHandler, org.xml.sax.DTDHandler, org.xml.sax.EntityResolver, org.xml.sax.ErrorHandler
    Direct Known Subclasses:
    XHTMLContentHandler, XMPContentHandler

    public class SafeContentHandler
    extends ContentHandlerDecorator
    Content handler decorator that makes sure that the character events (characters(char[], int, int) or ignorableWhitespace(char[], int, int)) passed to the decorated content handler contain only valid XML characters. All invalid characters are replaced with the Unicode replacement character U+FFFD (though a subclass may change this by overriding the writeReplacement(Output) method).

    The XML standard defines the following Unicode character ranges as valid XML characters:

     #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
     

    Note that currently this class only detects those invalid characters whose UTF-16 representation fits a single char. Also, this class does not ensure that the UTF-16 encoding of incoming characters is correct.

    • Constructor Detail

      • SafeContentHandler

        public SafeContentHandler​(org.xml.sax.ContentHandler handler)
    • Method Detail

      • startElement

        public void startElement​(java.lang.String uri,
                                 java.lang.String localName,
                                 java.lang.String name,
                                 org.xml.sax.Attributes atts)
                          throws org.xml.sax.SAXException
        Specified by:
        startElement in interface org.xml.sax.ContentHandler
        Overrides:
        startElement in class ContentHandlerDecorator
        Throws:
        org.xml.sax.SAXException
      • endElement

        public void endElement​(java.lang.String uri,
                               java.lang.String localName,
                               java.lang.String name)
                        throws org.xml.sax.SAXException
        Specified by:
        endElement in interface org.xml.sax.ContentHandler
        Overrides:
        endElement in class ContentHandlerDecorator
        Throws:
        org.xml.sax.SAXException
      • endDocument

        public void endDocument()
                         throws org.xml.sax.SAXException
        Specified by:
        endDocument in interface org.xml.sax.ContentHandler
        Overrides:
        endDocument in class ContentHandlerDecorator
        Throws:
        org.xml.sax.SAXException
      • characters

        public void characters​(char[] ch,
                               int start,
                               int length)
                        throws org.xml.sax.SAXException
        Specified by:
        characters in interface org.xml.sax.ContentHandler
        Overrides:
        characters in class ContentHandlerDecorator
        Throws:
        org.xml.sax.SAXException
      • ignorableWhitespace

        public void ignorableWhitespace​(char[] ch,
                                        int start,
                                        int length)
                                 throws org.xml.sax.SAXException
        Specified by:
        ignorableWhitespace in interface org.xml.sax.ContentHandler
        Overrides:
        ignorableWhitespace in class ContentHandlerDecorator
        Throws:
        org.xml.sax.SAXException