Class XHTMLWhitespaceXMLFilter

  • All Implemented Interfaces:
    ContentHandler, DTDHandler, EntityResolver, ErrorHandler, LexicalHandler, XMLFilter, XMLReader

    public class XHTMLWhitespaceXMLFilter
    extends DefaultXMLFilter
    Removes non-semantic whitespaces in XML elements. See http://www.w3.org/TR/html4/struct/text.html#h-9.1 for more details. Possible use cases:
    • UC1: Any white spaces group is removed if it's before a non inline (see NONINLINE_ELEMENTS) element or at the beginning of the document.
    • UC2: Any white spaces group is removed if it's after a non inline (see NONINLINE_ELEMENTS) element or at the end of the document.
    • UC3: Inside inline content any white spaces group become a single space.
    • UC4: Non visible elements (comments, CDATA and NONVISIBLE_ELEMENTS) are invisible and do not cut a white space group. text(sp)(sp)text becomes text(sp)text
    • UC5: Visible empty element like img count as text when grouping white spaces
    • UC6: Semantic comment count as text when grouping white spaces
    Since:
    4.0M1
    Version:
    $Id: 0a43fce9f6d05413910b06619b5016fcf89262e4 $