Class HTMLUtils


  • public final class HTMLUtils
    extends Object
    HTML Utility methods.
    Since:
    1.8.3
    Version:
    $Id: 075360fb08626be92ca752b149f96032c06b4953 $
    • Method Detail

      • toString

        public static String toString​(Document document)
        Parameters:
        document - the W3C Document to transform into a String
        Returns:
        the XML as a String
      • toString

        public static String toString​(Document document,
                                      boolean omitDeclaration,
                                      boolean omitDoctype)
        Parameters:
        document - the W3C Document to transform into a String
        omitDeclaration - whether the XML declaration should be printed or not
        omitDoctype - whether the document type should be printed or not
        Returns:
        the XML as a String
      • stripHTMLEnvelope

        public static void stripHTMLEnvelope​(Document document)
        Strip the HTML envelope if it exists. Precisely this means removing the head tag and move all tags in the body tag directly under the html element. This is useful for example if you wish to insert an HTML fragment into an existing HTML page.
        Parameters:
        document - the w3c Document to strip
      • stripFirstElementInside

        public static void stripFirstElementInside​(Document document,
                                                   String parentTagName,
                                                   String elementTagName)
        Remove the first element inside a parent element and copy the element's children in the parent.
        Parameters:
        document - the w3c document from which to remove the top level paragraph
        parentTagName - the name of the parent tag to look under
        elementTagName - the name of the first element to remove
      • escapeElementText

        public static String escapeElementText​(String content)
        Escapes HTML special characters in a String using numerical HTML entities, so that the resulting string can safely be used as an HTML content text value. For instance, Jim & John will be escaped and can thus be put inside a HTML tag, such as the p tag, as in <p>Jim &amp; John</p>. Specifically, escapes < to &lt;, and & to &amp;.
        Parameters:
        content - the text to escape, may be null.
        Returns:
        a new escaped String, null if null input
        Since:
        12.8RC1, 12.6.3, 11.10.11
      • containsElementText

        public static boolean containsElementText​(CharSequence content)
        Same logic as escapeElementText(String) but only indicates if there is something to escape.
        Parameters:
        content - the content to parse
        Returns:
        true if the passed content contains content that can be interpreted as HTML syntax
        Since:
        12.10, 12.6.5
        See Also:
        escapeElementText(String)