Package org.xwiki.xml.html
Interface HTMLCleaner
-
@Role public interface HTMLCleaner
Transforms any HTML content into valid XHTML that can be fed to the XHTML Parser for example.- Since:
- 1.6M1
- Version:
- $Id: 7a5aea04496574c79f20aa94ecac9c1efc07d527 $
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description Document
clean(Reader originalHtmlContent)
Transforms any HTML content into valid XHTML that can be fed to the XHTML Parser for example.Document
clean(Reader originalHtmlContent, HTMLCleanerConfiguration configuration)
Transforms any HTML content into valid XHTML.HTMLCleanerConfiguration
getDefaultConfiguration()
Allows getting the default configuration that will be used thus allowing the user to configure it like adding some more filters before or after or even remove some filters to completely control what filters will be executed.
-
-
-
Method Detail
-
clean
Document clean(Reader originalHtmlContent)
Transforms any HTML content into valid XHTML that can be fed to the XHTML Parser for example. A default configuration is applied for cleaning the original HTML (seegetDefaultConfiguration()
).- Parameters:
originalHtmlContent
- the original content (HTML) to clean- Returns:
- the cleaned HTML as a w3c DOM (this allows further transformations if needed)
-
clean
Document clean(Reader originalHtmlContent, HTMLCleanerConfiguration configuration)
Transforms any HTML content into valid XHTML. A specific cleaning configuration can be passed to control the cleaning process.- Parameters:
originalHtmlContent
- the original HTML content to be cleaned.configuration
- the configuration to use for cleaning the HTML content- Returns:
- the cleaned HTML as a w3c DOM
- Since:
- 1.8.1
-
getDefaultConfiguration
HTMLCleanerConfiguration getDefaultConfiguration()
Allows getting the default configuration that will be used thus allowing the user to configure it like adding some more filters before or after or even remove some filters to completely control what filters will be executed. This is to be used for very specific use cases. In the majority of cases you should instead use the clean API that doesn't require passing a configuration.- Returns:
- the default configuration that will be used to clean the original HTML
- Since:
- 1.8.1
-
-