com.meterware.httpunit.parsing

Interface HTMLParser

public interface HTMLParser

A front end to a DOM parser that can handle HTML.

Since: 1.5.2

Author: Russell Gold Bernhard Wagner

Method Summary
StringgetCleanedText(String string)
Removes any string artifacts placed in the text by the parser.
voidparse(URL baseURL, String pageText, DocumentAdapter adapter)
Parses the specified text string as a Document, registering it in the HTMLPage.
booleansupportsParserWarnings()
Returns true if this parser can display parser warnings.
booleansupportsPreserveTagCase()
Returns true if this parser supports preservation of the case of tag and attribute names.
booleansupportsReturnHTMLDocument()
Returns true if this parser can return an HTMLDocument object.

Method Detail

getCleanedText

public String getCleanedText(String string)
Removes any string artifacts placed in the text by the parser. For example, a parser may choose to encode an HTML entity as a special character. This method should convert that character to normal text.

parse

public void parse(URL baseURL, String pageText, DocumentAdapter adapter)
Parses the specified text string as a Document, registering it in the HTMLPage. Any error reporting will be annotated with the specified URL.

supportsParserWarnings

public boolean supportsParserWarnings()
Returns true if this parser can display parser warnings.

supportsPreserveTagCase

public boolean supportsPreserveTagCase()
Returns true if this parser supports preservation of the case of tag and attribute names.

supportsReturnHTMLDocument

public boolean supportsReturnHTMLDocument()
Returns true if this parser can return an HTMLDocument object.