public interface HTMLParser
Modifier and Type | Method and Description |
---|---|
String |
getCleanedText(String string)
Removes any string artifacts placed in the text by the parser.
|
void |
parse(URL baseURL,
String pageText,
DocumentAdapter adapter)
Parses the specified text string as a Document, registering it in the HTMLPage.
|
boolean |
supportsForceTagCase()
Returns true if this parser supports forcing the upper/lower case of tag and attribute names.
|
boolean |
supportsParserWarnings()
Returns true if this parser can display parser warnings.
|
boolean |
supportsPreserveTagCase()
Returns true if this parser supports preservation of the case of tag and attribute names.
|
boolean |
supportsReturnHTMLDocument()
Returns true if this parser can return an HTMLDocument object.
|
void parse(URL baseURL, String pageText, DocumentAdapter adapter) throws IOException, SAXException
IOException
SAXException
String getCleanedText(String string)
boolean supportsPreserveTagCase()
boolean supportsForceTagCase()
boolean supportsReturnHTMLDocument()
boolean supportsParserWarnings()
Copyright © 2012. All Rights Reserved.