|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectgnu.xml.pipeline.EventFilter
gnu.xml.pipeline.LinkFilter
Pipeline filter to remember XHTML links found in a document, so they can later be crawled. Fragments are not counted, and duplicates are ignored. Callers are responsible for filtering out URLs they aren't interested in. Events are passed through unmodified.
Input MUST include a setDocumentLocator() call, as it's used to resolve relative links in the absence of a "base" element. Input MUST also include namespace identifiers, since it is the XHTML namespace identifier which is used to identify the relevant elements.
FIXME: handle xml:base attribute ... in association with a stack of base URIs. Similarly, recognize/support XLink data.
Field Summary |
Fields inherited from class gnu.xml.pipeline.EventFilter |
DECL_HANDLER, FEATURE_URI, LEXICAL_HANDLER, PROPERTY_URI |
Constructor Summary | |
LinkFilter()
Constructs a new event filter, which collects links in private data structure for later enumeration. |
|
LinkFilter(EventConsumer next)
Constructs a new event filter, which collects links in private data structure for later enumeration and passes all events, unmodified, to the next consumer. |
Method Summary | |
void |
endDocument()
Forgets about any base URI information that may be recorded. |
Enumeration |
getLinks()
Returns an enumeration of the links found since the filter was constructed, or since removeAllLinks() was called. |
void |
removeAllLinks()
Removes records about all links reported to the event stream, as if the filter were newly created. |
void |
startDocument()
Reports an error if no Locator has been made available. |
void |
startElement(String uri,
String localName,
String qName,
Attributes atts)
Collects URIs for (X)HTML content from elements which hold them. |
Methods inherited from class gnu.xml.pipeline.EventFilter |
attributeDecl, bind, chainTo, characters, comment, elementDecl, endCDATA, endDTD, endElement, endEntity, endPrefixMapping, externalEntityDecl, getContentHandler, getDocumentLocator, getDTDHandler, getErrorHandler, getNext, getProperty, ignorableWhitespace, internalEntityDecl, notationDecl, processingInstruction, setContentHandler, setDocumentLocator, setDTDHandler, setErrorHandler, setProperty, skippedEntity, startCDATA, startDTD, startEntity, startPrefixMapping, unparsedEntityDecl |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
public LinkFilter()
public LinkFilter(EventConsumer next)
Method Detail |
public Enumeration getLinks()
public void removeAllLinks()
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException
startElement
in interface ContentHandler
startElement
in class EventFilter
SAXException
public void startDocument() throws SAXException
startDocument
in interface ContentHandler
startDocument
in class EventFilter
SAXException
public void endDocument() throws SAXException
endDocument
in interface ContentHandler
endDocument
in class EventFilter
SAXException
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Source code is under GPL (with library exception) in the JAXP project at http://www.gnu.org/software/classpathx/jaxp
This documentation was derived from that source code on 2004-06-11.