org.apache.xml.serialize

Class BaseMarkupSerializer

public abstract class BaseMarkupSerializer extends Object implements ContentHandler, DocumentHandler, LexicalHandler, DTDHandler, DeclHandler, DOMSerializer, Serializer

Base class for a serializer supporting both DOM and SAX pretty serializing of XML/HTML/XHTML documents. Derives classes perform the method-specific serializing, this class provides the common serializing mechanisms.

The serializer must be initialized with the proper writer and output format before it can be used by calling {@link #setOutputCharStream} or {@link #setOutputByteStream} for the writer and {@link #setOutputFormat} for the output format.

The serializer can be reused any number of times, but cannot be used concurrently by two threads.

If an output stream is used, the encoding is taken from the output format (defaults to UTF-8). If a writer is used, make sure the writer uses the same encoding (if applies) as specified in the output format.

The serializer supports both DOM and SAX. DOM serializing is done by calling {@link #serialize(Document)} and SAX serializing is done by firing SAX events and using the serializer as a document handler. This also applies to derived class.

If an I/O exception occurs while serializing, the serializer will not throw an exception directly, but only throw it at the end of serializing (either DOM or SAX's {@link org.xml.sax.DocumentHandler#endDocument}.

For elements that are not specified as whitespace preserving, the serializer will potentially break long text lines at space boundaries, indent lines, and serialize elements on separate lines. Line terminators will be regarded as spaces, and spaces at beginning of line will be stripped.

When indenting, the serializer is capable of detecting seemingly element content, and serializing these elements indented on separate lines. An element is serialized indented when it is the first or last child of an element, or immediate following or preceding another element.

Version: $Revision: 1.57 $ $Date: 2005/05/02 21:58:58 $

Author: Assaf Arkin Rahul Srivastava Elena Litani, IBM

See Also: Serializer LSSerializer

Field Summary
protected NodefCurrentNode
Current node that is being processed
protected DOMErrorImplfDOMError
protected DOMErrorHandlerfDOMErrorHandler
protected LSSerializerFilterfDOMFilter
protected shortfeatures
protected StringBufferfStrBuffer
Temporary buffer to store character data
protected String_docTypePublicId
The system identifier of the document type, if known.
protected String_docTypeSystemId
The system identifier of the document type, if known.
protected EncodingInfo_encodingInfo
protected OutputFormat_format
The output format associated with this serializer.
protected boolean_indenting
True if indenting printer.
protected Hashtable_prefixes
Association between namespace URIs (keys) and prefixes (values).
protected Printer_printer
The printer used for printing text parts.
protected boolean_started
If the document has been started (header serialized), this flag is set to true so it's not started twice.
Constructor Summary
protected BaseMarkupSerializer(OutputFormat format)
Protected constructor can only be used by derived class.
Method Summary
ContentHandlerasContentHandler()
DocumentHandlerasDocumentHandler()
DOMSerializerasDOMSerializer()
voidattributeDecl(String eName, String aName, String type, String valueDefault, String value)
voidcharacters(char[] chars, int start, int length)
protected voidcharacters(String text)
Called to print the text contents in the prevailing element format.
protected voidcheckUnboundNamespacePrefixedNode(Node node)
DOM level 3: Check a node to determine if it contains unbound namespace prefixes.
voidcomment(char[] chars, int start, int length)
voidcomment(String text)
protected ElementStatecontent()
Must be called by a method about to print any type of content.
voidelementDecl(String name, String model)
voidendCDATA()
voidendDocument()
Called at the end of the document to wrap it up.
voidendDTD()
voidendEntity(String name)
voidendNonEscaping()
voidendPrefixMapping(String prefix)
voidendPreserving()
protected ElementStateenterElementState(String namespaceURI, String localName, String rawName, boolean preserveSpace)
Enter a new element state for the specified element.
voidexternalEntityDecl(String name, String publicId, String systemId)
protected voidfatalError(String message)
protected ElementStategetElementState()
Return the state of the current element.
protected abstract StringgetEntityRef(int ch)
Returns the suitable entity reference for this character value, or null if no such entity exists.
protected StringgetPrefix(String namespaceURI)
Returns the namespace prefix for the specified URI.
voidignorableWhitespace(char[] chars, int start, int length)
voidinternalEntityDecl(String name, String value)
protected booleanisDocumentState()
Returns true if in the state of the document.
protected ElementStateleaveElementState()
Leave the current element state and return to the state of the parent element.
protected DOMErrormodifyDOMError(String message, short severity, String type, Node node)
The method modifies global DOM error object
voidnotationDecl(String name, String publicId, String systemId)
protected voidprepare()
protected voidprintCDATAText(String text)
protected voidprintDoctypeURL(String url)
Print a document type public or system identifier URL.
protected voidprintEscaped(int ch)
protected voidprintEscaped(String source)
Escapes a string so it may be printed as text content or attribute value.
protected voidprintText(char[] chars, int start, int length, boolean preserveSpace, boolean unescaped)
Called to print additional text with whitespace handling.
protected voidprintText(String text, boolean preserveSpace, boolean unescaped)
voidprocessingInstruction(String target, String code)
voidprocessingInstructionIO(String target, String code)
booleanreset()
voidserialize(Element elem)
Serializes the DOM element using the previously specified writer and output format.
voidserialize(DocumentFragment frag)
Serializes the DOM document fragmnt using the previously specified writer and output format.
voidserialize(Document doc)
Serializes the DOM document using the previously specified writer and output format.
protected abstract voidserializeElement(Element elem)
Called to serializee the DOM element.
protected voidserializeNode(Node node)
Serialize the DOM node.
protected voidserializePreRoot()
Comments and PIs cannot be serialized before the root element, because the root element serializes the document type, which generally comes first.
voidsetDocumentLocator(Locator locator)
voidsetOutputByteStream(OutputStream output)
voidsetOutputCharStream(Writer writer)
voidsetOutputFormat(OutputFormat format)
voidskippedEntity(String name)
voidstartCDATA()
voidstartDocument()
voidstartDTD(String name, String publicId, String systemId)
voidstartEntity(String name)
voidstartNonEscaping()
voidstartPrefixMapping(String prefix, String uri)
voidstartPreserving()
protected voidsurrogates(int high, int low)
voidunparsedEntityDecl(String name, String publicId, String systemId, String notationName)

Field Detail

fCurrentNode

protected Node fCurrentNode
Current node that is being processed

fDOMError

protected final DOMErrorImpl fDOMError

fDOMErrorHandler

protected DOMErrorHandler fDOMErrorHandler

fDOMFilter

protected LSSerializerFilter fDOMFilter

features

protected short features

fStrBuffer

protected final StringBuffer fStrBuffer
Temporary buffer to store character data

_docTypePublicId

protected String _docTypePublicId
The system identifier of the document type, if known.

_docTypeSystemId

protected String _docTypeSystemId
The system identifier of the document type, if known.

_encodingInfo

protected EncodingInfo _encodingInfo

_format

protected OutputFormat _format
The output format associated with this serializer. This will never be a null reference. If no format was passed to the constructor, the default one for this document type will be used. The format object is never changed by the serializer.

_indenting

protected boolean _indenting
True if indenting printer.

_prefixes

protected Hashtable _prefixes
Association between namespace URIs (keys) and prefixes (values). Accumulated here prior to starting an element and placing this list in the element state.

_printer

protected Printer _printer
The printer used for printing text parts.

_started

protected boolean _started
If the document has been started (header serialized), this flag is set to true so it's not started twice.

Constructor Detail

BaseMarkupSerializer

protected BaseMarkupSerializer(OutputFormat format)
Protected constructor can only be used by derived class. Must initialize the serializer before serializing any document, by calling {@link #setOutputCharStream} or {@link #setOutputByteStream} first

Method Detail

asContentHandler

public ContentHandler asContentHandler()

asDocumentHandler

public DocumentHandler asDocumentHandler()

asDOMSerializer

public DOMSerializer asDOMSerializer()

attributeDecl

public void attributeDecl(String eName, String aName, String type, String valueDefault, String value)

characters

public void characters(char[] chars, int start, int length)

characters

protected void characters(String text)
Called to print the text contents in the prevailing element format. Since this method is capable of printing text as CDATA, it is used for that purpose as well. White space handling is determined by the current element state. In addition, the output format can dictate whether the text is printed as CDATA or unescaped.

Parameters: text The text to print unescaped True is should print unescaped

Throws: IOException An I/O exception occured while serializing

checkUnboundNamespacePrefixedNode

protected void checkUnboundNamespacePrefixedNode(Node node)
DOM level 3: Check a node to determine if it contains unbound namespace prefixes.

Parameters: node The node to check for unbound namespace prefices

comment

public void comment(char[] chars, int start, int length)

comment

public void comment(String text)

content

protected ElementState content()
Must be called by a method about to print any type of content. If the element was just opened, the opening tag is closed and will be matched to a closing tag. Returns the current element state with empty and afterElement set to false.

Returns: The current element state

Throws: IOException An I/O exception occured while serializing

elementDecl

public void elementDecl(String name, String model)

endCDATA

public void endCDATA()

endDocument

public void endDocument()
Called at the end of the document to wrap it up. Will flush the output stream and throw an exception if any I/O error occured while serializing.

Throws: SAXException An I/O exception occured during serializing

endDTD

public void endDTD()

endEntity

public void endEntity(String name)

endNonEscaping

public void endNonEscaping()

endPrefixMapping

public void endPrefixMapping(String prefix)

endPreserving

public void endPreserving()

enterElementState

protected ElementState enterElementState(String namespaceURI, String localName, String rawName, boolean preserveSpace)
Enter a new element state for the specified element. Tag name and space preserving is specified, element state is initially empty.

Returns: Current element state, or null

externalEntityDecl

public void externalEntityDecl(String name, String publicId, String systemId)

fatalError

protected void fatalError(String message)

getElementState

protected ElementState getElementState()
Return the state of the current element.

Returns: Current element state

getEntityRef

protected abstract String getEntityRef(int ch)
Returns the suitable entity reference for this character value, or null if no such entity exists. Calling this method with '&' will return "&".

Parameters: ch Character value

Returns: Character entity name, or null

getPrefix

protected String getPrefix(String namespaceURI)
Returns the namespace prefix for the specified URI. If the URI has been mapped to a prefix, returns the prefix, otherwise returns null.

Parameters: namespaceURI The namespace URI

Returns: The namespace prefix if known, or null

ignorableWhitespace

public void ignorableWhitespace(char[] chars, int start, int length)

internalEntityDecl

public void internalEntityDecl(String name, String value)

isDocumentState

protected boolean isDocumentState()
Returns true if in the state of the document. Returns true before entering any element and after leaving the root element.

Returns: True if in the state of the document

leaveElementState

protected ElementState leaveElementState()
Leave the current element state and return to the state of the parent element. If this was the root element, return to the state of the document.

Returns: Previous element state

modifyDOMError

protected DOMError modifyDOMError(String message, short severity, String type, Node node)
The method modifies global DOM error object

Parameters: message severity type

Returns: a DOMError

notationDecl

public void notationDecl(String name, String publicId, String systemId)

prepare

protected void prepare()

printCDATAText

protected void printCDATAText(String text)

printDoctypeURL

protected void printDoctypeURL(String url)
Print a document type public or system identifier URL. Encapsulates the URL in double quotes, escapes non-printing characters and print it equivalent to {@link #printText}.

Parameters: url The document type url to print

printEscaped

protected void printEscaped(int ch)

printEscaped

protected void printEscaped(String source)
Escapes a string so it may be printed as text content or attribute value. Non printable characters are escaped using character references. Where the format specifies a deault entity reference, that reference is used (e.g. <).

Parameters: source The string to escape

printText

protected void printText(char[] chars, int start, int length, boolean preserveSpace, boolean unescaped)
Called to print additional text with whitespace handling. If spaces are preserved, the text is printed as if by calling {@link #printText(String,boolean,boolean)} with a call to {@link Printer#breakLine} for each new line. If spaces are not preserved, the text is broken at space boundaries if longer than the line width; Multiple spaces are printed as such, but spaces at beginning of line are removed.

Parameters: text The text to print preserveSpace Space preserving flag unescaped Print unescaped

printText

protected void printText(String text, boolean preserveSpace, boolean unescaped)

processingInstruction

public final void processingInstruction(String target, String code)

processingInstructionIO

public void processingInstructionIO(String target, String code)

reset

public boolean reset()

serialize

public void serialize(Element elem)
Serializes the DOM element using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.

Parameters: elem The element to serialize

Throws: IOException An I/O exception occured while serializing

serialize

public void serialize(DocumentFragment frag)
Serializes the DOM document fragmnt using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.

Parameters: elem The element to serialize

Throws: IOException An I/O exception occured while serializing

serialize

public void serialize(Document doc)
Serializes the DOM document using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.

Parameters: doc The document to serialize

Throws: IOException An I/O exception occured while serializing

serializeElement

protected abstract void serializeElement(Element elem)
Called to serializee the DOM element. The element is serialized based on the serializer's method (XML, HTML, XHTML).

Parameters: elem The element to serialize

Throws: IOException An I/O exception occured while serializing

serializeNode

protected void serializeNode(Node node)
Serialize the DOM node. This method is shared across XML, HTML and XHTML serializers and the differences are masked out in a separate {@link #serializeElement}.

Parameters: node The node to serialize

Throws: IOException An I/O exception occured while serializing

See Also: BaseMarkupSerializer

serializePreRoot

protected void serializePreRoot()
Comments and PIs cannot be serialized before the root element, because the root element serializes the document type, which generally comes first. Instead such PIs and comments are accumulated inside a vector and serialized by calling this method. Will be called when the root element is serialized and when the document finished serializing.

Throws: IOException An I/O exception occured while serializing

setDocumentLocator

public void setDocumentLocator(Locator locator)

setOutputByteStream

public void setOutputByteStream(OutputStream output)

setOutputCharStream

public void setOutputCharStream(Writer writer)

setOutputFormat

public void setOutputFormat(OutputFormat format)

skippedEntity

public void skippedEntity(String name)

startCDATA

public void startCDATA()

startDocument

public void startDocument()

startDTD

public final void startDTD(String name, String publicId, String systemId)

startEntity

public void startEntity(String name)

startNonEscaping

public void startNonEscaping()

startPrefixMapping

public void startPrefixMapping(String prefix, String uri)

startPreserving

public void startPreserving()

surrogates

protected void surrogates(int high, int low)

unparsedEntityDecl

public void unparsedEntityDecl(String name, String publicId, String systemId, String notationName)
Copyright B) 1999-2005 Apache XML Project. All Rights Reserved.