com.lowagie.text.pdf
Class SimpleXMLParser

java.lang.Object
  extended bycom.lowagie.text.pdf.SimpleXMLParser

public class SimpleXMLParser
extends Object

A simple XML and HTML parser. This parser is, like the SAX parser, an event based parser, but with much less functionality.

The parser can:

The code is based on http://www.javaworld.com/javatips/jw-javatip128_p.html with some extra code from XERCES to recognize the encoding.


Field Summary
private static int ATTRIBUTE_EQUAL
           
private static int ATTRIBUTE_LVALUE
           
private static int ATTRIBUTE_RVALUE
           
private static int CDATA
           
private static int CLOSE_TAG
           
private static int COMMENT
           
private static int DOCTYPE
           
private static int DONE
           
private static int ENTITY
           
private static HashMap entityMap
           
private static HashMap fIANA2JavaMap
           
private static int IN_TAG
           
private static int OPEN_TAG
           
private static int PRE
           
private static int QUOTE
           
private static int SINGLE_TAG
           
private static int START_TAG
           
private static int TEXT
           
 
Constructor Summary
private SimpleXMLParser()
           
 
Method Summary
static char decodeEntity(String s)
           
static String escapeXML(String s, boolean onlyASCII)
          Escapes a string with the appropriated XML codes.
private static void exc(String s, int line, int col)
           
private static String getDeclaredEncoding(String decl)
           
private static String getEncodingName(byte[] b4)
           
static String getJavaEncoding(String iana)
          Gets the java encoding from the IANA encoding.
static void parse(SimpleXMLDocHandler doc, InputStream in)
          Parses the XML document firing the events to the handler.
static void parse(SimpleXMLDocHandler doc, Reader r)
           
static void parse(SimpleXMLDocHandler doc, SimpleXMLDocHandlerComment comment, Reader r, boolean html)
          Parses the XML document firing the events to the handler.
private static int popMode(Stack st)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

fIANA2JavaMap

private static final HashMap fIANA2JavaMap

entityMap

private static final HashMap entityMap

TEXT

private static final int TEXT
See Also:
Constant Field Values

ENTITY

private static final int ENTITY
See Also:
Constant Field Values

OPEN_TAG

private static final int OPEN_TAG
See Also:
Constant Field Values

CLOSE_TAG

private static final int CLOSE_TAG
See Also:
Constant Field Values

START_TAG

private static final int START_TAG
See Also:
Constant Field Values

ATTRIBUTE_LVALUE

private static final int ATTRIBUTE_LVALUE
See Also:
Constant Field Values

ATTRIBUTE_EQUAL

private static final int ATTRIBUTE_EQUAL
See Also:
Constant Field Values

ATTRIBUTE_RVALUE

private static final int ATTRIBUTE_RVALUE
See Also:
Constant Field Values

QUOTE

private static final int QUOTE
See Also:
Constant Field Values

IN_TAG

private static final int IN_TAG
See Also:
Constant Field Values

SINGLE_TAG

private static final int SINGLE_TAG
See Also:
Constant Field Values

COMMENT

private static final int COMMENT
See Also:
Constant Field Values

DONE

private static final int DONE
See Also:
Constant Field Values

DOCTYPE

private static final int DOCTYPE
See Also:
Constant Field Values

PRE

private static final int PRE
See Also:
Constant Field Values

CDATA

private static final int CDATA
See Also:
Constant Field Values
Constructor Detail

SimpleXMLParser

private SimpleXMLParser()
Method Detail

popMode

private static int popMode(Stack st)

parse

public static void parse(SimpleXMLDocHandler doc,
                         InputStream in)
                  throws IOException
Parses the XML document firing the events to the handler.

Parameters:
doc - the document handler
in - the document. The encoding is deduced from the stream. The stream is not closed
Throws:
IOException - on error

getDeclaredEncoding

private static String getDeclaredEncoding(String decl)

getJavaEncoding

public static String getJavaEncoding(String iana)
Gets the java encoding from the IANA encoding. If the encoding cannot be found it returns the input.

Parameters:
iana - the IANA encoding
Returns:
the java encoding

parse

public static void parse(SimpleXMLDocHandler doc,
                         Reader r)
                  throws IOException
Throws:
IOException

parse

public static void parse(SimpleXMLDocHandler doc,
                         SimpleXMLDocHandlerComment comment,
                         Reader r,
                         boolean html)
                  throws IOException
Parses the XML document firing the events to the handler.

Parameters:
doc - the document handler
r - the document. The encoding is already resolved. The reader is not closed
Throws:
IOException - on error

exc

private static void exc(String s,
                        int line,
                        int col)
                 throws IOException
Throws:
IOException

escapeXML

public static String escapeXML(String s,
                               boolean onlyASCII)
Escapes a string with the appropriated XML codes.

Parameters:
s - the string to be escaped
onlyASCII - codes above 127 will always be escaped with &#nn; if true
Returns:
the escaped string

decodeEntity

public static char decodeEntity(String s)

getEncodingName

private static String getEncodingName(byte[] b4)