|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.google.gdata.util.common.base.UnicodeEscaper
public abstract class UnicodeEscaper
An Escaper
that converts literal text into a format safe for
inclusion in a particular context (such as an XML document). Typically (but
not always), the inverse process of "unescaping" the text is performed
automatically by the relevant parser.
For example, an XML escaper would convert the literal string "Foo<Bar>"
into "Foo<Bar>"
to prevent "<Bar>"
from
being confused with an XML tag. When the resulting XML document is parsed,
the parser API will return this text as the original literal string "Foo<Bar>"
.
Note: This class is similar to CharEscaper
but with one
very important difference. A CharEscaper can only process Java
UTF16 characters in
isolation and may not cope when it encounters surrogate pairs. This class
facilitates the correct escaping of all Unicode characters.
As there are important reasons, including potential security issues, to handle Unicode correctly if you are considering implementing a new escaper you should favor using UnicodeEscaper wherever possible.
A UnicodeEscaper
instance is required to be stateless, and safe
when used concurrently by multiple threads.
Several popular escapers are defined as constants in the class CharEscapers
. To create your own escapers extend this class and implement
the escape(int)
method.
Constructor Summary | |
---|---|
UnicodeEscaper()
|
Method Summary | |
---|---|
java.lang.Appendable |
escape(java.lang.Appendable out)
Returns an Appendable instance which automatically escapes all
text appended to it before passing the resulting text to an underlying
Appendable . |
java.lang.String |
escape(java.lang.String string)
Returns the escaped form of a given literal string. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public UnicodeEscaper()
Method Detail |
---|
public java.lang.String escape(java.lang.String string)
If you are escaping input in arbitrary successive chunks, then it is not
generally safe to use this method. If an input string ends with an
unmatched high surrogate character, then this method will throw
IllegalArgumentException
. You should either ensure your input is
valid UTF-16 before
calling this method or use an escaped Appendable
(as returned by
escape(Appendable)
) which can cope with arbitrarily split input.
Note: When implementing an escaper it is a good idea to override
this method for efficiency by inlining the implementation of
nextEscapeIndex(CharSequence, int, int)
directly. Doing this for
PercentEscaper
more than doubled the performance for unescaped
strings (as measured by CharEscapersBenchmark
).
escape
in interface Escaper
string
- the literal string to be escaped
string
java.lang.NullPointerException
- if string
is null
java.lang.IllegalArgumentException
- if invalid surrogate characters are
encounteredpublic java.lang.Appendable escape(java.lang.Appendable out)
Appendable
instance which automatically escapes all
text appended to it before passing the resulting text to an underlying
Appendable
.
Unlike escape(String)
it is permitted to append arbitrarily
split input to this Appendable, including input that is split over a
surrogate pair. In this case the pending high surrogate character will not
be processed until the corresponding low surrogate is appended. This means
that a trailing high surrogate character at the end of the input cannot be
detected and will be silently ignored. This is unavoidable since the
Appendable interface has no close()
method, and it is impossible to
determine when the last characters have been appended.
The methods of the returned object will propagate any exceptions thrown
by the underlying Appendable
.
For well formed UTF-16
the escaping behavior is identical to that of escape(String)
and
the following code is equivalent to (but much slower than)
escaper.escape(string)
:
StringBuilder sb = new StringBuilder();
escaper.escape(sb).append(string);
return sb.toString();
escape
in interface Escaper
out
- the underlying Appendable
to append escaped output to
Appendable
which passes text to out
after
escaping it
java.lang.NullPointerException
- if out
is null
java.lang.IllegalArgumentException
- if invalid surrogate characters are
encountered
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |