com.google.gdata.util.common.base
Class PercentEscaper

java.lang.Object
  extended by com.google.gdata.util.common.base.UnicodeEscaper
      extended by com.google.gdata.util.common.base.PercentEscaper
All Implemented Interfaces:
Escaper

public class PercentEscaper
extends UnicodeEscaper

A UnicodeEscaper that escapes some set of Java characters using the URI percent encoding scheme. The set of safe characters (those which remain unescaped) can be specified on construction.

For details on escaping URIs for use in web pages, see section 2.4 of RFC 3986.

In most cases this class should not need to be used directly. If you have no special requirements for escaping your URIs, you should use either CharEscapers.uriEscaper() or CharEscapers.uriEscaper(boolean).

When encoding a String, the following rules apply:

RFC 2396 specifies the set of unreserved characters as "-", "_", ".", "!", "~", "*", "'", "(" and ")". It goes on to state:

Unreserved characters can be escaped without changing the semantics of the URI, but this should not be done unless the URI is being used in a context that does not allow the unescaped character to appear.

For performance reasons the only currently supported character encoding of this class is UTF-8.

Note: This escaper produces uppercase hexidecimal sequences. From RFC 3986:
"URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings."


Field Summary
static java.lang.String SAFECHARS_URLENCODER
          A string of safe characters that mimics the behavior of URLEncoder.
static java.lang.String SAFEPATHCHARS_URLENCODER
          A string of characters that do not need to be encoded when used in URI path segments, as specified in RFC 3986.
static java.lang.String SAFEQUERYSTRINGCHARS_URLENCODER
          A string of characters that do not need to be encoded when used in URI query strings, as specified in RFC 3986.
 
Constructor Summary
PercentEscaper(java.lang.String safeChars, boolean plusForSpace)
          Constructs a URI escaper with the specified safe characters and optional handling of the space character.
 
Method Summary
 java.lang.String escape(java.lang.String s)
          Returns the escaped form of a given literal string.
 
Methods inherited from class com.google.gdata.util.common.base.UnicodeEscaper
escape
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SAFECHARS_URLENCODER

public static final java.lang.String SAFECHARS_URLENCODER
A string of safe characters that mimics the behavior of URLEncoder.

See Also:
Constant Field Values

SAFEPATHCHARS_URLENCODER

public static final java.lang.String SAFEPATHCHARS_URLENCODER
A string of characters that do not need to be encoded when used in URI path segments, as specified in RFC 3986. Note that some of these characters do need to be escaped when used in other parts of the URI.

See Also:
Constant Field Values

SAFEQUERYSTRINGCHARS_URLENCODER

public static final java.lang.String SAFEQUERYSTRINGCHARS_URLENCODER
A string of characters that do not need to be encoded when used in URI query strings, as specified in RFC 3986. Note that some of these characters do need to be escaped when used in other parts of the URI.

See Also:
Constant Field Values
Constructor Detail

PercentEscaper

public PercentEscaper(java.lang.String safeChars,
                      boolean plusForSpace)
Constructs a URI escaper with the specified safe characters and optional handling of the space character.

Parameters:
safeChars - a non null string specifying additional safe characters for this escaper (the ranges 0..9, a..z and A..Z are always safe and should not be specified here)
plusForSpace - true if ASCII space should be escaped to + rather than %20
Throws:
java.lang.IllegalArgumentException - if any of the parameters were invalid
Method Detail

escape

public java.lang.String escape(java.lang.String s)
Description copied from class: UnicodeEscaper
Returns the escaped form of a given literal string.

If you are escaping input in arbitrary successive chunks, then it is not generally safe to use this method. If an input string ends with an unmatched high surrogate character, then this method will throw IllegalArgumentException. You should either ensure your input is valid UTF-16 before calling this method or use an escaped Appendable (as returned by UnicodeEscaper.escape(Appendable)) which can cope with arbitrarily split input.

Note: When implementing an escaper it is a good idea to override this method for efficiency by inlining the implementation of UnicodeEscaper.nextEscapeIndex(CharSequence, int, int) directly. Doing this for PercentEscaper more than doubled the performance for unescaped strings (as measured by CharEscapersBenchmark).

Specified by:
escape in interface Escaper
Overrides:
escape in class UnicodeEscaper
Parameters:
s - the literal string to be escaped
Returns:
the escaped form of string