com.google.gdata.util.common.base
Class StringUtil

java.lang.Object
  extended by com.google.gdata.util.common.base.StringUtil

public class StringUtil
extends java.lang.Object

Some common string manipulation utilities.


Field Summary
static java.lang.String EMPTY_STRING
           
static java.lang.String LINE_BREAKS
           
static java.lang.String WHITE_SPACES
           
 
Method Summary
static boolean allAscii(java.lang.String s)
          Determines if a string contains only ascii characters
static void appendHexJavaScriptRepresentation(java.lang.StringBuilder sb, char c)
          Returns a javascript representation of the character in a hex escaped format.
static java.lang.String arrayMap2String(java.util.Map<java.lang.String,java.lang.String[]> map, java.lang.String keyValueDelim, java.lang.String entryDelim)
          Serializes a map
static java.lang.String bytesToHexString(byte[] bytes)
          Convert a byte array to a hex-encoding string: "a33bff00..."
static java.lang.String bytesToHexString(byte[] bytes, java.lang.Character delimiter)
          Convert a byte array to a hex-encoding string with the specified delimiter: "a3<delimiter>3b<delimiter>ff..."
static java.lang.String bytesToLatin1(byte[] ba)
          Convert a byte array to a String using Latin-1 (aka ISO-8859-1) encoding.
static java.util.List<java.lang.String> bytesToStringList(byte[] bytes)
          Convert an array of bytes into a List of Strings using UTF-8.
static java.lang.String bytesToUtf8(byte[] ba)
          Convert a byte array to a String using UTF-8 encoding.
static java.lang.String capitalize(java.lang.String s)
          Returns a string that is equivalent to the specified string with its first character converted to uppercase as by String.toUpperCase(java.util.Locale).
static java.lang.String collapse(java.lang.String str, java.lang.String chars, java.lang.String replacement)
          Replaces any string of matched characters with the supplied string.
static java.lang.String collapseWhitespace(java.lang.String str)
          Replaces any string of adjacent whitespace characters with the whitespace character " ".
static java.lang.String Collection2String(java.util.Collection<?> in, java.lang.String separator)
          Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.
static boolean containsCharRef(java.lang.String s)
          Determines if a string contains what looks like an html character reference.
static java.lang.String convertEOLToCRLF(java.lang.String input)
          Deprecated. Please inline this method.
static java.lang.String convertEOLToLF(java.lang.String input)
          Converts any instances of "\r" or "\r\n" style EOLs into "\n" (Line Feed).
static void copyStreams(java.io.InputStream in, java.io.OutputStream out)
          Copy all data from in to out in 4096 byte chunks.
static java.lang.String cropBetween(java.lang.String in, char limit)
          The old interface to cropBetween - using a single char limit
static java.lang.String cropBetween(java.lang.String in, java.lang.String limit)
          This removes characters between maching charLimit chars.
static int displayWidth(char ch)
          Returns the approximate display width of the character, measured in units of ascii characters.
static int displayWidth(java.lang.String s)
          Returns the approximate display width of the string, measured in units of ascii characters.
static byte[] encodingToBytes(java.lang.String str, java.lang.String encoding)
          Convert a String to a byte array using the specified encoding.
static boolean equals(java.lang.String s1, java.lang.String s2)
          Compares two strings, guarding against nulls If both Strings are null we return true
static java.lang.String expandShardNames(java.lang.String dbSpecComponent)
           
static java.lang.String fixedWidth(java.lang.String[] lines, int width)
          Reformats the given array of lines to a fixed width by inserting carriage returns and trimming unnecessary whitespace.
static java.lang.String fixedWidth(java.lang.String str, int width)
          Reformats the given string to a fixed width by inserting carriage returns and trimming unnecessary whitespace.
static byte[] hexToBytes(java.lang.String str)
          Convert a string of hex digits to a byte array, with the first byte in the array being the MSB.
static java.lang.String htmlEscape(java.lang.String s)
          Escapes special characters (& < > ") from a string so it can safely be included in an HTML document.
static java.lang.String indent(java.lang.String iString, int iIndentDepth)
          Indents the given String per line.
static int indexOfChars(java.lang.String str, java.lang.String chars)
          Like String.indexOf() except that it will look for any of the characters in 'chars' (similar to C's strpbrk)
static int indexOfChars(java.lang.String str, java.lang.String chars, int fromIndex)
          Like String.indexOf() except that it will look for any of the characters in 'chars' (similar to C's strpbrk)
static java.lang.String insertBreakingWhitespace(int lineLen, java.lang.String original)
          Inserts spaces every splitLen characters so that the string will wrap.
static boolean isCjk(char ch)
          Determines if a character is a CJK ideograph or a character typically used only in CJK text.
static boolean isCjk(int codePoint)
          Determines if a character is a CJK ideograph or a character typically used only in CJK text.
static boolean isCjk(java.lang.String s)
          Determines if a string is a CJK word.
static boolean isEmpty(java.lang.String s)
          Helper function for null and empty string testing.
static boolean isEmptyOrWhitespace(java.lang.String s)
          Helper function for null, empty, and whitespace string testing.
static boolean isHebrew(int codePoint)
          Determines if a character is a Hebrew character.
static boolean isHebrew(java.lang.String s)
          Determines if a string is a Hebrew word.
static java.lang.String Iterator2String(java.util.Iterator<?> it, java.lang.String separator)
          Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.
static java.lang.String javaEscape(java.lang.String s)
          We escape some characters in s to be able to insert strings into Java code
static java.lang.String javaEscapeWithinAttribute(java.lang.String s)
          Escape a string so that it can be safely placed as value of an attribute.
static java.lang.String javaScriptEscape(java.lang.String s)
          We escape some characters in s to be able to insert strings into JavaScript code.
static java.lang.String javaScriptEscapeToAscii(java.lang.String s)
          We escape some characters in s to be able to insert strings into JavaScript code.
static java.lang.String javaScriptUnescape(java.lang.String s)
          Undo escaping as performed in javaScriptEscape(.) Throws an IllegalArgumentException if the string contains bad escaping.
static java.lang.String javaUtilRegexEscape(java.lang.String s)
          Escapes the special characters from a string so it can be used as part of a regex pattern.
static java.lang.String join(java.util.Collection tokens, java.lang.String delimiter)
          Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.
static java.lang.String join(java.lang.Object[] tokens, java.lang.String delimiter)
          Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.
static java.lang.String joinInts(int[] tokens, java.lang.String delimiter)
          Concatenates the given int[] array into one String, inserting a delimiter between each pair of elements.
static java.lang.String joinLongs(long[] tokens, java.lang.String delimiter)
          Concatenates the given long[] array into one String, inserting a delimiter between each pair of elements.
static int lastIndexNotOf(java.lang.String str, java.lang.String chars, int fromIndex)
          Finds the last index in str of a character not in the characters in 'chars' (similar to ANSI string.find_last_not_of).
static java.lang.String lastToken(java.lang.String s, java.lang.String delimiter)
          Splits s with delimiters in delimiter and returns the last token
static byte[] latin1ToBytes(java.lang.String str)
          Convert a String to a byte array using Latin-1 (aka ISO-8859-1) encoding.
static java.lang.String list2String(java.util.Collection<?> in, java.lang.String separator)
          Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.
static
<V> java.util.Map
lowercaseKeys(java.util.Map<java.lang.String,V> map)
          Given a map, creates and returns a new map in which all keys are the lower-cased version of each key.
static java.lang.String lstrip(java.lang.String str)
          Deprecated. ensure the string is not null and use CharMatcher.LEGACY_WHITESPACE.trimLeadingFrom(str); also consider whether you really want the legacy whitespace definition, or something more standard like CharMatcher.WHITESPACE.
static java.lang.String makeSafe(java.lang.String s)
          Helper function for making null strings safe for comparisons, etc.
static
<K,V> java.lang.String
map2String(java.util.Map<K,V> in, java.lang.String sepKey, java.lang.String sepEntry)
          This function concatenates the elements of a Map in a string with form "..."
static java.lang.String maskLeft(java.lang.String s, int len, char mask_ch)
          Returns a string consisting of "s", with each of the first "len" characters replaced by "mask_ch" character.
static java.lang.String maskRight(java.lang.String s, int len, char mask_ch)
          Returns a string consisting of "s", with each of the last "len" characters replaces by "mask_ch" character.
static java.lang.String megastrip(java.lang.String str, boolean left, boolean right, java.lang.String what)
          Deprecated. ensure the string is not null and use
  • CharMatcher.anyOf(what).trimFrom(str) if left == true and right == true
  • CharMatcher.anyOf(what).trimLeadingFrom(str) if left == true and right == false
  • CharMatcher.anyOf(what).trimTrailingFrom(str) if left == false and right == true
static int numSharedChars(java.lang.String str, java.lang.String chars)
          Counts the number of (not necessarily distinct) characters in the string that also happen to be in 'chars'
static java.lang.String padLeft(java.lang.String s, int len, char pad_ch)
          Returns a string consisting of "s", plus enough copies of "pad_ch" on the left hand side to make the length of "s" equal to or greater than len (if "s" is already longer than "len", then "s" is returned).
static java.lang.String padRight(java.lang.String s, int len, char pad_ch)
          Returns a string consisting of "s", plus enough copies of "pad_ch" on the right hand side to make the length of "s" equal to or greater than len (if "s" is already longer than "len", then "s" is returned).
static java.lang.String[] parseDelimitedList(java.lang.String list, char delimiter)
          Parse a list of substrings separated by a given delimiter.
static java.lang.String pythonEscape(java.lang.String s)
          We escape some characters in s to be able to make the string executable from a python string
static java.lang.String regexEscape(java.lang.String s)
          Escapes the special characters from a string so it can be used as part of a regex pattern.
static java.lang.String regexReplacementEscape(java.lang.String s)
          Escapes the '\' and '$' characters, which comprise the subset of regex characters that has special meaning in methods such as:
static java.lang.String removeChars(java.lang.String str, java.lang.String oldchars)
          Remove any occurrances of 'oldchars' in 'str'.
static java.lang.String repeat(java.lang.String sourceString, int factor)
          Returns sourceString concatenated together 'factor' times.
static java.lang.String replace(java.lang.String str, java.lang.String what, java.lang.String with)
          Deprecated. Please use String.replace(CharSequence, CharSequence).
static java.lang.String replaceChars(java.lang.String str, java.lang.String oldchars, char newchar)
          Like String.replace() except that it accepts any number of old chars.
static java.lang.String replaceSmartQuotes(java.lang.String str)
          Replaces microsoft "smart quotes" (curly " and ') with their ascii counterparts.
static java.lang.String retainAllChars(java.lang.String str, java.lang.String retainChars)
          Removes all characters from 'str' that are not in 'retainChars'.
static java.lang.String rstrip(java.lang.String str)
          Deprecated. ensure the string is not null and use CharMatcher.LEGACY_WHITESPACE.trimTrailingFrom(str); also consider whether you really want the legacy whitespace definition, or something more standard like CharMatcher.WHITESPACE.
static java.lang.String[] split(java.lang.String str, java.lang.String delims)
          Split "str" by run of delimiters and return.
static java.lang.String[] split(java.lang.String str, java.lang.String delims, boolean trimTokens)
          Split "str" into tokens by delimiters and optionally remove white spaces from the splitted tokens.
static java.lang.String[] splitAndTrim(java.lang.String str, java.lang.String delims)
          Short hand for split(str, delims, true)
static int[] splitInts(java.lang.String str)
          Parse comma-separated list of ints and return as array.
static long[] splitLongs(java.lang.String str)
          Parse comma-separated list of longs and return as array.
static java.lang.String stream2String(java.io.InputStream is, int maxLength)
          Read a String of up to maxLength bytes from an InputStream
static java.util.Collection<java.lang.String> string2Collection(java.lang.String in, java.lang.String delimiter, boolean doStrip, java.util.Collection<java.lang.String> collection)
          Converts a delimited string to a collection of strings.
static java.util.LinkedList<java.lang.String> string2List(java.lang.String in, java.lang.String delimiter, boolean doStrip)
          This converts a String to a list of strings by extracting the substrings between delimiter
static java.util.HashMap<java.lang.String,java.lang.String> string2Map(java.lang.String in, java.lang.String delimEntry, java.lang.String delimKey, boolean doStripEntry)
          This converts a string to a Map.
static java.util.Set string2Set(java.lang.String in, java.lang.String delimiter, boolean doStrip)
          This converts a String to a Set of strings by extracting the substrings between delimiter
static java.lang.String strip(java.lang.String str)
          strip - strips both ways
static java.lang.String stripAndCollapse(java.lang.String str)
          Strip white spaces from both end, and collapse white spaces in the middle.
static java.lang.String stripHtmlTags(java.lang.String string)
          Given a String, returns an equivalent String with all HTML tags stripped.
static java.lang.String stripNonDigits(java.lang.String str)
          Strips all non-digit characters from a string.
static java.lang.String stripPrefix(java.lang.String str, java.lang.String prefix)
          Give me a string and a potential prefix, and I return the string following the prefix if the prefix matches, else null.
static java.lang.String stripPrefixIgnoreCase(java.lang.String str, java.lang.String prefix)
          Case insensitive version of stripPrefix.
static java.lang.String toNullIfEmpty(java.lang.String s)
          Helper function for making empty strings into a null.
static java.lang.String toNullIfEmptyOrWhitespace(java.lang.String s)
          Helper function for turning empty or whitespace strings into a null.
static java.lang.String toString(float[] iArray)
           
static java.lang.String toString(int[] iArray)
           
static java.lang.String toString(int[][] iArray)
           
static java.lang.String toString(long[] iArray)
           
static java.lang.String toString(long[][] iArray)
           
static java.lang.String toString(java.lang.Object[] obj)
           
static java.lang.String toString(java.lang.String s)
          Returns the string, in single quotes, or "NULL".
static java.lang.String toString(java.lang.String[] iArray)
           
static java.lang.String toUpperCase(java.lang.String src)
          Safely convert the string to uppercase.
static java.io.InputStream toUTF8InputStream(java.lang.String str)
          Replacement for deprecated StringBufferInputStream().
static java.lang.String unescapeCString(java.lang.String s)
          Unescape any C escape sequences (\n, \r, \\, \ooo, etc) and return the resulting string.
static java.lang.String unescapeHTML(java.lang.String s)
          Replace all the occurences of HTML escape strings with the respective characters.
static java.lang.String unescapeMySQLString(java.lang.String s)
          Unescape any MySQL escape sequences.
static java.lang.String unicodeEscape(java.lang.String s)
          Replaces each non-ascii character in s with its Unicode escape sequence \\uxxxx where xxxx is a hex number.
static byte[] utf8ToBytes(java.lang.String str)
          Convert a String to a byte array using UTF-8 encoding.
static java.lang.String xmlCDataEscape(java.lang.String s)
          Escape a string that is meant to be embedded in a CDATA section.
static java.lang.String xmlContentEscape(java.lang.String s)
          Escape a string for use inside as XML element content.
static java.lang.String xmlEscape(java.lang.String s)
          Returns a form of "s" appropriate for including in an XML document, after escaping certain special characters (e.g.
static java.lang.String xmlSingleQuotedEscape(java.lang.String s)
          Escape a string for use inside as XML single-quoted attributes.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

EMPTY_STRING

public static final java.lang.String EMPTY_STRING
See Also:
Constant Field Values

WHITE_SPACES

public static final java.lang.String WHITE_SPACES
See Also:
Constant Field Values

LINE_BREAKS

public static final java.lang.String LINE_BREAKS
See Also:
Constant Field Values
Method Detail

split

public static java.lang.String[] split(java.lang.String str,
                                       java.lang.String delims)
Split "str" by run of delimiters and return.


split

public static java.lang.String[] split(java.lang.String str,
                                       java.lang.String delims,
                                       boolean trimTokens)
Split "str" into tokens by delimiters and optionally remove white spaces from the splitted tokens.

Parameters:
trimTokens - if true, then trim the tokens

splitAndTrim

public static java.lang.String[] splitAndTrim(java.lang.String str,
                                              java.lang.String delims)
Short hand for split(str, delims, true)


splitInts

public static int[] splitInts(java.lang.String str)
                       throws java.lang.IllegalArgumentException
Parse comma-separated list of ints and return as array.

Throws:
java.lang.IllegalArgumentException

splitLongs

public static long[] splitLongs(java.lang.String str)
                         throws java.lang.IllegalArgumentException
Parse comma-separated list of longs and return as array.

Throws:
java.lang.IllegalArgumentException

joinInts

public static java.lang.String joinInts(int[] tokens,
                                        java.lang.String delimiter)
Concatenates the given int[] array into one String, inserting a delimiter between each pair of elements.


joinLongs

public static java.lang.String joinLongs(long[] tokens,
                                         java.lang.String delimiter)
Concatenates the given long[] array into one String, inserting a delimiter between each pair of elements.


join

@Deprecated
public static java.lang.String join(java.lang.Object[] tokens,
                                               java.lang.String delimiter)
Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.

Concatenates the String representations of the elements of a String[] array into one String, and inserts a delimiter between each pair of elements.

This includes the String[] case, because if s is a String, then s.toString() returns s.


join

@Deprecated
public static java.lang.String join(java.util.Collection tokens,
                                               java.lang.String delimiter)
Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.

Same as join(Object[],String), but takes a Collection instead.


replace

@Deprecated
public static java.lang.String replace(java.lang.String str,
                                                  java.lang.String what,
                                                  java.lang.String with)
Deprecated. Please use String.replace(CharSequence, CharSequence).

This replaces the occurances of 'what' in 'str' with 'with'

Parameters:
str - - the string o process
what - - to replace
with - - replace with this
Returns:
String str whete 'what' was repalced with 'with'

fixedWidth

public static java.lang.String fixedWidth(java.lang.String str,
                                          int width)
Reformats the given string to a fixed width by inserting carriage returns and trimming unnecessary whitespace.

Parameters:
str - the string to format
width - the fixed width (in characters)

fixedWidth

public static java.lang.String fixedWidth(java.lang.String[] lines,
                                          int width)
Reformats the given array of lines to a fixed width by inserting carriage returns and trimming unnecessary whitespace.

Parameters:
lines - - array of lines to format
width - - the fixed width (in characters)

insertBreakingWhitespace

public static java.lang.String insertBreakingWhitespace(int lineLen,
                                                        java.lang.String original)
Inserts spaces every splitLen characters so that the string will wrap.

Parameters:
lineLen - the length of the substrings to separate with spaces.
original - the original String
Returns:
original String with spaces inserted every lineLen characters.

indent

public static java.lang.String indent(java.lang.String iString,
                                      int iIndentDepth)
Indents the given String per line.

Parameters:
iString - The string to indent.
iIndentDepth - The depth of the indentation.
Returns:
The indented string.

megastrip

@Deprecated
public static java.lang.String megastrip(java.lang.String str,
                                                    boolean left,
                                                    boolean right,
                                                    java.lang.String what)
Deprecated. ensure the string is not null and use
  • CharMatcher.anyOf(what).trimFrom(str) if left == true and right == true
  • CharMatcher.anyOf(what).trimLeadingFrom(str) if left == true and right == false
  • CharMatcher.anyOf(what).trimTrailingFrom(str) if left == false and right == true

This is a both way strip

Parameters:
str - the string to strip
left - strip from left
right - strip from right
what - character(s) to strip
Returns:
the stripped string

lstrip

@Deprecated
public static java.lang.String lstrip(java.lang.String str)
Deprecated. ensure the string is not null and use CharMatcher.LEGACY_WHITESPACE.trimLeadingFrom(str); also consider whether you really want the legacy whitespace definition, or something more standard like CharMatcher.WHITESPACE.

lstrip - strips spaces from left

Parameters:
str - what to strip
Returns:
String the striped string

rstrip

@Deprecated
public static java.lang.String rstrip(java.lang.String str)
Deprecated. ensure the string is not null and use CharMatcher.LEGACY_WHITESPACE.trimTrailingFrom(str); also consider whether you really want the legacy whitespace definition, or something more standard like CharMatcher.WHITESPACE.

rstrip - strips spaces from right

Parameters:
str - what to strip
Returns:
String the striped string

strip

public static java.lang.String strip(java.lang.String str)
strip - strips both ways

Parameters:
str - what to strip
Returns:
String the striped string

stripAndCollapse

public static java.lang.String stripAndCollapse(java.lang.String str)
Strip white spaces from both end, and collapse white spaces in the middle.

Parameters:
str - what to strip
Returns:
String the striped and collapsed string

stripPrefix

public static java.lang.String stripPrefix(java.lang.String str,
                                           java.lang.String prefix)
Give me a string and a potential prefix, and I return the string following the prefix if the prefix matches, else null. Analogous to the c++ functions strprefix and var_strprefix.


stripPrefixIgnoreCase

public static java.lang.String stripPrefixIgnoreCase(java.lang.String str,
                                                     java.lang.String prefix)
Case insensitive version of stripPrefix. Analogous to the c++ functions strcaseprefix and var_strcaseprefix.


stripNonDigits

public static java.lang.String stripNonDigits(java.lang.String str)
Strips all non-digit characters from a string. The resulting string will only contain characters for which isDigit() returns true.

Parameters:
str - the string to strip
Returns:
a string consisting of digits only, or an empty string

numSharedChars

public static int numSharedChars(java.lang.String str,
                                 java.lang.String chars)
Counts the number of (not necessarily distinct) characters in the string that also happen to be in 'chars'


indexOfChars

public static int indexOfChars(java.lang.String str,
                               java.lang.String chars,
                               int fromIndex)
Like String.indexOf() except that it will look for any of the characters in 'chars' (similar to C's strpbrk)


indexOfChars

public static int indexOfChars(java.lang.String str,
                               java.lang.String chars)
Like String.indexOf() except that it will look for any of the characters in 'chars' (similar to C's strpbrk)


lastIndexNotOf

public static int lastIndexNotOf(java.lang.String str,
                                 java.lang.String chars,
                                 int fromIndex)
Finds the last index in str of a character not in the characters in 'chars' (similar to ANSI string.find_last_not_of). Returns -1 if no such character can be found.


replaceChars

public static java.lang.String replaceChars(java.lang.String str,
                                            java.lang.String oldchars,
                                            char newchar)
Like String.replace() except that it accepts any number of old chars. Replaces any occurrances of 'oldchars' in 'str' with 'newchar'. Example: replaceChars("Hello, world!", "H,!", ' ') returns " ello world "


removeChars

public static java.lang.String removeChars(java.lang.String str,
                                           java.lang.String oldchars)
Remove any occurrances of 'oldchars' in 'str'. Example: removeChars("Hello, world!", ",!") returns "Hello world"


retainAllChars

public static java.lang.String retainAllChars(java.lang.String str,
                                              java.lang.String retainChars)
Removes all characters from 'str' that are not in 'retainChars'. Example: retainAllChars("Hello, world!", "lo") returns "llool"


replaceSmartQuotes

public static java.lang.String replaceSmartQuotes(java.lang.String str)
Replaces microsoft "smart quotes" (curly " and ') with their ascii counterparts.


hexToBytes

public static byte[] hexToBytes(java.lang.String str)
Convert a string of hex digits to a byte array, with the first byte in the array being the MSB. The string passed in should be just the raw digits (upper or lower case), with no leading or trailing characters (like '0x' or 'h'). An odd number of characters is supported. If the string is empty, an empty array will be returned. This is significantly faster than using new BigInteger(str, 16).toByteArray(); especially with larger strings. Here are the results of some microbenchmarks done on a P4 2.8GHz 2GB RAM running linux 2.4.22-gg11 and JDK 1.5 with an optimized build: String length hexToBytes (usec) BigInteger ----------------------------------------------------- 16 0.570 1.43 256 8.21 44.4 1024 32.8 526 16384 546 121000


convertEOLToLF

public static java.lang.String convertEOLToLF(java.lang.String input)
Converts any instances of "\r" or "\r\n" style EOLs into "\n" (Line Feed).


convertEOLToCRLF

@Deprecated
public static java.lang.String convertEOLToCRLF(java.lang.String input)
Deprecated. Please inline this method.


padLeft

public static java.lang.String padLeft(java.lang.String s,
                                       int len,
                                       char pad_ch)
Returns a string consisting of "s", plus enough copies of "pad_ch" on the left hand side to make the length of "s" equal to or greater than len (if "s" is already longer than "len", then "s" is returned).


padRight

public static java.lang.String padRight(java.lang.String s,
                                        int len,
                                        char pad_ch)
Returns a string consisting of "s", plus enough copies of "pad_ch" on the right hand side to make the length of "s" equal to or greater than len (if "s" is already longer than "len", then "s" is returned).


maskLeft

public static java.lang.String maskLeft(java.lang.String s,
                                        int len,
                                        char mask_ch)
Returns a string consisting of "s", with each of the first "len" characters replaced by "mask_ch" character.


maskRight

public static java.lang.String maskRight(java.lang.String s,
                                         int len,
                                         char mask_ch)
Returns a string consisting of "s", with each of the last "len" characters replaces by "mask_ch" character.


unescapeCString

public static java.lang.String unescapeCString(java.lang.String s)
Unescape any C escape sequences (\n, \r, \\, \ooo, etc) and return the resulting string.


unescapeMySQLString

public static java.lang.String unescapeMySQLString(java.lang.String s)
                                            throws java.lang.IllegalArgumentException
Unescape any MySQL escape sequences. See MySQL language reference Chapter 6 at http://www.mysql.com/doc/. This function will not work for other SQL-like dialects.

Parameters:
s - string to unescape, with the surrounding quotes.
Returns:
unescaped string, without the surrounding quotes.
Throws:
java.lang.IllegalArgumentException - if s is not a valid MySQL string.

unescapeHTML

public static final java.lang.String unescapeHTML(java.lang.String s)
Replace all the occurences of HTML escape strings with the respective characters.

Parameters:
s - a String value
Returns:
a String value

stripHtmlTags

public static java.lang.String stripHtmlTags(java.lang.String string)
Given a String, returns an equivalent String with all HTML tags stripped. Note that HTML entities, such as "&amp;" will still be preserved.


pythonEscape

public static java.lang.String pythonEscape(java.lang.String s)
We escape some characters in s to be able to make the string executable from a python string


javaScriptEscape

public static java.lang.String javaScriptEscape(java.lang.String s)
We escape some characters in s to be able to insert strings into JavaScript code. Also, make sure that we don't write out --> or

javaScriptEscapeToAscii

public static java.lang.String javaScriptEscapeToAscii(java.lang.String s)
We escape some characters in s to be able to insert strings into JavaScript code. Also, make sure that we don't write out --> or </scrip, which may close a script tag. Turns all non-ascii characters into ASCII javascript escape sequences (eg ?)


appendHexJavaScriptRepresentation

public static void appendHexJavaScriptRepresentation(java.lang.StringBuilder sb,
                                                     char c)
Returns a javascript representation of the character in a hex escaped format. Although this is a rather specific method, it is made public because it is also used by the JSCompiler.

Parameters:
sb - The buffer to which the hex representation should be appended.
c - The character to be appended.

javaScriptUnescape

public static java.lang.String javaScriptUnescape(java.lang.String s)
Undo escaping as performed in javaScriptEscape(.) Throws an IllegalArgumentException if the string contains bad escaping.


xmlContentEscape

public static java.lang.String xmlContentEscape(java.lang.String s)
Escape a string for use inside as XML element content. This escapes less-than and ampersand, only.


xmlSingleQuotedEscape

public static java.lang.String xmlSingleQuotedEscape(java.lang.String s)
Escape a string for use inside as XML single-quoted attributes. This escapes less-than, single-quote, ampersand, and (not strictly necessary) newlines.


xmlCDataEscape

public static java.lang.String xmlCDataEscape(java.lang.String s)
Escape a string that is meant to be embedded in a CDATA section. The returned string is guaranteed to be valid CDATA content. The syntax of CDATA sections is the following:
<[!CDATA[...]]>
The only invalid character sequence in a CDATA tag is "]]>". If this sequence is present in the input string, we replace it by closing the current CDATA field, then write ']]&gt;', then reopen a new CDATA section.


javaEscape

public static java.lang.String javaEscape(java.lang.String s)
We escape some characters in s to be able to insert strings into Java code


javaEscapeWithinAttribute

public static java.lang.String javaEscapeWithinAttribute(java.lang.String s)
Escape a string so that it can be safely placed as value of an attribute. This is essentially similar to the javaEscape(java.lang.String) except that it escapes double quote to the HTML literal &quot;. This is to prevent the double quote from being interpreted as the character closing the attribute.


xmlEscape

public static java.lang.String xmlEscape(java.lang.String s)
Returns a form of "s" appropriate for including in an XML document, after escaping certain special characters (e.g. '&' => '&', etc.)


htmlEscape

public static java.lang.String htmlEscape(java.lang.String s)
Escapes special characters (& < > ") from a string so it can safely be included in an HTML document. (same as xmlEscape except that htmlEscape does not escape the apostrophe character).


regexEscape

public static java.lang.String regexEscape(java.lang.String s)
Escapes the special characters from a string so it can be used as part of a regex pattern. This method is for use on gnu.regexp style regular expressions.


javaUtilRegexEscape

public static java.lang.String javaUtilRegexEscape(java.lang.String s)
Escapes the special characters from a string so it can be used as part of a regex pattern. This method is for use on regexes in the flavor of the java.util.regex package. This method should be removed when we move to the java version 1.5 (Tiger) release, since that release gives us a literal regex flag as well as a quote method to produce literal regexes.


regexReplacementEscape

public static java.lang.String regexReplacementEscape(java.lang.String s)
Escapes the '\' and '$' characters, which comprise the subset of regex characters that has special meaning in methods such as:
java.util.regex.Matcher.appendReplacement(sb, replacement);
java.lang.String.replaceAll(str, replacement);
Note that this method is offered in java version 1.5 as the method
java.util.regex.Matcher.quoteReplacement(String);


cropBetween

public static java.lang.String cropBetween(java.lang.String in,
                                           char limit)
The old interface to cropBetween - using a single char limit


cropBetween

public static java.lang.String cropBetween(java.lang.String in,
                                           java.lang.String limit)
This removes characters between maching charLimit chars. For example cropBetween("ab^cd^ef^gh^hi", '^') will return "abefhi" It will consider squences of 2 charLimit as one charLimit in the output

Parameters:
in - - the string to process
limit - - the limit of the string(s) to remove
Returns:
String - the cropped string

string2List

public static java.util.LinkedList<java.lang.String> string2List(java.lang.String in,
                                                                 java.lang.String delimiter,
                                                                 boolean doStrip)
This converts a String to a list of strings by extracting the substrings between delimiter

Parameters:
in - - what to process
delimiter - - the delimiting string
doStrip - - to strip the substrings before adding to the list
Returns:
LinkedList

string2Set

public static java.util.Set string2Set(java.lang.String in,
                                       java.lang.String delimiter,
                                       boolean doStrip)
This converts a String to a Set of strings by extracting the substrings between delimiter

Parameters:
in - - what to process
delimiter - - the delimiting string
doStrip - - to strip the substrings before adding to the list
Returns:
Set

string2Collection

public static java.util.Collection<java.lang.String> string2Collection(java.lang.String in,
                                                                       java.lang.String delimiter,
                                                                       boolean doStrip,
                                                                       java.util.Collection<java.lang.String> collection)
Converts a delimited string to a collection of strings. Substrings between delimiters are extracted from the string and added to a collection that is provided by the caller.

Parameters:
in - The delimited input string to process
delimiter - The string delimiting entries in the input string.
doString - Whether to strip the substrings before adding to the collection
collection - The collection to which the strings will be added. If null, a new List will be created.
Returns:
The collection to which the substrings were added. This is syntactic sugar to allow call chaining.

list2String

@Deprecated
public static java.lang.String list2String(java.util.Collection<?> in,
                                                      java.lang.String separator)
Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.

Lots of people called list2String when in fact it was implemented as Collection2String. I added Collection2String as a new function and am leaving the list2String function signature here so it can continue to be


Collection2String

@Deprecated
public static java.lang.String Collection2String(java.util.Collection<?> in,
                                                            java.lang.String separator)
Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.

This concatenates the elements of a collection in a string

Parameters:
in - - the collection that has to be conatenated
separator - - a string to sepparate the elements from the list
Returns:
String

Iterator2String

@Deprecated
public static java.lang.String Iterator2String(java.util.Iterator<?> it,
                                                          java.lang.String separator)
Deprecated. Please use But note that Join does not consider null elements to be equivalent to the empty string, as this method does.


string2Map

public static java.util.HashMap<java.lang.String,java.lang.String> string2Map(java.lang.String in,
                                                                              java.lang.String delimEntry,
                                                                              java.lang.String delimKey,
                                                                              boolean doStripEntry)
This converts a string to a Map. It will first split the string into entries using delimEntry. Then each entry is split into a key and a value using delimKey. By default we strip the keys. Use doStripEntry to strip also the entries

Parameters:
in - - the string to be processed
delimEntry - - delimiter for the entries
delimKey - - delimiter between keys and values
doStripEntry - - strip entries before inserting in the map
Returns:
HashMap

map2String

public static <K,V> java.lang.String map2String(java.util.Map<K,V> in,
                                                java.lang.String sepKey,
                                                java.lang.String sepEntry)
This function concatenates the elements of a Map in a string with form "..."

Parameters:
in - - the map to be converted
sepKey - - the separator to put between key and value
sepEntry - - the separator to put between map entries
Returns:
String

lowercaseKeys

public static <V> java.util.Map lowercaseKeys(java.util.Map<java.lang.String,V> map)
Given a map, creates and returns a new map in which all keys are the lower-cased version of each key.

Parameters:
map - A map containing String keys to be lowercased
Throws:
java.lang.IllegalArgumentException - if the map contains duplicate string keys after lower casing

collapseWhitespace

public static java.lang.String collapseWhitespace(java.lang.String str)
Replaces any string of adjacent whitespace characters with the whitespace character " ".

Parameters:
str - the string you want to munge
Returns:
String with no more excessive whitespace!
See Also:
collapse

collapse

public static java.lang.String collapse(java.lang.String str,
                                        java.lang.String chars,
                                        java.lang.String replacement)
Replaces any string of matched characters with the supplied string.

This is a more general version of collapseWhitespace.

   E.g. collapse("hello     world", " ", "::")
   will return the following string: "hello::world"
 

Parameters:
str - the string you want to munge
chars - all of the characters to be considered for munge
replacement - the replacement string
Returns:
String munged and replaced string.

stream2String

public static java.lang.String stream2String(java.io.InputStream is,
                                             int maxLength)
                                      throws java.io.IOException
Read a String of up to maxLength bytes from an InputStream

Parameters:
is - input stream
maxLength - max number of bytes to read from "is". If this is -1, we read everything.
Returns:
String up to maxLength bytes, read from "is"
Throws:
java.io.IOException

parseDelimitedList

public static java.lang.String[] parseDelimitedList(java.lang.String list,
                                                    char delimiter)
Parse a list of substrings separated by a given delimiter. The delimiter can also appear in substrings (just double them): parseDelimitedString("this|is", '|') returns ["this","is"] parseDelimitedString("this||is", '|') returns ["this|is"]

Parameters:
list - String containing delimited substrings
delimiter - Delimiter (anything except ' ' is allowed)
Returns:
String[] A String array of parsed substrings

isEmpty

public static boolean isEmpty(java.lang.String s)
Helper function for null and empty string testing.

Returns:
true iff s == null or s.equals("");

isEmptyOrWhitespace

public static boolean isEmptyOrWhitespace(java.lang.String s)
Helper function for null, empty, and whitespace string testing.

Returns:
true if s == null or s.equals("") or s contains only whitespace characters.

makeSafe

public static java.lang.String makeSafe(java.lang.String s)
Helper function for making null strings safe for comparisons, etc.

Returns:
(s == null) ? "" : s;

toNullIfEmpty

public static java.lang.String toNullIfEmpty(java.lang.String s)
Helper function for making empty strings into a null.

Returns:
null if s is zero length. otherwise, returns s.

toNullIfEmptyOrWhitespace

public static java.lang.String toNullIfEmptyOrWhitespace(java.lang.String s)
Helper function for turning empty or whitespace strings into a null.

Returns:
null if s is zero length or if s contains only whitespace characters. otherwise, returns s.

arrayMap2String

public static java.lang.String arrayMap2String(java.util.Map<java.lang.String,java.lang.String[]> map,
                                               java.lang.String keyValueDelim,
                                               java.lang.String entryDelim)
Serializes a map

Parameters:
map - A map of String keys to arrays of String values
keyValueDelim - Delimiter between keys and values
entryDelim - Delimiter between entries
Returns:
String A string containing a serialized representation of the contents of the map. e.g. arrayMap2String({"foo":["bar","bar2"],"foo1":["bar1"]}, "=", "&") returns "foo=bar&foo=bar2&foo1=bar1"

equals

public static boolean equals(java.lang.String s1,
                             java.lang.String s2)
Compares two strings, guarding against nulls If both Strings are null we return true


lastToken

public static java.lang.String lastToken(java.lang.String s,
                                         java.lang.String delimiter)
Splits s with delimiters in delimiter and returns the last token


allAscii

public static boolean allAscii(java.lang.String s)
Determines if a string contains only ascii characters


containsCharRef

public static boolean containsCharRef(java.lang.String s)
Determines if a string contains what looks like an html character reference. Useful for deciding whether unescaping is necessary.


isHebrew

public static boolean isHebrew(java.lang.String s)
Determines if a string is a Hebrew word. A string is considered to be a Hebrew word if isHebrew(int) is true for any of its characters.


isHebrew

public static boolean isHebrew(int codePoint)
Determines if a character is a Hebrew character.


isCjk

public static boolean isCjk(java.lang.String s)
Determines if a string is a CJK word. A string is considered to be CJK if isCjk(char) is true for any of its characters.


isCjk

public static boolean isCjk(char ch)
Determines if a character is a CJK ideograph or a character typically used only in CJK text. Note: This function cannot handle supplementary characters. To handle all Unicode characters, including supplementary characters, use the function isCjk(int).


isCjk

public static boolean isCjk(int codePoint)
Determines if a character is a CJK ideograph or a character typically used only in CJK text.


unicodeEscape

public static java.lang.String unicodeEscape(java.lang.String s)
Replaces each non-ascii character in s with its Unicode escape sequence \\uxxxx where xxxx is a hex number. Existing escape sequences won't be affected.


displayWidth

public static int displayWidth(java.lang.String s)
Returns the approximate display width of the string, measured in units of ascii characters.

See Also:
displayWidth(char)

displayWidth

public static int displayWidth(char ch)
Returns the approximate display width of the character, measured in units of ascii characters. This method should err on the side of caution. By default, characters are assumed to have width 2; this covers CJK ideographs, various symbols and miscellaneous weird scripts. Given below are some Unicode ranges for which it seems safe to assume that no character is substantially wider than an ascii character: - Latin, extended Latin, even more extended Latin. - Greek, extended Greek, Cyrillic. - Some symbols (including currency symbols) and punctuation. - Half-width Katakana and Hangul. - Hebrew - Thai Characters in these ranges are given a width of 1. IMPORTANT: this function has an analog in strutil.cc named UnicodeCharWidth, which needs to be updated if you change the implementation here.


toString

public static java.lang.String toString(float[] iArray)
Returns:
a string representation of the given native array.

toString

public static java.lang.String toString(long[] iArray)
Returns:
a string representation of the given native array.

toString

public static java.lang.String toString(int[] iArray)
Returns:
a string representation of the given native array

toString

public static java.lang.String toString(java.lang.String[] iArray)
Returns:
a string representation of the given array.

toString

public static java.lang.String toString(java.lang.String s)
Returns the string, in single quotes, or "NULL". Intended only for logging.

Parameters:
s - - the string
Returns:
the string, in single quotes, or the string "null" if it's null.

toString

public static java.lang.String toString(int[][] iArray)
Returns:
a string representation of the given native array

toString

public static java.lang.String toString(long[][] iArray)
Returns:
a string representation of the given native array.

toString

public static java.lang.String toString(java.lang.Object[] obj)
Returns:
a String representation of the given object array. The strings are obtained by calling toString() on the underlying objects.

toUTF8InputStream

public static java.io.InputStream toUTF8InputStream(java.lang.String str)
Replacement for deprecated StringBufferInputStream(). Instead of: InputStream is = new StringBuilderInputStream(str); do: InputStream is = StringUtil.toUTF8InputStream(str);


copyStreams

public static void copyStreams(java.io.InputStream in,
                               java.io.OutputStream out)
                        throws java.io.IOException
Copy all data from in to out in 4096 byte chunks.

Throws:
java.io.IOException

bytesToLatin1

public static java.lang.String bytesToLatin1(byte[] ba)
Convert a byte array to a String using Latin-1 (aka ISO-8859-1) encoding. Note: something is probably wrong if you're using this method. Either you're dealing with legacy code that doesn't support i18n or you're using a third-party library that only deals with Latin-1. New code should (almost) always uses UTF-8 encoding.

Returns:
the decoded String or null if ba is null

bytesToHexString

public static java.lang.String bytesToHexString(byte[] bytes)
Convert a byte array to a hex-encoding string: "a33bff00..."


bytesToHexString

public static java.lang.String bytesToHexString(byte[] bytes,
                                                java.lang.Character delimiter)
Convert a byte array to a hex-encoding string with the specified delimiter: "a3<delimiter>3b<delimiter>ff..."


latin1ToBytes

public static byte[] latin1ToBytes(java.lang.String str)
Convert a String to a byte array using Latin-1 (aka ISO-8859-1) encoding. If any character in the String is not Latin-1 (meaning it's high 8 bits are not all zero), then the returned byte array will contain garbage. Therefore, only use this if you know all your characters are within Latin-1. Note: something is probably wrong if you're using this method. Either you're dealing with legacy code that doesn't support i18n or you're using a third-party library that only deals with Latin-1. New code should (almost) always uses UTF-8 encoding.

Returns:
the encoded byte array or null if str is null

bytesToUtf8

public static java.lang.String bytesToUtf8(byte[] ba)
Convert a byte array to a String using UTF-8 encoding.

Returns:
the decoded String or null if ba is null

utf8ToBytes

public static byte[] utf8ToBytes(java.lang.String str)
Convert a String to a byte array using UTF-8 encoding.

Returns:
the encoded byte array or null if str is null

encodingToBytes

public static byte[] encodingToBytes(java.lang.String str,
                                     java.lang.String encoding)
Convert a String to a byte array using the specified encoding.

Parameters:
encoding - the encoding to use
Returns:
the encoded byte array or null if str is null

bytesToStringList

public static java.util.List<java.lang.String> bytesToStringList(byte[] bytes)
Convert an array of bytes into a List of Strings using UTF-8. A line is considered to be terminated by any one of a line feed ('\n'), a carriage return ('\r'), or a carriage return followed immediately by a linefeed.

Can be used to parse the output of

Parameters:
bytes - the array to convert
Returns:
A new mutable list containing the Strings in the input array. The list will be empty if bytes is empty or if it is null.

toUpperCase

public static java.lang.String toUpperCase(java.lang.String src)
Safely convert the string to uppercase.

Returns:
upper case representation of the String; or null if the input string is null.

expandShardNames

public static java.lang.String expandShardNames(java.lang.String dbSpecComponent)
                                         throws java.lang.IllegalArgumentException,
                                                java.lang.IllegalStateException
Parameters:
dbSpecComponent - a single component of a DBDescriptor spec (e.g. the host or database component). The expected format of the string is:
(prefix){(digits),(digits)}(suffix)
Returns:
a shard expansion of the given String. Note that unless the pattern is matched exactly, no expansion is performed and the original string is returned unaltered. For example, 'db{0,1}.adz' is expanded into 'db0.adz, db1.adz'. Note that this method is added to StringUtil instead of DBDescriptor to better encapsulate the choice of regexp implementation.
Throws:
java.lang.IllegalArgumentException - if the string does not parse.
java.lang.IllegalStateException

repeat

public static java.lang.String repeat(java.lang.String sourceString,
                                      int factor)
Returns sourceString concatenated together 'factor' times.

Parameters:
sourceString - The string to repeat
factor - The number of times to repeat it.

capitalize

public static java.lang.String capitalize(java.lang.String s)
Returns a string that is equivalent to the specified string with its first character converted to uppercase as by String.toUpperCase(java.util.Locale). The returned string will have the same value as the specified string if its first character is non-alphabetic, if its first character is already uppercase, or if the specified string is of length 0.

For example:

    capitalize("foo bar").equals("Foo bar");
    capitalize("2b or not 2b").equals("2b or not 2b")
    capitalize("Foo bar").equals("Foo bar");
    capitalize("").equals("");
 

Parameters:
s - the string whose first character is to be uppercased
Returns:
a string equivalent to s with its first character converted to uppercase
Throws:
java.lang.NullPointerException - if s is null