net.sf.saxon.regex
Class JDK15RegexTranslator

java.lang.Object
  extended by net.sf.saxon.regex.JDK15RegexTranslator

public class JDK15RegexTranslator
extends java.lang.Object

This class translates XML Schema regex syntax into JDK 1.5 regex syntax. This differs from the JDK 1.4 translator because JDK 1.5 handles non-BMP characters (wide characters) in places where JDK 1.4 does not, for example in a range such as [X-Y]. This enables much of the code from the 1.4 translator to be removed. Author: James Clark Modified by Michael Kay (a) to integrate the code into Saxon, and (b) to support XPath additions to the XML Schema regex syntax. This version also removes most of the complexities of handling non-BMP characters, since JDK 1.5 handles these natively.


Nested Class Summary
(package private) static class JDK15RegexTranslator.BackReference
           
(package private) static class JDK15RegexTranslator.CharClass
           
(package private) static class JDK15RegexTranslator.CharRange
           
(package private) static class JDK15RegexTranslator.Complement
           
(package private) static class JDK15RegexTranslator.Empty
           
(package private) static class JDK15RegexTranslator.Property
           
(package private) static class JDK15RegexTranslator.Range
           
(package private) static class JDK15RegexTranslator.SimpleCharClass
           
(package private) static class JDK15RegexTranslator.SingleChar
           
(package private) static class JDK15RegexTranslator.Subtraction
           
(package private) static class JDK15RegexTranslator.Union
           
 
Field Summary
(package private) static int ALL
           
(package private) static java.lang.String CATEGORY_NAMES
           
(package private) static int[][] CATEGORY_RANGES
           
(package private) static java.lang.String NMCHAR_CATEGORIES
           
(package private) static java.lang.String NMCHAR_EXCLUDE_RANGES
           
(package private) static java.lang.String NMCHAR_INCLUDES
           
(package private) static java.lang.String NMSTRT_CATEGORIES
           
(package private) static java.lang.String NMSTRT_EXCLUDE_RANGES
           
(package private) static java.lang.String NMSTRT_INCLUDES
           
(package private) static int NONE
           
(package private) static java.lang.String NOT_ALLOWED_CLASS
           
(package private) static int SOME
           
(package private) static java.lang.String SURROGATES1_CLASS
           
(package private) static java.lang.String SURROGATES2_CLASS
           
 
Method Summary
static void main(java.lang.String[] args)
           
static java.lang.String translate(java.lang.CharSequence regexp, boolean xpath)
          Translates a regular expression in the syntax of XML Schemas Part 2 into a regular expression in the syntax of java.util.regex.Pattern.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

CATEGORY_NAMES

static final java.lang.String CATEGORY_NAMES
See Also:
Constant Field Values

CATEGORY_RANGES

static final int[][] CATEGORY_RANGES

NMSTRT_INCLUDES

static final java.lang.String NMSTRT_INCLUDES
See Also:
Constant Field Values

NMSTRT_EXCLUDE_RANGES

static final java.lang.String NMSTRT_EXCLUDE_RANGES
See Also:
Constant Field Values

NMSTRT_CATEGORIES

static final java.lang.String NMSTRT_CATEGORIES
See Also:
Constant Field Values

NMCHAR_INCLUDES

static final java.lang.String NMCHAR_INCLUDES
See Also:
Constant Field Values

NMCHAR_EXCLUDE_RANGES

static final java.lang.String NMCHAR_EXCLUDE_RANGES
See Also:
Constant Field Values

NMCHAR_CATEGORIES

static final java.lang.String NMCHAR_CATEGORIES
See Also:
Constant Field Values

NONE

static final int NONE
See Also:
Constant Field Values

SOME

static final int SOME
See Also:
Constant Field Values

ALL

static final int ALL
See Also:
Constant Field Values

SURROGATES1_CLASS

static final java.lang.String SURROGATES1_CLASS
See Also:
Constant Field Values

SURROGATES2_CLASS

static final java.lang.String SURROGATES2_CLASS
See Also:
Constant Field Values

NOT_ALLOWED_CLASS

static final java.lang.String NOT_ALLOWED_CLASS
See Also:
Constant Field Values
Method Detail

translate

public static java.lang.String translate(java.lang.CharSequence regexp,
                                         boolean xpath)
                                  throws RegexSyntaxException
Translates a regular expression in the syntax of XML Schemas Part 2 into a regular expression in the syntax of java.util.regex.Pattern. The translation assumes that the string to be matched against the regex uses surrogate pairs correctly. If the string comes from XML content, a conforming XML parser will automatically check this; if the string comes from elsewhere, it may be necessary to check surrogate usage before matching.

Parameters:
regexp - a String containing a regular expression in the syntax of XML Schemas Part 2
xpath - a boolean indicating whether the XPath 2.0 F+O extensions to the schema regex syntax are permitted
Throws:
RegexSyntaxException - if regexp is not a regular expression in the syntax of XML Schemas Part 2, or XPath 2.0, as appropriate
See Also:
Pattern, XML Schema Part 2

main

public static void main(java.lang.String[] args)
                 throws RegexSyntaxException
Throws:
RegexSyntaxException