net.sf.saxon.regex
Class JDK14RegexTranslator
java.lang.Object
net.sf.saxon.regex.JDK14RegexTranslator
public class JDK14RegexTranslator
- extends java.lang.Object
This class translates XML Schema regex syntax into JDK 1.4 regex syntax.
Author: James Clark
Modified by Michael Kay (a) to integrate the code into Saxon, and (b) to support XPath additions
to the XML Schema regex syntax.
This version of the regular expression translator treats each half of a surrogate pair as a separate
character, translating anything in an XPath regex that can match a non-BMP character into a Java
regex that matches the two halves of a surrogate pair independently. This approach doesn't work
under JDK 1.5, whose regex engine treats a surrogate pair as a single character.
The same translator is currently used for Saxon on .NET 1.1
Method Summary |
int |
getNumberOfCapturedGroups()
|
static void |
main(java.lang.String[] args)
|
java.lang.String |
translate(java.lang.CharSequence regExp,
boolean xpath)
Translates a regular expression in the syntax of XML Schemas Part 2 into a regular
expression in the syntax of java.util.regex.Pattern . |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
CATEGORY_NAMES
static final java.lang.String CATEGORY_NAMES
- See Also:
- Constant Field Values
CATEGORY_RANGES
static final int[][] CATEGORY_RANGES
NMSTRT_INCLUDES
static final java.lang.String NMSTRT_INCLUDES
- See Also:
- Constant Field Values
NMSTRT_EXCLUDE_RANGES
static final java.lang.String NMSTRT_EXCLUDE_RANGES
- See Also:
- Constant Field Values
NMSTRT_CATEGORIES
static final java.lang.String NMSTRT_CATEGORIES
- See Also:
- Constant Field Values
NMCHAR_INCLUDES
static final java.lang.String NMCHAR_INCLUDES
- See Also:
- Constant Field Values
NMCHAR_EXCLUDE_RANGES
static final java.lang.String NMCHAR_EXCLUDE_RANGES
- See Also:
- Constant Field Values
NMCHAR_CATEGORIES
static final java.lang.String NMCHAR_CATEGORIES
- See Also:
- Constant Field Values
NONE
static final int NONE
- See Also:
- Constant Field Values
SOME
static final int SOME
- See Also:
- Constant Field Values
ALL
static final int ALL
- See Also:
- Constant Field Values
SURROGATES1_CLASS
static final java.lang.String SURROGATES1_CLASS
- See Also:
- Constant Field Values
SURROGATES2_CLASS
static final java.lang.String SURROGATES2_CLASS
- See Also:
- Constant Field Values
NOT_ALLOWED_CLASS
static final java.lang.String NOT_ALLOWED_CLASS
- See Also:
- Constant Field Values
JDK14RegexTranslator
public JDK14RegexTranslator()
translate
public java.lang.String translate(java.lang.CharSequence regExp,
boolean xpath)
throws RegexSyntaxException
- Translates a regular expression in the syntax of XML Schemas Part 2 into a regular
expression in the syntax of
java.util.regex.Pattern
. The translation
assumes that the string to be matched against the regex uses surrogate pairs correctly.
If the string comes from XML content, a conforming XML parser will automatically
check this; if the string comes from elsewhere, it may be necessary to check
surrogate usage before matching.
- Parameters:
regExp
- a String containing a regular expression in the syntax of XML Schemas Part 2xpath
- a boolean indicating whether the XPath 2.0 F+O extensions to the schema
regex syntax are permitted
- Returns:
- a String containing a regular expression in the syntax of java.util.regex.Pattern
- Throws:
RegexSyntaxException
- if regexp
is not a regular expression in the
syntax of XML Schemas Part 2, or XPath 2.0, as appropriate- See Also:
Pattern
,
XML Schema Part 2
getNumberOfCapturedGroups
public int getNumberOfCapturedGroups()
main
public static void main(java.lang.String[] args)
throws RegexSyntaxException
- Throws:
RegexSyntaxException