jfun.parsec
Class Lexers

java.lang.Object
  extended by jfun.parsec.Lexers

public final class Lexers
extends java.lang.Object

Provides some predefined basic lexer objects. A lexer is a character level parser that returns a token based on the recognized character range.

Author:
Ben Yu Dec 19, 2004

Constructor Summary
Lexers()
           
 
Method Summary
static Parser<Tok> allInteger()
          Deprecated. Use lexLong().
static Parser<Tok> allInteger(java.lang.String name)
          Deprecated. Use lexLong(String).
static Parser<Tok> charLiteral()
          returns the lexer that's gonna parse single quoted character literal (escaped by '\'), and then converts the character to a Character.
static Parser<Tok> charLiteral(java.lang.String name)
          returns the lexer that's gonna parse single quoted character literal (escaped by '\'), and then converts the character to a Character.
static Parser<Tok> decimal()
          returns the lexer that's gonna parse a decimal number (valid patterns are: 1, 2.3, 000, 0., .23), and convert the string to a decimal typed token.
static Parser<Tok> decimal(java.lang.String name)
          returns the lexer that's gonna parse a decimal number (valid patterns are: 1, 2.3, 000, 0., .23), and convert the string to a decimal typed token.
static Parser<Tok> decInteger()
          Deprecated. Use lexDecLong().
static Parser<Tok> decInteger(java.lang.String name)
          Deprecated. Use lexDecLong(String).
static Words getCaseInsensitive(Parser<?> wscanner, java.lang.String[] ops, java.lang.String[] keywords)
          Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively.
static Words getCaseInsensitive(Parser<?> wscanner, java.lang.String[] ops, java.lang.String[] keywords, FromString<?> toWord)
          Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively.
static Words getCaseInsensitive(java.lang.String[] ops, java.lang.String[] keywords)
          Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively.
static Words getCaseSensitive(Parser<?> wscanner, java.lang.String[] ops, java.lang.String[] keywords)
          Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively.
static Words getCaseSensitive(Parser<?> wscanner, java.lang.String[] ops, java.lang.String[] keywords, FromString<?> toWord)
          Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively.
static Words getCaseSensitive(java.lang.String[] ops, java.lang.String[] keywords)
          Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively.
static Words getOperators(java.lang.String... ops)
          Creates a Words object for lexing the operators with names specified in ops.
static Parser<Tok> hexInteger()
          Deprecated. Use lexHexLong().
static Parser<Tok> hexInteger(java.lang.String name)
          Deprecated. Use lexHexLong(String).
static Parser<Tok> integer()
          returns the lexer that's gonna parse a integer number (valid patterns are: 0, 00, 1, 10), and convert the string to an integer typed token.
static Parser<Tok> integer(java.lang.String name)
          returns the lexer that's gonna parse a integer number (valid patterns are: 0, 00, 1, 10), and convert the string to an integer typed token.
static Parser<Tok> lexDecLong()
          returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token.
static Parser<Tok> lexDecLong(java.lang.String name)
          returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token.
static Parser<Tok[]> lexeme(Parser<?> delim, Parser<Tok> s)
          Greedily runs Parser s repeatedly, and ignores the pattern recognized by Parser delim before and after each s.
static Parser<Tok[]> lexeme(java.lang.String name, Parser<?> delim, Parser<Tok> s)
          Greedily runs Parser s repeatedly, and ignores the pattern recognized by Parser delim before and after each s.
static Parser<Tok> lexer(Parser<?> s, Tokenizer tn)
          Transform the recognized character range of scanner s to a token object with a Tokenizer.
static Parser<Tok> lexer(Parser<?> s, Tokenizer tn, java.lang.String err)
          Transform the recognized character range of scanner s to a token object with a Tokenizer.
static Parser<Tok> lexer(java.lang.String name, Parser<?> s, Tokenizer tn)
          Transform the recognized character range of scanner s to a token object with a Tokenizer.
static Parser<Tok> lexer(java.lang.String name, Parser<?> s, Tokenizer tn, java.lang.String err)
          Transform the recognized character range of scanner s to a token object with a Tokenizer.
static Parser<Tok> lexHexLong()
          returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token.
static Parser<Tok> lexHexLong(java.lang.String name)
          returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token.
static Parser<Tok> lexLong()
          returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.
static Parser<Tok> lexLong(java.lang.String name)
          returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.
static Parser<Tok> lexOctLong()
          returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token.
static Parser<Tok> lexOctLong(java.lang.String name)
          returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token.
static Parser<Tok> lexSimpleStringLiteral()
          returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.
static Parser<Tok> lexSimpleStringLiteral(java.lang.String name)
          returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.
static Parser<Tok> octInteger()
          Deprecated. Use lexOctLong().
static Parser<Tok> octInteger(java.lang.String name)
          Deprecated. Use lexOctLong(String).
static Parser<Tok> quoted(char open, char close)
          Create a lexer that parsers a string literal quoted by open and close, and then converts it to a TokenQuoted token instance.
static Parser<Tok> quoted(java.lang.String name, char open, char close)
          Create a lexer that parsers a string literal quoted by open and close, and then converts it to a TokenQuoted token instance.
static Parser<Tok> sqlStringLiteral()
          returns the lexer that's gonna parse single quoted string literal (single quote is escaped with another single quote), and convert the string to a String token.
static Parser<Tok> sqlStringLiteral(java.lang.String name)
          returns the lexer that's gonna parse single quoted string literal (single quote is escaped with another single quote), and convert the string to a String token.
static Parser<Tok> stringLiteral()
          Deprecated. Use lexSimpleStringLiteral()
static Parser<Tok> stringLiteral(java.lang.String name)
          Deprecated. Use lexSimpleStringLiteral(String)
static Parser<Tok> word()
          returns the lexer that's gonna parse any word.
static Parser<Tok> word(java.lang.String name)
          returns the lexer that's gonna parse any word.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Lexers

public Lexers()
Method Detail

charLiteral

public static Parser<Tok> charLiteral()
returns the lexer that's gonna parse single quoted character literal (escaped by '\'), and then converts the character to a Character.

Returns:
the lexer.

charLiteral

public static Parser<Tok> charLiteral(java.lang.String name)
returns the lexer that's gonna parse single quoted character literal (escaped by '\'), and then converts the character to a Character.

Parameters:
name - the lexer name.
Returns:
the lexer.

stringLiteral

public static Parser<Tok> stringLiteral()
Deprecated. Use lexSimpleStringLiteral()

returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.

Returns:
the lexer.

lexSimpleStringLiteral

public static Parser<Tok> lexSimpleStringLiteral()
returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.

Returns:
the lexer.

stringLiteral

public static Parser<Tok> stringLiteral(java.lang.String name)
Deprecated. Use lexSimpleStringLiteral(String)

returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.

Parameters:
name - the lexer name.
Returns:
the lexer.

lexSimpleStringLiteral

public static Parser<Tok> lexSimpleStringLiteral(java.lang.String name)
returns the lexer that's gonna parse double quoted string literal (escaped by '\'), and convert the string to a String token.

Parameters:
name - the lexer name.
Returns:
the lexer.

sqlStringLiteral

public static Parser<Tok> sqlStringLiteral()
returns the lexer that's gonna parse single quoted string literal (single quote is escaped with another single quote), and convert the string to a String token.

Returns:
the lexer.

sqlStringLiteral

public static Parser<Tok> sqlStringLiteral(java.lang.String name)
returns the lexer that's gonna parse single quoted string literal (single quote is escaped with another single quote), and convert the string to a String token.

Parameters:
name - the lexer name.
Returns:
the lexer.

decimal

public static Parser<Tok> decimal()
returns the lexer that's gonna parse a decimal number (valid patterns are: 1, 2.3, 000, 0., .23), and convert the string to a decimal typed token.

Returns:
the lexer.

decimal

public static Parser<Tok> decimal(java.lang.String name)
returns the lexer that's gonna parse a decimal number (valid patterns are: 1, 2.3, 000, 0., .23), and convert the string to a decimal typed token.

Parameters:
name - the lexer name.
Returns:
the lexer.

integer

public static Parser<Tok> integer()
returns the lexer that's gonna parse a integer number (valid patterns are: 0, 00, 1, 10), and convert the string to an integer typed token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Returns:
the lexer.

integer

public static Parser<Tok> integer(java.lang.String name)
returns the lexer that's gonna parse a integer number (valid patterns are: 0, 00, 1, 10), and convert the string to an integer typed token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Parameters:
name - the lexer name.
Returns:
the lexer.

decInteger

public static Parser<Tok> decInteger()
Deprecated. Use lexDecLong().

returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Returns:
the lexer.

decInteger

public static Parser<Tok> decInteger(java.lang.String name)
Deprecated. Use lexDecLong(String).

returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Parameters:
name - the lexer name.
Returns:
the lexer.

octInteger

public static Parser<Tok> octInteger()
Deprecated. Use lexOctLong().

returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.

Returns:
the lexer.

octInteger

public static Parser<Tok> octInteger(java.lang.String name)
Deprecated. Use lexOctLong(String).

returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.

Parameters:
name - the lexer name.
Returns:
the lexer.

hexInteger

public static Parser<Tok> hexInteger()
Deprecated. Use lexHexLong().

returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.

Returns:
the lexer.

hexInteger

public static Parser<Tok> hexInteger(java.lang.String name)
Deprecated. Use lexHexLong(String).

returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.

Parameters:
name - the lexer name.
Returns:
the lexer.

allInteger

public static Parser<Tok> allInteger()
Deprecated. Use lexLong().

returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.

Returns:
the lexer.

allInteger

public static Parser<Tok> allInteger(java.lang.String name)
Deprecated. Use lexLong(String).

returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.

Parameters:
name - the lexer name.
Returns:
the lexer.

lexDecLong

public static Parser<Tok> lexDecLong()
returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Returns:
the lexer.

lexDecLong

public static Parser<Tok> lexDecLong(java.lang.String name)
returns the lexer that's gonna parse a decimal integer number (valid patterns are: 1, 10, 123), and convert the string to a Long token. The difference between integer() and decInteger() is that decInteger does not allow a number starting with 0.

Parameters:
name - the lexer name.
Returns:
the lexer.

lexOctLong

public static Parser<Tok> lexOctLong()
returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.

Returns:
the lexer.

lexOctLong

public static Parser<Tok> lexOctLong(java.lang.String name)
returns the lexer that's gonna parse a octal integer number (valid patterns are: 0, 07, 017, 0371 etc.), and convert the string to a Long token. an octal number has to start with 0.

Parameters:
name - the lexer name.
Returns:
the lexer.

lexHexLong

public static Parser<Tok> lexHexLong()
returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.

Returns:
the lexer.

lexHexLong

public static Parser<Tok> lexHexLong(java.lang.String name)
returns the lexer that's gonna parse a hex integer number (valid patterns are: 0x1, 0Xff, 0xFe1 etc.), and convert the string to a Long token. an hex number has to start with either 0x or 0X.

Parameters:
name - the lexer name.
Returns:
the lexer.

lexLong

public static Parser<Tok> lexLong()
returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.

Returns:
the lexer.

lexLong

public static Parser<Tok> lexLong(java.lang.String name)
returns the lexer that's gonna parse decimal, hex, and octal numbers and convert the string to a Long token.

Parameters:
name - the lexer name.
Returns:
the lexer.

word

public static Parser<Tok> word()
returns the lexer that's gonna parse any word. and convert the string to a TokenWord. A word starts with an alphametic character, followed by 0 or more alphanumeric characters.

Returns:
the lexer.

word

public static Parser<Tok> word(java.lang.String name)
returns the lexer that's gonna parse any word. and convert the string to a TokenWord. A word starts with an alphametic character, followed by 0 or more alphanumeric characters.

Parameters:
name - the lexer name.
Returns:
the lexer.

quoted

public static Parser<Tok> quoted(java.lang.String name,
                                 char open,
                                 char close)
Create a lexer that parsers a string literal quoted by open and close, and then converts it to a TokenQuoted token instance.

Parameters:
name - the lexer name.
open - the opening character.
close - the closing character.
Returns:
the lexer.

quoted

public static Parser<Tok> quoted(char open,
                                 char close)
Create a lexer that parsers a string literal quoted by open and close, and then converts it to a TokenQuoted token instance.

Parameters:
open - the opening character.
close - the closing character.
Returns:
the lexer.

getOperators

public static Words getOperators(java.lang.String... ops)
Creates a Words object for lexing the operators with names specified in ops. Operators are lexed as TokenReserved.

Parameters:
ops - the operator names.
Returns:
the Words instance.

getCaseInsensitive

public static Words getCaseInsensitive(java.lang.String[] ops,
                                       java.lang.String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord. A word is defined as an alpha numeric string that starts with [_a-zA-Z], with 0 or more [0-9_a-zA-Z] following.

Parameters:
ops - the operator names.
keywords - the keyword names.
Returns:
the Words instance.

getCaseSensitive

public static Words getCaseSensitive(java.lang.String[] ops,
                                     java.lang.String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord. A word is defined as an alpha numeric string that starts with [_a-zA-Z], with 0 or more [0-9_a-zA-Z] following.

Parameters:
ops - the operator names.
keywords - the keyword names.
Returns:
the Words instance.

getCaseInsensitive

public static Words getCaseInsensitive(Parser<?> wscanner,
                                       java.lang.String[] ops,
                                       java.lang.String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord.

Parameters:
wscanner - the scanner for a word in the language.
ops - the operator names.
keywords - the keyword names.
Returns:
the Words instance.

getCaseSensitive

public static Words getCaseSensitive(Parser<?> wscanner,
                                     java.lang.String[] ops,
                                     java.lang.String[] keywords)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord.

Parameters:
wscanner - the scanner for a word in the language.
ops - the operator names.
keywords - the keyword names.
Returns:
the Words instance.

getCaseInsensitive

public static Words getCaseInsensitive(Parser<?> wscanner,
                                       java.lang.String[] ops,
                                       java.lang.String[] keywords,
                                       FromString<?> toWord)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case insensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord.

Parameters:
wscanner - the scanner for a word in the language.
ops - the operator names.
keywords - the keyword names.
toWord - the FromString object used to create a token for non-key words recognized by wscanner.
Returns:
the Words instance.

getCaseSensitive

public static Words getCaseSensitive(Parser<?> wscanner,
                                     java.lang.String[] ops,
                                     java.lang.String[] keywords,
                                     FromString<?> toWord)
Creates a Words object for lexing the operators with names specified in ops, and for lexing the keywords case sensitively. Keywords and operators are lexed as TokenReserved. Words that are not among the keywords are lexed as TokenWord.

Parameters:
wscanner - the scanner for a word in the language.
ops - the operator names.
keywords - the keyword names.
toWord - the FromString object used to create a token for non-key words recognized by wscanner.
Returns:
the Words instance.

lexer

public static Parser<Tok> lexer(java.lang.String name,
                                Parser<?> s,
                                Tokenizer tn)
Transform the recognized character range of scanner s to a token object with a Tokenizer. If the Tokenizer.toToken() returns null, scan fails.

Parameters:
name - the name of the new Scanner.
tn - the Tokenizer object.
s - the scanner to transform.
Returns:
the new Scanner.

lexer

public static Parser<Tok> lexer(Parser<?> s,
                                Tokenizer tn)
Transform the recognized character range of scanner s to a token object with a Tokenizer. If the Tokenizer.toToken() returns null, scan fails.

Parameters:
s - the scanner to transform.
tn - the Tokenizer object.
Returns:
the new Scanner.

lexer

public static Parser<Tok> lexer(Parser<?> s,
                                Tokenizer tn,
                                java.lang.String err)
Transform the recognized character range of scanner s to a token object with a Tokenizer. If the Tokenizer.toToken() returns null, scan fails.

Parameters:
s - the scanner to transform.
tn - the Tokenizer object.
err - the error message when the tokenizer returns null.
Returns:
the new Scanner.

lexer

public static Parser<Tok> lexer(java.lang.String name,
                                Parser<?> s,
                                Tokenizer tn,
                                java.lang.String err)
Transform the recognized character range of scanner s to a token object with a Tokenizer. If the Tokenizer.toToken() returns null, scan fails.

Parameters:
name - the name of the new Scanner.
s - the scanner to transform.
tn - the Tokenizer object.
err - the error message when the tokenizer returns null.
Returns:
the new Scanner.

lexeme

public static Parser<Tok[]> lexeme(java.lang.String name,
                                   Parser<?> delim,
                                   Parser<Tok> s)
Greedily runs Parser s repeatedly, and ignores the pattern recognized by Parser delim before and after each s. Parser s has to be a lexer object that returns a Tok object. The result Tok objects are collected and returned in a Tok[] array.

Parameters:
name - the name of the new Parser object.
delim - the delimiter Parser object.
s - the Parser object.
Returns:
the new Parser object.

lexeme

public static Parser<Tok[]> lexeme(Parser<?> delim,
                                   Parser<Tok> s)
Greedily runs Parser s repeatedly, and ignores the pattern recognized by Parser delim before and after each s. Parser s has to be a lexer object that returns a Tok object. The result Tok objects are collected and returned in a Tok[] array.

Parameters:
delim - the delimiter Parser object.
s - the Parser object.
Returns:
the new Parser object.