This class implements a scanner (aka lexical analyzer or
lexer) for IDL. The scanner reads characters from a global input
stream and returns integers corresponding to the terminal number
of the next token. Once the end of input is reached the EOF token
is returned on every subsequent call.
All symbol constants are defined in sym.java which is generated by
JavaCup from parser.cup.
In addition to the scanner proper (called first via init() then
with next_token() to get each token) this class provides simple
error and warning routines and keeps a count of errors and
warnings that is publicly accessible. It also provides basic
preprocessing facilties, i.e. it does handle preprocessor
directives such as #define, #undef, #include, etc. although it
does not provide full C++ preprocessing
This class is "static" (i.e., it has only static members and methods).
EOF_CHAR
protected static final int EOF_CHAR
EOF constant.
- -1
char_symbols
protected static Hashtable char_symbols
Table of single character symbols. For ease of implementation, we
store all unambiguous single character tokens in this table of Integer
objects keyed by Integer objects with the numerical value of the
appropriate char (currently Character objects have a bug which precludes
their use in tables).
conditionalCompilation
protected static boolean conditionalCompilation
currentFile
public static String currentFile
current file name
currentPragmaPrefix
public static String currentPragmaPrefix
currently active pragma prefix
current_line
protected static int current_line
Current line number for use in error messages.
current_position
protected static int current_position
Character position in current line.
defines
protected static Hashtable defines
Defined symbols (preprocessor)
in_string
protected static boolean in_string
Have we already read a '"' ?
java_keywords
protected static Hashtable java_keywords
Table of Java reserved names.
keywords
protected static Hashtable keywords
Table of keywords. Keywords are initially treated as
identifiers. Just before they are returned we look them up in
this table to see if they match one of the keywords. The
string of the name is the key here, which indexes Integer
objects holding the symbol number.
keywords_lower_case
protected static Hashtable keywords_lower_case
Table of keywords, stored in lower case. Keys are the
lower case version of the keywords used as keys for the keywords
hash above, and the values are the case sensitive versions of
the keywords. This table is used for detecting collisions of
identifiers with keywords.
line
protected static StringBuffer line
Current line for use in error messages.
next_char
protected static int next_char
First and second character of lookahead.
next_char2
protected static int next_char2
warning_count
public static int warning_count
Count of warnings issued so far
wide
protected static boolean wide
Are we processing a wide char or string ?
advance
protected static void advance()
throws java.io.IOException
Advance the scanner one character in the input stream. This moves
next_char2 to next_char and then reads a new next_char2.
checkIdentifier
public static String checkIdentifier(String str)
Checks whether Identifier str is legal and returns it. If the
identifier is escaped with a leading underscore, that
underscore is removed. If a the legal IDL identifier clashes
with a Java reserved word, an underscore is prepended.
str
- - the IDL identifier
Prints an error msg if the identifier collides with an IDL
keyword.
currentLine
public static int currentLine()
record information about the last lexical scope so that it can be
restored later
define
public static void define(String symbol,
String value)
defined
public static String defined(String symbol)
do_symbol
protected static token do_symbol()
throws java.io.IOException
Process an identifier.
Identifiers begin with a letter, underscore, or dollar sign,
which is followed by zero or more letters, numbers,
underscores or dollar signs. This routine returns a str_token
suitable for return by the scanner or null, if the string that
was read expanded to a symbol that was #defined. In this case,
the symbol is expanded in place
emit_error
public static void emit_error(String message)
Emit an error message. The message will be marked with both the
current line number and the position in the line. Error messages
are printed on standard error (System.err).
message
- the message to print.
emit_error
public static void emit_error(String message,
str_token t)
emit_warn
public static void emit_warn(String message)
Emit a warning message. The message will be marked with both the
current line number and the position in the line. Messages are
printed on standard error (System.err).
message
- the message to print.
emit_warn
public static void emit_warn(String message,
str_token t)
find_single_char
protected static int find_single_char(int ch)
Try to look up a single character symbol, returns -1 for not found.
ch
- the character in question.
getPosition
public static PositionInfo getPosition()
return the current reading position
id_char
protected static boolean id_char(int ch)
Determine if a character is ok for the middle of an id.
ch
- the character in question.
id_start_char
protected static boolean id_start_char(int ch)
Determine if a character is ok to start an id.
ch
- the character in question.
init
public static void init()
throws java.io.IOException
Initialize the scanner. This sets up the keywords and char_symbols
tables and reads the first two characters of lookahead.
"Object" is listed as reserved in the OMG spec.
"int" is not, but I reserved it to bar its usage as a legal integer
type.
needsJavaEscape
public static boolean needsJavaEscape(Module m)
next_token
public static token next_token()
throws java.io.IOException
Return one token. This is the main external interface to the scanner.
It consumes sufficient characters to determine the next input token
and returns it.
preprocess
protected static void preprocess()
throws java.io.IOException
Preprocessor directives are handled here.
real_next_token
protected static token real_next_token()
throws java.io.IOException
The actual routine to return one token.
- token
reset
public static void reset()
reset the scanner state
restorePosition
public static void restorePosition(PositionInfo p)
strictJavaEscapeCheck
public static boolean strictJavaEscapeCheck(String s)
called during the parse phase to catch clashes with
Java reserved words.
swallow_comment
protected static void swallow_comment()
throws java.io.IOException
Handle swallowing up a comment. Both old style C and new style C++
comments are handled.
undefine
public static void undefine(String symbol)