Package org.antlr.codegen
Class CodeGenerator
- java.lang.Object
-
- org.antlr.codegen.CodeGenerator
-
public class CodeGenerator extends Object
ANTLR's code generator. Generate recognizers derived from grammars. Language independence achieved through the use of StringTemplateGroup objects. All output strings are completely encapsulated in the group files such as Java.stg. Some computations are done that are unused by a particular language. This generator just computes and sets the values into the templates; the templates are free to use or not use the information. To make a new code generation target, define X.stg for language X by copying from existing Y.stg most closely releated to your language; e.g., to do CSharp.stg copy Java.stg. The template group file has a bunch of templates that are needed by the code generator. You can add a new target w/o even recompiling ANTLR itself. The language=X option in a grammar file dictates which templates get loaded/used. Some language like C need both parser files and header files. Java needs to have a separate file for the cyclic DFA as ANTLR generates bytecodes directly (which cannot be in the generated parser Java file). To facilitate this, cyclic can be in same file, but header, output must be searpate. recognizer is in outptufile.
-
-
Field Summary
Fields Modifier and Type Field Description ACyclicDFACodeGenerator
acyclicDFAGenerator
I have factored out the generation of acyclic DFAs to separate classprotected org.antlr.stringtemplate.StringTemplateGroup
baseTemplates
The basic output templates without AST or templates stuff; this will be the templates loaded for the language such as Java.stg *and* the Dbg stuff if turned on.String
classpathTemplateRootDirectoryName
protected boolean
debug
Generate debugging event method callsstatic boolean
EMIT_TEMPLATE_DELIMITERS
boolean
GENERATE_SWITCHES_WHEN_POSSIBLE
Grammar
grammar
Which grammar are we generating code for? Each generator is attached to a specific grammar.protected org.antlr.stringtemplate.StringTemplate
headerFileST
protected String
language
What language are we generating?protected int
lineWidth
static int
MADSI_DEFAULT
static int
MAX_ACYCLIC_DFA_STATES_INLINE
static int
MAX_SWITCH_CASE_LABELS
static int
MIN_SWITCH_ALTS
static int
MSA_DEFAULT
static int
MSCL_DEFAULT
When generating SWITCH statements, some targets might need to limit the size (based upon the number of case labels).protected org.antlr.stringtemplate.StringTemplate
outputFileST
protected boolean
profile
Track runtime parsing information about decisions etc...protected org.antlr.stringtemplate.StringTemplate
recognizerST
Target
target
The target specifies how to write out files and do other language specific actions.protected org.antlr.stringtemplate.StringTemplateGroup
templates
Where are the templates this generator should use to generate code?protected Tool
tool
A reference to the ANTLR tool so we can learn about output directories and such.protected boolean
trace
Create a Tracer object and make the recognizer invoke this.protected int
uniqueLabelNumber
Used to create unique labelsstatic String
VOCAB_FILE_EXTENSION
I have factored out the generation of cyclic DFAs to separate classprotected static String
vocabFilePattern
-
Constructor Summary
Constructors Constructor Description CodeGenerator(Tool tool, Grammar grammar, String language)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description protected boolean
canGenerateSwitch(DFAState s)
You can generate a switch rather than if-then-else for a DFA state if there are no semantic predicates and the number of edge label values is small enough; e.g., don't generate a switch for a state containing an edge label such as 20..52330 (the resulting byte codes would overflow the method 65k limit probably).String
createUniqueLabel(String name)
Create a label to track a token / rule reference's result.void
generateLocalFOLLOW(GrammarAST referencedElementNode, String referencedElementName, String enclosingRuleName, int elementIndex)
Error recovery in ANTLR recognizers.org.antlr.stringtemplate.StringTemplate
generateSpecialState(DFAState s)
A special state is huge (too big for state tables) or has a predicated edge.protected org.antlr.stringtemplate.StringTemplate
genLabelExpr(org.antlr.stringtemplate.StringTemplateGroup templates, Transition edge, int k)
Generate an expression for traversing an edge.org.antlr.stringtemplate.StringTemplate
genLookaheadDecision(org.antlr.stringtemplate.StringTemplate recognizerST, DFA dfa)
Generate code that computes the predicted alt given a DFA.org.antlr.stringtemplate.StringTemplate
genRecognizer()
Given the grammar to which we are attached, walk the AST associated with that grammar to create NFAs.protected org.antlr.stringtemplate.StringTemplate
genSemanticPredicateExpr(org.antlr.stringtemplate.StringTemplateGroup templates, Transition edge)
org.antlr.stringtemplate.StringTemplate
genSetExpr(org.antlr.stringtemplate.StringTemplateGroup templates, IntSet set, int k, boolean partOfDFA)
For intervals such as [3..3, 30..35], generate an expression that tests the lookahead similar to LA(1)==3 || (LA(1)>=30&&LA(1)<=35)protected void
genTokenTypeConstants(org.antlr.stringtemplate.StringTemplate code)
Set attributes tokens and literals attributes in the incoming code template.protected void
genTokenTypeNames(org.antlr.stringtemplate.StringTemplate code)
Generate a token names table that maps token type to a printable name: either the label like INT or the literal like "begin".protected org.antlr.stringtemplate.StringTemplate
genTokenVocabOutput()
Generate a token vocab file with all the token names/types.org.antlr.stringtemplate.StringTemplateGroup
getBaseTemplates()
static List<String>
getListOfArgumentsFromAction(String actionText, int separatorChar)
static int
getListOfArgumentsFromAction(String actionText, int start, int targetChar, int separatorChar, List<String> args)
Given an arg action like [x, (*a).foo(21,33), 3.2+1, '\n', "a,oo\nick", {bl, "fdkj"eck}, ["cat\n,", x, 43]] convert to a list of arguments.String
getRecognizerFileName(String name, int type)
Generate TParser.java and TLexer.java from T.g if combined, else just use T.java as output regardless of type.org.antlr.stringtemplate.StringTemplate
getRecognizerST()
org.antlr.stringtemplate.StringTemplateGroup
getTemplates()
String
getTokenTypeAsTargetLabel(int ttype)
Get a meaningful name for a token type useful during code generation.String
getVocabFileName()
What is the name of the vocab file generated for this grammar? Returns null if no .tokens file should be generated.void
issueInvalidAttributeError(String x, String y, Rule enclosingRule, antlr.Token actionToken, int outerAltNum)
void
issueInvalidAttributeError(String x, Rule enclosingRule, antlr.Token actionToken, int outerAltNum)
void
issueInvalidScopeError(String x, String y, Rule enclosingRule, antlr.Token actionToken, int outerAltNum)
protected void
loadLanguageTarget(String language)
void
loadTemplates(String language)
load the main language.stg template group filevoid
setDebug(boolean debug)
void
setProfile(boolean profile)
void
setTrace(boolean trace)
List
translateAction(String ruleName, GrammarAST actionTree)
protected void
translateActionAttributeReferences(Map actions)
Actions may reference $x::y attributes, call translateAction on each action and replace that action in the Map.void
translateActionAttributeReferencesForSingleScope(Rule r, Map scopeActions)
Use for translating rule @init{...} actions that have no scopeList<org.antlr.stringtemplate.StringTemplate>
translateArgAction(String ruleName, GrammarAST actionTree)
Translate an action like [3,"foo",a[3]] and return a List of the translated actions.org.antlr.stringtemplate.StringTemplate
translateTemplateConstructor(String ruleName, int outerAltNum, antlr.Token actionToken, String templateActionText)
Given a template constructor action like %foo(a={...}) in an action, translate it to the appropriate template constructor from the templateLib.protected void
verifyActionScopesOkForTarget(Map actions)
Some targets will have some extra scopes like C++ may have '@headerfile:name {action}' or something.void
write(org.antlr.stringtemplate.StringTemplate code, String fileName)
-
-
-
Field Detail
-
MSCL_DEFAULT
public static final int MSCL_DEFAULT
When generating SWITCH statements, some targets might need to limit the size (based upon the number of case labels). Generally, this limit will be hit only for lexers where wildcard in a UNICODE vocabulary environment would generate a SWITCH with 65000 labels.- See Also:
- Constant Field Values
-
MAX_SWITCH_CASE_LABELS
public static int MAX_SWITCH_CASE_LABELS
-
MSA_DEFAULT
public static final int MSA_DEFAULT
- See Also:
- Constant Field Values
-
MIN_SWITCH_ALTS
public static int MIN_SWITCH_ALTS
-
GENERATE_SWITCHES_WHEN_POSSIBLE
public boolean GENERATE_SWITCHES_WHEN_POSSIBLE
-
EMIT_TEMPLATE_DELIMITERS
public static boolean EMIT_TEMPLATE_DELIMITERS
-
MADSI_DEFAULT
public static final int MADSI_DEFAULT
- See Also:
- Constant Field Values
-
MAX_ACYCLIC_DFA_STATES_INLINE
public static int MAX_ACYCLIC_DFA_STATES_INLINE
-
classpathTemplateRootDirectoryName
public String classpathTemplateRootDirectoryName
-
grammar
public Grammar grammar
Which grammar are we generating code for? Each generator is attached to a specific grammar.
-
language
protected String language
What language are we generating?
-
target
public Target target
The target specifies how to write out files and do other language specific actions.
-
templates
protected org.antlr.stringtemplate.StringTemplateGroup templates
Where are the templates this generator should use to generate code?
-
baseTemplates
protected org.antlr.stringtemplate.StringTemplateGroup baseTemplates
The basic output templates without AST or templates stuff; this will be the templates loaded for the language such as Java.stg *and* the Dbg stuff if turned on. This is used for generating syntactic predicates.
-
recognizerST
protected org.antlr.stringtemplate.StringTemplate recognizerST
-
outputFileST
protected org.antlr.stringtemplate.StringTemplate outputFileST
-
headerFileST
protected org.antlr.stringtemplate.StringTemplate headerFileST
-
uniqueLabelNumber
protected int uniqueLabelNumber
Used to create unique labels
-
tool
protected Tool tool
A reference to the ANTLR tool so we can learn about output directories and such.
-
debug
protected boolean debug
Generate debugging event method calls
-
trace
protected boolean trace
Create a Tracer object and make the recognizer invoke this.
-
profile
protected boolean profile
Track runtime parsing information about decisions etc... This requires the debugging event mechanism to work.
-
lineWidth
protected int lineWidth
-
acyclicDFAGenerator
public ACyclicDFACodeGenerator acyclicDFAGenerator
I have factored out the generation of acyclic DFAs to separate class
-
VOCAB_FILE_EXTENSION
public static final String VOCAB_FILE_EXTENSION
I have factored out the generation of cyclic DFAs to separate class- See Also:
- Constant Field Values
-
vocabFilePattern
protected static final String vocabFilePattern
- See Also:
- Constant Field Values
-
-
Method Detail
-
loadLanguageTarget
protected void loadLanguageTarget(String language)
-
loadTemplates
public void loadTemplates(String language)
load the main language.stg template group file
-
genRecognizer
public org.antlr.stringtemplate.StringTemplate genRecognizer()
Given the grammar to which we are attached, walk the AST associated with that grammar to create NFAs. Then create the DFAs for all decision points in the grammar by converting the NFAs to DFAs. Finally, walk the AST again to generate code. Either 1 or 2 files are written: recognizer: the main parser/lexer/treewalker item header file: language like C/C++ need extern definitions The target, such as JavaTarget, dictates which files get written.
-
verifyActionScopesOkForTarget
protected void verifyActionScopesOkForTarget(Map actions)
Some targets will have some extra scopes like C++ may have '@headerfile:name {action}' or something. Make sure the target likes the scopes in action table.
-
translateActionAttributeReferences
protected void translateActionAttributeReferences(Map actions)
Actions may reference $x::y attributes, call translateAction on each action and replace that action in the Map.
-
translateActionAttributeReferencesForSingleScope
public void translateActionAttributeReferencesForSingleScope(Rule r, Map scopeActions)
Use for translating rule @init{...} actions that have no scope
-
generateLocalFOLLOW
public void generateLocalFOLLOW(GrammarAST referencedElementNode, String referencedElementName, String enclosingRuleName, int elementIndex)
Error recovery in ANTLR recognizers. Based upon original ideas: Algorithms + Data Structures = Programs by Niklaus Wirth and A note on error recovery in recursive descent parsers: http://portal.acm.org/citation.cfm?id=947902.947905 Later, Josef Grosch had some good ideas: Efficient and Comfortable Error Recovery in Recursive Descent Parsers: ftp://www.cocolab.com/products/cocktail/doca4.ps/ell.ps.zip Like Grosch I implemented local FOLLOW sets that are combined at run-time upon error to avoid parsing overhead.
-
genLookaheadDecision
public org.antlr.stringtemplate.StringTemplate genLookaheadDecision(org.antlr.stringtemplate.StringTemplate recognizerST, DFA dfa)
Generate code that computes the predicted alt given a DFA. The recognizerST can be either the main generated recognizerTemplate for storage in the main parser file or a separate file. It's up to the code that ultimately invokes the codegen.g grammar rule. Regardless, the output file and header file get a copy of the DFAs.
-
generateSpecialState
public org.antlr.stringtemplate.StringTemplate generateSpecialState(DFAState s)
A special state is huge (too big for state tables) or has a predicated edge. Generate a simple if-then-else. Cannot be an accept state as they have no emanating edges. Don't worry about switch vs if-then-else because if you get here, the state is super complicated and needs an if-then-else. This is used by the new DFA scheme created June 2006.
-
genLabelExpr
protected org.antlr.stringtemplate.StringTemplate genLabelExpr(org.antlr.stringtemplate.StringTemplateGroup templates, Transition edge, int k)
Generate an expression for traversing an edge.
-
genSemanticPredicateExpr
protected org.antlr.stringtemplate.StringTemplate genSemanticPredicateExpr(org.antlr.stringtemplate.StringTemplateGroup templates, Transition edge)
-
genSetExpr
public org.antlr.stringtemplate.StringTemplate genSetExpr(org.antlr.stringtemplate.StringTemplateGroup templates, IntSet set, int k, boolean partOfDFA)
For intervals such as [3..3, 30..35], generate an expression that tests the lookahead similar to LA(1)==3 || (LA(1)>=30&&LA(1)<=35)
-
genTokenTypeConstants
protected void genTokenTypeConstants(org.antlr.stringtemplate.StringTemplate code)
Set attributes tokens and literals attributes in the incoming code template. This is not the token vocab interchange file, but rather a list of token type ID needed by the recognizer.
-
genTokenTypeNames
protected void genTokenTypeNames(org.antlr.stringtemplate.StringTemplate code)
Generate a token names table that maps token type to a printable name: either the label like INT or the literal like "begin".
-
getTokenTypeAsTargetLabel
public String getTokenTypeAsTargetLabel(int ttype)
Get a meaningful name for a token type useful during code generation. Literals without associated names are converted to the string equivalent of their integer values. Used to generate x==ID and x==34 type comparisons etc... Essentially we are looking for the most obvious way to refer to a token type in the generated code. If in the lexer, return the char literal translated to the target language. For example, ttype=10 will yield '\n' from the getTokenDisplayName method. That must be converted to the target languages literals. For most C-derived languages no translation is needed.
-
genTokenVocabOutput
protected org.antlr.stringtemplate.StringTemplate genTokenVocabOutput()
Generate a token vocab file with all the token names/types. For example: ID=7 FOR=8 'for'=8 This is independent of the target language; used by antlr internally
-
translateAction
public List translateAction(String ruleName, GrammarAST actionTree)
-
translateArgAction
public List<org.antlr.stringtemplate.StringTemplate> translateArgAction(String ruleName, GrammarAST actionTree)
Translate an action like [3,"foo",a[3]] and return a List of the translated actions. Because actions are themselves translated to a list of chunks, must cat together into a StringTemplate>. Don't translate to strings early as we need to eval templates in context.
-
getListOfArgumentsFromAction
public static List<String> getListOfArgumentsFromAction(String actionText, int separatorChar)
-
getListOfArgumentsFromAction
public static int getListOfArgumentsFromAction(String actionText, int start, int targetChar, int separatorChar, List<String> args)
Given an arg action like [x, (*a).foo(21,33), 3.2+1, '\n', "a,oo\nick", {bl, "fdkj"eck}, ["cat\n,", x, 43]] convert to a list of arguments. Allow nested square brackets etc... Set separatorChar to ';' or ',' or whatever you want.
-
translateTemplateConstructor
public org.antlr.stringtemplate.StringTemplate translateTemplateConstructor(String ruleName, int outerAltNum, antlr.Token actionToken, String templateActionText)
Given a template constructor action like %foo(a={...}) in an action, translate it to the appropriate template constructor from the templateLib. This translates a *piece* of the action.
-
issueInvalidScopeError
public void issueInvalidScopeError(String x, String y, Rule enclosingRule, antlr.Token actionToken, int outerAltNum)
-
issueInvalidAttributeError
public void issueInvalidAttributeError(String x, String y, Rule enclosingRule, antlr.Token actionToken, int outerAltNum)
-
issueInvalidAttributeError
public void issueInvalidAttributeError(String x, Rule enclosingRule, antlr.Token actionToken, int outerAltNum)
-
getTemplates
public org.antlr.stringtemplate.StringTemplateGroup getTemplates()
-
getBaseTemplates
public org.antlr.stringtemplate.StringTemplateGroup getBaseTemplates()
-
setDebug
public void setDebug(boolean debug)
-
setTrace
public void setTrace(boolean trace)
-
setProfile
public void setProfile(boolean profile)
-
getRecognizerST
public org.antlr.stringtemplate.StringTemplate getRecognizerST()
-
getRecognizerFileName
public String getRecognizerFileName(String name, int type)
Generate TParser.java and TLexer.java from T.g if combined, else just use T.java as output regardless of type.
-
getVocabFileName
public String getVocabFileName()
What is the name of the vocab file generated for this grammar? Returns null if no .tokens file should be generated.
-
write
public void write(org.antlr.stringtemplate.StringTemplate code, String fileName) throws IOException
- Throws:
IOException
-
canGenerateSwitch
protected boolean canGenerateSwitch(DFAState s)
You can generate a switch rather than if-then-else for a DFA state if there are no semantic predicates and the number of edge label values is small enough; e.g., don't generate a switch for a state containing an edge label such as 20..52330 (the resulting byte codes would overflow the method 65k limit probably).
-
createUniqueLabel
public String createUniqueLabel(String name)
Create a label to track a token / rule reference's result. Technically, this is a place where I break model-view separation as I am creating a variable name that could be invalid in a target language, however, label ::=is probably ok in all languages we care about.
-
-