-
-
Notifications
You must be signed in to change notification settings - Fork 9
Writing Semantic action rules
The Earley Algorithm parser builds an Abstract Syntax Tree either after
it fully parses the grammar or when directed by custom reduction check
rules that may be specified. This is done in instance method
reduce_ast()
.
Later in semantic analysis this tree is traversed to accomplish
whatever action is needed. For example in
uncompyle6 the action is
to reproduce Python source code from a parse of (modified) CPython
bytecode instructions. In our examples
expr
and
expr2
,
the example programs, semantic analysis evaluates arithmetic
expressions.
The semantic actions class is usually derived from the
GenericASTTraversal
class. That class provides methods for pre- and
post-order tree traversal. Most of the time pre-order traversal is
used. Since pre-order traversal also allows for hooks to be run after
all nodes complete, it can do a more than a strict pre-order
traversal.
The way to indicate a semantic action to take on an GenericASTTraversal
nonterminal is to create a method in the subclass of
GenericaASTTraversal
and preface the nonterminal name with n_
. For
example to have a semantic action for nonterminal stmt
, you would
define method n_stmt
.
After a while, you'll probably find many routines which do similar sorts of things, and soon you'll find you want a short notation to describe rules and not have to create methods at all.
For example in the Python 2 grammar, you could write for the pass
statement rule:
def n_pass(self, n):
self.write(self.indent) # start a new line
self.writeln('pass') # write "pass" and finish line
and do the corresponding thing for continue
and break
.
To reduce tedium, what's generally done instead is to put rules into a table and have C-style sprintf format specifiers to indicate these common kinds of actions to take. Doing this in the Python2 grammar we instead have rules:
TABLE_DIRECT = {
'break_stmt': ( '%|break\n', ),
'continue_stmt': ( '%|continue\n', ),
'pass_stmt': ( '%|pass\n', ),
...
}
The %|
format specifier indicates starting on a new line.
The kinds of specifiers is very much tied to how one write the grammar and what kind of activity is needed. For example in uncompyle6 I often need to copy attributes from one node over to one of its siblings when I need be able to give a deparse by program offset. That kind of operation however is not needed when deparsing the entire program, so no offset information needs to be stored.
The method template_engine
in the semantic rules is where we define such table-driven rules.
See also Table-driven Semantic Actions from the uncompyle6 documentation for a more description full-blown system of table-directed semantic actions.