Adding Integers
Interpreters, Iteration 5
User Story
User Story #5: Interpreter evaluates a two-value sum.
A Hobbes program can be the addition of two values (any combination of integers and strings).
Examples: 2 + 3
and "hi " + "there"
.
These evaluate to 5
and hi there
,
respectively.
For this particular iteration, you're only going to worry about adding two integers together.
Exploring ASTs and TIRs
Now the structure and workings of ASTs and TIRs become more important. A single integer or single string can be represented as one object. But a sum has to be represented as three objects (at least): the two operands and the operator.
Storing these three objects should reflect the structure of the original program. The structure of the program tells us how to interpret the code, so presumably your interpreters should know and use the structure.
The front end of an interpreter (or compiler) is the portion of the interpreter that reads the input (a stream of characters) and turns it into a useful data structure. The back end (the part you're writing) does the actual interpreting using the useful data structure.
And the most useful data structure to represent a program: a tree.
Abstract Syntax Trees
An abstract syntax tree (AST) is a tree data structure representing the syntax (i.e., original structure) of the program. ANTLR, a library used in the front end, will generate its own version of an AST from the parser (one of the phases of the front end).
For ANTLR, an AST consists of tokens (i.e., individual elements of a program like a single integer or a single string or a single operator) organized into general tree with an arbitrary number of children.
A single integer would be just a single token wrapped in a single-node tree. You've seen this in testing the interpretation of an integer. Similarly, you've seen how a string is wrapped up the same way. These are hardly trees, of course, because these are hardly programs! However, take two single integers and add them together, and you end up with a more interesting tree.
The expression 5 + 11
is parsed as this AST:

ANTLR's ASTs come with a variety of methods to help navigate
through an AST: getChild(int)
,
addChild(Tree)
, getParent()
. You've
already used the getType()
and getText()
methods.
If you wanted to build this tree yourself, here's some code to do that:
Tree root = new CommonTree(new CommonToken(HobbesParser.OPERATOR, "+")); root.addChild(new CommonTree(new CommonToken(HobbesParser.INTEGER, "5"))); root.addChild(new CommonTree(new CommonToken(HobbesParser.INTEGER, "11")));
Tree Intermediate Representations
ASTs are a procedural way of processing information. You have methods to navigate through the structure of an AST, but the AST itself doesn't really do much. ASTs often only indicate the structure of a program. Sometimes you want to annotate the tree in some ways to provide more information to your interpreter.
Many interpreters and compilers use a tree intermediate representation (TIR). It's a little more powerful than an AST, and sits between the front end and the ultimate result (hence calling this data structure an "intermediate representation").
ANTLR also provides a tree parser which will parse its own ASTs. Some people use a tree parser to do the actual interpretation or compilation or other static analysis. I prefer to use a tree parser to build a TIR for me. The TIR hierarchy of classes is something I put together, and it's provided to you in the CITkit library.
Instead of providing generic building and navigation methods,
each TIR class has its own particular structure and methods. So
while the picture of 5 + 11
above is not a bad
approximation of its TIR, the code to build the equivalent TIR is
more telling:
OperatorETIR operator = new OperatorETIR( new IntegerETIR(5), OperatorETIR.Operator.ADD, new IntegerETIR(11));
These objects know exactly what kind of a TIR they are, so
there's no extra "token type" to carry around. Also, data is
carried around in its native form (e.g., integers as
int
s, not String
s).
Implementing
Let's assume for this iteration that the operands to an addition are always integers. The user story itself says otherwise, but stories are often broken down into smaller implementation steps.
The user story itself tells us to assume that there's only two
operands to and addition, so there's no need (yet) to worry about
things like 5 + 11 + 4
. Similarly, you don't have to
worry about multiplication or subtraction or division or modulus or
exponentiation or etc...
Procedural
Reuse createIntegerAST(String)
. Also,
String.valueOf(int)
turns an int
into a
String
. Be sure to try at least one negative number
and zero.
Write a createAdditionAST(int, int)
method in
ProceduralHobbesInterpreterTest
to generate an AST for
an addition. Write a new test method
shouldInterpretAddition()
, and make three assertions
with three different trees. Compile, and red bar.
The problem should be that the switch
statement in your interpret(Tree)
method doesn't
handle an OPERATOR
.
Add another case
to handle OPERATOR
s.
Have it throw an exception or maybe return null
.
Compile, and red bar.
Equipped with this iteration's assumptions, you can assume that
an OPERATOR
has two children, and they're both
INTEGER
s. So just grab the two children
(tree.getChild(0)
and tree.getChild(1)
),
turn their text (tree.getChild(0).getText()
) into
int
s. Integer.decode(String)
does the
best job turning String
s into int
s.
Write working code for the OPERATOR
case. Compile,
and green bar.
Write a few new Hobbes scripts, and try your driver on them. (The object-oriented interpreter will have problems with them, of course.)
Object Oriented
This is going to require a bit more work. So far, all your
interpret(ExpressionTIR)
does is turn the expression
into a String
with toString()
. That won't
work well for this iteration, at least not for an addition.
Write a new shouldInterpretAddition()
test method
in OOHobbesInterpreterTest
. Make three assertions
about three different additions. Compile, and red bar.
You should get an error about the strings: "16"
is
not the same as "5+11"
.
The parameter of interpret(ExpressionTIR)
is
originally of little help because it only thinks of itself as a
generic expression. But this is an inheritance game.
ExpressionTIR
is really an interface, so while it's
perfectly fine and good to declare variables and parameters of this
type, the actual object is an IntegerTIR
, a
StringETIR
, or an OperatorETIR
.
You could ask it: (tir instanceof IntegerTIR)
. This
is almost always followed by a cast: IntegerETIR i =
(IntegerETIR) tir;
. Whenever you use Java's
instanceof
operator or an object cast, you're not
doing object-oriented programming. So let's avoid this as much
as possible.
The object-oriented solution you'll use is a pretty significant overkill, but it'll set the OO framework for this interpreter until you're finished with it.
You're going to use what's known as the visitor
pattern. If you had the ability to change the
IntegerETIR
and OperatorETIR
classes, you
could add in interpret()
methods to those classes to
do the interpreting. However, you do not have the ability to change
these classes. Even if you did, there are many different
interpreters (and compilers and static analyzers) that you could
implement, and would you really want to implement methods in the
class hierarchy for each one of them? (Hint: no!)
The visitor pattern is a design pattern that
abstracts an algorithm over a class hierarchy. The classes
in the hierarchy (like IntegerETIR
and
OperatorETIR
) accept a visitor and call the
appropriate method for their type. Your algorithm-implementing
class only has to implement a visitor interface and the
methods it requires.
This is most easily seen from an example, and you're ready to make the example yourself.
First, do yourself a favor, and change a code template in Eclipse. The example on that page explains what you should change and what to change it to.
To implement an algorithm with the visitor pattern, your class
must implement a visitor interface. You're interested in the
ExpressionTIRVisitor<T>
.
Eclipse quick fix: click on the warning symbol in the editor's gutter. Or, click on the error, and press Ctrl-1. Your options change depending on the error; for this error, choose to implement the missing methods.
Change OOHobbesInterpreter
to implement
ExpressionTIRVisitor<String>
. Take care of the
18 errors this produces with a quick fix on the class,
implementing the missing methods.
You should now have eighteen more methods defined in the class. You'll only need three of them for this iteration (and for the near future).
Compile, and get the same red bar.
Turning OOHobbesInterpreter
into a Hobbes visitor
is only part of the solution. interpret(ExpressionTIR)
is still handling the interpretation. You need that method to use
its own object as the visitor. This will look a little odd:
public String interpret(ExpressionTIR expression) { return expression.accept(this); }
accept(ExpressionTIRVisitor<T>)
is a method
specified by the visitor pattern. It's job is to take the visitor
and use one of its methods to process the expression. All of that
is implemented for you in the TIR hierarchy. What you've provided
in OOHobbesInterpreter
is the visitor itself. When
expression
is an IntegerETIR
, then
visitIntegerETIR(IntegerETIR)
will be called.
Change interpret(ExpressionTIR)
. Compile, and red
bar.
The red bar should now include failures for
IntegerETIR
and StringETIR
. But the code
that made them work was nothing more than calling
toString()
on the expressions. It's just that with the
visitor, where you call toString()
has
changed.
Change visitIntegerETIR(IntegerETIR)
and
visitStringETIR(StringETIR)
to compute correctly.
Compile, and red bar.
You should have just the one error for an
OperatorETIR
assertion.
Care to guess which method you have to change in order to handle it?
Try a little trick.
Comment out two of your OperatorETIR
assertions,
and hard code the result of the remaining assertion in the
proper visit
method of
OOHobbesInterpreter
. Compile, and green
bar.
The green bar is a cheat, of course, because you've hard coded the answer to just the one assertion. But at least you've confirmed where you have to change the code.
Uncomment the assertions, compile, and red bar.
So now the question is how to process the
OperatorETIR
. Suppose I call this value
operator
. If I want the first operand (the one on the
left in the TIR), I can use operator.getLeft()
.
Unfortunately, that method returns an ExpressionTIR
.
Again, because of the inheritance hierarchy, an
IntegerTIR
is an ExpressionTIR
, so
creating the TIR isn't a problem. The problem is that it's harder
for you to get the integer value out!
The answer for this iteration is a cast.
IntegerETIR left = (IntegerETIR) operator.getLeft();
Then you can call left.getValue()
to get the actual
int
value out for an addition.
You should be puzzled and confused (and angry?) by the use of a cast here since I summarily dismissed them a few paragraphs ago. Using a cast means that you're not doing object-oriented programming. You're going to cheat on this iteration to keep things relatively simple. Soon (i.e., by the time you finish the current user story), you'll clean this up.
Don't forget String.valueOf(int)
.
Write good code for the visit
method you're working
on. Compile, and green bar.
Try out both drivers on some addition programs.