Abstracting Operator Algorithms
Interpreters, Iteration 10
This is going to be a refactoring iteration.
Make sure all your unit and CIAT tests green bar.
Visiting an Operator Expression
Here's what my
HobbesInterpreter#visitOperatorETIR()
looks like:
public String visitOperatorETIR(IOperatorETIR expression) { OperatorAlgorithm algorithm; switch (expression.getOperator()) { case ADD: algorithm = PolyD.build(OperatorAlgorithm.class, new AdditionAlgorithm()); break; case MULTIPLY: algorithm = PolyD.build(OperatorAlgorithm.class, new MultiplicationAlgorithm()); break; default: throw new IllegalArgumentException(expression.getOperator() + " is an unrecognized operator"); } return algorithm.compute(expression.getLeft(), expression.getRight()); }
The key observation for this iteration is the role of the
switch
statement. Since both operators are implemented
using the OperatorAlgorithm
interface, you can set a
variable (algorithm
in my code) to the appropriate
object and then use that to compute the expression.
Yes, you could copy-and-paste, but it's good practice to refactor (with Eclipse's Refactor menu) when you have a clear target for the refactoring.
Refactor your code into this format.
The purpose of that switch
statement is to
determine which operator algorithm to apply. The method finishes by
actually invoking the operator algorithm on the two
subexpressions.
Note that this description of the method's algorithm says
nothing about what addition really means or multiplication
or any other operator. That's not the purpose of this
method. It calls another method to do the computation. So why
are you testing for the actual addition in
HobbesInterpreterTest
? Why don't you have tests for
AdditionAlgorithm
?
These are great questions. You'll answer the latter question in this iteration. The former question will take more work, and so we'll save it for the next iteration.
Writing the Missing Tests
Strictly speaking, AdditionAlgorithm
and
MultiplicationAlgorithm
do not have unit tests. Yes,
they're being tested through HobbesInterpreterTest
,
but it's much better to test them directly. But first, I want to
invoke the Rule of Three since you have three interfaces/classes
related to operators: OperatorAlgorithm
,
AdditionAlgorithm
, and
MultiplicationAlgorithm
.
The moving can be done as a refactoring in Eclipse. When you've
made the moves, you may want to commit to git so that the renaming
is detected. Use git add -u .
to recognize that some
files have been deleted.
Create a new package
...interpreters.hobbes.operators
in both the
src
and tests
folders. Move the three
classes listed above into the src
version. Run unit
and CIAT tests for green bars.
In the tests
folder, you can create an
AdditionAlgorithmTest
which starts out like this:
public class AdditionAlgorithmTest { private OperatorAlgorithm myAlgorithm; @Before public void setUp() { myAlgorithm = PolyD.build(OperatorAlgorithm.class, new AdditionAlgorithm()); } }
The setUp()
method uses the same allocation you
have in HobbesInterpreter
.
It'll take a good deal of manipulation to modify the tests. It might be easier to rewrite them. It's up to you. I would use the original tests as inspiration.
Create this test-case class with this code. Copy over the
addition-testing methods of HobbesInterpreterTest
into
AdditionAlgorithmTest
. Modify the assertions to use
myAlgorithm
. Green bars all around.
Don't remove the tests from HobbesInterpreterTest
yet since they're still ensuring that the Hobbes interpreter can
compute operator expressions.
Follow the same process to create
MultiplicationAlgorithmTest
and to get tests into it.
Green bars all around.
So What Are the Alternatives?
Eclipse: Quick-fix on the switch
!
Why use a switch
to determine which algorithm to
use? It helps to consider the alternatives when asking a question
like this.
I could write a multibranch if
in place of the
switch
:
public String visitOperatorETIR(IOperatorETIR expression) { OperatorAlgorithm algorithm; if (expression.getOperator() == Operator.ADD) { algorithm = PolyD.build(OperatorAlgorithm.class, new AdditionAlgorithm()); } else if (expression.getOperator() == Operator.MULTIPLY) { algorithm = PolyD.build(OperatorAlgorithm.class, new MultiplicationAlgorithm()); } else { throw new IllegalArgumentException(expression.getOperator() + " is an unrecognized operator"); } return algorithm.compute(expression.getLeft(), expression.getRight()); }
The switch
is more readable, and there's a good
chance that it's more efficient. (Operator
is an
enumerated data type, and the word is that they're supposed to be
efficient in Java when you want them to be, including in a
switch
.)
However, both the switch
and multibranch
if
have a fatal flaw: they're static code. You have no
flexibility outside of the HobbesInterpreter
class to
use different algorithms. Or add new ones! What if the client wants
to be able to add new operators on the fly?
Before we get too carried away though, those grandiose ideas aren't enough to get us to change this code yet. The reasons to change now is much, much simpler: responsibility and abstraction.
Again, what is the purpose of the switch
statement
in this method? Answer: to select the appropriate algorithm. This
is a data lookup for an algorithm. The operator
is a key, and the corresponding algorithm is the value. Data lookup
for an algorithm can be done at least four different ways:
- as a
switch
, - as a multibranch
if
, - as a lookup table (i.e., a binary search tree or a hash table), or
- as object-oriented polymorphism.
Each of these solutions has their benefits and drawbacks. We're presently looking for some abstraction, and this is best obtained through a lookup table or polymorphism.
Unfortunately, polymorphism is not an option for you (without a
lot of extra work). I've written the Operator
enumerated types, and a polymorphic solution would have to be
provided by me. I would have to write
Operator.ADD.compute(ExpressionTIR, ExpressionTIR)
for
you. Or you'd have to implement the whole TIR hierarchy (and
visitors) for yourself. It also turns out that this isn't
particularly flexible either.
So to get the flexibility that I'm demanding for you, the remaining solution is to use a lookup table.
Switching to a Hash Table (Pardon the Pun)
A lookup table, like the one you need for operator algorithms, is usually implemented as a hash table in a compiler or interpreter. In practice, hash tables perform just fine, and it's not worth the hassle of implementing binary search trees (let alone AVL or red-black trees).
In Java we can maintain some abstraction about the lookup table:
use a Map<K,V>
. This has two generic types with
it, one for the type of data used for keys and another for the type
of values. In our case, the keys are of type
OperatorETIR.Operator
, and the values are of type
OperatorAlgorithm
.
Define a new instance variable in HobbesInterpreter
named myOperators
of type
Map<OperatorETIR.Operator, OperatorAlgorithm>
.
In the constructor, initialize it to be a new
HashMap<OperatorETIR.Operator,
OperatorAlgorithm>
. Green bars all around.
You might not have a HobbesInterpreter
constructor.
Create one with no parameters. You should not have more than one
constructor.
If I want to put an algorithm into the lookup table, it might look (exactly) like this:
myOperators.put( Operator.ADD, PolyD.build(OperatorAlgorithm.class, new AdditionAlgorithm()) );
It's all of the same elements that went into the
switch
. I'm just storing it in a lookup table now
instead of "storing" it in the switch
.
Add that line of code to the HobbesInterpreter
constructor. Green bars all around.
No, you're not actually using the lookup table yet. Let's do
that in the ADD
case:
algorithm = myOperators.get(Operator.ADD);
Use this assignment in the ADD
case. Green bars all
around.
What's good for the ADD
is good for the
MULTIPLY
!
Follow those last two steps for MULTIPLY
, tweaking
the code as necessary. Green bars all around.
Look at your switch
(or multibranch
if
). It's almost unnecessary!
The switch does have an error case that's kind of helpful for debugging, but it shouldn't come up in normal use by your client. So I wouldn't worry about this. If you have any tests that do test that default case, go ahead and delete it. (The client hasn't made any stories that tell us what to do with an error like this, and it's actually counter productive to make some assertions now.)
So with that observation, the switch
is
unnecessary.
Just set algorithm
directly from the lookup table
when that variable is declared. Get rid of the switch
.
Green bars all around.
visitOperatorETIR()
should now be just two lines
long!
Think About It
So consider the final product:
- The code is simpler.
- Responsibilities are better separated.
- The code is more flexible.
- The code is easier to test.
Keep in mind that these are all comparatives, not superlatives. So things are better, not the best—yet!
Simpler?
visitOperatorETIR()
is definitely shorter, and that
often means simpler. (Read some obfuscated C or
Perl for counterexamples.) I believe it applies here. Creating the
lookup table takes a good deal of code, but it's better patterned
which will make it easy to read as more operators are added.
visitOperatorETIR()
never changes as more operators
are added!
Better separated responsibilities?
visitOperatorETIR()
worries about applying an
operator, not about applying addition or multiplication.
It's the responsibility of the lookup table to know what algorithm
to apply for which operator. Here, you're one step away from being
best if only you could get the initialization of the lookup table
out of the HobbesInterpreter
class. (That sounds like
a future iteration!)
More flexible?
This is a direct result of separating the responsibilities.
Changing an operator's algorithm on the fly is now
possible. It's not a trivial change (e.g., how would the
Hobbes programmer specify the new algorithm for
+
?).
Easier to test?
You know this one is wrong, right? At the beginning of this
iteration, I told you to duplicate some of the tests you
had in HobbesInterpreterTest
into
AdditionAlgorithmTest
and
MultiplicationAlgorithmTest
. How can duplicated tests
be easier?
The truth is, those tests belong in the
AdditionAlgorithmTest
and
MultiplicationAlgorithmTest
, not
HobbesInterpreterTest
.
Again, this comes down to the issue of responsibility.
AdditionAlgorithm
is responsible for implementing the
addition algorithm, so AdditionAlgorithmTest
should be
responsible for testing that algorithm. So that's where you test
each and every variation of data types and idempotent values
(zero and the empty string)
HobbesInterpreter
is just responsible for looking
up an operator and applying the right algorithm. If
HobbesInterpreterTest
tests different variations of
data types and idempotent values, it's testing too much. It's
duplicating tests that AdditionAlgorithmTest
and
MultiplicationAlgorithmTest
should be covering.
Let's simplify the testing!
Replace the addition and multiplication test methods in
HobbesInterpreterTest
with one test method,
shouldInterpreterOperatorExpressions()
. Use two
assertions, one involving an addition, the other a
multiplication.
There is a question of whether or not the right addition and multiplication algorithms are being called, but that's not entirely a question for the unit tests to address. You should have CIAT tests that assure the client (and yourself) that the right algorithms are in place. The fact that the algorithm is found in a lookup table is a task for another iteration.