Abstracting Operator Algorithms
Interpreters, Iteration 10

This is going to be a refactoring iteration.

Make sure all your unit and CIAT tests green bar.

Visiting an Operator Expression

Here's what my HobbesInterpreter#visitOperatorETIR() looks like:

public String visitOperatorETIR(IOperatorETIR expression) {
  OperatorAlgorithm algorithm;
  switch (expression.getOperator()) {
  case ADD:
    algorithm = PolyD.build(OperatorAlgorithm.class, new AdditionAlgorithm());
    break;
  case MULTIPLY:
    algorithm = PolyD.build(OperatorAlgorithm.class, new MultiplicationAlgorithm());
    break;
  default:
    throw new IllegalArgumentException(expression.getOperator()
        + " is an unrecognized operator");
  }
  return algorithm.compute(expression.getLeft(), expression.getRight());
}

The key observation for this iteration is the role of the switch statement. Since both operators are implemented using the OperatorAlgorithm interface, you can set a variable (algorithm in my code) to the appropriate object and then use that to compute the expression.

Yes, you could copy-and-paste, but it's good practice to refactor (with Eclipse's Refactor menu) when you have a clear target for the refactoring.

Refactor your code into this format.

The purpose of that switch statement is to determine which operator algorithm to apply. The method finishes by actually invoking the operator algorithm on the two subexpressions.

Note that this description of the method's algorithm says nothing about what addition really means or multiplication or any other operator. That's not the purpose of this method. It calls another method to do the computation. So why are you testing for the actual addition in HobbesInterpreterTest? Why don't you have tests for AdditionAlgorithm?

These are great questions. You'll answer the latter question in this iteration. The former question will take more work, and so we'll save it for the next iteration.

Writing the Missing Tests

Strictly speaking, AdditionAlgorithm and MultiplicationAlgorithm do not have unit tests. Yes, they're being tested through HobbesInterpreterTest, but it's much better to test them directly. But first, I want to invoke the Rule of Three since you have three interfaces/classes related to operators: OperatorAlgorithm, AdditionAlgorithm, and MultiplicationAlgorithm.

The moving can be done as a refactoring in Eclipse. When you've made the moves, you may want to commit to git so that the renaming is detected. Use git add -u . to recognize that some files have been deleted.

Create a new package ...interpreters.hobbes.operators in both the src and tests folders. Move the three classes listed above into the src version. Run unit and CIAT tests for green bars.

In the tests folder, you can create an AdditionAlgorithmTest which starts out like this:

public class AdditionAlgorithmTest {

  private OperatorAlgorithm myAlgorithm;

  @Before
  public void setUp() {
    myAlgorithm = PolyD.build(OperatorAlgorithm.class, new AdditionAlgorithm());
  }
}

The setUp() method uses the same allocation you have in HobbesInterpreter.

It'll take a good deal of manipulation to modify the tests. It might be easier to rewrite them. It's up to you. I would use the original tests as inspiration.

Create this test-case class with this code. Copy over the addition-testing methods of HobbesInterpreterTest into AdditionAlgorithmTest. Modify the assertions to use myAlgorithm. Green bars all around.

Don't remove the tests from HobbesInterpreterTest yet since they're still ensuring that the Hobbes interpreter can compute operator expressions.

Follow the same process to create MultiplicationAlgorithmTest and to get tests into it. Green bars all around.

So What Are the Alternatives?

Eclipse: Quick-fix on the switch!

Why use a switch to determine which algorithm to use? It helps to consider the alternatives when asking a question like this.

I could write a multibranch if in place of the switch:

public String visitOperatorETIR(IOperatorETIR expression) {
  OperatorAlgorithm algorithm;
  if (expression.getOperator() == Operator.ADD) {
    algorithm = PolyD.build(OperatorAlgorithm.class,
        new AdditionAlgorithm());
  } else if (expression.getOperator() == Operator.MULTIPLY) {
    algorithm = PolyD.build(OperatorAlgorithm.class,
        new MultiplicationAlgorithm());
  } else {
    throw new IllegalArgumentException(expression.getOperator()
        + " is an unrecognized operator");
  }
  return algorithm.compute(expression.getLeft(), expression.getRight());
}

The switch is more readable, and there's a good chance that it's more efficient. (Operator is an enumerated data type, and the word is that they're supposed to be efficient in Java when you want them to be, including in a switch.)

However, both the switch and multibranch if have a fatal flaw: they're static code. You have no flexibility outside of the HobbesInterpreter class to use different algorithms. Or add new ones! What if the client wants to be able to add new operators on the fly?

Before we get too carried away though, those grandiose ideas aren't enough to get us to change this code yet. The reasons to change now is much, much simpler: responsibility and abstraction.

Again, what is the purpose of the switch statement in this method? Answer: to select the appropriate algorithm. This is a data lookup for an algorithm. The operator is a key, and the corresponding algorithm is the value. Data lookup for an algorithm can be done at least four different ways:

as a switch,
as a multibranch if,
as a lookup table (i.e., a binary search tree or a hash table), or
as object-oriented polymorphism.

Each of these solutions has their benefits and drawbacks. We're presently looking for some abstraction, and this is best obtained through a lookup table or polymorphism.

Unfortunately, polymorphism is not an option for you (without a lot of extra work). I've written the Operator enumerated types, and a polymorphic solution would have to be provided by me. I would have to write Operator.ADD.compute(ExpressionTIR, ExpressionTIR) for you. Or you'd have to implement the whole TIR hierarchy (and visitors) for yourself. It also turns out that this isn't particularly flexible either.

So to get the flexibility that I'm demanding for you, the remaining solution is to use a lookup table.

Switching to a Hash Table (Pardon the Pun)

A lookup table, like the one you need for operator algorithms, is usually implemented as a hash table in a compiler or interpreter. In practice, hash tables perform just fine, and it's not worth the hassle of implementing binary search trees (let alone AVL or red-black trees).

In Java we can maintain some abstraction about the lookup table: use a Map<K,V>. This has two generic types with it, one for the type of data used for keys and another for the type of values. In our case, the keys are of type OperatorETIR.Operator, and the values are of type OperatorAlgorithm.

Define a new instance variable in HobbesInterpreter named myOperators of type Map<OperatorETIR.Operator, OperatorAlgorithm>. In the constructor, initialize it to be a new HashMap<OperatorETIR.Operator, OperatorAlgorithm>. Green bars all around.

You might not have a HobbesInterpreter constructor. Create one with no parameters. You should not have more than one constructor.

If I want to put an algorithm into the lookup table, it might look (exactly) like this:

myOperators.put(
  Operator.ADD,
  PolyD.build(OperatorAlgorithm.class, new AdditionAlgorithm())
  );

It's all of the same elements that went into the switch. I'm just storing it in a lookup table now instead of "storing" it in the switch.

Add that line of code to the HobbesInterpreter constructor. Green bars all around.

No, you're not actually using the lookup table yet. Let's do that in the ADD case:

algorithm = myOperators.get(Operator.ADD);

Use this assignment in the ADD case. Green bars all around.

What's good for the ADD is good for the MULTIPLY!

Follow those last two steps for MULTIPLY, tweaking the code as necessary. Green bars all around.

Look at your switch (or multibranch if). It's almost unnecessary!

The switch does have an error case that's kind of helpful for debugging, but it shouldn't come up in normal use by your client. So I wouldn't worry about this. If you have any tests that do test that default case, go ahead and delete it. (The client hasn't made any stories that tell us what to do with an error like this, and it's actually counter productive to make some assertions now.)

So with that observation, the switch is unnecessary.

Just set algorithm directly from the lookup table when that variable is declared. Get rid of the switch. Green bars all around.

visitOperatorETIR() should now be just two lines long!

Think About It

So consider the final product:

The code is simpler.
Responsibilities are better separated.
The code is more flexible.
The code is easier to test.

Keep in mind that these are all comparatives, not superlatives. So things are better, not the best—yet!

Simpler?

visitOperatorETIR() is definitely shorter, and that often means simpler. (Read some obfuscated C or Perl for counterexamples.) I believe it applies here. Creating the lookup table takes a good deal of code, but it's better patterned which will make it easy to read as more operators are added. visitOperatorETIR() never changes as more operators are added!

Better separated responsibilities?

visitOperatorETIR() worries about applying an operator, not about applying addition or multiplication. It's the responsibility of the lookup table to know what algorithm to apply for which operator. Here, you're one step away from being best if only you could get the initialization of the lookup table out of the HobbesInterpreter class. (That sounds like a future iteration!)

More flexible?

This is a direct result of separating the responsibilities. Changing an operator's algorithm on the fly is now possible. It's not a trivial change (e.g., how would the Hobbes programmer specify the new algorithm for +?).

Easier to test?

You know this one is wrong, right? At the beginning of this iteration, I told you to duplicate some of the tests you had in HobbesInterpreterTest into AdditionAlgorithmTest and MultiplicationAlgorithmTest. How can duplicated tests be easier?

The truth is, those tests belong in the AdditionAlgorithmTest and MultiplicationAlgorithmTest, not HobbesInterpreterTest.

Again, this comes down to the issue of responsibility. AdditionAlgorithm is responsible for implementing the addition algorithm, so AdditionAlgorithmTest should be responsible for testing that algorithm. So that's where you test each and every variation of data types and idempotent values (zero and the empty string)

HobbesInterpreter is just responsible for looking up an operator and applying the right algorithm. If HobbesInterpreterTest tests different variations of data types and idempotent values, it's testing too much. It's duplicating tests that AdditionAlgorithmTest and MultiplicationAlgorithmTest should be covering.

Let's simplify the testing!

Replace the addition and multiplication test methods in HobbesInterpreterTest with one test method, shouldInterpreterOperatorExpressions(). Use two assertions, one involving an addition, the other a multiplication.

There is a question of whether or not the right addition and multiplication algorithms are being called, but that's not entirely a question for the unit tests to address. You should have CIAT tests that assure the client (and yourself) that the right algorithms are in place. The fact that the algorithm is found in a lookup table is a task for another iteration.

CS 214: Programming Languages Spring 2009

Abstracting Operator Algorithms Interpreters, Iteration 10