14 A First Look at Interpretation
Once we have a representation of programs, there are many ways in
which we might want to manipulate them. We might want to display a
program in an attractive way (“pretty-print”), convert into code in
some other format (“compilation”), ask whether it obeys certain
properties (“verification”), and so on. For now, we’re going to
focus on asking what value it corresponds to (“evaluation”—
We will write an evaluator, in the form of an interpreter, for our arithmetic language. We choose arithmetic first for three reasons: (a) you already know how it works, so we can focus on the mechanics of writing evaluators; (b) it’s contained in every language we will encounter later, so we can build upwards and outwards from it; and (c) it’s (surprisingly) sophisticated enough to convey some important points.
14.1 Representing Arithmetic
Let’s first agree on how we will represent arithmetic expressions.
Let’s say we want to support only two operations—
Why did we not include division? What impact does it have on the remarks above?
data ArithC:
| numC (n :: Number)
| plusC (l :: ArithC, r :: ArithC)
| multC (l :: ArithC, r :: ArithC)
end
14.2 Writing an Interpreter
Now let’s write an interpreter for this arithmetic language. First,
we should think about what its type is. It clearly consumes a
ArithC value. What does it produce? Well, an interpreter
evaluates—
Write your test cases for the interpreter.
Templates are explained in detail earlier in this book [REF].
fun interp(e :: ArithC) -> Number:
cases (ArithC) e:
| numC(n) => ...
| plusC(l, r) => ...
| multC(l, r) => ...
end
end
fun interp(e :: ArithC) -> Number:
cases (ArithC) e:
| numC(n) => n
| plusC(l, r) => l + r
| multC(l, r) => l * r
end
where:
interp(numC(3)) is 3
end
Do you spot the errors?
fun interp(e :: ArithC) -> Number:
cases (ArithC) e:
| numC(n) => ...
| plusC(l, r) => ... interp(l) ... interp(r) ...
| multC(l, r) => ... interp(l) ... interp(r) ...
end
end
fun interp(e :: ArithC) -> Number:
cases (ArithC) e:
| numC(n) => n
| plusC(l, r) => interp(l) + interp(r)
| multC(l, r) => interp(l) * interp(r)
end
end
Later on [REF], we’re going to wish we had returned a more complex datatype than just numbers. But for now, this will do.
Congratulations: you’ve written your first interpreter! I know, it’s
very nearly an anticlimax. But they’ll get harder—
14.3 Did You Notice?
I just slipped something by you:
What is the “meaning” of addition and multiplication in this new language?
That’s a pretty abstract question, isn’t it. Let’s make it concrete. I’ll pose the problem as follows.
1 + 2
1 + 2
’1’ + ’2’
’1’ + ’2’
First of all, there are many different kinds of numbers: fixed-width (e.g., 32-bit) integers, signed fixed-width (e.g., 31-bits plus a sign-bit) integers, arbitrary precision integers; in some languages, rationals; various formats of fixed- and floating-point numbers; in some languages, complex numbers; and so on. After the numbers have been chosen, addition may support only some combinations of them.
In addition, some languages permit the addition of datatypes such as matrices.
Furthermore, many languages support “addition” of strings (we use scare-quotes because we don’t really mean the mathematical concept of addition, but rather the operation performed by an operator with the syntax +). In some languages this always means concatenation; in some others, it can result in numeric results (or numbers stored in strings).
Returning to our interpreter, what semantics do we have? We’ve adopted whatever semantics Pyret provides, because we map + to Pyret’s +. In fact that’s not even quite true: Pyret may, for all we know, also enable + to apply to strings (which in fact it does), so we’ve chosen the restriction of Pyret’s semantics to numbers.
In what way have we restricted + to apply only to numbers? Where exactly is this restriction?
If we wanted a different semantics, we’d have to implement it explicitly.
What all would you have to change so that the number had signed 32-bit arithmetic?
In general, we have to be careful about too readily borrowing from the host language. We’ll return to this topic later [REF].
14.4 Enlarging the Language Without Growing It
We’ve picked a very restricted first language, so there are many ways we can grow it. Some, such as representing data structures and functions, will clearly force us to add new features to the interpreter itself. Others, such as adding more of arithmetic itself, can possibly be done without disturbing the core language and hence its interpreter. Let’s investigate.
14.4.1 Extension: Binary Subtraction
First, we’ll add subtraction. Because our language already has numbers, addition, and multiplication, it’s easy to define subtraction: \(a - b = a + -1 \times b\).
Okay, that was easy! But now we should turn this into concrete code. To do so, we face a decision: where does this new subtraction operator reside? It is tempting, and perhaps seems natural, to just add one more case to our existing ArithC datatype.
What are the negative consequences of modifying ArithC?
The first, obvious, one is that we now have to modify all programs that process ArithC. So far that’s only our interpreter, which is pretty simple, but in a more complex implementation, there could be many programs built around the datatype—
a type-checker, compiler, etc.— which must all be changed, creating a heavy burden. Second, we were trying to add new constructs that we can define in terms of existing ones; it feels slightly self-defeating to do this in a way that isn’t modular.
Third, and most subtly, there’s something conceptually unnecessary about modifying ArithC. That’s because ArithC represents a perfectly good core language. Atop this, we might want to include any number of additional operations that make the user’s life more convenient, but there’s no need to put these in the core. Rather, it’s wise to record conceptually different ideas in distinct datatypes, rather than shoehorn them into one. The separation can look a little unwieldy sometimes, but it makes the program much easier for future developers to read and maintain. Besides, for different purposes you might want to layer on different extensions, and separating the core from the surface enables that.
data ArithExt: |
| numExt (n :: Number) |
| plusExt (l :: ArithExt, r :: ArithExt) |
| multExt (l :: ArithExt, r :: ArithExt) |
| bminusExt (l :: ArithExt, r :: ArithExt) |
end |
Given this datatype, we should do two things. First, we should modify our parser to also parse - expressions, and always construct ArithExt terms (rather than any ArithC ones). Second, we should implement a desugar function that translates ArithExt values into ArithC ones.These conveniences, which exist only in the surface syntax but not in the core, are called syntactic sugar. Hence, removing them is called desugaring.
Let’s write the obvious part of desugar:
fun desugar(s :: ArithExt) -> ArithC: |
cases (ArithExt) s: |
| numExt(n) => numC(n) |
| plusExt(l, r) => plusC(desugar(l), desugar(r)) |
| multExt(l, r) => multC(desugar(l), desugar(r)) |
end |
end |
| bminusExt(l, r) => |
plusC(desugar(l), multC(numC(-1), desugar(r))) |
It’s a common mistake to forget the recursive calls to desugar on l and r. What happens when you forget them? Try for yourself and see.
14.4.2 Extension: Unary Negation
Now let’s consider another extension, which is a little more interesting: unary negation. This forces you to do a little more work in the parser because, depending on your surface syntax, you may need to look ahead to determine whether you’re in the unary or binary case. But that’s not even the interesting part!
Modify parse to handle unary subtraction.
There are many ways we can desugar unary negation. We can define it naturally as \(-b = 0 - b\), or we could abstract over the desugaring of binary subtraction with this expansion: \(-b = 0 + -1 \times b\).
Which one do you prefer? Why?
| uminusExt (e :: ArithExtU) |
| uminusExt(e) => desugar(bminusExt(numExt(0), e)) |
- The first is that the recursion is generative, which forces us to take extra care.If you haven’t heard of generative recursion before, read the section on it in How to Design Programs. Essentially, in generative recursion the sub-problem is a computed function of the input, rather than a structural piece of it. This is an especially simple case of generative recursion, because the “function” is simple: it’s just the bminusExt constructor. We might be tempted to fix this by using a different rewrite:
| uminusExt(e) => bminusExt(numExt(0), desugar(e))
which does indeed eliminate the generativity.Do Now!Unfortunately, this desugaring transformation won’t work at all! Do you see why? If you don’t, try to run it.
The second is that we are implicitly depending on exactly what bminusS means; if its meaning changes, so will that of uminusS, even if we don’t want it to. In contrast, defining a functional abstraction that consumes two terms and generates one representing the addition of the first to -1 times the second, and using this to define the desugaring of both uminusExt and bminusExt, is a little more fault-tolerant.
You might say that the meaning of subtraction is never going to change, so why bother? Yes and no. Yes, it’s meaning is unlikely to change; but no, its implementation might. For instance, the developer may decide to log all uses of binary subtraction. In the macro expansion, all uses of unary negation would also get logged, but they would not in the second expansion.
Fortunately, in this particular case we have a much simpler option, which is to define \(-b = -1 \times b\). This expansion works with the primitives we have, and follows structural recursion. The reason we took the above detour, however, is to alert you to these problems, and warn that you might not always be so fortunate.