On this page:
16.1 Representing Arithmetic
16.2 Writing an Interpreter
16.3 A First Taste of “Semantics”
16.4 Desugaring:   Growing the Language Without Enlarging It
16.4.1 Extension:   Binary Subtraction
16.4.2 Extension:   Unary Negation
16.5 A Three-Stage Pipeline

16 Processing Programs: A First Look at Interpretation

    16.1 Representing Arithmetic

    16.2 Writing an Interpreter

    16.3 A First Taste of “Semantics”

    16.4 Desugaring: Growing the Language Without Enlarging It

      16.4.1 Extension: Binary Subtraction

      16.4.2 Extension: Unary Negation

    16.5 A Three-Stage Pipeline

Now we’re ready to write an evaluatora program that turns programs into answers—in the form of an interpreter, for our arithmetic language.The term “evaluate” means “to reduce to a value”. We choose arithmetic first for three reasons: (a) you already know how it works, so we can focus on the mechanics of writing evaluators; (b) it’s contained in every language we will encounter later, so we can build upwards and outwards from it; and (c) it’s (surprisingly) sophisticated enough to convey some important points.

16.1 Representing Arithmetic

Let’s first agree on how we will represent arithmetic expressions. Let’s say we want to support only two operations—addition and multiplication—in addition to primitive numbers. We need to represent arithmetic expressions. What are the rules that govern the nesting of arithmetic expressions? We’re actually free to nest any expression inside another.

Do Now!

Why did we not include division? What impact does it have on the remarks above?

We’ve ignored division because it forces us into a discussion of what expressions we might consider legal: clearly the representation of 1/2 ought to be legal; the representation of 1/0 is much more debatable; and that of 1/(1-1) seems even more controversial. We’d like to sidestep this controversy for now and return to it later [REF].

Thus, we want a representation for numbers and arbitrarily nestable addition and multiplication. Here’s one we can use:

data ArithC: | numC(n :: Number) | plusC(l :: ArithC, r :: ArithC) | multC(l :: ArithC, r :: ArithC) end

16.2 Writing an Interpreter

Now let’s write an interpreter for this arithmetic language. First, we should think about what its type is. It clearly consumes a ArithC value. What does it produce? Well, an interpreter evaluates—and what kind of value might arithmetic expressions reduce to? Numbers, of course. So the interpreter is going to be a function from arithmetic expressions to numbers.


Write your examples for the interpreter.

Because we have a recursive datatype, it is natural to structure our interpreter as a recursive function over it. Here’s a first template:

Templates are explained in detail in How to Design Programs.

fun interp(e :: ArithC) -> Number: cases (ArithC) e: | numC(n) => ... | plusC(l, r) => ... | multC(l, r) => ... end end

You’re probably tempted to jump straight to code, which you can:

fun interp(e :: ArithC) -> Number: cases (ArithC) e: | numC(n) => n | plusC(l, r) => l + r | multC(l, r) => l * r end where: interp(numC(3)) is 3 end

which works just fine, passing its test.

Do Now!

Do you spot the errors?

Instead, let’s expand the template out a step:

fun interp(e :: ArithC) -> Number: cases (ArithC) e: | numC(n) => ... | plusC(l, r) => ... interp(l) ... interp(r) ... | multC(l, r) => ... interp(l) ... interp(r) ... end end

and now we can fill in the blanks:

fun interp(e :: ArithC) -> Number: cases (ArithC) e: | numC(n) => n | plusC(l, r) => interp(l) + interp(r) | multC(l, r) => interp(l) * interp(r) end end

Later on (Functions Anywhere), we’re going to wish we had returned a more complex datatype than just numbers. But for now, this will do.

Congratulations: you’ve written your first interpreter! I know, it’s very nearly an anticlimax. But they’ll get harder—much harder—pretty soon, I promise.

16.3 A First Taste of “Semantics”

I just slipped something by you:

Do Now!

What is the “meaning” of addition and multiplication in this new language?

That’s a pretty abstract question, isn’t it. Let’s make it concrete. I’ll pose the problem as follows.

Which of these is the same?
  • 1 + 2

  • 1 + 2

  • 1 + 2

  • 1 + 2

What we’re driving at is that there are many kinds of addition in computer science:
  • First of all, there are many different kinds of numbers: fixed-width (e.g., 32-bit) integers, signed fixed-width (e.g., 31-bits plus a sign-bit) integers, arbitrary precision integers; in some languages, rationals; various formats of fixed- and floating-point numbers; in some languages, complex numbers; and so on. After the numbers have been chosen, addition may support only some combinations of them.

  • In addition, some languages permit the addition of datatypes such as matrices.

  • Furthermore, many languages support “addition” of strings (we use scare-quotes because we don’t really mean the mathematical concept of addition, but rather the operation performed by an operator with the syntax +). In some languages this always means concatenation; in some others, it can result in numeric results (or numbers stored in strings).

These are all different “meanings for addition”. Semantics is the mapping of syntax (e.g., +) to meaning (e.g., some or all of the above).

Returning to our interpreter, what semantics do we have? We’ve adopted whatever semantics Pyret provides, because we map + to Pyret’s +. In fact that’s not even quite true: Pyret may, for all we know, also enable + to apply to strings (which in fact it does), so we’ve chosen the restriction of Pyret’s semantics to numbers.


In what way have we restricted + to apply only to numbers? Where exactly is this restriction?

If we wanted a different semantics, we’d have to implement it explicitly.


What all would you have to change so that the number had signed 32-bit arithmetic?

In general, we have to be careful about too readily borrowing from the host language. We’ll return to this topic later [REF]. However, because we have lots of interesting things to study already, we will adopt Pyret’s numbers as our numbers for now.

16.4 Desugaring: Growing the Language Without Enlarging It

We’ve picked a very restricted first language, so there are many ways we can grow it. Some, such as representing data structures and functions, will clearly force us to add new features to the interpreter itself. Others, such as adding more of arithmetic itself, can possibly be done without disturbing the core language and hence its interpreter: this is known as adding syntactic sugar, or “sugar” for short. Let’s investigate.

16.4.1 Extension: Binary Subtraction

First, we’ll add subtraction. Because our language already has numbers, addition, and multiplication, it’s easy to define subtraction: \(a - b = a + -1 \times b\).

Okay, that was easy! But now we should turn this into concrete code. To do so, we face a decision: where does this new subtraction operator reside? It is tempting, and perhaps seems natural, to just add one more case to our existing ArithC datatype.

Do Now!

What are the negative consequences of modifying ArithC?

This creates a few problems:
  1. The first, obvious, one is that we now have to modify all programs that process ArithC. So far that’s only our interpreter, which is pretty simple, but in a more complex implementation, there could be many programs built around the datatype—a type-checker, compiler, etc.—which must all be changed, creating a heavy burden.

  2. Second, we were trying to add new constructs that we can define in terms of existing ones; it feels slightly self-defeating to do this in a way that isn’t modular.

  3. Third, and most subtly, there’s something conceptually unnecessary about modifying ArithC. That’s because ArithC represents a perfectly good core language. Atop this, we might want to include any number of additional operations that make the user’s life more convenient, but there’s no need to put these in the core. Rather, it’s wise to record conceptually different ideas in distinct datatypes, rather than shoehorn them into one. The separation can look a little unwieldy sometimes, but it makes the program much easier for future developers to read and maintain. Besides, for different purposes you might want to layer on different extensions, and separating the core from the surface enables that.

Therefore, we’ll define a new datatype to reflect our intended surface syntax terms:
<arith-dt> ::=

    data ArithExt:

      | numExt (n :: Number)

      | plusExt (l :: ArithExt, r :: ArithExt)

      | multExt (l :: ArithExt, r :: ArithExt)

      | bminusExt (l :: ArithExt, r :: ArithExt)



This looks almost exactly like ArithC, other than the added case, which follows the familiar recursive pattern. Note that the children of each node refer to ArithExt, not ArithC.
Do Now!

What happens if the children are declared to be ArithC rather than ArithExt?

If we did this, then we would be able to use sugar only at the top-level, not in any sub-expressions.

Given this datatype, we should do two things. First, we should modify our parser to also parse - expressions, and always construct ArithExt terms (rather than any ArithC ones). Second, we should implement a desugar function that translates ArithExt values into ArithC ones.Desugaring is the act of removing syntactic sugar.

Let’s write the obvious part of desugar:
<main> ::=

    fun desugar(s :: ArithExt) -> ArithC:

      cases (ArithExt) s:

        | numExt(n) => numC(n)

        | plusExt(l, r) => plusC(desugar(l), desugar(r))

        | multExt(l, r) => multC(desugar(l), desugar(r))





Now let’s convert the mathematical description of subtraction above into code:
<bminus> ::=

    | bminusExt(l, r) =>

      plusC(desugar(l), multC(numC(-1), desugar(r)))

Do Now!

It’s a common mistake to forget the recursive calls to desugar on l and r. What happens when you forget them? Try for yourself and see.

16.4.2 Extension: Unary Negation

Now let’s consider another extension, which is a little more interesting: unary negation. This forces you to do a little more work in the parser because, depending on your surface syntax, you may need to look ahead to determine whether you’re in the unary or binary case. But that’s not even the interesting part!


Modify parse to handle unary subtraction.

There are many ways we can desugar unary negation. We can define it naturally as \(-b = 0 - b\), or we could abstract over the desugaring of binary subtraction with this expansion: \(-b = 0 + -1 \times b\).

Do Now!

Which one do you prefer? Why?

It’s tempting to pick the first expansion, because it’s much simpler. Imagine we’ve extended the ArithExt datatype with a representation of unary negation:
<uminus-dt> ::=

    | uminusExt (e :: ArithExtU)

Now the implementation in desugar is straightforward:
<uminus> ::=

    | uminusExt(e) => desugar(bminusExt(numExt(0), e))

Let’s make sure the types match up. Observe that e is a ArithExt term, so it is valid to use as an argument to bminusExt, and the entire term can legally be passed to desugar. It is therefore important to not desugar e but rather embed it directly in the generated term. This embedding of an input term in another one and recursively calling desugar is a common pattern in desugaring tools; it is called a macro (specifically, the “macro” here is this definition of uminusExt).

However, there are two problems with the definition above:
  1. The first is that the recursion is generative, which forces us to take extra care.If you haven’t heard of generative recursion before, read the section on it in How to Design Programs. Essentially, in generative recursion the sub-problem is a computed function of the input, rather than a structural piece of it. This is an especially simple case of generative recursion, because the “function” is simple: it’s just the bminusExt constructor. We might be tempted to fix this by using a different rewrite:
    <uminus/alt> ::=

        | uminusExt(e) => bminusExt(numExt(0), desugar(e))

    which does indeed eliminate the generativity.

    Do Now!

    Unfortunately, this desugaring transformation won’t work at all! Do you see why? If you don’t, try to run it.

  2. The second is that we are implicitly depending on exactly what bminusExt means; if its meaning changes, so will that of uminusExt, even if we don’t want it to. In contrast, defining a functional abstraction that consumes two terms and generates one representing the addition of the first to -1 times the second, and using this to define the desugaring of both uminusExt and bminusExt, is a little more fault-tolerant.

    You might say that the meaning of subtraction is never going to change, so why bother? Yes and no. Yes, it’s meaning is unlikely to change; but no, its implementation might. For instance, the developer may decide to log all uses of binary subtraction. In the first expansion all uses of unary negation would also get logged, but they would not in the second expansion.

Fortunately, in this particular case we have a much simpler option, which is to define \(-b = -1 \times b\). This expansion works with the primitives we have, and follows structural recursion. The reason we took the above detour, however, is to alert you to these problems, and warn that you might not always be so fortunate.

16.5 A Three-Stage Pipeline

This concludes our first look at the standard pipeline we’re going to use. We will first parse programs to convert them to abstract syntax; we will then desugar them to eliminate unnecessary constructs. From now on, we will usually focus just on the resulting core language, which will be subject to not only interpretation but also type-checking and other actions.