25 Interpreting Conditionals
Now that we have the first stirrings of a programming language, let’s
grow it out a little. The heart of a programming language consists of
control—
25.1 The Design Space of Conditionals
(if test-exp then-part else-part) |
What kind of values can the test-exp be? In some languages they must be Boolean values (two values, one representing truth and the other falsehood). In other languages this expression can evaluate to just about any value, with some set—
colloquially called truthy— representing truth (i.e., they result in execution of the then-part) while the remaining ones are falsy, meaning they cause else-part to run. Initially, it may seem attractive to design a language with several truthy and falsy values: after all, this appears to give the programmer more convenience, permitting non-Boolean-valued functions and expressions to be used in conditionals. However, this can lead to bewildering inconsistencies across languages:Value
JS
Perl
PHP
Python
Ruby
0
falsy
falsy
falsy
falsy
truthy
""
falsy
falsy
falsy
falsy
truthy
NaN
falsy
truthy
truthy
truthy
truthy
nil/null/None/undef
falsy
falsy
falsy
falsy
falsy
"0"
truthy
falsy
falsy
truthy
truthy
-1
truthy
truthy
truthy
truthy
truthy
[]
truthy
truthy
falsy
falsy
truthy
empty map/object
truthy
falsy
falsy
falsy
truthy
Of course, it need not be so complex. Scheme, for instance, has only one value that is falsy: false itself (written as #f). Every other value is truthy. For those who value allowing non-Boolean values in conditionals, this represents an elegant trade-off: it means a function need not worry that a type-consistent value resulting from a computation might cause a conditional to reverse itself. (For instance, if a function returns strings, it need not worry that the empty string might be treated differently from every other string.)While writing this chapter, I stumbled on a strange bug in Pyret: all numeric s-expressions parsed as s-num values except 0, which parsed as a s-sym. Eventually Justin Pombrio reported: “It’s a silly bug with a if in JavaScript that’s getting 0 and thinking it’s false.” Note that Ruby and Lua have relatively few falsy values; it may not be coincidental that their creators were deeply influenced by Scheme.
What kind of terms are the branches? Some languages make a distinction between statements and expressions; in such languages, designers need to decide which of these are permitted. In some languages, there are even two syntactic forms of conditional to reflect these two choices: e.g., in C, if uses statements (and does not return any value) while the “ternary operator” ((...?...:...)) permits expressions and returns a value.
If the branches are expressions and hence allowed to evaluate to values, how do the values relate? Many (but not all) languages with static type systems expect the two branches to have the same type [REF]. Languages without static type systems usually place no restrictions.
For now, we will assume that the conditional expression can only be a Boolean value; the branches are expressions (because that is all we have in our language anyway); and the two branches can return values of different types.
25.2 The Game Plan for Conditionals
- First, we need to define syntax. We’ll use
true
false
(if test-exp then-exp else-exp)
to represent the two Boolean constants and the conditional expression. - We need to modify the representation of programs to handle these new constructs. Here’s our new expression language (with the name adjusted to signal its growth beyond pure arithmetic):
data ExprC: | trueC | falseC | numC(n :: Number) | plusC(l :: ExprC, r :: ExprC) | multC(l :: ExprC, r :: ExprC) | ifC(c :: ExprC, t :: ExprC, e :: ExprC) end
We need to adjust the pre-desugaring language (ExprExt) as well to account for the new constructs. - We need to modify the parser and desugarer.
Exercise
Modify parse and desugar to work with the extended language. Adjust the datatypes as needed. Be sure to write tests.
Do Now!
There’s one more big change needed. Do you see what it is?
25.2.1 The Interpreter’s Type
trueC
It precludes being able to have a language with pure Booleans. This will have consequences when we get to types [REF].
It means you can perform arithmetic on truth values. This might not sound so surprising: after all, conjunction (and) and disjunction (or) can, after all, be thought in terms of arithmetic. But once you you say truth values are numbers, you can no longer detect if a programmer accidentally subtracts one truth value from another, divides them, and so on.
It isn’t even clear which numbers should represent which truth values. Historically, some languages have made zero represent truth; others have even chosen to use non-negative numbers for truth and negative numbers for falsity. None of these choices is more clearly “correct” than other ones, which suggests we’re really just guessing our way around here.
Most of all, we can’t keep hacking our way out of this situation. How are we going to represent strings or lists? With Gödel numbering? What happens when we get to functions, coroutines, continuations?
The consequence of this decisionOf course, you’re welcome to experiment with different decisions. The beauty of writing little interpreters is you can change what you want and explore the consequences of those changes. is that we will need a way to represent all the possible outcomes from the interpreter.
Do Now!
Try to sketch a representation for yourself.
data Value: | numV(n :: Number) | boolV(b :: Boolean) end
25.2.2 Updating Arithmetic
Finally, we’re ready to augment our interpreter. We can ignore the arithmetic lines, which should be unchanged (because we haven’t changed anything about how we will perform arithmetic), and focus on the new parts of the language.
Do Now!
Right?
Wrong. Because we’ve changed the type of value the interpreter produces, we have to update the rules for arithmetic, too, to reflect that. We can do this quickly, but we’ll do it in a few steps to illustrate a point.
fun interp(e :: ExprC) -> Value: |
cases (ExprC) e: |
| numC(n) => numV(n) |
end |
end |
| plusC(l, r) => numV(interp(l).n + interp(r).n)
25.2.3 Defensive Programming
interp(l).n
fun arith-binop(op :: (Number, Number -> Number), l :: ExprC, r :: ExprC) -> Value: l-v = interp(l) r-v = interp(r) if is-numV(l-v) and is-numV(r-v): numV(op(l-v.n, r-v.n)) else: raise('argument not a number') end end
| plusC(l, r) => arith-binop(lam(x, y): x + y end, l, r) |
| multC(l, r) => arith-binop(lam(x, y): x * y end, l, r) |
Do Now!
Before we move on, let’s ponder one more question. Suppose we could be certain that no other variant would have a field named n. Then, is there any difference between the version that checks is-numV for each of the values and the version that does not?
The version that performs a check in arith-binop is providing the error at the level of the language being implemented. It does not depend on Pyret to perform any checks; furthermore, it can give an error in terms of the interpreted language, using terminology that makes sense to the programmer in that language.
In contrast, the version that delegates the check to Pyret is allowing a meta-error to percolate through. This requires being very certain of how Pyret works, whether it will perform the check at the right time and in the right place, and then halt program execution in the appropriate way. Furthermore, the error message it produces might make no sense to the programmer: Pyret might say “Field n not found”, but to a person using a language with only arithmetic and conditionals, the very term “field” might mean nothing.
25.2.4 Interpreting Conditionals
| trueC => boolV(true) |
| falseC => boolV(false) |
| ifC(cnd, thn, els) => |
ic = interp(cnd) |
if is-boolV(ic): |
if ic.b: |
interp(thn) |
else: |
interp(els) |
end |
else: |
raise('not a boolean') |
end |
25.3 Growing the Conditional Language
A way to compute Boolean values, not just write them as constants. For instance, we should add operations on numbers (such as numeric comparison). This is relatively easy, especially given that we already have arith-binop parameterized over the operation to perform and returning a Value (rather than a number). The bigger nuisance is pushing this through parsing and desugaring. It would instead be better to create generic unary and binary operations and look them up in a table.
It would also be useful to have a way to combine conditionals (negation, disjunction, conjunction).
Exercise
Generalize the parser and desugarer to look up a table of unary and binary operations and represent them uniformly, instead of having a different variant for each one.
(and (not (= x 0)) (/ 1 x)) |
(if (not (= x 0)) |
false |
(/ 1 0)) |
Exercise
Implement negation, conjunction, and disjunction.
Exercise
Define a multi-armed conditional expression that desugars into nested ifs.