15 Evaluating Functions
15.1 Adding Functions to the Language
Let’s start creating a real programming language. We could add intermediate features such as conditionals, but to do almost anything interesting we’re going to need functions or their moral equivalent, so let’s get to it.
Add conditionals to your language. You can either add boolean datatypes or, if you want to do something quicker, add a conditional that treats 0 as false and everything else as true.
What are the important test cases you should write?
15.1.1 Defining Data Representations
A set of definitions suggests no ordering, which means, presumably, any definition can refer to any other. That’s what I intend here, but when you are designing your own language, be sure to think about this.
fun double(x): x + x end
fun quadruple(x): double(double(x)) end
fun const5(_): 5 end
When a function has multiple arguments, what simple but important criterion governs the names of those arguments?
data FunDefC:
 fdC (name :: String, arg :: String, body :: ExprC)
end
What is the body? Clearly, it has the form of an arithmetic expression, and sometimes it can even be represented using the existing ArithC language: for instance, the body of const5 can be represented as numC(5). But representing the body of double requires something more: not just addition (which we have), but also “x”. You are probably used to calling this a variable, but we will not use that term for now. Instead, we will call it an identifier.I promise we’ll return to this issue of nomenclature later [REF].
Anything else?
Finally, let’s look at the body of quadruple. It has yet another new construct: a function application. Be very careful to distinguish between a function definition, which describes what the function is, and an application, which uses it. The argument (or actual parameter) in the inner application of double is x; the argument in the outer application is double(x). Thus, the argument can be any complex expression.
data ExprC: 
 numC (n :: Number) 
 plusC (l :: ExprC, r :: ExprC) 
 multC (l :: ExprC, r :: ExprC) 
 <idCdt> 
end 
 idC (s :: String) 
 appC (f :: String, a :: ExprC) 
fdC("double", "x", plusC(idC("x"), idC("x")))
fdC("quadruple", "x", appC("double", appC("double", idC("x"))))
fdC("const5", "_", numC(5))
Look out! Did you notice that we spoke of a set of function definitions, but chose a list representation? That means we’re using an ordered collection of data to represent an unordered entity. At the very least, then, when testing, we should use any and all permutations of definitions to ensure we haven’t subtly built in a dependence on the order.
15.1.2 Growing the Interpreter
fun interp(e :: ExprC, fds :: List<FunDefC>) > Number: 
cases (ExprC) e: 
 numC(n) => n 
 plusC(l, r) => interp(l, fds) + interp(r, fds) 
 multC(l, r) => interp(l, fds) * interp(r, fds) 
 idC(s) => <idCinterpsubst> 
 appC(f, a) => <appCinterpsubst> 
fun getfundef(name :: String, fds :: List<FunDefC>) 
> FunDefC: 
end 
15.1.3 Substitution
fun subst(with :: ExprC, at :: String, in :: ExprC) 
> ExprC: 
end 
Suppose we want to substitute 3 for the identifier x in the bodies of the three example functions above. What should it produce?
A common mistake is to assume that the result of substituting, e.g., 3 for x in double is fun double(x): 3 + 3 end. This is incorrect. We only substitute at the point when we apply the function, at which point the function’s invocation is replaced by its body. The header enables us to find the function and ascertain the name of its parameter; but only its body participates in evaluation. Examine the use of substitution in the interpreter to see how returning a function definition would result in a type error.
These examples already tell us what to do in almost all the cases. Given a number, there’s nothing to substitute. If it’s an identifier, we haven’t seen an example with a different identifier but you’ve guessed what should happen: it stays unchanged. In the other cases, descend into the subexpressions, performing substitution.
Before we turn this into code, there’s an important case to consider. Suppose the name we are substituting happens to be the name of a function. Then what should happen?
What, indeed, should happen?
There are many ways to approach this question. One is from a design perspective: function names live in their own “world”, distinct from ordinary program identifiers. Some languages (such as C and Common Lisp, in slightly different ways) take this perspective, and partition identifiers into different namespaces depending on how they are used. In other languages, there is no such distinction; indeed, we will examine such languages soon [REF].
For now, we will take a pragmatic viewpoint. Because expressions evaluate to numbers, that means a function name could turn into a number. However, numbers cannot name functions, only symbols can. Therefore, it makes no sense to substitute in that position, and we should leave the function name unmolested irrespective of its relationship to the variable being substituted. (Thus, a function could have a parameter named x as well as refer to another function called x, and these would be kept distinct.)
cases (ExprC) in: 
 numC(n) => in 
 plusC(l, r) => plusC(subst(with, at, l), subst(with, at, r)) 
 multC(l, r) => multC(subst(with, at, l), subst(with, at, r)) 
 appC(f, a) => appC(f, subst(with, at, a)) 
 idC(s) => 
if s == at: 
with 
else: 
in 
end 
end 
Observe that, whereas in the numC case the interpreter returned n, substitution returns in (i.e., the original expression, equivalent at that point to writing numC(n)). Why?
15.1.4 The Interpreter, Resumed
 appC(f, a) => 
fd = getfundef(f, fds) 
subst(a, fd.arg, fd.body) 
Tempting, but wrong.
Do you see why?
 appC(f, a) => 
fd = getfundef(f, fds) 
interp(subst(a, fd.arg, fd.body), fds) 

Okay, that leaves only one case: identifiers. What could possibly be complicated about them? They should be just about as simple as numbers! And yet we’ve put them off to the very end, suggesting something subtle or complex is afoot.
Work through some examples to understand what the interpreter should do in the identifier case.
fun double(x): x + y end
When we substitute 5 for x, this produces the expression 5 + y. So far so good, but what is left to substitute y? As a matter of fact, it should be clear from the very outset that this definition of double is erroneous. The identifier y is said to be free, an adjective that in this setting has negative connotations.
 idC(s) => raise("unbound identifier") 
And that’s it!
cases (List<FunDefC>) fds: 
 empty => raise("couldn't find function") 
 link(f, r) => 
if f.name == name: 
f 
else: 
getfundef(name, r) 
end 
end 
15.1.5 Oh Wait, There’s More!
fun subst(with :: ExprC, at :: String, in :: ExprC)
> ExprC:
...
end
Sticking to surface syntax for brevity, suppose we apply double
to 1 + 2. This would substitute 1 + 2 for each
x, resulting in the following
expression—
fun subst(with :: Number, at :: String, in :: ExprC)
> ExprC:
...
end
In fact, we don’t even have substitution quite right! The version of substitution we have doesn’t scale past this language due to a subtle problem known as “name capture”. Fixing substitution is complex, subtle, and an exciting intellectual endeavor, but it’s not the direction I want to go in here. We’ll instead sidestep this problem in this book. If you’re interested, however, read about the lambda calculus [CITE], which provides the tools for defining substitution correctly.
Modify your interpreter to substitute names with answers, not expressions.
We’ve actually stumbled on a profound distinction in programming
languages. The act of evaluating arguments before substituting them
in functions is called eager application, while that of
deferring evaluation is called lazy—
15.2 From Substitution to Environments
Though we have a working definition of functions, you may feel a slight unease about it. When the interpreter sees an identifier, you might have had a sense that it needs to “look it up”. Not only did it not look up anything, we defined its behavior to be an error! While absolutely correct, this is also a little surprising. More importantly, we write interpreters to understand and explain languages, and this implementation might strike you as not doing that, because it doesn’t match our intuition.
There’s another difficulty with using substitution, which is the
number of times we traverse the source program. It would be nice to
have to traverse only those parts of the program that are actually
evaluated, and then, only when necessary. But substitution traverses
everything—
Does substitution have implications for the time complexity of evaluation?
There’s yet another problem with substitution, which is that it is
defined in terms of representations of the program source. Obviously,
our interpreter has and needs access to the source, to interpret it.
However, other implementations—
15.2.1 Introducing the Environment
The intuition that addresses the first concern is to have the
interpreter “look up” an identifier in some sort of directory. The
intuition that addresses the second concern is to defer the
substitution. Fortunately, these converge nicely in a way that also
addresses the third. The directory records the intent to
substitute, without actually rewriting the program source; by
recording the intent, rather than substituting immediately, we can
defer substitution; and the resulting data structure, which is called
an environment, avoids the need for sourcetosource rewriting
and maps nicely to lowlevel machine representations. Each name
association in the environment is called a
binding.This does not mean our study of
substitution was useless; to the contrary, many tools that work over
programs—
One subtlety is in defining precisely what “the same” means, especially with regards to failure.
Let’s first define our environment data structure. An environment is a collection of names associated with...what?
A natural question to ask here might be what the environment maps names to. But a better, more fundamental, question is: How to determine the answer to the “natural” question?
Remember that our environment was created to defer substitutions. Therefore, the answer lies in substitution. We discussed earlier (Oh Wait, There’s More!) that we want substitution to map names to answers, corresponding to an eager function application strategy. Therefore, the environment should map names to answers.
data Binding:
 bind (name :: String, value :: Number)
end
# An Environment is a List<Binding>
mtenv = []
xtndenv = link
15.2.2 Interpreting with Environments
Now we can tackle the interpreter. One case is easy, but we should revisit all the others:
fun interp(e :: ExprC, nv :: List<Binding>, fds :: List<FunDefC>) 
> Number: 
cases (ExprC) e: 
 numC(n) => n 
end 
end 
 plusC(l, r) => interp(l, nv, fds) + interp(r, nv, fds) 
 multC(l, r) => interp(l, nv, fds) * interp(r, nv, fds) 
 idC(s) => lookup(s, nv) 
Implement lookup.
 appC(f, a) => 
fd = getfundef(f, fds) 
argval = interp(a, nv, fds) 
interp(fd.body, <appCinterpfofenv>, fds) 
xtndenv(bind(fd.arg, argval), nv) 
fun lookup(s :: String, nv :: List<Binding>) > Number: 
cases (List<Binding>) nv: 
 empty => raise("unbound identifier: " + s) 
 link(f, r) => 
if s == f.name: 
f.value 
else: 
lookup(s, r) 
end 
end 
end 
Observe that looking up a free identifier still produces an error, but
it has moved from the interpreter—
check: 
f1 = fdC("double", "x", plusC(idC("x"), idC("x"))) 
f2 = fdC("quadruple", "x", 
appC("double", appC("double", idC("x")))) 
f3 = fdC("const5", "_", numC(5)) 
funs = [f1, f2, f3] 
fun i(e): interp(e, mtenv, funs) end 
i(plusC(numC(5), appC("quadruple", numC(3)))) is 17 
i(multC(appC("const5", numC(3)), numC(4))) is 20 
i(plusC(numC(10), appC("const5", numC(10)))) is (10 + 5) 
i(plusC(numC(10), appC("double", plusC(numC(1), numC(2))))) 
is (10 + 3 + 3) 
i(plusC(numC(10), appC("quadruple", plusC(numC(1), numC(2))))) 
is (10 + 3 + 3 + 3 + 3) 
Spot the bug.
15.2.3 Deferring Correctly
interp(appC("f1", numC(3)), mtenv, 
[fdC("f1", "x", appC("f2", numC(4))), 
fdC("f2", "y", plusC(idC("x"), idC("y")))]) 
raises "unbound identifier: y" 
fun f1(x): f2(4) end
fun f2(y): x + y end
f1(3)
In fact, so will our substitutionbased interpreter!
Why does the substitution process result in an error? It’s because, when we replace the representation of x with the representation of 3 in the representation of f1, we do so in f1 only.This “the representation of” is getting a little annoying, isn’t it? Therefore, I’ll stop saying that, but do make sure you understand why I had to say it. It’s an important bit of pedantry. (Obviously: x is f1’s parameter; even if another function had a parameter named x, that’s a different x.) Thus, when we get to evaluating the body of f2, its x hasn’t been substituted, resulting in the error.
What went wrong when we switched to environments? Watch carefully: this is subtle. We can focus on applications, because only they affect the environment. When we substituted the formal for the value of the actual, we did so by extending the current environment. In terms of our example, we asked the interpreter to substitute not only f2’s substitution in f2’s body, but also the current ones (those for the caller, f1), and indeed all past ones as well. That is, the environment only grows; it never shrinks.
xtndenv(bind(fd.arg, argval), mtenv) 
15.2.4 Scope
The broken environment interpreter above implements what is known as dynamic scope. This means the environment accumulates bindings as the program executes. As a result, whether an identifier is even bound depends on the history of program execution. We should regard this unambiguously as a flaw of programming language design. It adversely affects all tools that read and process programs: compilers, IDEs, and humans.
In contrast, substitution—
15.2.5 How Bad Is It?
To understand the binding structure of your program, you may need to look at the whole program. No matter how much you’ve decomposed your program into small, understandable fragments, it doesn’t matter if you have a free identifier anywhere.
Understanding the binding structure is not only a function of the size of the program but also of the complexity of its control flow. Imagine an interactive program with numerous callbacks; you’d have to track through every one of them, too, to know which binding governs an identifier.
if moonvisible():
f1(10)
else:
f2(10)
end
What happens on cloudy nights?
15.2.6 The TopLevel Scope
(define y 1) (define (f x) (+ x y))
(define y 1) (define (f x) (+ x y)) (define y 2)
(define y 1) (define f (let ((z y)) (lambda (x) (+ x y z)))) (define y 2)
15.2.7 Exposing the Environment
If we were building the implementation for others to use, it would be wise and a courtesy for the exported interpreter to take only an expression and list of function definitions, and invoke our defined interp with the empty environment. This both spares users an implementation detail, and avoids the use of an interpreter with an incorrect environment. In some contexts, however, it can be useful to expose the environment parameter. For instance, the environment can represent a set of predefined bindings: e.g., if the language wishes to provide pi automatically bound to 3.2 (in Indiana).
15.3 Functions Anywhere
Programming languages should be designed not by piling feature on top of feature, but by removing the weaknesses and restrictions that make additional features appear necessary. [REF]
One of the things we stayed coy about when introducing functions (Adding Functions to the Language) is exactly where functions go. We may have suggested we’re following the model of an idealized programming environment, with definitions and their uses kept separate. But, inspired by the Scheme design principle, let’s examine how necessary that is.
Why can’t functions definitions be expressions? In our current arithmeticcentric language we face the uncomfortable question “What value does a function definition represent?”, to which we don’t really have a good answer. But a real programming language obviously computes more than numbers, so we no longer need to confront the question in this form; indeed, the answer to the above can just as well be, “A function value”. Let’s see how that might work out.
(+ 2 ([define (f x) (* x 3)] 4))
15.3.1 Functions as Expressions and Values
data ExprC: 
 numC (n :: Number) 
 plusC (l :: ExprC, r :: ExprC) 
 multC (l :: ExprC, r :: ExprC) 
 idC (s :: String) 
end 
 fdC (name :: String, arg :: String, body :: ExprC) 
We might consider more refined datatypes that split function definitions apart from other kinds of expressions. This amounts to trying to classify different kinds of expressions, which we will return to when we study types Checking Program Invariants Statically: Types.
 appC (f :: ExprC, a :: ExprC) 
fun interp(e :: ExprC, nv :: List<Binding>): 
cases (ExprC) e: 
 numC(n) => n 
 plusC(l, r) => interp(l, nv) + interp(r, nv) 
 multC(l, r) => interp(l, nv) * interp(r, nv) 
 idC(s) => lookup(s, nv) 
end 
end 
 fdC(_, _, _) => e 
What impact does this have on the interpreter’s return type?
 appC(f, a) => 
fd = interp(f, nv) 
argval = interp(a, nv) 
interp(fd.body, xtndenv(bind(fd.arg, argval), mtenv)) 
check:
dbl = fdC("dbl", "x", plusC(idC("x"), idC("x")))
quad = fdC("quad", "x", appC(dbl, appC(dbl, idC("x"))))
c5 = fdC("c5", "_", numC(5))
fun i(e): interp(e, mtenv) end
i(plusC(numC(5), appC(quad, numC(3)))) is 17
i(multC(appC(c5, numC(3)), numC(4))) is 20
i(plusC(numC(10), appC(c5, numC(10)))) is 15
i(plusC(numC(10), appC(dbl, plusC(numC(1), numC(2)))))
is 16
i(plusC(numC(10), appC(quad, plusC(numC(1), numC(2)))))
is 22
end
15.3.2 A Small Improvement
Is there any part of our interpreter definition that we never use?
 fdC (arg :: String, body :: ExprC) 
Do you see what else you need to change?
 fdC(_, _) => e 
The type of our environment is wrong, as is that of lookup. Construct an example that demonstrates this.
15.3.3 Nesting Functions
appC(fdC("x", fdC("x", plusC(idC("x"), idC("x")))), numC(4))
fdC("x", plusC(idC("x"), idC("x")))
appC(fdC("x", fdC("y", plusC(idC("x"), idC("y")))), numC(4))
fdC("y", plusC(idC("x"), idC("y")))
15.3.4 Nested Functions and Substitution
appC(fdC("x", fdC("x", plusC(idC("x"), idC("x")))), numC(4))
fdC("x", plusC(idC("x"), idC("x")))
appC(fdC("x", fdC("y", plusC(idC("x"), idC("y")))), numC(4))
fdC("y", plusC(numC(4), idC("y")))
In other words, we’re again failing to faithfully capture what substitution would have done. A function value needs to remember the substitutions that have already been applied to it. Because we’re representing substitutions using an environment, a function value therefore needs to be bundled with an environment. This resulting data structure is called a closure.
15.3.5 An Answer Type
data Value:
 numV (n :: Number)
 closV (f :: ExprC, e :: List<Binding>) # ExprC must be an fdC
end
fun interp(e :: ExprC, nv :: List<Binding>) > Value: 
cases (ExprC) e: 
end 
end 
 numC(n) => numV(n) 
 plusC(l, r) => plusv(interp(l, nv), interp(r, nv)) 
 multC(l, r) => multv(interp(l, nv), interp(r, nv)) 
fun plusv(v1, v2): numV(v1.n + v2.n) end
fun multv(v1, v2): numV(v1.n * v2.n) end
 idC(s) => lookup(s, nv) 
 fdC(_, _) => closV(e, nv) 
Write the interpreter for function application.
 appC(f, a) => 
clos = interp(f, nv) 
argval = interp(a, nv) 
interp(clos.f.body, 
xtndenv(bind(clos.f.arg, argval), 
clos.e)) 
Observe that the argument to interp is clos.e rather than mtenv. Write a program that illustrates the difference.
If we now switch back to using substitution, will we encounter any problems?
Yes, we will. We’ve defined substitution to replace program text in other program text. Strictly speaking we can no longer do this, because Value terms cannot be contained inside ExprC ones. That is, substitution is predicated on the assumption that the type of answers is a form of syntax. It is actually possible to carry through a study of programming under this assumption, but we won’t take that path here.
15.3.6 Sugaring Over Anonymity
Now let’s get back to the idea of naming functions, which has evident value for program understanding. Observe that we do have a way of naming things: by passing them to functions, where they acquire a local name (that of the formal parameter). Anywhere within that function’s body, we can refer to that entity using the formal parameter name.
fun double(x): x + x end
double(10)
double = fun (x): x + x end
double(10)
(let ([double (lambda (x) (+ x x))]) (double 10))
fun something():
double = fun (x): x + x end
double(10)
end
(define (double x) (+ x x)) (define (quadruple x) (double (double x))) (quadruple 10)
(let ([double (lambda (x) (+ x x))]) (let ([quadruple (lambda (x) (double (double x)))]) (quadruple 10)))
(let ([quadruple (lambda (x) (double (double x)))]) (let ([double (lambda (x) (+ x x))]) (quadruple 10)))
fun loopforever(): loopforever() end
loopforever()
loopforever = fun(): loopforever() end loopforever()
(fun (loopforever): loopforever() end)(fun (): loopforever() end)
Therefore, Pyret’s = is clearly doing something more than just textual substitution: it is also “tying the loop” for recursive definitions. We can understand this either in terms of the Ycombinator (Shrinking the Language [EMPTY]) or study recursion directly (Recursion and Cycles).
15.4 Functions and Predictability
We began (Adding Functions to the Language) with a language where at all application points, we knew exactly which function was going to be invoked (because we knew its name, and the name referred to one of a fixed global set). These are known as firstorder functions. In contrast, we later moved to a language (Functions Anywhere) with firstclass functions: those that had the same status as any other value in the language.
This transition gave us a great deal of new flexiblity. For instance, we saw (Sugaring Over Anonymity) that some seemingly necessary language features could instead be implemented just as syntactic sugar; indeed, with true firstclass functions, we can define all of computation (Shrinking the Language [EMPTY]). So what’s not to like?
The subtle problem is that whenever we increase our expressive power, we correspondingly weaken our predictive power. In particular, when confronted with a particular function application in a program, the question is, can we tell precisely which function is going to be invoked at this point? With firstorder functions, yes; with higherorder functions, this is undecidable. Having this predictive power has many important consequences: a compiler can choose to inline (almost) every function application; a programming environment can give substantial help about which function is being called at that point; a security analyzer can definitively rule out known bad functions, thereby reducing the number of useless alerts it generates. Of course, with higherorder functions, all these operations are still sometimes possible; but they are not always possible, and how possible they are depends on the structure of the program and the cleverness of tools.
With higherorder functions, why is determining the precise function at an application undecidable?
Why does the above reference to inlining say “almost”?