On this page:
2.1 Values
2.2 Expressions
2.3 Recipe for Primitive Values
2.3.1 Abstracting Common Parts
2.4 Evaluating by Reducing Expressions
2.5 Recipe for Simple Data
2.5.1 Functions Over Mixed, Related Data
2.5.2 More Refined Comparisons
2.6 Recipe for Recursive Data
2.7 Parameterizing Data Definitions
2.7.1 List and the list Abbreviation
2.8 Abstracting Common Parts of List Functions
2.9 Functions that Return Functions
2.10 More and Mutually Recursive Data
2.10.1 Other Recursive Patterns
2.10.2 Mutually Recursive Data
2.10.3 A Note on Parametric Data Definitions
2.11 Naming Intermediate Values
2.12 More to Learn

2 Programming in Pyret

    2.1 Values

    2.2 Expressions

    2.3 Recipe for Primitive Values

      2.3.1 Abstracting Common Parts

    2.4 Evaluating by Reducing Expressions

    2.5 Recipe for Simple Data

      2.5.1 Functions Over Mixed, Related Data

      2.5.2 More Refined Comparisons

    2.6 Recipe for Recursive Data

    2.7 Parameterizing Data Definitions

      2.7.1 List and the list Abbreviation

    2.8 Abstracting Common Parts of List Functions

    2.9 Functions that Return Functions

    2.10 More and Mutually Recursive Data

      2.10.1 Other Recursive Patterns

      2.10.2 Mutually Recursive Data

      2.10.3 A Note on Parametric Data Definitions

    2.11 Naming Intermediate Values

    2.12 More to Learn

Programs exist to compute answers, which we will call values. These values will represent all sorts of interesting phenomena: they may be a frame from the next hit movie, a list of Web pages matching your query, a confirmation code for an upcoming trip, or a prediction of tomorrow’s temperature. Ultimately, they’re all values and our goal here will be to learn how to write programs that compute them.

You can find the book, which uses a variant of Scheme rather than Pyret, for free at http://htdp.org/.

Here, we’ll introduce values and ways of constructing them, and then move on to the problem of starting from a blank file and building up to a program that creates the answers we want. While there’s no silver bullet answer to the problem, the Design Recipe is a methodology for writing programs, described in the book How to Design Programs (HtDP), that tackles this problem for a number of programming patterns. After addressing some basics of values and expressions, we’ll explore using the Design Recipe to program in Pyret.

2.1 Values

The simplest programs are ones whose value we already know.What use is it if we already know it? You’ll find out later: (part "values-for-tests"). Here, then, is the simplest program you can write:

3

If you ask Pyret for the value of this program, it will unsurprisingly tell you that it’s 3. Programs can also refer to text, called a string:

"Hello, Tara!"

The value of this program, likewise, is "Hello, Tara!".

Okay, so we’ve seen numbers and strings. There are many more kinds of values in Pyret, including images. We’ll get to them soon.

2.2 Expressions

Obviously, we’re not very interested in writing programs whose answers we already know. Therefore, let’s start the first real step of our journey: writing expressions that take one or more steps to produce an answer.

Do Now!

What is this program’s value?Yes, you must put spaces around +.

1 + 2

As you might have guessed, this program’s value is 3. In your head, you probably applied the rules of algebra to compute this program’s value; in fact, Pyret follows the same rules. Everything you know about how to compute the answer of an algebraic expression applies here! Similarly, once I tell you that using + on string concatenates them, you can easily see that

"will." + "i." + "am"

evaluates to "will.i.am".

2.3 Recipe for Primitive Values

The first step in the design recipe is to analyze the values the program is working with, both for its input and for its answers. The recipe works through the problem starting from the definition of the input and finishes with the implementation. The easiest recipe to start with is the one for primitive values, like the numbers and strings we just saw.

For our first example, we’ll take something simple enough that we could understand it in just about any context: calculating hourly wage with overtime.

To keep things simple, we’ll assume a constant hourly rate of $10/hour, so the only parameter is the number of hours worked. Then we proceed through several steps.

  1. Understand the Data Definition

    The incoming hours worked are Numbers, and there are two interesting kinds of numbers: those greater than 40 and those less than or equal to 40.

  2. Function Header and Purpose

    Next, to get started with the function, we write down the name of the function, the types of the arguments, and the type of value that will be returned. We also write a short English sentence, using the doc: keyword, describing the function:

    fun hours-to-wages(hours :: Number) -> Number: doc: "Compute total wage from hours, accounting for overtime" end

    This says that hours-to-wages will consume a Number and return another Number. The description informs the reader that the interpretations of the numbers for argument and return are hours and wages, respectively.

  3. Write Examples of Calling the Function (Test Cases)

    Next, we write a few test cases. This helps us think through what the implementation will look like, lets us know of any problematic cases that might come up, and gives us something to run to confirm that our implementation does what we expect when we’re done.

    These examples of calls go in a where: block at the end of the function, and the simplest ones use is to check that the output is an expected value:

    fun hours-to-wages(hours :: Number) -> Number: doc: "Compute total wage from hours, accounting for overtime, at $10/hr base" where: hours-to-wages(40) is 400 hours-to-wages(40.5) is 407.5 hours-to-wages(41) is 415 hours-to-wages(0) is 0 hours-to-wages(45) is 475 hours-to-wages(20) is 200 end

    Examples should cover at least all the different cases mentioned in the data definition, and also to test common extremes and base cases. In this case, for example, it’s useful to note that the 40th hour doesn’t count towards overtime, but the 41st does.

  4. Implement the Body

    Now all that’s left is to fill a Pyret expression into the body of the function that computes the correct answer based on the inputs.

    In the case of this running example, the data definition has a condition in it: whether or not the hours have exceeded 40. Conditional data usually implies a conditional expression in code. That means an if expression to test which case of the data definition the function is handling. Then, the bodies of the branches can be filled in with the appropriate answers.

    fun hours-to-wages(hours :: Number) -> Number: doc: "Compute total wage from hours, accounting for overtime, at $10/hr base" if hours < 41: hours * 10 else if hours >= 41: (40 * 10) + ((hours - 40) * (10 * 1.5)) end where: hours-to-wages(40) is 400 hours-to-wages(40.5) is 407.5 hours-to-wages(41) is 415 hours-to-wages(0) is 0 hours-to-wages(45) is 475 hours-to-wages(20) is 200 end

  5. Run the Tests, and Fix Mistakes

    Now we can click Run, and get feedback on if there are any mistakes in the definition we wrote. If any tests fail, we have to figure out if we misunderstood the test, or mis-implemented the program. It can require many rounds of back-and-forth before settling on a good set of tests and an agreeing implementation that does precisely what we intend.

    Do Now!

    Run this example and read the output. Did all the tests pass?

    In this case, there’s a mismatch between the implementation and the tests: the second test fails. When this happens, we have to ask: did we get the test wrong or the implementation? In this case, it happens to be the implementation; the first condition shouldn’t include the case where hours is between 40 and 41, which should be handled by the second case.

    Since we got something wrong around a boundary condition, it’s probably a good idea to add one more test to make sure we didn’t screw up in the other direction. This is the correct implementation, with a new test to double-check things:

    fun hours-to-wages(hours :: Number) -> Number: doc: "Compute total wage from hours, accounting for overtime, at $10/hr base" if hours <= 40: # this line changed hours * 10 else if hours > 40: # this line changed (40 * 10) + ((hours - 40) * (10 * 1.5)) end where: hours-to-wages(40) is 400 hours-to-wages(39.5) is 395 # this test was added hours-to-wages(40.5) is 407.5 # this test is correct now hours-to-wages(41) is 415 hours-to-wages(0) is 0 hours-to-wages(45) is 475 hours-to-wages(20) is 200 end

2.3.1 Abstracting Common Parts

The hours-to-wages function always assumes an hourly rate of $10/hour. We can change it to accommodate a different hourly rate, say $20/hour, by changing the constant 10 where it appears representing the hourly rate:

fun hours-to-wages-20(hours :: Number) -> Number: doc: "Compute total wage from hours, accounting for overtime, at $20/hr base" if hours <= 40: hours * 20 else if hours > 40: (40 * 20) + ((hours - 40) * (20 * 1.5)) end end

We could make another copy of the function for $30/hour workers, and so on. However, it’s also possible, and quite straightforward, to change the function to work for any hourly wage. We note the shared parts across the implementation and lift them out, adding a new parameter to the function.

fun hours-to-wages-at-rate(rate :: Number, hours :: Number) -> Number: doc: "Compute total wage from hours, accounting for overtime, at the given rate" if hours <= 40: hours * rate else if hours > 40: (40 * rate) + ((hours - 40) * (rate * 1.5)) end end

Note that we’ll take the convention of adding new parameters at the beginning of the argument list. We simply add the new parameter (with an appropriate annotation), and replace all instances of the constant with it.

Exercise

Write a function called has-overtime that takes a number of hours and returns true if the number of hours is greater than 40 and false otherwise.

Exercise

Working negative hours is nonsense. Write a version of hours-to-wages that uses the raise function to throw an error if fewer than 0 hours are reported. Use the raises form to test for it (read about raises in the Pyret documentation).

Exercise

Write a function called hours-to-wages-ot that takes a number of hours, an hourly rate, and an overtime threshold, and produces the total pay. Any hours worked beyond the overtime threshold should be credited at 1.5 times the normal rate of pay, as before.

2.4 Evaluating by Reducing Expressions

There’s an imporant note about Pyret going on in this example: There are no return statements in Pyret. The function body (the if expression here) evaluates when called, and the result of the body’s evaluation is the result of the function call.

We can show this evaluation visually, with an example of a call:

The first step is to substitute the provided value (45) in for the argument (in this case the identifier hours) everywhere it appears.

hours-to-wages(45) => if 45 <= 40: hours * 10 else if 45 > 40: (40 * 10) + ((45 - 40) * 15) end

Next, the conditional part of the if expression is evaluated, which in this case is false.

=> if false: hours * 10 else if 45 > 40: (40 * 10) + ((45 - 40) * 15) end

Since the condition is false, the next branch is tried.

=> if false: hours * 10 else if true: (40 * 10) + ((45 - 40) * 15) end

Since the condition is true, the expression reduces to the body of that branch. After that, it’s just arithmetic.

=> (40 * 10) + ((45 - 40) * 15)

=> 400 + (5 * 15) => 475

This style of reduction is the best way to think about the evaluation of Pyret function calls and expressions. The whole body of the function takes steps that simplify it, much like simplifying expressions in algebra. We’ll use this reduction-style example later in the notes, and you can use it yourself if you want to try and work through the evaluation of a Pyret function by hand (or in your head).

Do Now!

Write out the evaluation steps for

hours-to-wages-at-rate(15, 45)

for the definition of hours-to-wages-at-rate from Abstracting Common Parts.

When you look at at a test case in Pyret, like

hours-to-wages(45) is 475

the same idea of algebraic simplification applies. This will reduce to

475 is 475

which is clearly true, and is causes it to be registered as a passing test. When a check or where block runs, it evaluates each is test and compares it to the answer in this same way.

2.5 Recipe for Simple Data

This definition of primitive versus compound data has a little bit of a fuzzy border: we could think of strings as lists of characters, for instance. Usually the kind of program we’re writing makes it clear what data we are treating as “primitive” values and what data we are breaking down further.

Lots of interesting programs work over far more than primitive data like Numbers and Strings. They combine pieces of primitive data into richer datatypes, and the recipe has a natural extension to dealing with those cases.

Our example will be calculating the distance from the origin of a 2-dimensional point, and we’ll adapt the initial recipe as needed for it.

  1. Understand the Data Definition and Write Examples

    A 2D point has an x and y coordinate, which are both numbers. We’ll use a Pyret data definition to represent points that has both of these fields:

    data Point: | point(x :: Number, y :: Number) end

    When working with more complicated data, it’s useful to write down examples of the data itself along with the definition. These examples can be useful for thinking through what kinds of values can be represented by the data definition, and also serve as useful inputs to tests later. Some examples of points are:

    point(0, 0) point(3, 4) point(-3, -4)

  2. Write Down the Function Header, Contract, and Purpose

    This step is basically unchanged from primitive data, just adapted for the new definition. We’re calculating the distance from the origin to a given point, so the function will consume a point as input and produce a Number as output. There aren’t any extra arguments:

    fun distance-to-origin(p :: Point) -> Number: doc: "Produces the distance from the origin to the given point." end

  3. Write Example Calls to the Function

    Next, we write a few test cases. This helps us think through what the implementation will look like, lets us know of any problematic cases that might come up, and gives us something to run to confirm that our implementation does what we expect when we’re done. Our examples of the data definition from earlier can come in handy here as test inputs.

    fun distance-to-origin(p :: Point) -> Number: doc: "Produces the distance from the origin to the given point." where: distance-to-origin(point(0, 0)) is 0 distance-to-origin(point(3, 4)) is 5 distance-to-origin(point(-3, -4)) is 5 end

  4. Write Down a Template for the Data Definition

    This is a new step beyond what’s needed for primitive data. When values get more structure, it’s useful to have a structure to program them with. Functions that work over Points will need access to the x and y fields. For breaking apart instances of data defined by data, Pyret provides an expression called cases. Here’s what a cases expression looks like over one of our example Points:

    cases (Point) point(0, 1): | point(x, y) => x + y end => 0 + 1

    You may be wondering why this is called cases when there’s only one case here. A more general version will be shown soon. The cases expression matches the constructor of the given value (in this case, point(0, 1)) with the case (a case is everything after a |) that has the same constructor name. Then, the fields of the value are substituted for the names in the case pattern (in this case, x and y), and the whole cases expression reduces to the body of that case.

    The annotation in parentheses, (Point), checks that the value you provide is correct, and dictates which cases make sense to use—only those from the data definition.

    Whenever we work with Points (or other compound data), a cases expression is a good way to get at the data inside it in a principled way. Most functions over Points will have this structure:

    #| fun point-function(p :: Point) -> ???: cases (Point) p: | point(x, y) => ... x ... ... y ... end end |#

    That is, we don’t know what it will produce, but the function will consume a point, use cases to inspect the point value, and then do something with the x and y fields. The precise operation isn’t clear until we look at the particular function we’re trying to implement, but one or both of those values will almost certainly be involved.

    So, the next step is to copy the template into the definition we’re building:

    fun distance-to-origin(p :: Point) -> Number: doc: "Produces the distance from the origin to the given point." cases (Point) p: | point(x, y) => ... x ... ... y ... end where: distance-to-origin(point(0, 0)) is 0 distance-to-origin(point(3, 4)) is 5 distance-to-origin(point(-3, -4)) is 5 end

    Now to fill in the function body, we just need to figure out what the ... parts should be.

  5. Fill in the Function Body

    With a solid understanding of the function, we can now fill in the body with an implementation that gives the answers our tests expect. Here, we just use the standard definition of distance, computed by Pyret’s num-sqrt function and the * and + operators.

    fun distance-to-origin(p :: Point) -> Number: doc: "Produces the distance from the origin to the given point." cases (Point) p: | point(x, y) => num-sqrt((x * x) + (y * x)) end where: distance-to-origin(point(0, 0)) is 0 distance-to-origin(point(3, 4)) is 5 distance-to-origin(point(-3, -4)) is 5 end

  6. Run the Tests and Fix Mistakes

    As before, we run the tests to figure out any mistakes we may have made, both in our understanding of the examples and in the implementation itself.

    Do Now!

    Run the example above and fix any mistakes.

2.5.1 Functions Over Mixed, Related Data

Data definitions can define more than one kind of constructor: Point just has one (point). We could extend the definition to handle polar points:

data Point: | point(x :: Number, y :: Number) | polar(theta :: Number, magnitude :: Number) end

Now, the use of cases is a little more obvious, in distinguishing between these multiple constructors, which we often refer to as variants:

cases (Point) polar(0, 5): | point(x, y) => x | polar(theta, magnitude) => magnitude end => 5

cases (Point) point(1, 2): | point(x, y) => x | polar(theta, magnitude) => magnitude end => 1

2.5.2 More Refined Comparisons

Sometimes, a direct comparison via is isn’t enough for testing. We saw raises in the last section for testing errors. However, when doing some computations, especially involving math with approximations, we want to ask a different question. For example, consider these tests for distance-to-origin:

check: distance-to-origin(point(1, 1)) is ??? end

What can we check here? Typing this into the REPL, we can find that the answer prints as 1.4142135623730951. That’s an approximation of the real answer, which Pyret cannot represent exactly. But it’s hard to know that this precise answer, to this decimal place, and no more, is the one we should expect up front, and thinking through the answers is supposed to be the first thing we do!

Since we know we’re getting an approximation, we can really only check that the answer is roughly correct, not exactly correct. If we can check that the answer to distance-to-origin(point(1, 1)) is around, say, 1.41, and can do the same for some similar cases, that’s probably good enough for many applications, and for our purposes here. If we were calculating orbital dynamics, we might demand higher precision, but note that we’d still need to pick a cutoff! Testing for inexact results is a necessary task.

Let’s first define what we mean by “around” with one of the most precise ways we can, a function:

fun around(actual :: Number, expected :: Number) -> Boolean: doc: "Return whether actual is within 0.01 of expected" num-abs(actual - expected) < 0.01 where: around(5, 5.01) is true around(5.01, 5) is true around(5.02, 5) is false around(num-sqrt(2), 1.41) is true end

The is form now helps us out. There is special syntax for supplying a user-defined function to use to compare the two values, instead of just checking if they are equal:

check: 5 is%(around) 5.01 num-sqrt(2) is%(around) 1.41 distance-to-origin(point(1, 1)) is%(around) 1.41 end

Adding %(something) after is changes the behavior of is. Normally, it would compare the left and right values for equality. If something is provided with %, however, it instead passes the left and right values to the provided function (in this example around). If the provided function produces true, the test passes, if it produces false, the test fails. This gives us the control we need to test functions with predictable approximate results.

Exercise

Extend the definition of distance-to-origin to include polar points.

Exercise

This might save you a Google search: polar conversions. Use the design recipe to write x-component and y-component, which return the x and y Cartesian parts of the point (which you would need, for example, if you were plotting them on a graph). Read about num-sin and other functions you’ll need at the Pyret number documentation.

Exercise

Write a data definition called Pay for pay types that includes both hourly employees, whose pay type includes an hourly rate, and salaried employees, whose pay type includes a total salary for the year. Use the design recipe to write a function called expected-weekly-wages that takes a Pay, and returns the expected weekly salary: the expected weekly salary for an hourly employee assumes they work 40 hours, and the expected weekly salary for a salaried employee is 1/52 of their salary.

2.6 Recipe for Recursive Data

Sometimes, a data definition has a piece that refers back to itself. For example, a linked list of numbers:

data NumList: | nl-empty | nl-link(first :: Number, rest :: NumList) end

Moving on to defining examples, we can talk about empty lists:

The nl- stands for NumList. This avoids using empty, which we’ll see later is already taken by Pyret.

nl-empty

We can represent short lists, like a sequence of two 4’s:

nl-link(4, nl-link(4, nl-empty))

Since these are created with constructors from data, we can use cases with them:

cases (NumList) nl-empty: | nl-empty => "empty!" | nl-link(first, rest) => "not empty" end => "empty!"

cases (NumList) nl-link(1, nl-link(2, nl-empty)): | nl-empty => "empty!" | nl-link(first, rest) => first end => 1

This describes a qualitatively different kind of data than something like a Point, which is only made up of other kinds of simple data. In fact, in contrast to something like a Point, the size of NumLists is unbounded or arbitrarily-sized. Given a NumList, there is an easy way to make a new, larger one: just use nl-link. So, we need to consider larger lists:

nl-link(1, nl-link(2, nl-link(3, nl-link(4, nl-link(5, nl-link(6, nl-link(7, nl-link(8, nl-empty))))

Let’s try to write a function contains-3, which returns true if the NumList contains the value 3, and false otherwise.

First, our header:

fun contains-3(nl :: NumList) -> Boolean: doc: "Produces true if the list contains 3, false otherwise" end

Next, some tests:

fun contains-3(nl :: NumList) -> Boolean: doc: "Produces true if the list contains 3, false otherwise" where: contains-3(nl-empty) is false contains-3(nl-link(3, nl-empty)) is true contains-3(nl-link(1, nl-link(3, nl-empty))) is true contains-3(nl-link(1, nl-link(2, nl-link(3, nl-link(4, nl-empty))))) is true contains-3(nl-link(1, nl-link(2, nl-link(5, nl-link(4, nl-empty))))) is false end

Next, we need to fill in the body with the template for a function over NumLists. We can start with the analogous template using cases we had before:

fun contains-3(nl :: NumList) -> Boolean: doc: "Produces true if the list contains 3, false otherwise" cases (NumList) nl: | nl-empty => ... | nl-link(first, rest) => ... first ... ... rest ... end end

An empty list doesn’t contain the number 3, surely, so the answer must be false in the nl-empty case. In the nl-link case, if the first element is 3, we’ve successfully answered the question. That only leaves the case where the argument is an nl-link and the first element does not equal 3:

fun contains-3(nl :: NumList) -> Boolean: cases (NumList) nl: | nl-empty => false | nl-link(first, rest) => if first == 3: true else: # handle rest here end end end

Since we know rest is a NumList (based on the data definition), we can use a cases expression to work with it. This is sort of like filling in a part of the template again:

fun contains-3(nl :: NumList) -> Boolean: cases (NumList) nl: | nl-empty => false | nl-link(first, rest) => if first == 3: true else: cases (NumList) rest: | nl-empty => ... | nl-link(first-of-rest, rest-of-rest) => ... first-of-rest ... ... rest-of-rest ... end end end end

If the rest was empty, then we haven’t found 3 (just like when we checked the original argument, nl). If the rest was a nl-link, then we need to check if the first thing in the rest of the list is 3 or not:

fun contains-3(nl :: NumList) -> Boolean: cases (NumList) nl: | nl-empty => false | nl-link(first, rest) => if first == 3: true else: cases (NumList) rest: | nl-empty => false | nl-link(first-of-rest, rest-of-rest) => if first-of-rest == 3: true else: # fill in here ... end end end end end

Since rest-of-rest is a NumList, we can fill in a cases for it again:

fun contains-3(nl :: NumList) -> Boolean: cases (NumList) nl: | nl-empty => false | nl-link(first, rest) => if first == 3: true else: cases (NumList) rest: | nl-empty => false | nl-link(first-of-rest, rest-of-rest) => if first-of-rest == 3: true else: cases (NumList) rest-of-rest: | nl-empty => ... | nl-link(first-of-rest-of-rest, rest-of-rest-of-rest) => ... first-of-rest-of-rest ... ... rest-of-rest-of-rest ... end end end end end end

See where this is going? Not anywhere good. We can copy this cases expression as many times as we want, but we can never answer the question for a list that is just one element longer than the number of times we copy the code.

So what to do? We tried this approach of using another copy of cases based on the observation that rest is a NumList, and cases provides a meaningful way to break apart a NumList; in fact, it’s what the recipe seems to lead to naturally.

Let’s go back to the step where the problem began, after filling in the template with the first check for 3:

fun contains-3(nl :: NumList) -> Boolean: cases (NumList) nl: | nl-empty => false | nl-link(first, rest) => if first == 3: true else: # what to do with rest? end end end

We need a way to compute whether or not the value 3 is contained in rest. Looking back at the data definition, we see that rest is a perfectly valid NumList, simply by the definition of nl-link. And we have a function (or, most of one) whose job is to figure out if a NumList contains 3 or not: contains-3. That ought to be something we can call with rest as an argument, and get back the value we want:

fun contains-3(nl :: NumList) -> Boolean: cases (NumList) nl: | nl-empty => false | nl-link(first, rest) => if first == 3: true else: contains-3(rest) end end end

And lo and behold, all of the tests defined above pass. It’s useful to step through what’s happening when this function is called. Let’s look at an example:

contains-3(nl-link(1, nl-link(3, nl-empty)))

First, we substitute the argument value in place of nl everywhere it appears; that’s just the usual rule for function calls.

=> cases (NumList) nl-link(1, nl-link(3, nl-empty)): | nl-empty => false | nl-link(first, rest) => if first == 3: true else: contains-3(rest) end end

Next, we find the case that matches the constructor nl-link, and substitute the corresponding pieces of the nl-link value for the first and rest identifiers.

=> if 1 == 3: true else: contains-3(nl-link(3, nl-empty)) end

Since 1 isn’t 3, the comparison evaluates to false, and this whole expression evaluates to the contents of the else branch.

=> if false: true else: contains-3(nl-link(3, nl-empty)) end => contains-3(nl-link(3, nl-empty))

This is another function call, so we substitute the value nl-link(3, nl-empty), which was the rest field of the original input, into the body of contains-3 this time.

=> cases (NumList) nl-link(3, nl-empty): | nl-empty => false | nl-link(first, rest) => if first == 3: true else: contains-3(rest) end end

Again, we substitute into the nl-link branch.

=> if 3 == 3: true else: contains-3(nl-empty) end

This time, since 3 is 3, we take the first branch of the if expression, and the whole original call evaluates to true.

=> if true: true else: contains-3(nl-empty) end => true

An interesting feature of this trace through the evaluation is that we reached the expression contains-3(nl-link(3, nl-empty)), which is a normal call to contains-3; it could even be a test case on its own. The implementation works by doing something (checking for equality with 3) with the non-recursive parts of the datum, and combining that result with the result of the same operation (contains-3) on the recursive part of the datum. This idea of doing recursion with the same function on self-recursive parts of the datatype lets us extend our template to handle recursive positions.

The simple design recipe dictated this as the template:

#| fun num-list-fun(nl :: NumList) -> ???: cases (NumList) nl: | nl-empty => ... | nl-link(first, rest) => ... first ... ... rest ... end end |#

However, this template doesn’t give much guidance with what to do with the rest field. We will extend the template with the following rule: each self-recursive position in the data definition becomes a self-recursive function call in the template. So the new template looks like:

#| fun num-list-fun(nl :: NumList) -> ???: cases (NumList) nl: | nl-empty => ... | nl-link(first, rest) => ... first ... ... num-list-fun(rest) ... end end |#

To handle recursive data, the design recipe just needs to be modified to have this extended template. When you see a recursive data definition (of which there will be many when programming in Pyret), you should naturally start thinking about what the recursive calls will return and how to combine their results with the other, non-recursive pieces of the datatype.

Exercise

Use the design recipe to write a function contains-n that takes a NumList and a Number, and returns whether that number is in the NumList.

Exercise

Use the design recipe to write a function sum that takes a NumList, and returns the sum of all the numbers in it. The sum of the empty list is 0.

Exercise

Use the design recipe to write a function remove-3 that takes a NumList, and returns a new NumList with any 3’s removed. The remaining elements should all be in the list in the same order they were in the input.

Exercise

Write a data definition called NumListList that represents a list of NumLists, and use the design recipe to write a function sum-of-lists that takes a NumListList and produces a NumList containing the sums of the sub-lists.

2.7 Parameterizing Data Definitions

If we wanted to define a list of Strings in the same style we defined a list of numbers, we might write:

data StringList: | sl-empty | sl-link(first :: String, rest :: StringList) end

Similarly, the exercises from Recipe for Recursive Data had you define a representation of lists of NumLists. It would be straightfoward, if tedious, to do the same for lists of Booleans, lists of Pay objects, and so on.

Each of these forms has the same structure:

data ??List: | xx-empty | xx-link(first :: ??, rest :: ??List) end

Where the ?? and xx are filled in with application-specific names like Num and nl, or String and sl. It would be useful to not require a new definition for each kind of data; repeating ourselves is generally good to avoid in programming.

Pyret has special syntax for this kind of data definition:

data AList<a>: | a-empty | a-link(first :: a, rest :: AList<a>) end

We say that the header, data AList<a>, creates a data definition that is parametric or polymorphic over a [to learn more about this, see Parametric Polymorphism]. At the time of defining AList, we have not yet decided what kind of elements it will hold, and defer that decision until we create a particular AList whose contents we know. At that point, we can instantiate different kinds of lists by using <> to fill in an appropriate annotation for the contents of a particular AList:

ls :: AList<String> = a-link("a", a-link("b", a-link("c", a-empty))) lls :: AList<AList<String>> = a-link(a-link("a", a-empty), a-link(a-link("b", a-link("c", a-empty)), a-empty)) ln :: AList<Number> = a-link(1, a-link(2, a-empty)) lp :: AList<Point> = a-link(point(1, 2), a-link(polar(3, 45), a-empty))

We can call types like AList<Number> instantiations of the polymorphic AList type. We can rewrite the NumList functions we defined before in terms of ALists, like contains-3:

fun contains-3(nl :: AList<Number>) -> Boolean: doc: "Returns true if 3 is in the list, false otherwise" cases (AList<Number>) nl: | a-empty => false | a-link(first, rest) => if first == 3: true else: contains-3(rest) end end where: contains-3(a-empty) is false contains-3(a-link(3, a-empty)) is true contains-3(a-link(1, a-link(3, a-empty))) is true contains-3(a-link(1, a-link(2, a-link(3, a-link(4, a-empty))))) is true contains-3(a-link(1, a-link(2, a-link(5, a-link(4, a-empty))))) is false end

The only difference is the use of AList<Number> instead of NumList in the header and the cases expression (and the renaming of nl- to a-).

Let’s consider another example; calculating the length of a list of numbers. We start with the header as always:

fun length(nl :: AList<Number>) -> Number: doc: "Returns the number of elements in the list." end

Next we add examples:

fun length(nl :: AList<Number>) -> Number: doc: "Returns the number of elements in the list." where: length(a-empty) is 0 length(a-link(8, a-empty)) is 1 length(a-link(22, a-link(43, a-empty))) is 1 end

Next, we add the template:

fun length(nl :: AList<Number>) -> Number: doc: "Returns the number of elements in the list." cases (AList<Number>) nl: | a-empty => ... | a-link(first, rest) => ... first ... ... length(rest) ... end where: length(a-empty) is 0 length(a-link(8, a-empty)) is 1 length(a-link(22, a-link(43, a-empty))) is 2 end

The length of an empty list is 0, and the length of a link is one larger than the length of the rest of the list:

fun length(nl :: AList<Number>) -> Number: doc: "Returns the number of elements in the list." cases (AList<Number>) nl: | a-empty => 0 | a-link(first, rest) => 1 + length(rest) end where: length(a-empty) is 0 length(a-link(8, a-empty)) is 1 length(a-link(22, a-link(43, a-empty))) is 2 end

This should be a pretty easy exercise to understand by now. What would happen if we wanted to write the same definition for AList<String>? It would look like this:

fun length(sl :: AList<String>) -> Number: doc: "Returns the number of elements in the list." cases (AList<String>) sl: | a-empty => 0 | a-link(first, rest) => 1 + length(rest) end end

What about a AList<AList<Point>>?

fun length(pll :: AList<AList<Point>>) -> Number: doc: "Returns the number of elements in the list." cases (AList<AList<Point>>) pll: | a-empty => 0 | a-link(first, rest) => 1 + length(rest) end end

Aside from the annotations and the name of the argument, these functions are exactly the same! Mirroring the way we unified the various ??List data definitions, we can lift out the common part of the type:

fun length<a>(al :: AList<a>) -> Number: doc: "Returns the number of elements in the list." cases (AList<a>) al: | a-empty => 0 | a-link(first, rest) => 1 + length(rest) end end

This works (and makes sense) for all the different kinds of examples:

check: length(a-empty) is 0 length(a-link(8, a-empty)) is 1 length(a-link(22, a-link(43, a-empty))) is 2 length(a-empty) is 0 length(a-link("a", a-empty)) is 1 length(a-link("a", a-link(43, a-empty))) is 2 length(a-empty) is 0 length(a-link(a-link(point(1, 2), a-empty), a-empty)) is 1 end

There’s (much) more to talk about in actually type-checking parametric types later in Parametric Polymorphism. We call length a parametric or polymorphic function, just as we called AList a parametric data definition. It means that the function can work over all possible instantiations of AList. We’ll talk about more instances of interesting functions like this in Abstracting Common Parts of List Functions.

2.7.1 List and the list Abbreviation

Lists in this style are so useful that they are built into Pyret. The definition in Pyret’s library is:

data List<a>: | empty | link(first :: a, rest :: List<a>) end

The names List, link, and empty are all available anywhere in Pyret (to avoid confusion, Pyret will also stop any other programs from redefining them).

Since lists are constructed often, and writing nested link and empty calls is tedious, Pyret supports a special syntax for constructing lists more concisely:

check: [list: 1, 2] is link(1, link(2, empty)) [list:] is empty [list: [list:], [list: 1]] is link(empty, link(link(1, empty), empty)) end

We’ll switch to using this syntax for lists for the rest of the notes and examples, and to using the built-in List with problem-appropriate instantiations.

2.8 Abstracting Common Parts of List Functions

In Abstracting Common Parts, we saw how to pull out common parts of functions over primitive data. What happens if we apply the same rules to functions over lists?

Let’s consider two functions over List<Number>s, sqr-nums and abs-nums:

fun sqr-nums(nl :: List<Number>) -> List<Number>: doc: "Return a list with the squares of the given numbers" cases (List<Number>) nl: | empty => empty | link(first, rest) => link(num-sqr(first), sqr-nums(rest)) end end fun abs-nums(nl :: List<Number>) -> List<Number>: doc: "Return a list with the absolute values of the given numbers" cases (List<Number>) nl: | empty => empty | link(first, rest) => link(num-abs(first), abs-nums(rest)) end end

We can identify the common parts: the num-abs function and num-sqr functions appear in the same spot. By following the rules of extraction, we add a new parameter name at the beginning, replace the common part with that identifier in the function body, and put that argument in the beginning of the argument lists for each call.

fun do-to-nums(operation :: ???, nl :: List<Number>) -> List<Number>: doc: "Return a new list made of the elements of nl with operation done to them" cases (List<Number>) nl: | empty => empty | link(first, rest) => # replaced num-abs/num-sqr with operation in the line below, and added it # in the recursive call to do-to-nums, according to the rules of # extracting common parts link(operation(first), do-to-nums(operation, rest)) end end

What about the annotation for operation? What goes there?

To describe operation, we’re going to introduce a new kind of annotation, the function or arrow annotation. Its syntax is

(Ann1, Ann2, ..., AnnN -> AnnR)

where Ann1 through AnnN are annotations representing arguments to a function, and AnnR is an annotation representing the return type of a function. So we can write the annotation for num-sqr and num-abs as:

(Number -> Number)

That is, they are both functions that consume and produce Numbers. We can use one to describe the operation argument of do-to-nums:

fun do-to-nums(operation :: (Number -> Number), nl :: List<Number>) -> List<Number>: doc: "Return a list containing operation applied to each element of nl" cases (List<Number>) nl: | empty => empty | link(first, rest) => link(operation(first), do-to-nums(operation, rest)) end where: do-to-nums(num-sqr, [list:]) is [list:] do-to-nums(num-sqr, [list: 1, 2]) is [list: 1, 4] do-to-nums(num-abs, [list:]) is [list:] do-to-nums(num-abs, [list: -1, 1]) is [list: 1, 1] end

In the examples for do-to-nums, we see that, following the rules from extracting common parts, we provide num-abs and num-sqr as arguments. We can run the program and its examples and see that it works.

In Pyret (and many other languages), all functions are values, just like 5 or point(1, 2) are values. This goes for user-defined functions, too, so we can use do-to-nums with, say, an hours-to-wages function:

fun hours-to-wages(hours :: Number) -> Number: if hours <= 40: hours * 10 else if hours > 40: (40 * 10) + ((hours - 40) * (10 * 1.5)) end end check: do-to-nums(hours-to-wages, [list: 40, 39, 41]) is [list: 400, 390, 415] end

It’s useful to see how this impacts the evaluation of do-to-nums for a small example:

do-to-nums(hours-to-wages, link(41, link(39, empty)))

We aren’t calling hours-to-wages yet, so no substitution happens. The function value (which we’ll represent with its name) is simply passed in as a value to the body of do-to-nums.

=> cases (List<Number>) link(41, link(39, empty)): | empty => empty | link(first, rest) => link(hours-to-wages(first), do-to-nums(hours-to-wages, rest)) end

=> link(hours-to-wages(41), do-to-nums(hours-to-wages, link(39, empty)))

The body of hours-to-wages is evaluated next, with 41 substituted for hours.

=> link( if 41 <= 40: 41 * 10 else if 41 > 40: (40 * 10) + ((41 - 40) * (10 * 1.5)) end, do-to-nums(hours-to-wages, link(39, empty)))

Now we evaluate the if and perform arithmetic.

=> link( 415, do-to-nums(hours-to-wages, link(39, empty)))

Now we have another call to do-to-nums. Note that we are still processing the argument list of the call to link. We’re done processing the first element, but we’re processing the rest in a place where it will become the rest field of the new list that’s being created.

=> link( 415, cases (List<Number>) link(39, empty): | empty => empty | link(first, rest) => link(hours-to-wages(first), do-to-nums(hours-to-wages, rest)) end)

=> link( 415, link( hours-to-wages(39), do-to-nums(hours-to-wages, empty)))

The hours-to-wages call again happens, this time on 39.

=> link( 415, link( if 39 <= 40: 39 * 10 else if 39 > 40: (40 * 10) + ((39 - 40) * (10 * 1.5)) end, do-to-nums(hours-to-wages, empty)))

More arithmetic.

=> link( 415, link( 390, do-to-nums(hours-to-wages, empty)))

And to finish, we get to the base case of empty, and the list traversal is completed.

=> link( 415, link( 390, cases (List<Number>) empty: | empty => empty | link(first, rest) => link(hours-to-wages(first), do-to-nums(hours-to-wages, rest)) end))

=> link( 415, link( 390, empty))

When we pass functions as values, we will continue to represent them with just their name in evaluation sequences. One thing to keep in mind is that a function’s body is not evaluated until a function is called, and when the call occurs happens irrespective of when the function is passed as an argument. In fact, this must be the case: consider hours-to-wages above. Its body cannot be evaluated without supplying a value for hours, which happens only when the function is called. You can learn more about how functions are handled later [Adding Functions to the Language], but for now, other than this rule, you can treat functions like any other value we’ve seen so far.

Functions as values are also called first-class functions, and when they return a function, they are called higher-order functions. Lots of languages have first-class and higher-order functions, from Python and JavaScript to Ocaml and Haskell, and they have numerous uses, many of which we’ll study. (For some elegant and perhaps surprising ones, see [Functions as Data].)

The do-to-nums function is a pretty nice abstraction: we can now apply a function to all the elements of a list of numbers, and get back a new list with the outputs.

But we can do one better. Let’s look at what do-to-strings, a version of do-to-nums for Strings, would look like:

fun do-to-strings(operation :: (String -> String), sl :: List<String>) -> List<String>: doc: "Return a list containing operation applied to each element of ns" cases (List<String>) sl: | empty => empty | link(first, rest) => link(operation(first), do-to-strings(operation, rest)) end end

Like when we saw similar versions of length for different types, do-to-strings and do-to-nums only differ in the declared types: String in place of Number everywhere. This suggests that we can use a parametric type again:

fun do-to-a<a>(operation :: (a -> a), al :: List<a>) -> List<a>:

If we fill in the definition of do-to-a, we could redefine:

fun do-to-nums(operation :: (Number -> Number), nl :: List<Number>) -> List<Number>: do-to-a(operation, nl) end fun do-to-strs(operation :: (String -> String), sl :: List<String>) -> List<String>: do-to-a(operation, sl) end

But before we declare ourselves done, there’s one last part of this definition worth generalizing. Do you see it? What if we wanted to write nums-to-strings, which takes a list of Numbers and produces a list of Strings:

fun nums-to-strings(nums :: List<Number>) -> List<String>: ... end

Can we use do-to-a to do it? The definition would look like:

fun nums-to-strings(nl :: List<Number>) -> List<String>: do-to-a(num-tostring, nl) end

We could actually run this version of nums-to-strings without error: Pyret doesn’t currently check that we don’t mis-use parametric types. That doesn’t change the fact that we would be violating the stated contract of do-to-a, though.

But, this violates the contract of do-to-a, because do-to-a must return a list of the same type it consumes. In this example, it would produce a list of a different type. So there’s one final change to make to do-to-a to make it handle this last case:

fun turn-as-into-bs<a, b>(a-to-b :: (a -> b), al :: List<a>) -> List<b>: doc: "Return a the b's returned from a-to-b on each a in the input" cases (List<a>) al: | empty => empty | link(first, rest) => link(a-to-b(first), turn-as-into-bs(a-to-b, rest)) end end

This new function, turn-as-into-bs, is parameterized over two types, the type of the elements of the input list (a), and the type of the elements of the output list (b). With this final tweak, all of these examples now make sense with the same function:

check: turn-as-into-bs(num-sqr, [list:]) is [list:] turn-as-into-bs(num-sqr, [list: 1, 2]) is [list: 1, 4] turn-as-into-bs(num-abs, [list:]) is [list:] turn-as-into-bs(num-abs, [list: -1, 1]) is [list: 1, 1] turn-as-into-bs(string-toupper, [list: "a", "b", "c"]) is [list: "A", "B", "C"] turn-as-into-bs(string-tolower, [list: "A", "B", "C"]) is [list: "a", "b", "c"] turn-as-into-bs(num-tostring, [list: 1, 2]) is [list: "1", "2"] end

The function turn-as-into-bs is typically called map, and is a major building block in programming with functions, just like the for loop is a major building block in C++ and Java. It’s important enough that map is a default library function in many programming languages with higher-order functions.

2.9 Functions that Return Functions

We saw in Abstracting Common Parts of List Functions that the turn-as-into-bs function, more commonly known as map, covered a number of use cases for transforming lists. Recall that we can use it, for instance, to calculate the hourly wage for a list of numbers representing hours:

# This function is just as before fun hours-to-wages(hours :: Number) -> Number: doc: "Compute total wage from hours, accounting for overtime, at $10/hr base" if hours < 41: hours * 10 else if hours >= 41: (40 * 10) + ((hours - 40) * (10 * 1.5)) end end # Now we could use check: map(hours-to-wages, [list: 30, 40]) is [list: 300, 400] end

It’s easy to define versions of this that for hourly rates of 20, or 30, and use those with map. But we already saw that in Abstracting Common Parts, it’s a good idea to not write many copies of essentially the same function, when we can abstract out the common parts.

But, if we have a modified version of hours-to-wages that takes the hourly rate as an argument:

fun hours-to-wages-rated(hours :: Number, rate :: Number) -> Number:

How do we use this with map? Let’s say we have two lists of hours:

at-rate-10 = [list: 35, 40, 30] at-rate-20 = [list: 25, 45, 40]

How can we use map to calculate the first list’s wages at a rate of $10/hr, and the second list’s wages at $20/hr? If we just try calling it directly:

check: map(hours-to-wages-rated, at-rate-10) is [list: 350, 400, 300] map(hours-to-wages-rated, at-rate-20) is [list: 500, 950, 800] end

We get an arity mismatch error from inside the map function. Why? Recall that what the map function does is apply the function given to it as its first argument to each list element. So it tries to apply, in the first example:

hours-to-wages-rated(35)

This is clearly an error, since hours-to-wages-rated’s contract requires two arguments, and only one is provided here. What would solve the problem is a function that takes an hourly rate, and returns a function like hours-to-wages that only takes one argument, but keeps track of the given hourly rate.

Let’s first think about what the header to such a function would look like:

fun make-rated-hours-to-wages(rate :: Number) -> (Number -> Number):

That’s the type-based description of what we just said: make-rated-hours-to-wages takes a Number and returns a function that takes a Number and produces one. We can write some examples of what it would look like to use such a function:

fun make-rated-hours-to-wages(rate :: Number) -> (Number -> Number): doc: "Create a function that computes wages at the given rate" where: make-rated-hours-to-wages(10)(40) is 400 make-rated-hours-to-wages(10)(35) is 350 make-rated-hours-to-wages(20)(35) is 700 end

Note how the tests look different from anything we’ve seen: There is a function invocation followed by another, because make-rated-hours-to-wages returns a function, which we immediately call. Note also that this matches the return type in the contract, which specifies that the function should return a (Number -> Number) function. As further examples, the returned functions ought to be of the right shape to use with map:

check: map(make-rated-hours-to-wages(10), at-rate-10) is [list: 350, 400, 300] map(make-rated-hours-to-wages(20), at-rate-20) is [list: 500, 950, 800] end

lam is short for lambda, which is a traditional name for anonymous functions that comes from the λ-calculus. So how to fill in the body of make-rated-hours-to-wages? There are a few ways, here we’re going to add to our inventory of Pyret concepts. In Pyret, we can use the lam keyword to create a function value directly, anywhere an expression is allowed. A use of lam looks like a use of fun but without the name. The resulting value is just like a function created with fun; they can be called, passed to map, bound to identifiers, and so on:

check: f = lam(x :: Number) -> Number: x + 1 end f(1) is 2 f(2) is 3 map(lam(x): x + 1 end, [list: 1, 2, 3]) is [list: 2, 3, 4] map(f, [list: 1, 2, 3]) is [list: 2, 3, 4] end

So, we can make the body of make-rated-hours-to-wages be a single lam expression, that takes in a number of hours and does the call to hours-to-wages-rated with both arguments.

fun make-rated-hours-to-wages(rate :: Number) -> (Number -> Number): doc: "Create a function that computes wages at the given rate" lam(hours :: Number) -> Number: hours-to-wages-rated(hours, rate) end end

Let’s step through a substitution, which has an interesting new consequence:

When substituting into a lam expression, we do what we always have: replace names with values. So the lam value quite literally stores the rate value of 10 inside itself, as part of an expression that will evaluate when the function is called.

make-rated-hours-to-wages(10)(30) => lam(hours :: Number) -> Number: hours-to-wages-rated(hours, 10) end(30) => hours-to-wages-rated(30, 10) => 300 # ... after some arithmetic

In a call to map, the lam value is not called immediately, and is passed through. As a reminder, here is the definition of map:

fun map<a, b>(a-to-b :: (a -> b), al :: List<a>) -> List<b>: doc: "Return a the b's returned from a-to-b on each a in the input" cases (List<a>) al: | empty => empty | link(first, rest) => link(a-to-b(first), map(a-to-b, rest)) end end

Let’s look at a call to map with an anonymous function created from make-rated-hours-to-wages:

We’ll use list abbreviations in this example.

map(make-rated-hours-to-wages(10), [list: 35, 40, 30]) => map(lam(hours): hours-to-wages(hours, 10) end, [list: 35, 40, 30])

Here, we’ve substituted the lam value for the identifier a-to-b. So the function is appearing twice; once immediately applied to first, and once passed to the recursive call to map.

=> cases (List<Number>) [list: 35, 40, 30]: | empty => empty | link(first, rest) => link((lam(hours): hours-to-wages(hours, 10) end(first)), map(lam(hours): hours-to-wages(hours, 10) end, rest)) end => link(lam(hours): hours-to-wages(hours, 10) end(35), map(lam(hours): hours-to-wages(hours, 10) end, [list: 40, 30]))

Now we apply the lam to the value 35, substituting it for hours.

=> link(hours-to-wages(35, 10), map(lam(hours): hours-to-wages(hours, 10) end, [list: 40, 30])) => link(350, # skipping arithmetic again map(lam(hours): hours-to-wages(hours, 10) end, [list: 40, 30]))

This continues through the list. The important thing to notice is that the function value is the thing “keeping track” of the original 10 that was passed into make-rated-hours-to-wages.

=> link(350, cases (List<Number>) [list: 40, 30]: | empty => empty | link(first, rest) => link((lam(hours): hours-to-wages(hours, 10) end(first)), map(lam(hours): hours-to-wages(hours, 10) end, rest)) end)

Anonymous functions are a useful tool when working with functions like map, because they provide a way to manage multi-argument functions, and also can create functions to use in map without creating verbose new fun declarations.

As far as definitions go, we call lam expressions anonymous function expressions. When an anonymous function expression’s body has a reference to an identifier that isn’t one of its arguments, we say that it closes over that identifier. Similarly, when a value is substituted for a closed over identifier, we say that the value has been closed over. The resulting function that is closed over other values is called a closure.

Closures will come up many times in the rest of this book, and they are one of the first things we’ll learn to implement when we focus on the study of languages themselves.

2.10 More and Mutually Recursive Data

Linked lists are just one instance of recursive data. There are many other patterns of recursive data to consider, and they have interesting consequences for constructing functions over them.

2.10.1 Other Recursive Patterns

We’ll say much more about binary trees later. Binary trees are one kind of recursive data that have more than one recursive piece:

data BinTree<a>: | leaf | node(value :: a, left :: BinTree<a>, right :: BinTree<a>) end

Extending the recipe for this kind of recursive data is straightforward. We modify the template to include recursive calls for each recursive piece:

fun bin-tree-fun<a>(t :: BinTree<a>) -> ???: cases (BinTree<a>) t: | leaf => ... | node(value, left, right) => ... value ... ... bin-tree-fun(left) ... ... bin-tree-fun(right) ... end end

We can use this template to fill in, for example, tree-sum, which adds up the numbers in a BinTree<Number>:

fun tree-sum(t :: BinTree<Number>) -> Number: cases (BinTree<Number>) t: | leaf => 0 | node(value, left, right) => value + tree-sum(left) + tree-sum(right) end where: tree-sum(leaf) is 0 tree-sum(node(2, leaf, leaf)) is 2 tree-sum(node(3, node(4, leaf, leaf), leaf)) is 7 tree-sum(node(3, node(4, leaf, leaf), node(6, leaf, leaf))) is 13 end

What’s the time complexity of tree-inorder, assuming that + is linear time in the length of the left-hand list. Is there a better solution using an accumulator? Or, as another example, tree-inorder, which returns a List of the elements in order (note that + is suitable to use as a way to append lists):

fun tree-inorder<a>(t :: BinTree<a>) -> List<a>: cases (BinTree<a>) t: | leaf => empty | node(value, left, right) => tree-inorder(left) + [list: value] + tree-inorder(right) end end

This is a straightforward extension of handling recursive datatypes, but is useful to note on its own.

2.10.2 Mutually Recursive Data

The last section defined binary trees, but what about general trees with lists of children? In this case, we might define it as:

data TList<b>: | t-empty | t-link(first :: Tree<b>, rest :: TList) end data Tree<b>: | node(value :: b, children :: TList<b>) end

Note that the type variable b in TList differs from the one in List: it is parametrizing the kind of Tree the first field contains, not the type of the first field itself. What would it look like to define Tree with plain Lists for children?

Here, there is no direct recursion in the body of Tree: its variants of Tree make no reference back to itself. However, they do make reference to TList, which refers back to Tree in the first field of t-link. (We would notice the cycle if we started from TList, as well).

For mutually-recursive datatypes, we cannot consider one without the other: Any function that works over Trees may need to consider all the nuances of TLists (or List<Tree>s) as well.

This is true in any examples of the datatype (can you write an example, other than t-empty, that uses one without the other?), and in the code that we write. Let’s try to develop tree-sum with this definition, starting with a template:

fun tree-sum(t :: Tree<Number>) -> Number: doc: "Add up the elements of the tree" cases (Tree<Number>) t: | node(val, children) => ... val ... ... children ... end where: tree-sum(node(4, t-empty)) is 4 tree-sum(node(5, t-link(node(4, t-empty), t-link(node(6, t-empty), t-empty))) is 15 end

We want to add val (which we know to be a number), to the sum of the list of children. That’s a big enough job that it should be deferred to a helper, so:

fun list-sum(l :: TList<Number>) -> Number: doc: "Add up the elements of the list" end fun tree-sum(t :: Tree<Number>) -> Number: doc: "Add up the elements of the tree" cases (Tree<Number>) t: | node(val, children) => val + list-sum(children) end end

Now, we can insert the template for list-sum, which contains a recursive call for the rest field, and no guidance (yet) on the first field:

fun list-sum(l :: TList<Number>) -> Number: doc: "Add up the elements of the list" cases (TList<Number>) l: | t-empty => 0 | t-link(first, rest) => ... first ... ... list-sum(rest) ... end end fun tree-sum(t :: Tree<Number>) -> Number: doc: "Add up the elements of the tree" cases (Tree<Number>) t: | node(val, children) => val + list-sum(children) end end

If we follow the same rule we followed for tree-sum, we ought to be able to just call back into tree-sum for first:

fun list-sum(l :: TList<Number>) -> Number: doc: "Add up the elements of the list" cases (TList<Number>) l: | t-empty => 0 | t-link(first, rest) => tree-sum(first) + list-sum(rest) end end fun tree-sum(t :: Tree<Number>) -> Number: doc: "Add up the elements of the tree" cases (Tree<Number>) t: | node(val, children) => val + list-sum(children) end end

This completes the definition, and the tests defined above will now pass. As usual, it’s useful to step through an evaluation of an example to see the pattern in action:

tree-sum(node(5, t-link(node(4, t-empty), node(6, t-empty))))

We start with substitution, as always

=> cases (Tree<Number>) node(5, t-link(node(4, t-empty), t-link(node(6, t-empty), t-empty))): | node(val, children) => val + list-sum(children) end => 5 + list-sum(t-link(node(4, t-empty), node(6, t-empty))) => 5 + cases (TList<Number>) t-link(node(4, t-empty), t-link(node(6, t-empty), t-empty)): | t-empty => 0 | t-link(first, rest) => tree-sum(first) + list-sum(rest) end

This expression is interesting, because it will be filled in with two different function bodies in order to be processed. The first one to resolve is the call to tree-sum, since it appears first.

=> 5 + (tree-sum(node(4, t-empty)) + list-sum(t-link(node(6, t-empty), t-empty))) => 5 + (cases (Tree<Number>) node(4, t-empty): | node(val, children) => val + list-sum(children) end + list-sum(t-link(node(6, t-empty), t-empty))) => 5 + ((4 + list-sum(t-empty)) + list-sum(t-link(node(6, t-empty), t-empty)))

Note that these calls are quite deeply nested; now another call to list-sum needs to be resolved before we can go on.

=> 5 + ((4 + cases (TList<Number>) t-empty: | t-empty => 0 | t-link(first, rest) => tree-sum(first) + list-sum(rest) end) + list-sum(t-link(node(6, t-empty), t-empty))) => 5 + ((4 + 0) + list-sum(t-link(node(6, t-empty), t-empty)))

Up to here corresponds to processing the first sub-tree of the input. The 4 corresponds to the tree-sum call in list-sum.

=> 5 + (4 + list-sum(t-link(node(6, t-empty), t-empty)))

Do Now!

Fill in the rest of the reduction down to 5 + (4 + 6), corresponding to the three value fields in the original tree.

We can extract some new rules from this: if we notice a cycle in the datatypes used in a data definition, we need to make sure that we handle the whole cycle when we think through examples, and through the template. When designing the template, we should consider a template for each datatype involved in the cycle. In this example, that would mean a template like:

fun tlist-fun<a>(l :: TList<a>) -> ???: cases (TList<a>) l: | t-empty => ... | t-link(first, rest) => # call to tree-fun, which will be handled below ... tree-fun(first) ... ... tlist-fun(rest) ... end end fun tree-fun<a>(t :: Tree<a>) -> a: cases (Tree<a>) t: | node(val, children) => ... val ... # call to tlist-fun, to handle children above ... tlist-fun(children) ... end end

This generalizes to mutual recursion between any number of data definitions. If you notice that your data definitions have a cycle of reference, be aware that you’ll need to work through examples and an implementation that handles all of them.

2.10.3 A Note on Parametric Data Definitions

The definition in the last section made up a new datatype, TList, for representing lists of trees. That was useful for an example, but most of the time, it would be preferable to just use the built-in List datatype rather than defining something totally new. So a Tree definition like this would make better use of Pyret’s builtins:

data Tree<a>: | node(value :: a, children :: List<Tree<a>>) end

The template wouldn’t change much; it would simply refer to List<Tree<a>> instead of TList<a>, and list-sum would use List<Tree<Number>> instead of TList<Number>. But we saw other functions that used Lists, and conceivably the lists that they work over could have sub-elements that are Trees. Does that mean that map, and length, need to be concerned with Trees? Hopefully the answer is no, because otherwise we’d need to rethink the whole discussion about map in Abstracting Common Parts of List Functions to account for Trees!

This brings up an important distinction between the data that a specific problem works with, and a single Pyret data declaration. Problems that work over Tree<Number>s are concerned with specifically List<Tree<Number>>s, not any other kind of list. When writing functions over List<Tree<Number>>s, it’s natural that we have to take Trees into account, and consider Tree-handling functions in the template. The data definition of a Tree problem specifies the type of data that the list works with, and it becomes part of the examples and the problem template.

However, other instantiations of List, like List<Number>, have no relationship with Trees, and those problems have no reason to include templates for Trees. The problem we’re trying to solve dictates the types of data to use, which may include some specific instantiations, and that guides us towards a template.

Finally, describing some problems’ data doesn’t require a fully instantiated datatype. For example, taking the length of a list, or calculating the number of nodes in a tree, work for any instantiation of a List or Tree. Since these problems can be solved without referring to particular instantiations, they don’t need to consider the contents at all.

2.11 Naming Intermediate Values

Sometimes, we perform a computation and want to use the result more than once. For example, in this implementation of max-in-list, we get the max value from the rest of the list in two places:

fun max-in-list(l :: List<Number>) -> Number: doc: "Return the largest number in the list" cases (List<Number>) l: | empty => 0 | link(first, rest) => if first > max-in-list(rest): first else: max-in-list(rest) end end where: max-in-list([list: 1, 5, 4]) is 5 max-in-list([list:]) is 0 end

The expression max-in-list(rest) appears twice in this example. To avoid duplicating it, we can give it a name, like this:

fun max-in-list(l :: List<Number>) -> Number: doc: "Return the largest number in the list" cases (List<Number>) l: | empty => 0 | link(first, rest) => max-rest = max-in-list(rest) if first > max-rest: first else: max-rest end end end

We call the expression max-rest = max-in-list(rest) an identifier binding or let binding expression. We need to add a new rule for reducing this kind of expression. It’s a little different than the rules we’ve seen so far, because it takes multiple expressions into account. When handling an identifier binding, we look at the binding expression itself and the expression(s) immediately after it in the same block of code. So in the example above, we’d be considering:

max-rest = max-in-list(rest) if first > max-rest: first else: max-rest end

An identifier binding evaluates by evaluating the expression to the right of the = sign, and then substituting the resulting value into the rest of the expressions after it. Before stepping through all of max-in-list, here’s a short example that has a similar structure:

x = 5 + 5 if x > 9 x else: x + 11 end => x = 10 if x > 9 x else: x + 11 end

This is where x in the rest of the expressions (the if expression) gets substituted with 10.

=> if 10 > 9: 10 else: 10 + 11 end => if true: 10 else: 10 + 11 end => 10

So, a full substitution for a call to max-in-list would look like:

max-in-list([list: 1, 5]) => cases (List<Number>) [list: 1, 5]: | empty => 0 | link(first, rest) => max-rest = max-in-list(rest) if first > max-rest: first else: max-rest end end => max-rest = max-in-list([list: 5]) if 1 > max-rest: 1 else: max-rest end

This step is interesting, because we have to process the max-in-list call to find the answer to store in max-rest.

=> max-rest = cases (List<Number>) [list: 5]: | empty => 0 | link(first, rest) => max-rest = max-in-list(rest) if first > max-rest: first else: max-rest end end if 1 > max-rest: 1 else: max-rest end

The block: ... end notation is used to explicit indicate that there is a sequence of expressions that will be evaluated just like any other. Note that we have two max-rests, each of which will be substituted into its own block.

=> max-rest = block: max-rest = max-in-list([list:]) if 5 > max-rest: 5 else: max-rest end end if 1 > max-rest: 1 else: max-rest end => max-rest = block: max-rest = cases (List<Number>) [list:]: | empty => 0 | link(first, rest) => if first > max-rest: first else: max-rest end end if 5 > max-rest: 5 else: max-rest end end if 1 > max-rest: 1 else: max-rest end

Now there is a value, 0, to substitute for the max-rest that is further to the right, so we can do the substitution for max-rest where it appears.

=> max-rest = block: max-rest = 0 if 5 > max-rest: 5 else: max-rest end end if 1 > max-rest: 1 else: max-rest end => max-rest = block: if 5 > 0: 5 else: 0 end end if 1 > max-rest: 1 else: max-rest end => max-rest = block: if true: 5 else: 0 end end if 1 > max-rest: 1 else: max-rest end

When a block: contains only a single value expression, it evaluates to that value.

=> max-rest = block: 5 end if 1 > max-rest: 1 else: max-rest end => max-rest = 5 if 1 > max-rest: 1 else: max-rest end

Now we can substitute for the first max-rest, and finish the if evaluation to get 5.

=> if 1 > 5: 1 else: 5 end => if false: 1 else: 5 end => 5

If there are multiple identifier bindings in a row, they are processed in order, and get substituted into the right-hand-sides of any other bindings that happen later in the same block:

=> x = 4 + 5 y = x + 10 z = y + x z - 3 => x = 9 y = x + 10 z = y + x z - 3 => y = 9 + 10 z = y + 9 z - 3 => y = 19 z = y + 9 z - 3 => z = 19 + 9 z - 3 => z = 28 z - 3 => 28 - 3 => 25

Standalone Bindings: A let-binding expression doesn’t make much sense on its own; in this binding, what expression can y be substituted into?

fun add-5(x :: Number) -> Number: y = x + 5 end

Pyret cannot do anything meaningful with the body of add-5, so Pyret reports an error upon running this program, that the block ended in a let-binding and thus cannot be evaluated. This version of add-5 would make sense:

fun add-5(x :: Number) -> Number: y = x + 5 y end

2.12 More to Learn

This chapter gives you a foundation for programming in Pyret, and is enough to get through all the programs through Halloween Analysis. There are several more features you’ll need to learn for later chapters, and they’ll be introduced as they come up, since the new features relate directly to the programming concepts taught in those chapters.