6 Processing Lists
some list functions are generic and operate on any kind of list: e.g., the length of a list is the same irrespective of what kind of values it contains;
some are specific at least to the type of data: e.g., the sum assumes that all the values are numbers (though they may be ages or prices or other information represented by numbers); and
some are somewhere in-between: e.g., a maximum function applies to any list of comparable values, such as numbers or strings.
This seems like a great varity, and we might worry about how we can handle this many different kinds of functions. Fortunately, and perhaps surprisingly, there is one standard way in which we can think about writing all these functions! Our mission is to understand and internalize this process.
6.1 Making Lists and Taking Them Apart
So far we’ve seen one way to make a list: by writing [list: …]. While useful, writing lists this way actually hides their true nature. Every list actually has two parts: a first element and the rest of the list. The rest of the list is itself a list, so it too has two parts…and so on.
Do Now!
Take apart this third list.
l1 = [list: 1, 2, 3] h1 = l1.first l2 = l1.rest h2 = l2.first l3 = l2.rest h3 = l3.first l4 = l3.rest check: h1 is 1 h2 is 2 h3 is 3 l2 is [list: 2, 3] l3 is [list: 3] l4 is empty end
check: [list: 1, 2, 3] is link(1, link(2, link(3, empty))) end
empty (written empty or [list: ]), or
non-empty (written link(…, …) or [list: ] with at least one value inside the brackets), where the rest is also a list (and hence may in turn be empty or non-empty, …).
6.2 Some Example Exercises
- Compute the length of a list:
my-len :: List<Any> -> Number
- Compute the sum of a list (of numbers):
my-sum :: List<Number> -> Number
- Compute the maximum of a list (of numbers or strings):
my-max :: List<Any> -> Any
- Given a list of strings, convert each string to a number representing its length:
my-str-len :: List<String> -> List<Number>
- Given a list of numbers, generate a list of its positive numbers:If you want to be pedantic: its positive numbers with the same count and in the same order.
my-pos-nums :: List<Number> -> List<Number>
- Given a list of numbers, replace every element with the running sum, i.e., the sum of all the elements from the beginning of the list until that element (inclusive):
my-running-sum :: List<Number> -> List<Number>
- Given a list, keep every alternate element in it, starting from the first:
my-alternating :: List<Any> -> List<Any>
- Given a list of numbers, compute the average of the numbers:
my-avg :: List<Number> -> List<Number>
Construct examples of the function’s behavior.
Employ the template that suggests possible solutions.
6.3 Structural Problems with Scalar Answers
Let’s write out examples for a few of the functions described above. We’ll approach writing examples in a very specific, stylized way. First of all, we should always construct at least two examples: one with empty and the other with at least one link, so that we’ve covered the two very broad kinds of lists. Then, we should have more examples specific to the kind of list stated in the problem. Finally, we should have even more examples to illustrate how we think about solving the problem.
6.3.1 my-len: Examples
We have’t precisely defined what it means to be “the length” of a list. We confront this right away when trying to write an example. What is the length of the list empty?
Do Now!
What do you think?
my-len(empty) is 0
my-len([list: 7]) is 1
my-len([list: 7, 8, 9]) is 3
my-len([list: 8, 9]) is 2
my-len([list: 9]) is 1
my-len([list: 7, 8, 9]) is 3 my-len([list: 8, 9]) is 2 my-len([list: 9]) is 1 my-len([list: ]) is 0
my-len([list: 7, 8, 9]) is 1 + 2 my-len([list: 8, 9]) is 1 + 1 my-len([list: 9]) is 1 + 0 my-len([list: ]) is 0
my-len([list: 7, 8, 9]) is 1 + my-len([list: 8, 9]) my-len([list: 8, 9]) is 1 + my-len([list: 9]) my-len([list: 9]) is 1 + my-len([list: ]) my-len([list: ]) is 0
Double-check all these and make sure you understand the calculations. It’ll prove central to how we write the program later!
6.3.2 my-sum: Examples
A similar logic applies to how we treat a function like my-sum. What do we want the sum of the empty list to be? Well, it may be entirely clear, so let’s move on for a moment. What is the sum of the list [list: 7, 8, 9]? Well, clearly we intend for this to be 24. Let’s see how that works out.
my-sum([list: 7, 8, 9]) is 7 + 8 + 9 my-sum([list: 8, 9]) is 8 + 9 my-sum([list: 9]) is 9
my-sum([list: 7, 8, 9]) is 7 + my-sum([list: 8, 9]) my-sum([list: 8, 9]) is 8 + my-sum([list: 9]) my-sum([list: 9]) is 9 + my-sum([list: ])
my-sum(empty) is 0
Observe, again, how we can use the result of computing my-sum of the rest of the list to compute its result for the whole list.
6.3.3 From Examples to Code
Given these examples, we can now turn them into code. We introduce the construct cases, which lets us tell apart different kinds of lists, and use it to provide answers for each kind of list.
The grammar for cases is as follows: [FILL] [FILL TEMPLATES]
fun my-len(l): cases (List) l: | empty => 0 | link(f, r) => 1 + my-len(r) end end
fun my-sum(l): cases (List) l: | empty => 0 | link(f, r) => f + my-sum(r) end end
6.4 Structural Problems with List Answers
Now let’s tackle the functions that produce a list as the answer.
6.4.1 my-str-len: Examples and Code
my-str-len([list: "hi", "there", "mateys"]) is [list: 2, 5, 6]
my-str-len([list: "there", "mateys"]) is [list: 5, 6] my-str-len([list: "mateys"]) is [list: 6]
my-str-len([list: "hi", "there", "mateys"]) is link(2, [list: 5, 6]) my-str-len([list: "there", "mateys"]) is link(5, [list: 6]) my-str-len([list: "mateys"]) is link(6, [list: ])
my-str-len(empty) is empty
fun my-str-len(l): cases (List) l: | empty => empty | link(f, r) => link(string-length(f), my-str-len(r)) end end
6.4.2 my-pos-nums: Examples and Code
Do Now!
Construct the sequence of examples that we obtain from the input [list: 1, -2, 3, -4].
my-pos-nums([list: 1, -2, 3, -4]) is [list: 1, 3] my-pos-nums([list: -2, 3, -4]) is [list: 3] my-pos-nums([list: 3, -4]) is [list: 3] my-pos-nums([list: -4]) is [list: ] my-pos-nums([list: ]) is [list: ]
my-pos-nums([list: 1, -2, 3, -4]) is link(1, [list: 3]) my-pos-nums([list: -2, 3, -4]) is [list: 3] my-pos-nums([list: 3, -4]) is link(3, [list: ]) my-pos-nums([list: -4]) is [list: ] my-pos-nums([list: ]) is [list: ]
my-pos-nums([list: 1, -2, 3, -4]) is link(1, my-pos-nums([list: -2, 3, -4])) my-pos-nums([list: -2, 3, -4]) is my-pos-nums([list: 3, -4]) my-pos-nums([list: 3, -4]) is link(3, my-pos-nums([list: -4])) my-pos-nums([list: -4]) is my-pos-nums([list: ]) my-pos-nums([list: ]) is [list: ]
fun my-pos-nums(l): cases (List) l: | empty => empty | link(f, r) => if f > 0: link(f, my-pos-nums(r)) else: my-pos-nums(r) end end end
Do Now!
Is our set of examples comprehensive?
Not really. There are many examples we haven’t considered, such as lists that end with positive numbers and lists with 0.
Exercise
Work through these examples and see how they affect the program!
6.4.3 my-alternating: First Attempt
Do Now!
Work out the results for my-alternating starting from the list [list: 1, 2, 3, 4, 5, 6].
check: my-alternating([list: 1, 2, 3, 4, 5, 6]) is [list: 1, 3, 5] my-alternating([list: 2, 3, 4, 5, 6]) is [list: 2, 4, 6] my-alternating([list: 3, 4, 5, 6]) is [list: 3, 5] my-alternating([list: 4, 5, 6]) is [list: 4, 6] end
6.4.4 my-running-sum: First Attempt
One more time, we’ll begin with an example.
Do Now!
Work out the results for my-running-sum starting from the list [list: 1, 2, 3, 4, 5].
check: my-running-sum([list: 1, 2, 3, 4, 5]) is [list: 1, 3, 6, 10, 15] my-running-sum([list: 2, 3, 4, 5]) is [list: 2, 5, 9, 14] my-running-sum([list: 3, 4, 5]) is [list: 3, 7, 12] end
my-running-sum([list: 1, 2, 3, 4, 5]) is [list: 1, 3, 6, 10, 15] my-running-sum([list: 2, 3, 4, 5]) is [list: 2, 5, 9, 14] my-running-sum([list: 3, 4, 5]) is [list: 3, 7, 12]
We’ll return to this function later, too [my-running-sum: Examples and Code].
6.5 Structural Problems with Sub-Domains
6.5.1 my-max: Examples
my-max([list: 1, 2, 3]) is 3 my-max([list: 3, 2, 1]) is 3 my-max([list: 2, 3, 1]) is 3 my-max([list: 2, 3, 1, 3, 2]) is 3 my-max([list: 2, 1, 4, 3, 2]) is 4 my-max([list: -2, -1, -3]) is -1
Do Now!
Could we define my-max(empty) to be 0? Returning 0 for the empty list has worked well twice already!
num-max(1, 2) is 2 num-max(-1, -2) is -1
Exercise
Suppose num-max were not already built in. Can you define it? You will find what you learned about Booleans handy. Remember to write some tests!
my-max([list: 1, 2, 3]) is 3 my-max([list: 2, 3]) is 3 my-max([list: 3]) is 3
my-max([list: 3, 2, 1]) is 3 my-max([list: 2, 1]) is 2 my-max([list: 1]) is 1
my-max([list: 2, 1, 4, 3, 2]) is 4 my-max([list: 1, 4, 3, 2]) is 4 my-max([list: 4, 3, 2]) is 4 my-max([list: 3, 2]) is 3 my-max([list: 2]) is 2
my-max([list: 2, 1, 4, 3, 2]) is num-max(2, 4) my-max([list: 1, 4, 3, 2]) is num-max(1, 4) my-max([list: 4, 3, 2]) is num-max(4, 3) my-max([list: 3, 2]) is num-max(3, 2) my-max([list: 2]) is …
my-max([list: 2]) is num-max(2, …)
my-max([list: 2, 1, 4, 3, 2]) is num-max(2, my-max([list: 1, 4, 3, 2])) my-max([list: 1, 4, 3, 2]) is num-max(1, my-max([list: 4, 3, 2])) my-max([list: 4, 3, 2]) is num-max(4, my-max([list: 3, 2])) my-max([list: 3, 2]) is num-max(3, my-max([list: 2]))
my-max(empty) raises ""
6.5.2 my-max: From Examples to Code
fun my-max(l): cases (List) l: | empty => raise("not defined for empty lists") | link(f, r) => num-max(f, my-max(r)) end end
Do Now!
What’s wrong with this?
num-max(2, my-max([list: ]))
That’s because we need to make sure aren’t trying to compute the maximum of the empty list. There are two ways to do this. We’ll see one now, and return to the other way later [REF: accumulators].
fun my-max(l): cases (List) l: | empty => raise("not defined for empty lists") | link(f, r) => cases (List) r: | empty => … | … end end end
fun my-max(l): cases (List) l: | empty => raise("not defined for empty lists") | link(f, r) => cases (List) r: | empty => f | … end end end
fun my-max(l): cases (List) l: | empty => raise("not defined for empty lists") | link(f, r) => cases (List) r: | empty => f | else => num-max(f, my-max(r)) end end end
6.5.3 my-alternating: Examples and Code
my-alternating([list: 1, 2, 3, 4, 5, 6]) is [list: 1, 3, 5] my-alternating([list: 3, 4, 5, 6]) is [list: 3, 5] my-alternating([list: 5, 6]) is [list: 5] my-alternating([list: ]) is [list: ]
fun my-alternating(l): cases (List) l: | empty => empty | link(f, r) => link(f, … r …) end end
Do Now!
Think about how to complete this definition.
my-alternating([list: 1, 2, 3, 4, 5]) is [list: 1, 3, 5] my-alternating([list: 3, 4, 5]) is [list: 3, 5] my-alternating([list: 5]) is [list: 5]
fun my-alternating(l): cases (List) l: | empty => empty | link(f, r) => cases (List) r: # note: deconstructing r, not l | empty => # the list has an odd number of elements [list: f] | link(fr, rr) => # fr = first of rest, rr = rest of rest link(f, my-alternating(rr)) end end end
In my-alternating: Examples and Code we’ll see another way of approaching this problem.
6.6 More Structural Problems with Scalar Answers
6.6.1 my-avg: Examples
Let’s now try to compute the average of a list of numbers. Let’s start with the example list [list: 1, 2, 3, 4] and work out more examples from it. The average of numbers in this list is clearly (1 + 2 + 3 + 4)/4, or 10/4.
my-avg([list: 1, 2, 3, 4]) is 10/4 my-avg([list: 2, 3, 4]) is 9/3 my-avg([list: 3, 4]) is 7/2 my-avg([list: 4]) is 4/1
The average of the remainder of the list is 9/3, i.e., 3.
The first number in the list is 1.
[list: 1, 2] # the rest has one element with sum 2 [list: 1, 4, 0] # the rest has two elements with sum 4
That is, to compute the average of a whole list, it’s not even useful to know the average of the rest of the list. Rather, we need to know the sum and the length of the rest of the list. With these two, we can add the first to the sum, and 1 to the length, and compute the new average.
fun my-avg(l): my-sum(l) / my-len(l) end
Do Now!
What should be the average of the empty list? Does the above code produce what you would expect?
Just as we argued earlier about the maximum [Structural Problems with Sub-Domains], the average of the empty list isn’t a well-defined concept. Therefore, it would be appropriate to signal an error. The implementation above does this, but poorly: it reports an error on division. A better programming practice would be to catch this situation and report the error right away, rather than hoping some other function will report the error.
Exercise
Alter my-avg above to signal an error when given the empty list.
Therefore, we see that the process we’ve used—
6.7 Structural Problems with Accumulators
Now we are ready to tackle the problems we’ve left unfinished. They will require a new technique to solve.
6.7.1 my-running-sum: Examples and Code
Recall how we began in my-running-sum: First Attempt. Our examples [<running-sum-egs-1>] showed the following problem. When we process the rest of the list, we have forgotten everything about what preceded it. That is, when processing the list starting at 2 we forget that we’ve seen a 1 earlier; when starting from 3, we forget that we’ve seen both 1 and 2 earlier; and so on. In other words, we keep forgetting the past. We need some way of avoiding that.
Do Now!
What should the initial sum be?
my-rs :: Number, List<Number> -> List<Number>
my-rs( 0, [list: 1, 2, 3, 4, 5]) is [list: 0 + 1] + my-rs( 0 + 1, [list: 2, 3, 4, 5]) my-rs( 1, [list: 2, 3, 4, 5]) is [list: 1 + 2] + my-rs( 1 + 2, [list: 3, 4, 5]) my-rs( 3, [list: 3, 4, 5]) is [list: 3 + 3] + my-rs( 3 + 3, [list: 4, 5]) my-rs( 6, [list: 4, 5]) is [list: 6 + 4] + my-rs( 6 + 4, [list: 5]) my-rs(10, [list: 5]) is [list: 10 + 5] + my-rs(10 + 5, [list: ]) my-rs(15, [list: ]) is empty
fun my-rs(acc, l): cases (List) l: | empty => empty | link(f, r) => new-sum = acc + f link(new-sum, my-rs(new-sum, r)) end end
fun my-running-sum(l): my-rs(0, l) end
Observe that we do not change my-running-sum itself to take extra arguments. There are multiple reasons for this. [FILL]
6.7.2 my-alternating: Examples and Code
Recall our effort in my-alternating: First Attempt, which we tackled in my-alternating: Examples and Code. There, we solved the problem by thinking of the list a little differently: we try, as much as possible, to skip two elements of the list at a time, so the first element we see is one we always want to keep as part of the answer. Here we will see another way to think about the same problem.
my-alt :: List<Any>, Boolean -> List<Any>
fun my-alt(l, keep): cases (List) l: | empty => empty | link(f, r) => if keep: link(f, my-alt(r, false)) else: my-alt(r, true) end end
fun my-alternating(l): my-alt(l, true) end
6.8 Dealing with Multiple Answers
Our discussion above has assumed there is only one answer for a given input. This is often true, but it also depends on how the problem is worded and how we choose to generate examples. We will study this in some detail now.
6.8.1 uniq: Problem Setup
Consider the task of writing uniq:uniq is the name of a Unix utility with similar behavior; hence the spelling of the name. given a list of values, it produces a collection of the same elements while avoiding any duplicates (hence uniq, short for “unique”).
Consider the following input: [list: 1, 2, 1, 3, 1, 2, 4, 1].
Do Now!
What is the sequence of examples this input generates? It’s really important you stop and try to do this by hand. As we will see there are multiple solutions, and it’s useful for you to consider what you generate. Even if you can’t generate a sequence, trying to do so will better prepare you for what you read next.
How did you obtain your example? If you just “thought about it for a moment and wrote something down”, you may or may not have gotten something you can turn into a program. Programs can only proceed systematically; they can’t “think”. So, hopefully you took a well-defined path to computing the answer.
6.8.2 uniq: Examples
It turns out there are several possible answers, because we have (intentionally) left the problem unspecified. Suppose there are two instances of a value in the list; which one do we keep, the first or the second? On the one hand, since the two instances must be equivalent it doesn’t matter, but it does for writing concrete examples and deriving a solution.
examples: uniq([list: 1, 2, 1, 3, 1, 2, 4, 1]) is [list: 3, 2, 4, 1] uniq([list: 2, 1, 3, 1, 2, 4, 1]) is [list: 3, 2, 4, 1] uniq([list: 1, 3, 1, 2, 4, 1]) is [list: 3, 2, 4, 1] uniq([list: 3, 1, 2, 4, 1]) is [list: 3, 2, 4, 1] uniq([list: 1, 2, 4, 1]) is [list: 2, 4, 1] uniq([list: 2, 4, 1]) is [list: 2, 4, 1] uniq([list: 4, 1]) is [list: 4, 1] uniq([list: 1]) is [list: 1] uniq([list: ]) is [list: ] end
uniq([list: 1, 2, 1, 3, 1, 2, 4, 1]) is [list: 1, 2, 3, 4]
uniq([list: 1, 2, 1, 3, 1, 2, 4, 1]) is [list: 4, 3, 2, 1]
6.8.3 uniq: Code
What is the systematic approach that gets us to this answer? When given a non-empty list, we split it into its first element and the rest of the list. Suppose we have the answer to uniq applied to the rest of the list. Now we can ask: is the first element in the rest of the list? If it is, then we can ignore it, since it is certain to be in the uniq of the rest of the list. If, however, it is not in the rest of the list, it’s critical that we link it to the answer.
This translates into the following program. For the empty list, we return the empty list. If the list is non-empty, we check whether the first is in the rest of the list. If it is not, we include it; otherwise we can ignore it for now.
fun uniq-rec(l :: List<Any>) -> List<Any>: cases (List) l: | empty => empty | link(f, r) => if r.member(f): uniq-rec(r) else: link(f, uniq-rec(r)) end end end
Exercise
Note that we’re using .member to check whether an element is a member of the list. Write a function member that consumes an element and a list, and tells us whether the element is a member of the list.
6.8.4 uniq: Reducing Computation
fun uniq-rec2(l :: List<Any>) -> List<Any>: cases (List) l: | empty => empty | link(f, r) => ur = uniq-rec(r) if r.member(f): ur else: link(f, ur) end end end
You might think, because we replaced two function calls with one, that we’ve reduced the amount of computation the program does. It does not! The two function calls are both in the two branches of the same conditional; therefore, for any given list element, only one or the other call to uniq happens. In fact, in both cases, there was one call to uniq before, and there is one now. So we have reduced the number of calls in the source program, but not the number that take place when the program runs. In that sense, the name of this section was intentionally misleading!
fun uniq-rec3(l :: List<Any>) -> List<Any>: cases (List) l: | empty => empty | link(f, r) => ur = uniq-rec(r) if ur.member(f): ur else: link(f, ur) end end end
Exercise
Later [Predicting Growth] we will study how to formally study how long a program takes to run. By the measure introduced in that section, does the change we just made make any difference? Be careful with your answer: it depends on how we count “the length” of the list.
Observe that if the list never contained duplicates in the first
place, then it wouldn’t matter which list we check membership in—
6.8.5 uniq: Example and Code Variations
Start with the entire given list and with the empty answer (so far).
For each list element, check whether it’s already in the answer so far. If it is, ignore it, otherwise extend the answer with it.
When there are no more elements in the list, the answer so far is the answer for the whole list.
6.8.6 uniq: Why Produce a List?
If you go back to the original statement of the uniq problem [uniq: Problem Setup], you’ll notice it said nothing about what order the output should have; in fact, it didn’t even say the output needs to be a list (and hence have an order). In that case, we should think about whether a list even makes sense for this problem. In fact, if we don’t care about order and don’t want duplicates (by definition of uniq), then there is a much simpler solution, which is to produce a set. Pyret already has sets built in, and converting the list to a set automatically takes care of duplicates. This is of course cheating from the perspective of learning how to write uniq, but it is worth remembering that sometimes the right data structure to produce isn’t necessarily the same as the one we were given. Also, later [(part "sets")], we will see how to build sets for ourselves (at which point, uniq will look familiar, since it is at the heart of set-ness).
6.9 Monomorphic Lists and Polymorphic Types
my-len :: List<Any> -> Number my-max :: List<Any> -> Any
Exercise
What happens if we run 1 > "two" or "two" > 3?
Rather, what we mean is a list where all the elements are of the same kind,Technically, elements that are also comparable. and the contract has not captured that. Furthermore, we don’t mean that my-max might return any old type: if we supply it with a list of numbers, we will not get a string as the maximum element! Rather, it will only return the kind of element that is in the provided list.
In short, we mean that all elements of the list are of the same type, but they can be of any type. We call the former monomorphic: “mono” meaning one, and “morphic” meaning shape, i.e., all values have one type. But the function my-max itself can operate over many of these kinds of lists, so we call it polymorphic (“poly” meaning many).
fun my-max<T>(l :: List<T>) -> T: … end
fun my-len<T>(l :: List<T>) -> Number: … end