2 Basic Data and Expressions
Let’s think about some of the programs we use. We might use Google, a
search engine: it consumes a search query and produces a
search results. We may use Facebook, a social network: it
consumes information about our friendships and produces
updates from our friends. We could use Amazon, a store: it
consumes descriptions of things we’re interested in and
produces lists of products that match our descriptions. We
sometimes use Weather.com, a weather site: it consumes our
location and produces a weather forecast for that location. In
fact these systems consume and produce even more: they consume our
history of past preferences, and produce ads, recommendations for
related products [REF collaborative filtering], and so on. In short, programs consume and
produce information.
Information is a fuzzy term; what computers actually consume and
produce are what we call data. The difference is subtle but
important. “Our location” is a vague concept, but what a weather
site actually consumes is a very concrete representation of it,
such as the name of a city (like “Providence, RI, USA”) or its GPS
coordinate (like “41.8236 N, 71.4222 W”). There might even be
multiple choices for how to represent that information. For instance,
some systems might represent Providence’s GPS coordinates as we have
shown above:
Others might choose to use positive and negative numbers to represent
the hemisphere:
You could even imagine combining the two into a single number through
some clever numeric trick [REF gödel encoding], and so on. Therefore, these are
different kinds of data to represent the same information.
Therefore, to write programs, we have to understand their data. We
will begin by understand the most basic kinds of data, and writing
some simple programs with them. As our programs get more
sophisticated, we will need to represent more interesting kinds of
information.Much of our presentation is directly derived
from the book How to Design Programs, which you can find
at http://htdp.org/.
2.1 Numbers
Some of the most common data in programs are numeric. They might
represent a GPS coordinate as above, or a person’s age, or the size of
a picture on a screen, or the number of results to a search
query. Numbers are easy to represent in Pyret; for instance, here’s
the year Brown University was founded:
Unsurprisingly, if you ask Pyret for their values, Pyret gives you
back the same value you entered:
What use are programs whose values we already know? You’ll find
out later: (part "values-for-tests").
Of course, you can combine numbers in ways you will
recognize; we call these expressions, just as in
algebra:Yes, you must put spaces around operations like
+.
> 1764 + 1729 |
3493 |
> 1764 - 1729 |
35 |
> 1764 * 1729 |
3049956 |
If you have prior programming experience, you may worry a little about
multiplying big numbers. Don’t worry, go wild:
> 1764 * 1729 * 1764 * 1729 * 1764 * 1729 * 1764 * 1729 |
86531512776056800758948096 |
The examples use only one operator per expression. What happens if we
try to combine multiple different operators?
In Pyret, if you combine operators, the language doesn’t want to guess
what you might have meant (because it might guess wrong): if you write
did you mean 3 - (2 + 1) or (3 + 1) - 2? Rather than
guess, it asks that you write parentheses (as we just did) to make
your intent clear.
Of course, there are many operations we can perform on numbers,
and we don’t have special symbols for all of them. You can use Pyret’s
many more numeric operators using the syntax you recognize from math
classes: the operator for numeric square root is called
num-sqrt and you use it as follows:
If you have multiple parameters to the operator, separate them with
commas:
where num-expt computes the exponential.
A natural question you might have is, “Do I need to put spaces after
the commas between parameters?” Good question! Go ahead and check for
yourself.
If you have experience with other programming languages, you might now
start to wonder about different kinds of numbers. For instance, in
many languages, this expression
unfortunately results in zero.Don’t ask. Or, well, ask
later. Fortunately, in Pyret this produces 1. That’s because
Pyret represents the result of 1 / 3 exactly as a ratio, just
as you’d expect. In fact, Pyret tries to preserve ratios at all times:
if you compute
(1764 * 1729 * 1764) / (1764 * 1729 * 1764 * 1729)
Pyret will produce an answer that is equivalent to
1/1729.In fact, if you click on the output produced
by Pyret, you can switch between different ways of presenting
the same datum, one of which is 1/1729.
What happens if you try to divide by zero?
What do you expect?
What does Pyret do?
2.2 Expressions Versus Values
Before we proceed, we have to introduce one piece of terminology. A
value is the result of an expression: we are done computing with
it and can’t do any more. Thus, 1764 * 1729 is not a
value, because we can still perform the multiplication; but
3049956 is a value because it has no more operations waiting to
be done.
Let’s look at another example of expressions. Consider a space
traveler visiting the Moon. On the Moon, this astronaut weighs only
one-sixth their weight on earth. Thus, if their Earth weight is 50kg,
their Moon weight (in kilograms) is
If their Earth weight is 150lb, their Moon weight (in pounds) is
In general, given their Earth weight we can write an expression to
calculate their Moon weight. We’ll return to this example soon
[REF].
Note that Pyret does not care whether 50 represents
kilograms or pounds. These are units, and units are not part of
the programming language. This makes it easy to create unit errors,
which have an
infamous history. Later [REF] we will see how we can write some
unit processing into our programs to reduce such errors.Do you see something curious in the fractions we’ve written in the
previous section and this one?
This is subtle, so don’t worry if you didn’t catch it: we wrote
1/1729 instead of 1 / 1729, and 1/6 instead of
1 / 6. Weren’t all arithmetic operations (like /)
supposed to be surrounded by spaces?
That’s because 1/1729 is not an expression, it’s a
value—specifically, a number—in itself. That is, 1 / 1729
is an expression; there is still work that needs to be done (the
division) to obtain a value. In contrast, Pyret’s way of writing
numbers has a special allowance for writing rational numbers directly:
you can write 1/1729 just as well as you can write 1 and
1729, and it’s just as much of a value.
2.3 Variables to Name Values
It’s often more convenient to refer to the name of something than its
value. For instance, six months from now if you saw 1764 in a
program, you might have no recollection of what that means; but if you
instead saw brown-founding you’d have a pretty good
sense. Pyret lets us give names to values:
brown-founding = 1764
this-year = 2016
We can then use the names in expressions:
this-year - brown-founding
Indeed, we don’t have to name only values; we can even name
expressions:
brown-age = this-year - brown-founding
We call these names variables, just as in algebra. We’ll use the
word bound“Bound” here means in the sense of “tied
down”, because the variable is now tied down to that value. to refer
to the association of names with values: that is,
brown-founding is bound to 1764.
2.4 Strings
Of course, computers process much more than just numbers. We might
want to write down the name of a city, the name of a person, parts of
a document, and so on. In Pyret, strings are used to represent
such text:
"Providence, RI"
"Bangalore, India"
"the quick brown fox jumped over the lazy dog"
Strings begin and end in double-quotes; note that you have to use the
double-quotes on your keyboard ("), not opening- and
closing-quotes (“ and ”) as generated by word processors. You can
also use single-quotes (') instead:
'Providence, RI'
'Bangalore, India'
'the quick brown fox jumped over the lazy dog'
Naturally, you might wonder how Pyret can tell where a string ends if
the string itself contains a double-quote:
"The book was called "Structure and Interpretation"" |
What happens when you enter this?
Is that two strings ("The book was called " and "") with
some variables (like Structure) and other stuff in-between?
Needless to say, Pyret gets confused. So how do you enter this?
There are two easy ways. First, we can use the other quotes to avoid
confusion:
'The book was called "Structure and Interpretation"'
Or (as you may have noticed when Pyret prints the above value), you
can tell Pyret “this quote is part of the string, not the end of
it”:
"The book was called \"Structure and Interpretation\""
Either will work fine; pick whichever is more readable and less likely
to cause confusion.
2.4.1 Multi-Line Strings
It’s sometimes convenient to have a string that spans
multiple lines. Usually, strings that go beyond one line represent an
error in the program (because the programmer forgot to close the
string). Therefore, if you try to write a multi-line string in your
program, Pyret will produce an error:
"Let us go then, you and I, |
When the evening is spread out against the sky" |
Try this out and see the resulting error. Become familiar with it!
Pyret instead uses a very different syntax for multi-line strings:
```Let us go then, you and I,
When the evening is spread out against the sky```
Using the triple-back-tick is your way of signaling to Pyret, “I
really do want
this to span multiple lines”. Notice that when Pyret prints this
string, it replaces the newline with \n. Indeed, you could have
written this multi-line string all on line line as:
"Let us go then, you and I,\nWhen the evening is spread out against the sky"
Again, the two are equivalent, and you can choose whichever is more
convenient.
2.4.2 Operations on Strings
Of course, we can not only create strings, we can also perform
computations with them—essentially, “arithmetic with strings”. For
instance, we can concatenate them:
> "will." + "i." + "am" |
"will.i.am" |
> "The Lovesong " + "of J. Alfred Prufrock"Observe the space at the end of the first string. |
"The Lovesong of J. Alfred Prufrock" |
We can also measure the length of strings, or take them apart:
> s = "The Lovesong" |
> string-length(s) |
12 |
> string-substring(s, 0, 3) |
"The" |
2.5 Booleans
We have now multiple times talked about two values being
equivalent. Right now we’re trusting our eyes to tell that they’re
equal, but we can do better: we can ask Pyret to check for us.
Before we do so, we should ask ourselves how Pyret can report back
what it finds. For instance, if we ask whether two strings are
equal, the answer might be yes and it might be no. What data
should Pyret use to indicate the answer?
Explain why numbers and strings are not good ways to express the
answer.
Pyret offers two values,
true and
false, to represent
such answers. For historical reasons, these are called Boolean
values.
Named for George Boole. We can
compare two values for equality with
==:
There is
much more we can and should say about equality, which we will do later
[REF equality].
> 1 == 1 |
true |
> 1 == 2 |
false |
Thus, returning to our two earlier examples:Note that we
don’t need to create variables to perform a comparison, just as we
didn’t above. The variables are only to make the examples easier to
read.
n1 = (1764 * 1729 * 1764) / (1764 * 1729 * 1764 * 1729)
n2 = 1/1729
n1 == n2
s1 = ```Let us go then, you and I,
When the evening is spread out against the sky```
s2 = "Let us go then, you and I,\nWhen the evening is spread out against the sky"
s1 == s2
We expect both to result in true, whereas this does not:
```Let us go then, you and I,
When the evening is spread out against the sky``` ==
"Let us go then, you and I, When the evening is spread out against the sky"
(because we replaced the \n with a space).
In particular, we are using the language to determine when things are
equal, rather than using our (perhaps wobbly) eyesight. This is a
really important idea, which we will return to in much more detail
later [REF testing].
2.5.1 Other Comparisons
Of course, we can do many more kinds of comparisons. For instance,
However, it may not be obvious how to compare strings. Some
comparisons are unsurprising:
> "a" < "b" |
true |
> "a" >= "c" |
false |
> "that" < "this" |
true |
> "alpha" < "beta" |
true |
which is the alphabetical order we’re used to;
but others need some explanining:
> "a" >= "C" |
true |
> "a" >= "A" |
true |
This is because Pyret strings are compared using an ordering called
the
ASCII order.
Can you compare true and false? Try comparing them for
equality, then for inequality (such as <).
In general, you can compare any two values for equality (but read more
at [REF equality]); for instance:
If you want to compare values of a specific kind, you can use more
specific operators:
> num-equal(1, 1) |
true |
> num-equal(1, 2) |
false |
> string-equal("a", "a") |
true |
> string-equal("a", "b") |
false |
However, these operators will not let you mix two different kinds of
values.
Try
num-equal("a", 1) |
string-equal("a", 1) |
and understand how these operators relate to ==.
2.5.2 Other Boolean-Producing Operations
There are even more Boolean-producing operators, such as:
> wm = "will.i.am" |
> string-contains(wm, "will") |
true |
> string-contains(wm, "Will")Note the capital W. |
false |
In fact, just about every kind of data will have some Boolean-valued
operators to enable comparisons.
2.5.3 Combining Booleans
Often, we want to base decisions on more than one Boolean value. For
instance, you are allowed to vote if you’re a citizen of a country
and you are above a certain age. You’re allowed to board a bus
if you have a ticket or the bus is having a free-ride day. We
can even combine conditions: you’re allowed to drive if you are above
a certain age and have good eyesight and—either
pass a test or have a temporary license. Also, you’re allowed
to drive if you are not inebriated.
Corresponding to these forms of combinations, Pyret offers three main
operations: and, or, and not. Here are some
examples of their use:
> (1 < 2) and (2 < 3) |
true |
> (1 < 2) and (3 < 2) |
false |
> (1 < 2) or (2 < 3) |
true |
> (3 < 2) or (1 < 2) |
true |
> not(1 < 2) |
false |
2.5.4 Using Booleans
One way to use Boolean values is to combine them to produce other
Boolean values. Ultimately, however, we’d like to produce other kinds
of data as well depending on a Boolean. There are several ways to do
this in Pyret, but for now we’ll focus on just one: the
conditional expression (because what value it produces is a
“condition” of other values).
The structure of a conditional in Pyret is as follows:
if <Boolean expression to check>:
<expression if true>
else:
<expression if false>
end
For instance,
brown-founding = 1764
rice-founding = 1912
if brown-founding < rice-founding:
"Brown is older"
else:
"Rice is older"
end
Actually, this program contains a small logical error. Suppose Brown
and Rice were founded in the same year; then, because
brown-founding > rice-founding would be false, it would
declare Rice older despite them being the same age. So a more accurate
program would check whether they’re equal, and only report one as
older if it really is. We can do this with an expanded version of the
conditional, whose syntax is:
if <Boolean expression to check>:
<expression if first expression is true>
else if <another Boolean expression to check>:
<expression if first expression is false and second expression is true>
else:
<expression if both expressions are false>
end
For instance:
if brown-founding < rice-founding:
"Brown is older"
else if brown-founding == rice-founding:
"Both are the same age"
else:
"Rice is older"
end
2.6 Evaluating by Reducing Expressions
Finally, let us briefly talk about how Pyret produces values, i.e.,
the process of evaluation (reducing to values). Suppose we want
to compute the wages of a worker. The worker is paid $10 for every
hour up to the first 40 hours, and is paid $15 for every extra
hour. Let’s say hours contains the number of hours they work,
and suppose it’s 45:
Suppose the formula for computing the wage is
if hours <= 40:
hours * 10
else if hours > 40:
(40 * 10) + ((hours - 40) * 15)
end
Let’s now see how this results in an answer, using a step-by-step
process that should match what you’ve seen in algebra
classes:The first step is to substitute the
hours with 45.
if 45 <= 40:
45 * 10
else if 45 > 40:
(40 * 10) + ((45 - 40) * 15)
end
Next, the conditional part of the if expression is evaluated,
which in this case is false.
=> if false:
45 * 10
else if 45 > 40:
(40 * 10) + ((45 - 40) * 15)
end
Since the condition is false, the next branch is tried.
=> if false:
45 * 10
else if true:
(40 * 10) + ((45 - 40) * 15)
end
Since the condition is true, the expression reduces to the body
of that branch. After that, it’s just arithmetic.
=> (40 * 10) + ((45 - 40) * 15)
This style of reduction is the best way to think about the evaluation
of Pyret expressions (and later, functions). The whole expression
takes steps that simplify it, proceeding by simple rules. You can use
this style yourself if you want to try and work through the evaluation
of a Pyret program by hand (or in your head).
2.7 Images
Pyret doesn’t limit you to numbers and strings; you can also treat
images as data. To use images, we should ask Pyret for the image
operations:
We can draw a red circle:
red-circ = circle(60, "solid", "red")
We can also draw a white rectangle:
white-rect = rectangle(300, 200, "solid", "white")
Just as we can combine numbers, strings, and Booleans, we can also
combine images (i.e., perform “arithmetic on images”). For instance,
overlay will lay one image (the first one) atop the second:
overlay(red-circ, white-rect)
to obtain the
Japanese flag.
Similarly, we can place three circles one above the other—
sm-circ = circle(50, "outline", "black")
above(sm-circ, above(sm-circ, sm-circ))
—to get the first stage of a snowperson (have fun drawing in the
rest!).
Revisiting the Japanese flag for a moment, there are actually rules
about the ratio of the different dimensions of the flag: the numbers
above were not chosen mindlessly. For instance, the width and height
must be in a 3:2 ratio (hence 300 and 200), and the red
circle must have a diameter 3/5 of the overall height. If we
now want to draw a bigger flag, we would have to carefully change many
things to preserve these rules! Alternatively, we could use a variable
to represent, say, one “unit” of size, and calculate
everything else from there:
unit = 100
bg-width = unit * 3
bg-height = unit * 2
circ-rad = 3/5 * 1/2 * bg-height
red-circ = circle(circ-rad, "solid", "red")
white-rect = rectangle(bg-width, bg-height, "solid", "white")
overlay(red-circ, white-rect)
This now makes it easy to change just one thing and have everything
else automatically change: each time we run the program the size of
the flag depends on the value of unit. If we make 200,
we obtain a flag twice as large; if we make it 50, we obtain
one half as big.
Of course, each time we have to keep running the program to compute
these values afresh, a problem we’d like to avoid; we’ll return to
this later [FILL]. Later we will also see how we can combine images
with other computation to make movies, animations, and videogames [REF
world].
2.8 Roughnums
Before we conclude looking at basic data, we have to say a little more
about numbers. The numeric examples we’ve picked above have been
chosen conveniently to not reveal a certain ugliness about how
computers treat numbers. Though Pyret masks this ugliness for the most
part, for various reasons (mainly having to do with efficiency), it
does not mask them entirely.
The best way to see this problem is to ask for the square root of
2: num-sqrt(2). This number does not have a precise
rational representation. Pyret still computes an answer, but prints it
in a curious way:
What is going on here? Pyret wants to make sure you understand that
what it has printed—
1.4142135623730951—
cannot possibly be
the exact answer (because \(\sqrt{2}\) has no rational
representation). The prefix of
~ means this is a
roughnum: an approximate answer that you should treat with
caution.
In practice, roughnums are represented using
floating points,
which are approximate but implemented efficiently in modern computers.
Roughnums have a pervasive property: once a computation
involves a roughnum, its answer will also be a roughnum:
Sometimes, this might be surprising:
> num-sqrt(2) - num-sqrt(2)
~0
You might think “Any number minus itself must be exactly zero”, but
Pyret doesn’t “know” as much as you do, and hedges its bets.
This turns out to be rather wise. For instance, consider this
calculation:
> num-sqrt(2) + 1
~2.414213562373095
Look more closely at the digits after the decimal. You would expect
them to be exactly the same as those for num-sqrt(2),
but the first one seems to have an extra digit:
~1.4142135623730951
~2.414213562373095
In fact, subtracting 1 from the second number produces
something slightly different from the first number:
> (num-sqrt(2) + 1) - 1
~1.414213562373095
So is this or isn’t this exactly the same as num-sqrt(2)? Let’s
ask Pyret:
> ((num-sqrt(2) + 1) - 1) - num-sqrt(2)
~-2.220446049250313e-16
The notation e means an exponential representation, so this is
the same as roughly \(2.2 \times 10^{-16}\)—i.e., a very small
number, but not exactly zero! Thus, starting with \(\sqrt{2}\),
adding one, subtracting one, and then subtracting \(\sqrt{2}\)
produces an answer that is not precisely zero.
Do roughnums obey the axioms of arithmetic operators like (where
appropriate) distributivity, associativity, and commutativity?
Because roughnums are so brittle, Pyret doesn’t let you get misled by
equality operations. If you ask
Pyret is not sure: the ~1 represents some value that is
approximately 1, but it could be exactly that or something
close to but not exactly it (just like we saw a value that was close
to but not exactly zero even though in reality we knew it was
zero). Therefore, Pyret regards this as neither true nor
false but rather as an error. Later [REF] we will see
how to compare roughnums.
A full explanation of floating points is well beyond the scope of this
document; this section just warns you of what you must
beware.The interested reader should find a copy of “What
Every Computer Scientist Should Know About Floating-Point Arithmetic”
by David Goldberg, e.g.,
see here.