4 State and Change
4.1 A Canonical Mutable Structure
As we have motivated ((part "union-find-functional")), sometimes it’s nice to be able to change the value of a datum rather than merely construct a new one with an updated value. The main advantage to changing it is that every value that refers to it can now see this change. The main disadvantage to changing it is that every value that refers to it can now see this change. Using this power responsibly is therefore an important programming challenge.
box consumes a value and creates a mutable box containing that value.
unbox consumes a box and returns the value contained in the box.
set-box consumes a box, a new value, and changes the box to contain the value. All subsequent unboxes of that box will now return the new value.
class Box<T> { |
private T the_value; |
Box(T v) { |
this.the_value = v; |
} |
T get() { |
return this.the_value; |
} |
void set(T v) { |
this.the_value = v; |
} |
} |
data Box: | box(ref v) where: n1 = box(1) n2 = box(2) n1!{v : 3} n2!{v : 4} n1!v is 3 n2!v is 4 end
Why do we say “type-consistent” above, rather than “the same type”?
The values could be related by subtyping ((part "subtyping")).
What does the comment about longevity mean? How does this apply to reusing values extracted from fields?
4.2 What it Means to be Identical
b0 = box("a value") |
b1 = box("a value") |
b2 = b1 |
check: b0!v == b1!v is true identical(b0, b1) is false identical(b1, b2) is true b1!v is b2!v end
hold-b1-value = b1!v |
b1!{v: "a different value"} |
check: b0!v == b1!v is false identical(b0, b1) is false identical(b1, b2) is true b1!v is b2!v end
b1!{v: hold-b1-value}
hold-b2-value = b2!v b2!{v: "yet another value"} check: b0!v == b1!v is false identical(b0, b1) is false identical(b1, b2) is true b1!v is b2!v b2!{v: hold-b2-value} end
b0!{v: "yet another value"} check: b0!v == b1!v is false end
Now, why did we bother holding on to and restoring the value? It’s
because at the end of each of these sequences, the values have all been
restored, but by making the change, we have been able to observe which
other names detect the change and which ones do not—
In practice, identical does not behave this way: it would be
too disruptive—
4.3 From Identifiers to Variables
As we have just seen, mutable values can be aliased, which means references can inadvertently have their values changed. Because these values can be passed around, it can be difficult to track all the aliases that might exist (because it would be infeasible for a value to retain “backward references”).
var x = 0 x := 1
x = 1; |
x = 3; |
Now, we also use the term “variable” in mathematics to refer to function parameters. For instance, in f(y) = y+3 we say that y is a “variable”. That is called a variable because it varies across invocations; however, within each invocation, it has the same value in its scope. Our identifiers until now have corresponded to this mathematical notion of a variable.If the identifier was bound to a box, then it remained bound to the same box value. It’s the content of the box that changed, not which box the identifier was bound to. In contrast, programming variables can vary even within each invocation, like the Java x above.
Henceforth, we will use variable when we mean an identifier whose value can change within its scope, and identifier when this cannot happen. If in doubt, we might play it safe and use “variable”; if the difference doesn’t really matter, we might use either one. It is less important to get caught up in these specific terms than to understand that they represent a distinction that matters (Mutation: Structures and Variables).
4.4 Interaction of Mutation with Closures: Counters
check: l1 = mk-counter() l1() is 1 l1() is 2 l2 = mk-counter() l2() is 1 l1() is 3 l2() is 2 end
We now see how we can implement this using both mutable structures (specifically, boxes) and variables.
4.4.1 Implementation Using Boxes
fun mk-counter(): ctr = box(0) lam(): ctr!{v : (ctr!v + 1)} ctr!v end end
fun mk-broken-counter(): lam(): ctr = box(0) ctr!{v : (ctr!v + 1)} ctr!v end where: l1 = mk-broken-counter() l1() is 1 l1() is 1 l2 = mk-broken-counter() l2() is 1 l1() is 1 l2() is 1 end
The examples above hint at an implementation necessity. Clearly, whatever the environment closes over in the procedure returned by mk-counter must refer to the same box each time. Yet something also needs to make sure that the value in that box is different each time! Look at it more carefully: it must be lexically the same, but dynamically different. This distinction will be at the heart of a strategy for implementing state (Mutation: Structures and Variables).
4.4.2 Implementation Using Variables
fun mk-counter(): var ctr = 0 lam(): ctr := ctr + 1 ctr end where: l1 = mk-counter() l1() is 1 l1() is 2 l2 = mk-counter() l2() is 1 l1() is 3 l2() is 2 end
fun mk-broken-counter(): lam(): var ctr = 0 ctr := ctr + 1 ctr end where: l1 = mk-broken-counter() l1() is 1 l1() is 1 l2 = mk-broken-counter() l2() is 1 l1() is 1 l2() is 1 end
4.5 Stateful but Equal
The binary operator ==, which is also used as the equality comparison by in when testing.
identical, also written as <=>.
check: E1(b0, b1) is true E1(b1, b2) is true end
> b0 |
box("a value") |
> b1 |
box("a different value") |
> b2 |
box("a different value") |
check: E1(b0, b1) is false E1(b1, b2) is true end
Confirm that equal-now does indeed have the properties ascribed to E1 above.
check: (b0 == b1) is false (b1 == b2) is true identical(b0, b1) is false identical(b1, b2) is true end
b0 = box("a value") b1 = box("a value") b2 = b1 l0 = [list: b0] l1 = [list: b1] l2 = [list: b2]
check: identical(l0, l1) is false identical(l1, l2) is false end
check: equal-now(l0, l1) is true equal-now(l1, l2) is true end
check: (l0 == l1) is false (l1 == l2) is true end
What might == represent that is interestingly different from both identical and equal-now? When it returns true, it is that the two values will “print the same” now and forever. How is this possible? It is because == recursively checks that the two arguments are structural until it gets to a mutable field; at that point, it checks that they are identical. If they are identical, then any change made to one will be reflected in the other (because they are in fact the same mutable field). That means their content, too, will always “print the same”. Therefore, we can now reveal the name given to ==: it is equal-always.
4.5.1 A Hierarchy of Equality
Observe that if two values v1 and v2 are equal-now, they are not necessarily equal-always; if they are equal-always, they are not necessarily identical. We have seen examples of both these cases above.
In contrast, if two values are identical, then they are
certainly going to be equal-always. That is because their
mutable fields reduce to identical, while the immutable
parts—
In most languages, it is common to have two equality operators, corresponding to identical (known as reference equality) and equal-now (known as structural equality). Pyret is rare in having a third operator, equal-always. For most programs, this is in fact the most useful equality operator: it is not overly bothered with details of aliasing, which can be difficult to predict; at the same time it makes decisions that stand the test of time, thereby forming a useful basis for various optimizations (which may not even be conscious of their temporal assumptions). This is why is in testing uses equal-always by default, and forces users to explicitly pick a different primitive if they want it.
4.5.2 Space and Time Complexity
identical always takes constant time. Indeed, some programs use identical precisely because they want constant-time equality, carefully structuring their program so that values that should be considered equal are aliases to the same value. Of course, maintaining this programming discipline is tricky.
equal-always and equal-now both must traverse at least the immutable part of data. Therefore, they take time proportional to the smaller datum (because if the two data are of different size, they must not be equal anyway, so there is no need to visit the extra data). The difference is that equal-always reduces to identical at references, thereby performing less computation than equal-now would.
- Use a quick check first, use the slower check only if the quick one fails. This corresponds to the following strategy:
fun my-eq(v1, v2) -> Boolean: identical(v1, v2) or equal-always(v1, v2) end
which has the following behavior: { check: my-eq(b0, b1) is false my-eq(b1, b2) is true my-eq(l0, l1) is false my-eq(l1, l2) is true end } It is worth noting that the above behavior is exactly the same as that of equal-always, which is the point: it is just often faster, when it can discharge the equality using identical without having to traverse the data. (Observe that this is a safe optimization because identical implies equal-always.) Use a different equality strategy entirely, if possible: Set Membership Again.
4.5.2.1 Comparing Functions
identical
b1!{v: "a different value"}
We have discussed (The Dawn of a New Equality) the difference between these two operators. However, we have not exhausted the kinds of equality we need, and that Pyret provides.