10 Sharing and Equality
10.1 Re-Examining Equality
data BinTree: | leaf | node(v, l :: BinTree, r :: BinTree) end a-tree = node(5, node(4, leaf, leaf), node(4, leaf, leaf)) b-tree = block: four-node = node(4, leaf, leaf) node(5, four-node, four-node) end
check: |
a-tree is b-tree |
a-tree.l is a-tree.l |
a-tree.l is a-tree.r |
b-tree.l is b-tree.r |
end |
However, there is another sense in which these trees are not equivalent. concretely, a-tree constructs a distinct node for each child, while b-tree uses the same node for both children. Surely this difference should show up somehow, but we have not yet seen a way to write a program that will tell these apart.
check: identical(a-tree, b-tree) is false identical(a-tree.l, a-tree.l) is true identical(a-tree.l, a-tree.r) is false identical(b-tree.l, b-tree.r) is true end
check: a-tree is-not%(identical) b-tree a-tree.l is%(identical) a-tree.l a-tree.l is-not%(identical) a-tree.r b-tree.l is%(identical) b-tree.r end
check: a-tree is b-tree a-tree is-not%(identical) b-tree a-tree.l is a-tree.r a-tree.l is-not%(identical) a-tree.r end
10.2 The Cost of Evaluating References
From a complexity viewpoint, it’s important for us to understand how these references work. As we have hinted, four-node is computed only once, and each use of it refers to the same value: if, instead, it was evaluated each time we referred to four-node, there would be no real difference between a-tree and b-tree, and the above tests would not distinguish between them.
L = range(0, 100)
L1 = link(1, L) L2 = link(-1, L)
check: L1.rest is%(identical) L L2.rest is%(identical) L L1.rest is%(identical) L2.rest end
fun check-for-no-copy(another-l): identical(another-l, L) end check: check-for-no-copy(L) is true end
check: L satisfies check-for-no-copy end
10.3 On the Internet, Nobody Knows You’re a DAG
Despite the name we’ve given it, b-tree is not actually a tree. In a tree, by definition, there are no shared nodes, whereas in b-tree the node named by four-node is shared by two parts of the tree. Despite this, traversing b-tree will still terminate, because there are no cyclic references in it: if you start from any node and visit its “children”, you cannot end up back at that node. There is a special name for a value with such a shape: directed acyclic graph (DAG).
Many important data structures are actually a DAG underneath. For instance, consider Web sites. It is common to think of a site as a tree of pages: the top-level refers to several sections, each of which refers to sub-sections, and so on. However, sometimes an entry needs to be cataloged under multiple sections. For instance, an academic department might organize pages by people, teaching, and research. In the first of these pages it lists the people who work there; in the second, the list of courses; and in the third, the list of research groups. In turn, the courses might have references to the people teaching them, and the research groups are populated by these same people. Since we want only one page per person (for both maintenance and search indexing purposes), all these personnel links refer back to the same page for people.
data Content: | page(s :: String) | section(title :: String, sub :: List<Content>) end
people-pages :: Content = section("People", [list: page("Church"), page("Dijkstra"), page("Haberman") ])
fun get-person(n): index(people-pages.sub, n) end
theory-pages :: Content = section("Theory", [list: get-person(0), get-person(1)]) systems-pages :: Content = section("Systems", [list: get-person(1), get-person(2)])
site :: Content = section("Computing Sciences", [list: theory-pages, systems-pages])
check: theory = index(site.sub, 0) systems = index(site.sub, 1) theory-dijkstra = index(theory.sub, 1) systems-dijkstra = index(systems.sub, 0) theory-dijkstra is systems-dijkstra theory-dijkstra is%(identical) systems-dijkstra end
10.4 From Acyclicity to Cycles
web-colors = link("white", link("grey", web-colors))
map2(color-table-row, table-row-content, web-colors)
Unfortunately, there are many things wrong with this attempted definition.
Do you see what they are?
This will not even parse. The identifier web-colors is not bound on the right of the =.
- Earlier, we saw a solution to such a problem: use rec [Streams From Functions]. What happens if we write
rec web-colors = link("white", link("grey", web-colors))
instead?ExerciseWhy does rec work in the definition of ones but not above?
Assuming we have fixed the above problem, one of two things will happen. It depends on what the initial value of web-colors is. Because it is a dummy value, we do not get an arbitrarily long list of colors but rather a list of two colors followed by the dummy value. Indeed, this program will not even type-check.
Suppose, however, that web-colors were written instead as a function definition to delay its creation:fun web-colors(): link("white", link("grey", web-colors())) end
On its own this just defines a function. If, however, we use it—web-colors()— it goes into an infinite loop constructing links. Even if all that were to work, map2 would either (a) not terminate because its second argument is indefinitely long, or (b) report an error because the two arguments aren’t the same length.
When you get to cycles, even defining the datum becomes difficult because its definition depends on itself so it (seemingly) needs to already be defined in the process of being defined. We will return to cyclic data later: Recursion and Cycles from Mutation.