|
|
|
@ -78,3 +78,77 @@ each element to be a root and make it its own parent (`_parent[i] == i` for all
|
|
|
|
|
`i`).
|
|
|
|
|
|
|
|
|
|
The `_rank` field is an optimization which we will touch on in a later section.
|
|
|
|
|
|
|
|
|
|
### Find
|
|
|
|
|
|
|
|
|
|
A naive Implementation of `find(...)` is simple enough to write:
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
|
def find(self, elem: int) -> int:
|
|
|
|
|
# If `elem` is its own parent, then it is the root of the tree
|
|
|
|
|
if (parent := self._parent[elem]) == elem:
|
|
|
|
|
return elem
|
|
|
|
|
# Otherwise, recurse on the parent
|
|
|
|
|
return self.find(parent)
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
However, going back up the chain of parents each time we want to find the root
|
|
|
|
|
node (an `O(n)` operation) would make for disastrous performance. Instead we can
|
|
|
|
|
do a small optimization called _path splitting_.
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
|
def find(self, elem: int) -> int:
|
|
|
|
|
while (parent := self._parent[elem]) != elem:
|
|
|
|
|
# Replace each parent link by a link to the grand-parent
|
|
|
|
|
elem, self._parent[elem] = parent, self._parent[parent]
|
|
|
|
|
return elem
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
This flattens the chain so that each node links more directly to the root (the
|
|
|
|
|
length is reduced by half), making each subsequent `find(...)` faster.
|
|
|
|
|
|
|
|
|
|
Other compression schemes exist, along the spectrum between faster shortening
|
|
|
|
|
the chain faster earlier, or updating `_parent` fewer times per `find(...)`.
|
|
|
|
|
|
|
|
|
|
### Union
|
|
|
|
|
|
|
|
|
|
A naive implementation of `union(...)` is simple enough to write:
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
|
def union(self, lhs: int, rhs: int) -> int:
|
|
|
|
|
# Replace both element by their root parent
|
|
|
|
|
lhs = self.find(lhs)
|
|
|
|
|
rhs = self.find(rhs)
|
|
|
|
|
# arbitrarily merge one into the other
|
|
|
|
|
self._parent[rhs] = lhs
|
|
|
|
|
# Return the new root
|
|
|
|
|
return lhs
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Once again, improvements can be made. Depending on the order in which we call
|
|
|
|
|
`union(...)`, we might end up creating a long chain from the leaf of the tree to
|
|
|
|
|
the root node, leading to slower `find(...)` operations. If at all possible, we
|
|
|
|
|
would like to keep the trees as shallow as possible.
|
|
|
|
|
|
|
|
|
|
To do so, we want to avoid merging taller trees into smaller ones, so as to keep
|
|
|
|
|
them as balanced as possible. Since a higher tree will result in a slower
|
|
|
|
|
`find(...)`, keeping the trees balanced will lead to increased performance.
|
|
|
|
|
|
|
|
|
|
This is where the `_rank` field we mentioned earlier comes in: the _rank_ of an
|
|
|
|
|
element is an upper bound on its height in the tree. By keeping track of this
|
|
|
|
|
_approximate_ height, we can keep the trees balanced when merging them.
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
|
def union(self, lhs: int, rhs: int) -> int:
|
|
|
|
|
lhs = self.find(lhs)
|
|
|
|
|
rhs = self.find(rhs)
|
|
|
|
|
# Always keep `lhs` as the taller tree
|
|
|
|
|
if (self._rank[lhs] < self._rank[rhs])
|
|
|
|
|
lhs, rhs = rhs, lhs
|
|
|
|
|
# Merge the smaller tree into the taller one
|
|
|
|
|
self._parent[rhs] = lhs
|
|
|
|
|
# Update the rank when merging trees of approximately the same size
|
|
|
|
|
if self._rank[lhs] == self._rank[rhs]:
|
|
|
|
|
self._rank[lhs] += 1
|
|
|
|
|
return lhs
|
|
|
|
|
```
|
|
|
|
|