From 5d6a9d4ec275c7383858f52e0862b1617334ba9e Mon Sep 17 00:00:00 2001 From: Bruno BELANYI Date: Mon, 24 Jun 2024 23:03:09 +0100 Subject: [PATCH] posts: union-find: add 'find' --- content/posts/2024-06-24-union-find/index.md | 31 ++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/content/posts/2024-06-24-union-find/index.md b/content/posts/2024-06-24-union-find/index.md index dfb8797..c9699b0 100644 --- a/content/posts/2024-06-24-union-find/index.md +++ b/content/posts/2024-06-24-union-find/index.md @@ -78,3 +78,34 @@ each element to be a root and make it its own parent (`_parent[i] == i` for all `i`). The `_rank` field is an optimization which we will touch on in a later section. + +### Find + +A naive Implementation of `find(...)` is simple enough to write: + +```python +def find(self, elem: int) -> int: + # If `elem` is its own parent, then it is the root of the tree + if (parent: = self._parent[elem]) == elem: + return elem + # Otherwise, recurse on the parent + return self.find(parent) +``` + +However, going back up the chain of parents each time we want to find the root +node (an `O(n)` operation) would make for disastrous performance. Instead we can +do a small optimization called _path splitting. + +```python +def find(self, elem: int) -> int: + while (parent: = self._parent[elem]) != elem: + # Replace each parent link by a link to the grand-parent + elem, self._parent[elem] = parent, self._parent[parent] + return elem +``` + +This flattens the links so that each node links directly to the root, making +each subsequent `find(...)` constant time. + +Other compression schemes exist, along the spectrum between faster shortening +the chain faster earlier, or updating `_parent` fewer times per `find(...)`.