blog/content/posts/2024-06-24-union-find/index.md

1.7 KiB

title date draft description tags categories series favorite disable_feed
Union Find 2024-06-24T21:07:49+01:00 false My favorite data structure
algorithms
data structures
python
programming
Cool algorithms
false false

To kickoff the [series]({{< ref "/series/cool-algorithms/">}}) of posts about algorithms and data structures I find interesting, I will be talking about my favorite one: the Disjoint Set. Also known as the Union-Find data structure, so named because of its two main operations: ds.union(lhs, rhs) and ds.find(elem).

What does it do?

The Union-Find data structure allows one to store a collection of sets of elements, with operations for adding new sets, merging two sets into one, and finding the representative member of a set. Not only does it do all that, but it does it in almost constant (amortized) time!

Here is a small motivating example for using the Disjoint Set data structure:

def connected_components(graph: Graph) -> list[set[Node]]:
    # Initialize the disjoint set so that each node is in its own set
    ds: DisjointSet[Node] = DisjointSet(graph.nodes)
    # Each edge is a connection, merge both sides into the same set
    for (start, dest) in graph.edges:
        ds.union(start, dest)
    # Connected components share the same (arbitrary) root
    components: dict[Node, set[Node]] = defaultdict(set)
    for n in graph.nodes:
        components[ds.find(n)].add(n)
    # Return a list of disjoint sets corresponding to each connected component
    return list(components.values())