Compare commits

...

7 commits

3 changed files with 132 additions and 2 deletions
content/posts
2024-06-24-union-find
2024-07-06-gap-buffer
2024-07-20-treap

View file

@ -15,7 +15,7 @@ favorite: false
disable_feed: false disable_feed: false
--- ---
To kickoff the [series]({{< ref "/series/cool-algorithms/">}}) of posts about To kickoff the [series]({{< ref "/series/cool-algorithms/" >}}) of posts about
algorithms and data structures I find interesting, I will be talking about my algorithms and data structures I find interesting, I will be talking about my
favorite one: the [_Disjoint Set_][wiki]. Also known as the _Union-Find_ data favorite one: the [_Disjoint Set_][wiki]. Also known as the _Union-Find_ data
structure, so named because of its two main operations: `ds.union(lhs, rhs)` and structure, so named because of its two main operations: `ds.union(lhs, rhs)` and

View file

@ -37,7 +37,7 @@ shorter/longer as required.
## Implementation ## Implementation
I'll be writing a sample implementation in Python, as with the rest of the I'll be writing a sample implementation in Python, as with the rest of the
[series]({{< ref "/series/cool-algorithms/">}}). I don't think it showcases the [series]({{< ref "/series/cool-algorithms/" >}}). I don't think it showcases the
elegance of the _Gap Buffer_ in action like a C implementation full of elegance of the _Gap Buffer_ in action like a C implementation full of
`memmove`s would, but it does makes it short and sweet. `memmove`s would, but it does makes it short and sweet.

View file

@ -27,3 +27,133 @@ parent's priority is always higher than any of its children.
[wiki]: https://en.wikipedia.org/wiki/Treap [wiki]: https://en.wikipedia.org/wiki/Treap
<!--more--> <!--more-->
## What does it do?
By randomizing the priority value of each key at insertion time, we ensure a
high likelihook that the tree stays _roughly_ balanced, avoiding degenerating to
unbalanced O(N) height.
Here's a sample tree created by inserting integers from 0 to 250 into the tree:
{{< graphviz file="treap.gv" />}}
## Implementation
I'll be keeping the theme for this [series] by using Python to implement the
_Treap_. This leads to somewhat annoying code to handle the `left`/`right` nodes
which is easier to do in C, using pointers.
[series]: {{< ref "/series/cool-algorithms/" >}}
### Representation
Creating a new `Treap` is easy: the tree starts off empty, waiting for new nodes
to insert.
Each `Node` must keep track of the `key`, the mapped `value`, and the node's
`priority` (which is assigned randomly). Finally it must also allow for storing
two children (`left` and `right`).
```python
class Node[K, V]:
key: K
value: V
priority: float
left: Node[K, V] | None
righg: Node[K, V] | None
def __init__(self, key: K, value: V):
# Store key and value, like a normal BST node
self.key = key
self.value = value
# Priority is derived randomly
self.priority = random()
self.left = None
self.right = None
class Treap[K, V]:
_root: Node[K, V] | None
def __init__(self):
# The tree starts out empty
self._root = None
```
### Search
Searching the tree is the same as in any other _Binary Search Tree_.
```python
def get(self, key: K) -> T | None:
node = self._root
# The usual BST traversal
while node is not None:
if node.key == key:
return node.value
elif node.key < key:
node = node.right
else:
node = node.left
return None
```
### Insertion
To insert a new `key` into the tree, we identify which leaf position it should
be inserted at. We then generate the node's priority, insert it at this
position, and rotate the node upwards until the heap property is respected.
```python
type ChildField = Literal["left, right"]
def insert(self, key: K, value: V) -> bool:
# Empty treap base-case
if self._root is None:
self._root = Node(key, value)
# Signal that we're not overwriting the value
return False
# Keep track of the parent chain for rotation after insertion
parents = []
node = self._root
while node is not None:
# Insert a pre-existing key
if node.key == key:
node.value = value
return True
# Go down the tree, keep track of the path through the tree
field = "left" if key < node.key else "right"
parents.append((node, field))
node = getattr(node, field)
# Key wasn't found, we're inserting a new node
child = Node(key, value)
parent, field = parents[-1]
setattr(parent, field, child)
# Rotate the new node up until we respect the decreasing priority property
self._rotate_up(child, parents)
# Key wasn't found, signal that we inserted a new node
return False
def _rotate_up(
self,
node: Node[K, V],
parents: list[tuple[Node[K, V], ChildField]],
) -> None:
while parents:
parent, field = parents.pop()
# If the parent has higher priority, we're done rotating
if parent.priority >= node.priority:
break
# Check for grand-parent/root of tree edge-case
if parents:
# Update grand-parent to point to the new rotated node
grand_parent, field = parents[-1]
setattr(grand_parent, field, node)
else:
# Point the root to the new rotated node
self._root = node
other_field = "left" if field == "right" else "right"
# Rotate the node up
setattr(parent, field, getattr(node, other_field))
setattr(node, other_field, parent)
```