nix: bump inputs

Add Reservoir Sampling post
Add Treap Revisited post
2024-08-10 11:56:04 +01:00 · 2024-08-10 11:56:04 +01:00 · 2024-08-10 11:56:04 +01:00 · 2024-08-10 11:56:04 +01:00 · 2024-08-10 11:56:04 +01:00 · 2024-08-10 11:56:04 +01:00
16 changed files with 1857 additions and 21 deletions
--- a/.markdownlint.yaml
+++ b/.markdownlint.yaml
@ -0,0 +1,3 @@
+# MD024/no-duplicate-heading/no-duplicate-header
+MD024:
+  siblings_only: true
--- a/config.yaml
+++ b/config.yaml
@ -67,6 +67,7 @@ params:
  webmentions:
    login: belanyi.fr
    pingback: true
+  mathjax: true

 taxonomies:
  category: "categories"
--- a/content/posts/2020-07-14-hello-world/index.md
+++ b/content/posts/2020-07-14-hello-world/index.md
@ -8,6 +8,8 @@ tags:
 categories:
 favorite: false
 tikz: true
+graphviz: true
+mermaid: true
 ---

 ## Test post please ignore
@ -40,6 +42,29 @@ echo hello world | cut -d' ' -f 1
  \end{tikzpicture}
 {{% /tikz %}}

+### Graphviz support
+
+{{% graphviz %}}
+  graph {
+    a -- b
+    b -- c
+    c -- a
+  }
+{{% /graphviz %}}
+
+### Mermaid support
+
+{{% mermaid %}}
+  graph TD
+  A[Enter Chart Definition] --> B(Preview)
+  B --> C{decide}
+  C --> D[Keep]
+  C --> E[Edit Definition]
+  E --> B
+  D --> F[Save Image and Code]
+  F --> B
+{{% /graphviz %}}
+
 ### Spoilers

 {{% spoiler "Don't open me" %}}
--- a/content/posts/2024-07-06-gap-buffer/index.md
+++ b/content/posts/2024-07-06-gap-buffer/index.md
@ -0,0 +1,191 @@
+---
+title: "Gap Buffer"
+date: 2024-07-06T21:27:19+01:00
+draft: false # I don't care for draft mode, git has branches for that
+description: "As featured in GNU Emacs"
+tags:
+  - algorithms
+  - data structures
+  - python
+categories:
+  - programming
+series:
+  - Cool algorithms
+favorite: false
+disable_feed: false
+---
+
+The [_Gap Buffer_][wiki] is a popular data structure for text editors to
+represent files and editable buffers. The most famous of them probably being
+[GNU Emacs][emacs].
+
+[wiki]: https://en.wikipedia.org/wiki/Gap_buffer
+[emacs]: https://www.gnu.org/software/emacs/manual/html_node/elisp/Buffer-Gap.html
+
+<!--more-->
+
+## What does it do?
+
+A _Gap Buffer_ is simply a list of characters, similar to a normal string, with
+the added twist of splitting it into two side: the prefix and suffix, on either
+side of the cursor. In between them, a gap is left to allow for quick
+insertion at the cursor.
+
+Moving the cursor moves the gap around the buffer, the prefix and suffix getting
+shorter/longer as required.
+
+## Implementation
+
+I'll be writing a sample implementation in Python, as with the rest of the
+[series]({{< ref "/series/cool-algorithms/" >}}). I don't think it showcases the
+elegance of the _Gap Buffer_ in action like a C implementation full of
+`memmove`s would, but it does makes it short and sweet.
+
+### Representation
+
+We'll be representing the gap buffer as an actual list of characters.
+
+Given that Python doesn't _have_ characters, let's settle for a list of strings,
+each representing a single character...
+
+```python
+Char = str
+
+class GapBuffer:
+    # List of characters, contains prefix and suffix of string with gap in the middle
+    _buf: list[Char]
+    # The gap is contained between [start, end) (i.e: buf[start:end])
+    _gap_start: int
+    _gap_end: int
+
+    # Visual representation of the gap buffer:
+    # This is a very  [                     ]long string.
+    # |<----------------------------------------------->| capacity
+    # |<------------>|                       |<-------->| string
+    #                 |<------------------->|             gap
+    # |<------------>|                                    prefix
+    #                                        |<-------->| suffix
+    def __init__(self, initial_capacity: int = 16) -> None:
+        assert initial_capacity > 0
+        # Initialize an empty gap buffer
+        self._buf = [""] * initial_capacity
+        self._gap_start = 0
+        self._gap_end = initial_capacity
+```
+
+### Accessors
+
+I'm mostly adding these for exposition, and making it easier to write `assert`s
+later.
+
+```python
+@property
+def capacity(self) -> int:
+  return len(self._buf)
+
+@property
+def gap_length(self) -> int:
+  return self._gap_end - self._gap_start
+
+@property
+def string_length(self) -> int:
+  return self.capacity - self.gap_length
+
+@property
+def prefix_length(self) -> int:
+  return self._gap_start
+
+@property
+def suffix_length(self) -> int:
+  return self.capacity - self._gap_end
+```
+
+### Growing the buffer
+
+I've written this method in a somewhat non-idiomatic manner, to make it closer
+to how it would look in C using `realloc` instead.
+
+It would be more efficient to use slicing to insert the needed extra capacity
+directly, instead of making a new buffer and copying characters over.
+
+```python
+def grow(self, capacity: int) -> None:
+    assert capacity >= self.capacity
+    # Create a new buffer with the new capacity
+    new_buf = [""] * capacity
+    # Move the prefix/suffix to their place in the new buffer
+    added_capacity = capacity - len(self._buf)
+    new_buf[: self._gap_start] = self._buf[: self._gap_start]
+    new_buf[self._gap_end + added_capacity :] = self._buf[self._gap_end :]
+    # Use the new buffer, account for added capacity
+    self._buf = new_buf
+    self._gap_end += added_capacity
+```
+
+### Insertion
+
+Inserting text at the cursor's position means filling up the gap in the middle
+of the buffer. To do so we must first make sure that the gap is big enough, or
+grow the buffer accordingly.
+
+Then inserting the text is simply a matter of copying its characters in place,
+and moving the start of the gap further right.
+
+```python
+def insert(self, val: str) -> None:
+    # Ensure we have enouh space to insert the whole string
+    if len(val) > self.gap_length:
+        self.grow(max(self.capacity * 2, self.string_length + len(val)))
+    # Fill the gap with the given string
+    self._buf[self._gap_start : self._gap_start + len(val)] = val
+    self._gap_start += len(val)
+```
+
+### Deletion
+
+Removing text from the buffer simply expands the gap in the corresponding
+direction, shortening the string's prefix/suffix. This makes it very cheap.
+
+The methods are named after the `backspace` and `delete` keys on the keyboard.
+
+```python
+def backspace(self, dist: int = 1) -> None:
+    assert dist <= self.prefix_length
+    # Extend gap to the left
+    self._gap_start -= dist
+
+def delete(self, dist: int = 1) -> None:
+    assert dist <= self.suffix_length
+    # Extend gap to the right
+    self._gap_end += dist
+```
+
+### Moving the cursor
+
+Moving the cursor along the buffer will shift letters from one side of the gap
+to the other, moving them accross from prefix to suffix and back.
+
+I find Python's list slicing not quite as elegant to read as a `memmove`, though
+it does make for a very small and efficient implementation.
+
+```python
+def left(self, dist: int = 1) -> None:
+    assert dist <= self.prefix_length
+    # Shift the needed number of characters from end of prefix to start of suffix
+    self._buf[self._gap_end - dist : self._gap_end] = self._buf[
+        self._gap_start - dist : self._gap_start
+    ]
+    # Adjust indices accordingly
+    self._gap_start -= dist
+    self._gap_end -= dist
+
+def right(self, dist: int = 1) -> None:
+    assert dist <= self.suffix_length
+    # Shift the needed number of characters from start of suffix to end of prefix
+    self._buf[self._gap_start : self._gap_start + dist] = self._buf[
+        self._gap_end : self._gap_end + dist
+    ]
+    # Adjust indices accordingly
+    self._gap_start += dist
+    self._gap_end += dist
+```
--- a/content/posts/2024-07-14-bloom-filter/index.md
+++ b/content/posts/2024-07-14-bloom-filter/index.md
@ -0,0 +1,97 @@
+---
+title: "Bloom Filter"
+date: 2024-07-14T17:46:40+01:00
+draft: false # I don't care for draft mode, git has branches for that
+description: "Probably cool"
+tags:
+  - algorithms
+  - data structures
+  - python
+categories:
+  - programming
+series:
+  - Cool algorithms
+favorite: false
+disable_feed: false
+---
+
+The [_Bloom Filter_][wiki] is a probabilistic data structure for set membership.
+
+The filter can be used as an inexpensive first step when querying the actual
+data is quite costly (e.g: as a first check for expensive cache lookups or large
+data seeks).
+
+[wiki]: https://en.wikipedia.org/wiki/Bloom_filter
+
+<!--more-->
+
+## What does it do?
+
+A _Bloom Filter_ can be understood as a hash-set which can either tell you:
+
+* An element is _not_ part of the set.
+* An element _may be_ part of the set.
+
+More specifically, one can tweak the parameters of the filter to make it so that
+the _false positive_ rate of membership is quite low.
+
+I won't be going into those calculations here, but they are quite trivial to
+compute, or one can just look up appropriate values for their use case.
+
+## Implementation
+
+I'll be using Python, which has the nifty ability of representing bitsets
+through its built-in big integers quite easily.
+
+We'll be assuming a `BIT_COUNT` of 64 here, but the implementation can easily be
+tweaked to use a different number, or even change it at construction time.
+
+### Representation
+
+A `BloomFilter` is just a set of bits and a list of hash functions.
+
+```python
+BIT_COUNT = 64
+
+class BloomFilter[T]:
+    _bits: int
+    _hash_functions: list[Callable[[T], int]]
+
+    def __init__(self, hash_functions: list[Callable[[T], int]]) -> None:
+        # Filter is initially empty
+        self._bits = 0
+        self._hash_functions = hash_functions
+```
+
+### Inserting a key
+
+To add an element to the filter, we take the output from each hash function and
+use that to set a bit in the filter. This combination of bit will identify the
+element, which we can use for lookup later.
+
+```python
+def insert(self, val: T) -> None:
+    # Iterate over each hash
+    for f in self._hash_functions:
+        n = f(val) % BIT_COUNT
+        # Set the corresponding bit
+        self._bit |= 1 << n
+```
+
+### Querying a key
+
+Because the _Bloom Filter_ does not actually store its elements, but some
+derived data from hashing them, it can only definitely say if an element _does
+not_ belong to it. Otherwise, it _may_ be part of the set, and should be checked
+against the actual underlying store.
+
+```python
+def may_contain(self, val: T) -> bool:
+    for f in self._hash_functions:
+        n = f(val) % BIT_COUNT
+        # If one of the bits is unset, the value is definitely not present
+        if not (self._bit & (1 << n)):
+            return False
+    # All bits were matched, `val` is likely to be part of the set
+    return True
+```
--- a/content/posts/2024-07-20-treap/index.md
+++ b/content/posts/2024-07-20-treap/index.md
@ -0,0 +1,159 @@
+---
+title: "Treap"
+date: 2024-07-20T14:12:27+01:00
+draft: false # I don't care for draft mode, git has branches for that
+description: "A simpler BST"
+tags:
+  - algorithms
+  - data structures
+  - python
+categories:
+  - programming
+series:
+  - Cool algorithms
+favorite: false
+disable_feed: false
+graphviz: true
+---
+
+The [_Treap_][wiki] is a mix between a _Binary Search Tree_ and a _Heap_.
+
+Like a _Binary Search Tree_, it keeps an ordered set of keys in the shape of a
+tree, allowing for binary search traversal.
+
+Like a _Heap_, it associates each node with a priority, making sure that a
+parent's priority is always higher than any of its children.
+
+[wiki]: https://en.wikipedia.org/wiki/Treap
+
+<!--more-->
+
+## What does it do?
+
+By randomizing the priority value of each key at insertion time, we ensure a
+high likelihood that the tree stays _roughly_ balanced, avoiding degenerating to
+unbalanced O(N) height.
+
+Here's a sample tree created by inserting integers from 0 to 250 into the tree:
+
+{{< graphviz file="treap.gv" />}}
+
+## Implementation
+
+I'll be keeping the theme for this [series] by using Python to implement the
+_Treap_. This leads to somewhat annoying code to handle the rotation process,
+which is easier to do in C using pointers.
+
+[series]: {{< ref "/series/cool-algorithms/" >}}
+
+### Representation
+
+Creating a new `Treap` is easy: the tree starts off empty, waiting for new nodes
+to insert.
+
+Each `Node` must keep track of the `key`, the mapped `value`, and the node's
+`priority` (which is assigned randomly). Finally it must also allow for storing
+two children (`left` and `right`).
+
+```python
+class Node[K, V]:
+    key: K
+    value: V
+    priority: float
+    left: Node[K, V] | None
+    righg: Node[K, V] | None
+
+    def __init__(self, key: K, value: V):
+        # Store key and value, like a normal BST node
+        self.key = key
+        self.value = value
+        # Priority is derived randomly
+        self.priority = random()
+        self.left = None
+        self.right = None
+
+class Treap[K, V]:
+    _root: Node[K, V] | None
+
+    def __init__(self):
+        # The tree starts out empty
+        self._root = None
+```
+
+### Search
+
+Searching the tree is the same as in any other _Binary Search Tree_.
+
+```python
+def get(self, key: K) -> T | None:
+    node = self._root
+    # The usual BST traversal
+    while node is not None:
+        if node.key == key:
+            return node.value
+        elif node.key < key:
+            node = node.right
+        else:
+            node = node.left
+    return None
+```
+
+### Insertion
+
+To insert a new `key` into the tree, we identify which leaf position it should
+be inserted at. We then generate the node's priority, insert it at this
+position, and rotate the node upwards until the heap property is respected.
+
+```python
+type ChildField = Literal["left, right"]
+
+def insert(self, key: K, value: V) -> bool:
+    # Empty treap base-case
+    if self._root is None:
+        self._root = Node(key, value)
+        # Signal that we're not overwriting the value
+        return False
+    # Keep track of the parent chain for rotation after insertion
+    parents = []
+    node = self._root
+    while node is not None:
+        # Insert a pre-existing key
+        if node.key == key:
+            node.value = value
+            return True
+        #  Go down the tree, keep track of the path through the tree
+        field = "left" if key < node.key else "right"
+        parents.append((node, field))
+        node = getattr(node, field)
+    #  Key wasn't found, we're inserting a new node
+    child = Node(key, value)
+    parent, field = parents[-1]
+    setattr(parent, field, child)
+    # Rotate the new node up until we respect the decreasing priority property
+    self._rotate_up(child, parents)
+    # Key wasn't found, signal that we inserted a new node
+    return False
+
+def _rotate_up(
+    self,
+    node: Node[K, V],
+    parents: list[tuple[Node[K, V], ChildField]],
+) -> None:
+    while parents:
+        parent, field = parents.pop()
+        # If the parent has higher priority, we're done rotating
+        if parent.priority >= node.priority:
+            break
+        # Check for grand-parent/root of tree edge-case
+        if parents:
+            # Update grand-parent to point to the new rotated node
+            grand_parent, field = parents[-1]
+            setattr(grand_parent, field, node)
+        else:
+            # Point the root to the new rotated node
+            self._root = node
+        other_field = "left" if field == "right" else "right"
+        # Rotate the node up
+        setattr(parent, field, getattr(node, other_field))
+        setattr(node, other_field, parent)
+```
--- a/content/posts/2024-07-20-treap/treap.gv
+++ b/content/posts/2024-07-20-treap/treap.gv
--- a/content/posts/2024-07-27-treap-revisited/index.md
+++ b/content/posts/2024-07-27-treap-revisited/index.md
@ -0,0 +1,146 @@
+---
+title: "Treap, revisited"
+date: 2024-07-27T14:12:27+01:00
+draft: false # I don't care for draft mode, git has branches for that
+description: "An even simpler BST"
+tags:
+  - algorithms
+  - data structures
+  - python
+categories:
+  - programming
+series:
+  - Cool algorithms
+favorite: false
+disable_feed: false
+---
+
+My [last post]({{< relref "../2024-07-20-treap/index.md" >}}) about the _Treap_
+showed an implementation using tree rotations, as is commonly done with [AVL
+Trees][avl] and [Red Black Trees][rb].
+
+But the _Treap_ lends itself well to a simple and elegant implementation with no
+tree rotations. This makes it especially easy to implement the removal of a key,
+rather than the fiddly process of deletion using tree rotations.
+
+[avl]: https://en.wikipedia.org/wiki/AVL_tree
+[rb]: https://en.wikipedia.org/wiki/Red%E2%80%93black_tree
+
+<!--more-->
+
+## Implementation
+
+All operations on the tree will be implemented in terms of two fundamental
+operations: `split` and `merge`.
+
+We'll be reusing the same structures as in the last post, so let's skip straight
+to implementing those fundaments, and building on them for `insert` and
+`delete`.
+
+### Split
+
+Splitting a tree means taking a key, and getting the following output:
+
+* a `left` node, root of the tree of all keys lower than the input.
+* an extracted `node` which corresponds to the input `key`.
+* a `right` node, root of the tree of all keys higher than the input.
+
+```python
+type OptionalNode[K, V] = Node[K, V] | None
+
+class SplitResult(NamedTuple):
+    left: OptionalNode
+    node: OptionalNode
+    right: OptionalNode
+
+def split(root: OptionalNode[K, V], key: K) -> SplitResult:
+    # Base case, empty tree
+    if root is None:
+        return SplitResult(None, None, None)
+    # If we found the key, simply extract left and right
+    if root.key == key:
+        left, right = root.left, root.right
+        root.left, root.right = None, None
+        return SplitResult(left, root, right)
+    # Otherwise, recurse on the corresponding side of the tree
+    if root.key < key:
+        left, node, right = split(root.right, key)
+        root.right = left
+        return SplitResult(root, node, right)
+    if key < root.key:
+        left, node, right = split(root.left, key)
+        root.left = right
+        return SplitResult(left, node, root)
+    raise RuntimeError("Unreachable")
+```
+
+### Merge
+
+Merging a `left` and `right` tree means (cheaply) building a new tree containing
+both of them. A pre-condition for merging is that the `left` tree is composed
+entirely of nodes that are lower than any key in `right` (i.e: as in `left` and
+`right` after a `split`).
+
+```python
+def merge(
+    left: OptionalNode[K, V],
+    right: OptionalNode[K, V],
+) -> OptionalNode[K, V]:
+    # Base cases, left or right being empty
+    if left is None:
+        return right
+    if right is None:
+        return left
+    # Left has higher priority, it must become the root node
+    if left.priority >= right.priority:
+        # We recursively reconstruct its right sub-tree
+        left.right = merge(left.right, right)
+        return left
+    # Right has higher priority, it must become the root node
+    if left.priority < right.priority:
+        # We recursively reconstruct its left sub-tree
+        right.left = merge(left, right.left)
+        return right
+    raise RuntimeError("Unreachable")
+```
+
+### Insertion
+
+Inserting a node into the tree is done in two steps:
+
+1. `split` the tree to isolate the middle insertion point
+2. `merge` it back up to form a full tree with the inserted key
+
+```python
+def insert(self, key: K, value: V) -> bool:
+    # `left` and `right` come before/after the key
+    left, node, right = split(self._root, key)
+    was_updated: bool
+    # Create the node, or update its value, if the key was already in the tree
+    if node is None:
+        node = Node(key, value)
+        was_updated = False
+    else:
+        node.value = value
+        was_updated = True
+    # Rebuild the tree with a couple of merge operations
+    self._root = merge(left, merge(node, right))
+    # Signal whether the key was already in the key
+    return was_updated
+```
+
+### Removal
+
+Removing a key from the tree is similar to inserting a new key, and forgetting
+to insert it back: simply `split` the tree and `merge` it back without the
+extracted middle node.
+
+```python
+def remove(self, key: K) -> bool:
+    # `node` contains the key, or `None` if the key wasn't in the tree
+    left, node, right = split(self._root, key)
+    # Put the tree back together, without the extract node
+    self._root = merge(left, right)
+    # Signal whether `key` was mapped in the tree
+    return node is not None
+```
--- a/content/posts/2024-08-02-reservoir-sampling/index.md
+++ b/content/posts/2024-08-02-reservoir-sampling/index.md
@ -0,0 +1,145 @@
+---
+title: "Reservoir Sampling"
+date: 2024-08-02T18:30:56+01:00
+draft: false # I don't care for draft mode, git has branches for that
+description: "Elegantly sampling a stream"
+tags:
+  - algorithms
+  - python
+categories:
+  - programming
+series:
+  - Cool algorithms
+favorite: false
+disable_feed: false
+mathjax: true
+---
+
+[_Reservoir Sampling_][reservoir] is an [online][online], probabilistic
+algorithm to uniformly sample $k$ random elements out of a stream of values.
+
+It's a particularly elegant and small algorithm, only requiring $\Theta(k)$
+amount of space and a single pass through the stream.
+
+[reservoir]: https://en.wikipedia.org/wiki/Reservoir_sampling
+[online]: https://en.wikipedia.org/wiki/Online_algorithm
+
+<!--more-->
+
+## Sampling one element
+
+As an introduction, we'll first focus on fairly sampling one element from the
+stream.
+
+```python
+def sample_one[T](stream: Iterable[T]) -> T:
+    stream_iter = iter(stream)
+    # Sample the first element
+    res = next(stream_iter)
+    for i, val in enumerate(stream_iter, start=1):
+        j = random.randint(0, i)
+        # Replace the sampled element with probability 1/(i + 1)
+        if j == 0:
+            res = val
+    # Return the randomly sampled element
+    return res
+```
+
+### Proof
+
+Let's now prove that this algorithm leads to a fair sampling of the stream.
+
+We'll be doing proof by induction.
+
+#### Hypothesis $H_N$
+
+After iterating through the first $N$ items in the stream,
+each of them has had an equal $\frac{1}{N}$ probability of being selected as
+`res`.
+
+#### Base Case $H_1$
+
+We can trivially observe that the first element is always assigned to `res`,
+$\frac{1}{1} = 1$, the hypothesis has been verified.
+
+#### Inductive Case
+
+For a given $N$, let us assume that $H_N$ holds. Let us now look at the events
+of loop iteration where `i = N` (i.e: observation of the $N + 1$-th item in the
+stream).
+
+`j = random.randint(0, i)` uniformly selects a value in the range $[0, i]$,
+a.k.a $[0, N]$. We then have two cases:
+
+* `j == 0`, with probability $\frac{1}{N + 1}$: we select `val` as the new
+reservoir element `res`.
+
+* `j != 0`, with probability $\frac{N}{N + 1}$: we keep the previous value of
+`res`. By $H_N$, any of the first $N$ elements had a $\frac{1}{N}$ probability
+of being `res` before at the start of the loop, each element now has a
+probability $\frac{1}{N} \cdot \frac{N}{N + 1} = \frac{1}{N + 1}$ of being the
+element.
+
+And thus, we have proven $H_{N + 1}$ at the end of the loop.
+
+## Sampling $k$ element
+
+The code for sampling $k$ elements is very similar to the one-element case.
+
+```python
+def sample[T](stream: Iterable[T], k: int = 1) -> list[T]:
+    stream_iter = iter(stream)
+    # Retain the first 'k' elements in the reservoir
+    res = list(itertools.islice(stream_iter, k))
+    for i, val in enumerate(stream_iter, start=k):
+        j = random.randint(0, i)
+        # Replace one element at random with probability k/(i + 1)
+        if j < k:
+            res[j] = val
+    # Return 'k' randomly sampled elements
+    return res
+```
+
+### Proof
+
+Let us once again do a proof by induction, assuming the stream contains at least
+$k$ items.
+
+#### Hypothesis $H_N$
+
+After iterating through the first $N$ items in the stream, each of them has had
+an equal $\frac{k}{N}$ probability of being sampled from the stream.
+
+#### Base Case $H_k$
+
+We can trivially observe that the first $k$ element are sampled at the start of
+the algorithm, $\frac{k}{k} = 1$, the hypothesis has been verified.
+
+#### Inductive Case
+
+For a given $N$, let us assume that $H_N$ holds. Let us now look at the events
+of the loop iteration where `i = N`, in order to prove $H_{N + 1}$.
+
+`j = random.randint(0, i)` uniformly selects a value in the range $[0, i]$,
+a.k.a $[0, N]$. We then have three cases:
+
+* `j >= k`, with probability $1 - \frac{k}{N + 1}$: we do not modify the
+sampled reservoir at all.
+
+* `j < k`, with probability $\frac{k}{N + 1}$: we sample the new element to
+replace the `j`-th element of the reservoir. Therefore for any element
+$e \in [0, k[$ we can either have:
+  * $j = e$: the element _is_ replaced, probability $\frac{1}{k}$.
+  * $j \neq e$: the element is _not_ replaced, probability $\frac{k - 1}{k}$.
+
+We can now compute the probability that a previously sampled element is kept in
+the reservoir:
+$1 - \frac{k}{N + 1} + \frac{k}{N + 1} \cdot \frac{k - 1}{k} = \frac{N}{N + 1}$.
+
+By $H_N$, any of the first $N$ elements had a $\frac{k}{N}$ probability
+of being sampled before at the start of the loop, each element now has a
+probability $\frac{k}{N} \cdot \frac{N}{N + 1} = \frac{k}{N + 1}$ of being the
+element.
+
+We have now proven that all elements have a probability $\frac{k}{N + 1}$ of
+being sampled at the end of the loop, therefore $H_{N + 1}$ has been verified.
--- a/flake.lock
+++ b/flake.lock
@ -3,11 +3,11 @@
    "flake-compat": {
      "flake": false,
      "locked": {
-        "lastModified": 1673956053,
-        "narHash": "sha256-4gtG9iQuiKITOjNQQeQIpoIB6b16fm+504Ch3sNKLd8=",
+        "lastModified": 1696426674,
+        "narHash": "sha256-kvjfFW7WAETZlt09AgDn1MrtKzP7t90Vf7vypd3OL1U=",
        "owner": "edolstra",
        "repo": "flake-compat",
-        "rev": "35bb57c0c8d8b62bbfd284272c928ceb64ddbde9",
+        "rev": "0f9255e01c2351cc7d116c072cb317785dd33b33",
        "type": "github"
      },
      "original": {
@ -21,11 +21,11 @@
        "systems": "systems"
      },
      "locked": {
-        "lastModified": 1689068808,
-        "narHash": "sha256-6ixXo3wt24N/melDWjq70UuHQLxGV8jZvooRanIHXw0=",
+        "lastModified": 1710146030,
+        "narHash": "sha256-SZ5L6eA7HJ/nmkzGG7/ISclqe6oZdOZTNoesiInkXPQ=",
        "owner": "numtide",
        "repo": "flake-utils",
-        "rev": "919d646de7be200f3bf08cb76ae1f09402b6f9b4",
+        "rev": "b1d9ab70662946ef0850d488da1c9019f3a9752a",
        "type": "github"
      },
      "original": {
@ -43,11 +43,11 @@
        ]
      },
      "locked": {
-        "lastModified": 1660459072,
-        "narHash": "sha256-8DFJjXG8zqoONA1vXtgeKXy68KdJL5UaXR8NtVMUbx8=",
+        "lastModified": 1709087332,
+        "narHash": "sha256-HG2cCnktfHsKV0s4XW83gU3F57gaTljL9KNSuG6bnQs=",
        "owner": "hercules-ci",
        "repo": "gitignore.nix",
-        "rev": "a20de23b925fd8264fd7fad6454652e142fd7f73",
+        "rev": "637db329424fd7e46cf4185293b9cc8c88c95394",
        "type": "github"
      },
      "original": {
@ -58,11 +58,11 @@
    },
    "nixpkgs": {
      "locked": {
-        "lastModified": 1691155369,
-        "narHash": "sha256-CIuJO5pgwCMsZM8flIU2OiZ79QfDCesXPsAiokCzlNM=",
+        "lastModified": 1722415718,
+        "narHash": "sha256-5US0/pgxbMksF92k1+eOa8arJTJiPvsdZj9Dl+vJkM4=",
        "owner": "NixOS",
        "repo": "nixpkgs",
-        "rev": "7d050b98e51cdbdd88ad960152d398d41c7ff5b4",
+        "rev": "c3392ad349a5227f4a3464dce87bcc5046692fce",
        "type": "github"
      },
      "original": {
@ -75,9 +75,6 @@
    "pre-commit-hooks": {
      "inputs": {
        "flake-compat": "flake-compat",
-        "flake-utils": [
-          "futils"
-        ],
        "gitignore": "gitignore",
        "nixpkgs": [
          "nixpkgs"
@ -87,11 +84,11 @@
        ]
      },
      "locked": {
-        "lastModified": 1691093055,
-        "narHash": "sha256-sjNWYpDHc6vx+/M0WbBZKltR0Avh2S43UiDbmYtfHt0=",
+        "lastModified": 1721042469,
+        "narHash": "sha256-6FPUl7HVtvRHCCBQne7Ylp4p+dpP3P/OYuzjztZ4s70=",
        "owner": "cachix",
        "repo": "pre-commit-hooks.nix",
-        "rev": "ebb43bdacd1af8954d04869c77bc3b61fde515e4",
+        "rev": "f451c19376071a90d8c58ab1a953c6e9840527fd",
        "type": "github"
      },
      "original": {
--- a/flake.nix
+++ b/flake.nix
@ -22,7 +22,6 @@
      repo = "pre-commit-hooks.nix";
      ref = "master";
      inputs = {
-        flake-utils.follows = "futils";
        nixpkgs.follows = "nixpkgs";
        nixpkgs-stable.follows = "nixpkgs";
      };
--- a/layouts/partials/head-extra.html
+++ b/layouts/partials/head-extra.html
@ -3,6 +3,30 @@
    <link rel="stylesheet" type="text/css" href="https://tikzjax.com/v1/fonts.css">
    <script async src="https://tikzjax.com/v1/tikzjax.js"></script>
 {{ end }}
+<!-- Graphviz support -->
+{{ if (.Params.graphviz) }}
+    <script src="https://cdn.jsdelivr.net/npm/@viz-js/viz@3.7.0/lib/viz-standalone.min.js"></script>
+    <script type="text/javascript">
+    (function() {
+        Viz.instance().then(function(viz) {
+            Array.prototype.forEach.call(document.querySelectorAll("pre.graphviz"), function(x) {
+                var svg = viz.renderSVGElement(x.innerText);
+                // Let CSS take care of the SVG size
+                svg.removeAttribute("width")
+                svg.setAttribute("height", "auto")
+                x.replaceChildren(svg)
+            })
+        })
+    })();
+    </script>
+{{ end }}
+<!-- Mermaid support -->
+{{ if (.Params.mermaid) }}
+    <script type="module" async>
+        import mermaid from "https://cdn.jsdelivr.net/npm/mermaid@latest/dist/mermaid.esm.min.mjs";
+        mermaid.initialize({ startOnLoad: true });
+    </script>
+{{ end }}
 {{ with .OutputFormats.Get "atom" -}}
    {{ printf `<link rel="%s" type="%s" href="%s" title="%s" />` .Rel .MediaType.Type .Permalink $.Site.Title | safeHTML }}
 {{ end -}}
--- a/layouts/shortcodes/graphviz.html
+++ b/layouts/shortcodes/graphviz.html
@ -0,0 +1,16 @@
+<pre class="graphviz">
+    {{ with .Get "file" }}
+        {{ if eq (. | printf "%.1s") "/" }}
+            {{/* Absolute path are from root of site. */}}
+            {{ $.Scratch.Set "filepath" . }}
+        {{ else }}
+            {{/* Relative paths are from page directory. */}}
+            {{ $.Scratch.Set "filepath" $.Page.File.Dir }}
+            {{ $.Scratch.Add "filepath" . }}
+        {{ end }}
+
+        {{ $.Scratch.Get "filepath" | readFile }}
+    {{ else }}
+        {{.Inner}}
+    {{ end }}
+</pre>
--- a/layouts/shortcodes/mermaid.html
+++ b/layouts/shortcodes/mermaid.html
@ -0,0 +1,16 @@
+<pre class="mermaid">
+    {{ with .Get "file" }}
+        {{ if eq (. | printf "%.1s") "/" }}
+            {{/* Absolute path are from root of site. */}}
+            {{ $.Scratch.Set "filepath" . }}
+        {{ else }}
+            {{/* Relative paths are from page directory. */}}
+            {{ $.Scratch.Set "filepath" $.Page.File.Dir }}
+            {{ $.Scratch.Add "filepath" . }}
+        {{ end }}
+
+        {{ $.Scratch.Get "filepath" | readFile }}
+    {{ else }}
+        {{.Inner}}
+    {{ end }}
+</pre>
--- a/layouts/shortcodes/tikz.html
+++ b/layouts/shortcodes/tikz.html
@ -1,3 +1,16 @@
 <script type="text/tikz">
+    {{ with .Get "file" }}
+        {{ if eq (. | printf "%.1s") "/" }}
+            {{/* Absolute path are from root of site. */}}
+            {{ $.Scratch.Set "filepath" . }}
+        {{ else }}
+            {{/* Relative paths are from page directory. */}}
+            {{ $.Scratch.Set "filepath" $.Page.File.Dir }}
+            {{ $.Scratch.Add "filepath" . }}
+        {{ end }}
+
+        {{ $.Scratch.Get "filepath" | readFile }}
+    {{ else }}
        {{.Inner}}
+    {{ end }}
 </script>
Author	SHA1	Message	Date
Bruno BELANYI	9208b4b874	nix: bump inputs Some checks failed ci/woodpecker/push/deploy/2 Pipeline failed Details	2024-08-10 11:56:04 +01:00
Bruno BELANYI	11db5a27b9	Add Reservoir Sampling post	2024-08-10 11:56:04 +01:00
Bruno BELANYI	cc4440c946	Add Treap Revisited post	2024-08-10 11:56:04 +01:00
Bruno BELANYI	7bc3d5c18f	posts: reservoir-sampling: add k-element sampling	2024-08-10 11:56:04 +01:00
Bruno BELANYI	806772d883	posts: gap-buffer: fix typo	2024-08-10 11:56:04 +01:00
Bruno BELANYI	883f0e7e9b	posts: treap: add removal	2024-08-10 11:56:04 +01:00
Bruno BELANYI	9ff4a07c9b	posts: reservoir-sampling: add one-element sample	2024-08-10 11:56:04 +01:00
Bruno BELANYI	eff8152307	posts: union-find: fix typo	2024-08-10 11:56:04 +01:00
Bruno BELANYI	652fe81c41	posts: treap-revisited: add insertion	2024-08-10 11:56:04 +01:00
Bruno BELANYI	3605445bcf	posts: add 'reservoir-sampling'	2024-08-10 11:56:04 +01:00
Bruno BELANYI	476322a627	Add Treap post	2024-08-10 11:56:04 +01:00
Bruno BELANYI	0798812f86	posts: treap: add merge	2024-08-10 11:56:04 +01:00
Bruno BELANYI	cd24e9692a	markdownlint: relax duplicate header check	2024-08-10 11:56:04 +01:00
Bruno BELANYI	5a233e7384	layouts: add Mermaid support Similar to Graphviz and TikZ support.	2024-08-10 11:56:04 +01:00
Bruno BELANYI	dea81f1859	posts: treap: add insertion	2024-08-10 11:56:04 +01:00
Bruno BELANYI	d33247b786	posts: treap-revisited: add split	2024-08-10 11:56:04 +01:00
Bruno BELANYI	62cd0759cf	config: enable MathJax	2024-08-10 11:56:04 +01:00
Bruno BELANYI	87ef9dd38c	layouts: add Graphviz support Similar to TikZ support.	2024-08-10 11:56:04 +01:00
Bruno BELANYI	2eaa9c4329	posts: treap: add search	2024-08-10 11:56:04 +01:00
Bruno BELANYI	19b535ce49	posts: treap-revisited: add implementation	2024-08-10 11:56:04 +01:00
Bruno BELANYI	a6bbb10098	layouts: tikz: allow using file input Makes it easier to handle big diagrams.	2024-08-10 11:56:04 +01:00
Bruno BELANYI	e842737cb6	posts: treap: add construction	2024-08-10 11:56:04 +01:00
Bruno BELANYI	21fbc24e02	posts: add 'treap-revisited'	2024-08-10 11:56:04 +01:00
Bruno BELANYI	879b671332	Add Bloom Filter post	2024-08-10 11:56:04 +01:00
Bruno BELANYI	9ff51fe82e	posts: treap: add presentation	2024-08-10 11:56:04 +01:00
Bruno BELANYI	9ef33b7ff8	Add Gap Buffer post	2024-08-10 11:56:04 +01:00
Bruno BELANYI	c97d83d883	posts: bloom-filter: add lookup	2024-08-10 11:56:04 +01:00
Bruno BELANYI	768acac4ae	posts: add treap	2024-08-10 11:56:04 +01:00
Bruno BELANYI	e8acb49b53	posts: gap-buffer: add movement	2024-08-10 11:56:04 +01:00
Bruno BELANYI	114ca1de50	posts: bloom-filter: add insertion	2024-08-10 11:56:04 +01:00
Bruno BELANYI	11138dafd1	posts: gap-buffer: add deletion	2024-08-10 11:56:04 +01:00
Bruno BELANYI	84ce6ea494	posts: bloom-filter: add construction	2024-08-10 11:56:04 +01:00
Bruno BELANYI	dbbcd528c3	posts: gap-buffer: add insertion	2024-08-10 11:56:04 +01:00
Bruno BELANYI	4abcd27ee7	posts: bloom-filter: add presentation	2024-08-10 11:56:04 +01:00
Bruno BELANYI	06c4a03a42	posts: gap-buffer: add growth	2024-08-10 11:56:04 +01:00
Bruno BELANYI	4da83c9716	posts: add bloom-filter	2024-08-10 11:56:04 +01:00
Bruno BELANYI	408b74daf7	posts: gap-buffer: add accessors	2024-08-10 11:56:04 +01:00
Bruno BELANYI	a9f003f4ee	posts: gap-buffer: add construction	2024-08-10 11:56:04 +01:00
Bruno BELANYI	51a1bd01cd	posts: gap-buffer: add presentation	2024-08-10 11:56:04 +01:00
Bruno BELANYI	f2fa93ad8b	posts: add gap-buffer	2024-08-10 11:56:04 +01:00