From 798116716f528a5a439d1bc490ec1a955d548e04 Mon Sep 17 00:00:00 2001 From: Bruno BELANYI Date: Sun, 14 Jul 2024 17:55:15 +0100 Subject: [PATCH] posts: bloom-filter: add construction --- .../posts/2024-07-14-bloom-filter/index.md | 25 +++++++++++++++++++ 1 file changed, 25 insertions(+) diff --git a/content/posts/2024-07-14-bloom-filter/index.md b/content/posts/2024-07-14-bloom-filter/index.md index 0a82882..547d50f 100644 --- a/content/posts/2024-07-14-bloom-filter/index.md +++ b/content/posts/2024-07-14-bloom-filter/index.md @@ -37,3 +37,28 @@ the _false positive_ rate of membership is quite low. I won't be going into those calculations here, but they are quite trivial to compute, or one can just look up appropriate values for their use case. + +## Implementation + +I'll be using Python, which has the nifty ability of representing bitsets +through its built-in big integers quite easily. + +We'll be assuming a `BIT_COUNT` of 64 here, but the implementation can easily be +tweaked to use a different number, or even change it at construction time. + +### Representation + +A `BloomFilter` is just a set of bits and a list of hash functions. + +```python +BIT_COUNT = 64 + +class BloomFilter[T]: + _bits: int + _hash_functions: list[Callable[[T], int]] + + def __init__(self, hash_functions: list[Callable[[T], int]]) -> None: + # Filter is initially empty + self._bits = 0 + self._hash_functions = hash_functions +```