2.9 KiB
title | date | draft | description | tags | categories | series | favorite | disable_feed | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Trie | 2024-06-30T11:07:49+01:00 | false | A cool map |
|
|
|
false | false |
This time, let's talk about the Trie, which is a tree-based mapping structure most often used for string keys.
What does it do?
A Trie can be used to map a set of string keys to their corresponding values, without the need for a hash function. This also means you won't suffer from hash collisions, though the tree-based structure will probably translate to slower performance than a good hash table.
A Trie is especially useful to represent a dictionary of words in the case of spell correction, as it can easily be used to fuzzy match words under a given edit distance (think Levenshtein distance)
Implementation
This implementation will be in Python for exposition purposes, even though
it already has a built-in dict
.
Representation
Creating a new Trie
is easy: the root node starts off empty and without any
mapped values.
class Trie[T]:
_children: dict[str, Trie[T]]
_value: T | None
def __init__(self):
# Each letter is mapped to a Trie
self._children = defaultdict(Trie)
# If we match a full string, we store the mapped value
self._value = None
We're using a defaultdict
for the children for ease of implementation in this
post. In reality, I would encourage you exit early when you can't match a given
character.
The string key will be implicit by the position of a node in the tree: the empty string at the root, one-character strings as its direct children, etc...
Search
An exact match look-up is easily done: we go down the tree until we've exhausted the key. At that point we've either found a mapped value or not.
def get(self, key: str) -> T | None:
# Have we matched the full key?
if not key:
# Store the `T` if mapped, `None` otherwise
return self._value
# Otherwise, recurse on the child corresponding to the first letter
return self._children[key[0]].get(key[1:])
Addition
Adding a new value to the Trie is similar to a key lookup, only this time we store the new value instead of returning it.
def insert(self, key: str, value: T) -> bool:
# Have we matched the full key?
if not key:
# Check whether we're overwriting a previous mapping
was_mapped = self._value is None
# Store the corresponding value
self._value = value
# Return whether we've performed an overwrite
return was_mapped
# Otherwise, recurse on the child corresponding to the first letter
return self._children[key[0]].insert(key[1:], value)