Do or Do Not, There is No Trie.


Decided I would use this time in writing a blog post to be productive and review data structures. Especially one that wasn’t covered in CS 261.

What is a Trie? This is also called a prefix tree, and this is a tree data structure that is used for retrieval of a key in a dataset of strings. This data structure can be used in various applications such as:

  1. Autocomplete
  2. Spell Checker
  3. IP Routing (Longest prefix matching)
  4. T9 Predictive Text
  5. Solving Word Games

But why a trie over another data structure like a hash table? A hash table has O(1) time complexity looking for a key, but it’s not efficient in 1) finding keys with a common prefix, and 2) enumerating a dataset of strings in lexicographical order. There can also be hash collisions with a hash table. A trie can have O(m) time complexity, where m is the key length, and searching for a key in a balanced tree costs O(mlogn) time complexity.

A trie is a rooted tree, and its nodes generally have two fields: 1) Links to its children and 2) Boolean field that specifies whether the node corresponds to the end of the key or is just a prefix.

The following is an implementation of a Trie in Python, with the following conditions (source: https://leetcode.com/problems/implement-trie-prefix-tree/description/).

  • Trie() Initializes the trie object.
  • void insert(String word) Inserts the string word into the trie.
  • boolean search(String word) Returns true if the string word is in the trie (i.e., was inserted before), and false otherwise.
  • boolean startsWith(String prefix) Returns true if there is a previously inserted string word that has the prefix prefix, and false otherwise.
class Trie:
    def __init__(self):
        """Initialize a TrieNode."""
        self.root = TrieNode()
        

    def insert(self, word: str) -> None:
        """Inserts a word into the trie."""
        cur_node = self.root
        for char in word:
            if char not in cur_node.children:
                cur_node.children[char] = TrieNode()
            cur_node = cur_node.children[char]
        
        cur_node.is_end = True
        

    def search(self, word: str) -> bool:
        """Returns True if word is in the trie, else False."""
        cur_node = self.root
        for char in word:
            if char not in cur_node.children:
                return False
            cur_node = cur_node.children[char]
        
        return cur_node.is_end
        

    def startsWith(self, prefix: str) -> bool:
        """Returns True if word exists in the trie that starts with the given prefix."""
        cur_node = self.root
        for char in prefix:
            if char not in cur_node.children:
                return False
            cur_node = cur_node.children[char]
            
        return True
        
    
class TrieNode:
    def __init__(self):
        self.children = {}
        self.is_end = False
    
    
# Your Trie object will be instantiated and called as such:
# obj = Trie()
# obj.insert(word)
# param_2 = obj.search(word)
# param_3 = obj.startsWith(prefix)

In this case, the insertion of a key to the trie is O(m) time complexity, where m is the key length, and O(m) space complexity.

Searching for a key in the trie is O(m) time complexity, since in the worst case, we perform m operations. The space complexity is O(1).

Searching for a prefix in the trie is also O(m) time complexity and O(1) space complexity. This is one benefit of a trie over a hash map.

Print Friendly, PDF & Email

Leave a Reply

Your email address will not be published. Required fields are marked *