Binary Search Tree – Deep Dive

Continuing my interview prep journey i’d like to tackle a specific data structure in the tree family. Tree’s ordinarily are data structures that can have between 0-N children and all generate from a specific root. (Tree’s honestly look more like a trees root structure than the actual tree, but tree sounds better). Trees have a variety of uses and applications (A specific example is the DNS system used for addressing domain names) and are an important data structure to learn.

There are a few subclasses of the tree, but one important one is the binary search tree. A binary search tree is a tree where each child has at most two children. The left child is always less than the parent node and the right child is always greater than the parent node. If you traverse the tree any node within the tree will also be a binary search tree. What makes this structure so powerful is its ability to make a search for a specific value O(log N) time.

Construction

Similar to a linked list, a binary search tree is constructed of nodes that reference one another. The base level is called the root which is contained in the tree code. (The following examples and methods will be done in Python3)

class Node:
    
    def __init__(self, key):
        self.key = key
        self.left = None
        self.right = None

class BinarySearchTree:
   
    def __init__(self):
        self.root = None

These are the bare essentials for the binary search tree. The Node class for holding the node values and the BinarySearchTree class for performing all of the operations of the binary search tree.

Insertion

The first operation we want to perform with a binary search tree is to insert new nodes.

class BinarySearchTree:
 
    def __init__(self):
        self.root

    def insert(self, key):
        if self.root is None:
            self.root = Node(key)
        else:
            current = root
            parent = None

            while True:
                parent = current

                if key < parent.key:
                    current = current.left
                    
                    if current is None:
                        parent.left = Node(key)
                        return
                   
                else:
                    current = current.right
    
                    if current is None:
                        parent.right = Node(key)
                        return
 

What happens in the code. We first check the root to ensure that there is a Node assigned. If there is we can traverse down the tree by comparing the value of our new key against the parent node until we find an open spot.

Search

Search is very simple. You compare the values from left and right until you either find the value or find None.

def search(self, target):
    current = self.root

    while current is not None:
        if current.key == target:
            return True
        else:
            if current.key > target:
                current = current.left
            else:
                current = current.right
    return False

Delete

Deleting is a little more complex depending on which node is deleted.

def delete(self, target):
    if root is None:
        return

    self.root = self.delete_helper(self.root, target)

def getMinimumKey(self, current):
    while current.left:
        current = current.left
    return current

def delete_helper(self, root, target):
    
    parent = None
    current = root

    while current and current.key != target:
        parent = current
        
        if target < current.key:
            current= current.left
        else:
            current = current.right
        
        current is None:
            return root

     # Case 1: No Children
     if current.left is None and current.right is None:
            
         if current != root:
             if parent.left == current:
                 parent.left = None
             else:
                 parent.right = None
         else:
             root = None

     # Case 2: node to be deleted has two children
     elif current.left and current.right:
             
         successor = self.getMinimumKey(current.right)
             
         val = successor.key
             
         self.delete_helper(root, successor.key)

         current.key = val

     # Case 3: node to be deleted has only one child
     else:
           
         if current.left:
             child = current.left
         else:
             child = current.right

         if current != root:
             if current = parent.left:
                 parent.left = child
             else:
                 parent.right = child
         else:
             root = child

      return root

There are three cases to look out for when deleting a node. If the node has no children, the node has one child, and the node has two children. In case 1, where the node has no children such as a leaf node or a root node with no children. We simply set the side that the child is on to None or the root to None if we are deleting a root with no children. If the node has one child we figure out where the child is (left or right) and then we assign that child to the now open position on the current nodes parent to the child, or set the root to the child. For two children it gets slightly more complicated. The child that needs to replace a two child parent is the parents in order successor or the right child’s left most child. I added another helper method that finds this child and then once it is found it will apply the deletion and replacement.

Traversals

Traversals involve how to navigate through the tree. They also come up in interview questions (there is a famous one where you need to build a correct tree with a pre-order and in-order array). There are three types, pre-order, in-order, and post-order. Each of the traversals deals with which node gets processed first when navigating through the tree. All operations are performed in O(N) time (because you need to find every node).

For my design of the traversals I’m focusing on their utility in the testing process. I’ll create an array that will be returned and test that array to ensure the traversal gives the right result.

In Order Traversal:
In order traversal is the easiest to comprehend. When you finish with this traversal it should return the values from least to greatest as it is processing the nodes in order. (For the sake of space, all of these methods are within the BinarySearchTree, but I will show them on their own.)

def in_order_traversal(self):
    traversal_list = []
    return self.in_order_traversal_helper(self.root, traversal_list

def in_order_traversal_helper(self, root, traversal_list):
    if root is None:
        return
    self.in_order_traversal_helper(root.left, traversal_list)
    traversal_list.append(root.key)
    self.in_order_traversal_helper(root.right, traversal_list)
    
    return traversal_list

You’ll notice that in_order_traversal_helper() is just a depth first search that appends the values to an array and returns that array. If I had a tree with 2 as the root, 1 as the left node, and 3 as the right node if I perform an in-order traversal, I will get an array that will look like [1, 2, 3]

Pre-order Traversal
Pre-order traversal processes a node as soon as it is reached and then moves along the left and right sub trees. Since the root is always the first node in the list. Pre-order traversal’s can be useful for constructing a correct tree from an array. Note that the traversals are all very similar, what separates them is the order in processing nodes.

def pre_order_traversal(self):
    traversal_list = []
    return pre_order_traversal_helper(root, traversal_list)

def pre_order_traversal_helper(self, root, traversal_list)
    if root is None:
       return

    traversal_list.append(root.key)
    self.pre_order_traversal_helper(root.left, traversal_list)
    self.pre_order_traversal_helper(root.right, traversal_list)

    return traversal_list

This will add the node to the array as soon as it is reached, and then move on to the left then right sub trees. Given the same tree as before (root=2, left=1, right=3) this will return an array [2, 1, 3]

Post-order traversal
Post order traversal involves going all the way to the furthest leaf within the tree then processing it. The array will be filled with all the leaf nodes starting from left to right, then the parents, all the way until the root.

def post_order_traversal(self):
    traversal_list = []
    return self.post_order_traversal_helper(root, traversal_list)

def post_order_traversal_helper(self, root, traversal_list):
    if root is None:
        return
    
    self.post_order_traversal_helper(root.left, traversal_list)
    self.post_order_traversal_helper(root.right, traversal_list)
    traversal_list.append(root.key)

    return traversal_list

Given the same tree (root=2, left=1, right=3) we get an array returned [1, 3, 2]

Those are the basic operations of a binary search tree. There are more advanced BST’s that also self balance (which I may or may not cover in a future blog post) that cover for disadvantages of the binary search tree (what if all the inserted nodes are on the right or left of the tree?). With the basics of the tree down we can get a better idea of how to navigate BST problems and possible ways to use a BST to solve common problems.

Leave a Reply