IT 327 Data Structures - Week 8 - Binary Search Trees

Today's lesson is Chapter 9: Binary Search Trees and AVL Trees.

Binary Search Trees Explained

A binary search tree (or BST) is structured such that, for any given node, the left child is "less than" the node and the right child is "greater than" the node. This will be the case not just for the immediate child but for all the descendants. That is, all the nodes in the left subtree of a given node will be "less" and all the nodes in the right subtree will be greater. This property should hold true for all nodes in the tree, at any level, not just the root node. In other wordes, every subtree in a binary search tree is also a binary search tree.

We've seen various approaches to searching, so here's how a BST compares to some others:

  • Like a linked-list a BST is dynamic. Unlike a linked-list, the BST is non-linear, which greatly improves search times.
  • Lika a hash table, a BST is optomized for storage and lookup. Unlike a hash table: (a) the BST does not consume unnecessary memory, and (b) it is very easy to display data in sorted order.
  • Like the array-based binary search (and to a lesser extent the interpolation search), searching for nodes in a binary tree runs very fast (O(lg n)) because we cut down the search space by half every time. Unlike the array-based binary search: (a) there is no time-consuming sorting phase that must be done beforehand (nodes are added in inserted order), and (b) you do not need to store items in an array.
           50
          /  \
         /    \
        /      \
      40       70
      / \     /  \
     /  50   60  80
   20
  /  \
 10  30

Balanced Search Trees: AVL Trees

One problem with BSTs is that they can become unbalanced (i.e. left-heavy or right-heavy), depending on how values are added to the tree. In the worst-case, nodes are added in order (forward or reverse) and the tree degrades to the performance (and appearance) of a linked-list.

To cure this problem, you must rotate the tree (or more likely, its subtrees) to make the tree balanced again. To see an example of the four rotations that you must do to keep a tree balanced, please see this page. The other pages in the "links" section (below) contain excellent examples.

Links

Tree Links - Links to some articles and explanations of trees on the web. Many include animtions and / or illustrations that are worth a thousand words.

Labs

Two labs today.

Unix Makefile: Makefile: Use this if you're working on your labs on a Unix / Linux system with gcc & make.

Lab #1: Binary Search Tree (52 fill-ins)

Several files are needed for this lab.

  1. Header: bstree.h: Note: Do not manually add bstree.h to the project, it will automatically be added for you when you add the other files.
  2. Implementation: bstree.cpp: This is the only file you will edit, print up, and turn in.
  3. Driver: bstree_test.cpp: Run this program to test your lab.

    Note: You must manually add bstree_test.cpp to your project (using "Add files to project") or it will complain about your program having no main() function and will completely fail to run.

  4. Output: bstree_test.out: Compare the output you get with the contents of this file to be sure you did it right.

    Note: You do not have to include the output file in your project, just look at it to see if your program is working right.

Lab #2: AVL Tree (37 fill-ins)

Several files are needed for this lab.

  1. Header: avltree.h: Note: Do not manually add avltree.h to the project, it will automatically be added for you when you add the other files.
  2. Implementation: avltree.cpp: This is the only file you will edit, print up, and turn in.
  3. Base Class: bstree.cpp: This is your binary search tree from Lab #1. Because the AVL tree is derived from it, you must manually include it in your project with "Add files to project". This also means you must complete Lab #1 before starting on Lab #2.
  4. Driver: avltree_test.cpp: Run this program to test your lab.

    Note: You must manually add avltree_test.cpp to your project (using "Add files to project") or it will complain about your program having no main() function and will completely fail to run.

  5. Output: avltree_test.out: Compare the output you get with the contents of this file to be sure you did it right. (Note: you will probably want to set the debugging variable at the top of bstree.cpp to 'false' to de-clutter the output.) If you want to see some massive tests, uncomment the lines at the bottom of avltree_test.cpp and you can compare it to this output: avltree_test_massive.out.

    Note: You do not have to include the output file in your project, just look at it to see if your program is working right.

Homework

For homework, add a method to the binary search tree (the plain one, not the AVL tree) that will delete a given node.

Adding the function header

First, add the following prototype as a public function in the bstree.h file:

	bool del(string del_me);

(I've put a commented-out one at the bottom of the file for you.)

Second, implement it as follows in bstree.cpp

	bool BinarySearchTree::del(string del_me)

Hints

  1. First, test if the node does not exist in the tree. If it doesn't, return false (because we can't delete a node that isn't there). (Have a look at the top of the insert(string addme) function for an idea of what I'm talking about.)
  2. Next, you will have is a while loop very similar to the one in exists(), with a few important differences:
    1. We will need to keep track of the parent of the node we will delete. Declare a pointer to a BinarySearchNode before the while loop called 'parent' and set it to null initially.
    2. If we find the node we're looking for, break out of the loop instead of returning from the function.
    3. Every time we need to traverse right or left, we need to set the parent to the current node before setting the node to it's left or right child.
  3. After the while loop, we need to see which child the parent lost (if any).
    1. If we have no parent (it's null), it means we're going to delete the root node, so set 'root' to null. (Important: This test must be done before the following two tests.)
    2. Else if the node we found is equal to the parent's left child, we need to set the parent's left child to null.
    3. Else if the node we found is equal to the parent's right child, we need to set the parent's right child to null.
  4. Before we can delete the node, we need to re-add it's orphaned children to the tree.
    1. If the node we're going to delete has a left child, re-add it to the tree by calling the insert() method that takes a BinarySearchNode, passing it the node's left child.
    2. Do the same thing for the right child. Note: this step will not be an "else if" clause, just another, standalone "if" statement. (It is possible that a node could have both right and left children, not just one or the other.)
  5. At the end if it all, you can finally delete the node, decrement the 'size' member (there's one less node in the tree), and return true.

Testing your code

Uncomment the "HOMEWORK" section in bstree_test.cpp (the driver program you used in Lab #1) to test the del() method you wrote.

Output: bstree_test_hwk.out: Shows an example of the the output from the same driver program as before, but with the homework tests uncommented. This exercise exposes a shortcoming of the plain-vanilla BST: the deletions we make cause it to degrade to a linear, linked-list.