Binary Tree


  1. Abstract idea of a tree:

    A tree is another data structure that you can use to store information. Unlike stacks and queues, which are linear data structures, trees are hierarchical data structures.

    Saying that the structure of a tree is hierarchical means that things are ordered above or below other things. For example, the army is hierarchical, with generals above colonels, and colonels above lieutenants, etc.

    Here is an example of a tree holding letters:

          tree
          ----
           j    <-- root
         /   \
        f      k  
      /   \      \
     a     h      z    <-- leaves
    

    Tree Vocabulary

    Let's now introduce some vocabulary with our sample tree... The element at the top of the tree is called the root. The elements that are directly under an element are called its children. The element directly above something is called its parent. For example, a is a child of f and f is the parent of a. Finally, elements with no children are called leaves.

    Is k a leaf? Who is its parent?

    How many parents can elements have?


    Aside: If you were to draw the picture above upside down, it would look like a real tree, with the leaves at the top and the root at the bottom...However, we usually draw tree data structures as we've done above.

    Uses

    There are many reasons to use a tree to store information. One reason might be because you want to store information that naturally forms a hierarchy. For example, the file system on a computer:
    file system
    -----------
         /    <-- root
      /     \
    ...    home
          /    \
       ugrad    course
        /      /   |   \
      ...  cs101 cs112 cs113
    

    Despite the hierarchical order of the structure of the tree, the order enforced on objects in the tree will depend on how we use the tree. This just means that unlike a stack whose operations are usually limited to push and pop, there are many different kinds of trees and ways to use them. Thus, this flexibility makes them more akin to linked lists.

    Another reason to use a tree is because trees make some operations more efficient (we'll discuss that at some later time).

    Recursive Data Structure

    A tree can be viewed as a recursive data structure. Why? Remember that recursive means that something is defined in terms of itself. Here, this means that trees are made up of subtrees.

    For example, let's look at our tree of letters and examine the part starting at f and everything under it...

          tree
          ----
           j
         /   \
        f      k  
      /   \      \
     a     h      z
    

    Doesn't it look like a tree itself? In this subtree, what is f (recall our vocabulary)?

    What about just z? Doesn't it look like a subtree?

    Binary Trees

    We can talk about trees where the number of children that any element has is limited. In the tree above, no element has more than 2 children. For the rest of this example, we will enforce this to be the case.

    A tree whose elements have at most 2 children is called a binary tree.

    Since each element in a binary tree can have only 2 children, we typically name them the left and right child.

  2. Tree operations:

    As mentioned, there are different kinds of trees (e.g., binary search trees, 2-3 trees, AVL trees, tries, just to name a few).

    What operations we will need for a tree, and how they work, depends on what kind of tree we use. However, there are some common operations we can mention:

    Other operations may be necessary, depending on the kind of tree we use.

  3. Tree representation in C++:

    Since we want to be able to represent a tree in C++, how are we going to store this hierarchical structure?

    Can we use an array?

    Answer: Certainly! There are times when we can use an array to represent a tree.

    However, we can also do something along the lines of a linked list. For example, just as linked list nodes hold one element and point to the next node...

     |
     v
    ---------     ---------     ---------
    | a | --+---> | b | --+---> | c | 0 |
    ---------     ---------     ---------
    

    we could have tree nodes that hold one element and point to their children...

     -----
     | j |
     |---|
     | | |
     /---\
    v     v
    

    So, our first example tree of letters would look something like:

               |
               v
             -----
             | j |
             |---|
             | | |
             /---\
            v     v
        -----     -----
        | f |     | k |
        |---|     |---|
        | | |     |0| |
        /---\     ----\
       v     v         v
    -----   -----      -----
    | a |   | h |      | z |
    |---|   |---|      |---|
    |0|0|   |0|0|      |0|0|
    -----   -----      -----
    

    Note that some nodes don't have a left and/or right child, so those pointers are NULL.

    Also, just as we need a pointer to the first node to keep track of a linked list; here, we need a pointer to the root node to keep track of a tree.

    Special case: Empty Tree

    What about when the tree is empty. How do we represent an empty tree?

    Answer: There are no nodes, so the pointer to the root should be NULL.

  4. Data types for a tree:

    First, we need to decide what types (e.g., classes) we'll need to implement the above representation of a tree of characters. We'll want to use classes to hide the internal details of this tree data structure.

    As usual, we'll want our tree data structure in its own module.

    People will use our tree something like the following:

    #include "binarytree.h"
    
    char ch;
    
    // Setup trees.
    BinaryTree t1;
    BinaryTree t2;
    
    t1.insert('a');
    t2.insert(ch);
    
    ...
    

    That means, as part of the interface to the binary tree, we need to provide:

    In addition, to fully implement the binary tree, we'll need:


    Now we can fill in these types. Let's start bottom-up from the simplest type and work our way up through types that use the simpler types.

    The type-of-an-element has already been determined:

    typedef char ItemType;
    

    Next, elements of the tree are being stored in nodes. For our binary tree, nodes must contain an element and pointers to 2 possible children, the left one and the right one.

    How do we define the type for a node with these 3 parts?

    Answer: We can combine them into one type with a class:

    class BTNode {
    public:
      ...
    
    private:
      ItemType data;
      BTNode *lChild;
      BTNode *rChild;
    };
    

    Next, we need something that holds all the information needed to keep track of the tree. Since the nodes already hold any elements, the only thing that is missing is the pointer to the root node.

    class BinaryTree {
    public:
      BinaryTree();
      bool insert(ItemType);
      ...
    
    private:
      BTNode* root;
    };
    

  5. Tree node implementation

    Let's look at a more complete class definition for the BTNode class:

    class BTNode {
    public:
      // Constructors.
      BTNode();
      BTNode(ItemType);
    
      // Get fields of node.
      ItemType getData() const;
      BTNode *getLeftChild() const;
      BTNode *getRightChild() const;
    
      // Set fields of node.
      void setData(ItemType);
      void setLeftChild(BTNode *);
      void setRightChild(BTNode *);
    
    private:
      ItemType data;
      BTNode *lChild;
      BTNode *rChild;
    };
    

    Note that it has methods to get and set all its fields.

    Now, let's look at skeletons for the method definitions of BTNode.

  6. Exercise

    1. Download btnode.h, btnode.cpp, and itemtype.h.

    2. Implement BTNode's methods.

    3. Write a driver program containing a main() function to test your implementation.


BU CAS CS - Binary Tree
This page created by Jisook Youn <jisook@cs.bu.edu>.
Material adapted for C++ from Binary Tree (C version by Robert I. Pitts <rip@bu.edu>).