Breadth-First Traversal of a Tree


  1. Helper data structure:

    Certain programming problems are easier to solve using multiple data structures.

    For example, testing a sequence of characters to determine if it is a palindrome (i.e., reads the same forward and backward, like "radar") can be accomplished easily with one stack and one queue. The solution is to enter the sequence of characters into both data structures, then remove letters from each data structure one at a time and compare them, making sure that the letters match.

    In this palindrome example, the user (person writing the main program) has access to both data structures to solve the problem. Another way that 2 data structures can be used in concert is to use one data structure to help implement another.

    We will examine how a common data structure can be used to help traverse a tree in breadth-first order.

  2. Depth-first traversal:

    We have already seen a few ways to traverse the elements of a tree. For example, given the following tree:

          tree
          ----
           j    <-- root
         /   \
        f      k
      /   \      \
     a     h      z
      \
       d
    

    A preorder traversal would visit the elements in the order: j, f, a, d, h, k, z.

    This type of traversal is called a depth-first traversal. Why? Because it tries to go deeper in the tree before exploring siblings. For example, the traversal visits all the descendants of f (i.e., keeps going deeper) before visiting f's sibling k (and any of k's descendants).

    As we've seen, this kind of traversal can be achieved by a simple recursive algorithm:

    PREORDER-TRAVERSE(tree)
    
    if (tree not empty)
      visit root of tree
      PREORDER-TRAVERSE(left subtree)
      PREORDER-TRAVERSE(right subtree)
    

    The 2 other traversal orders we know are inorder and postorder. An inorder traversal would give us: a, d, f, h, j, k, z. A postorder traversal would give us: d, a, h, f, z, k, j.

    Well, inorder and postorder traversals, like a preorder traversal, also try to go deeper first...

    For example, the inorder traversal visits a and d before it explores a's sibling h. Likewise, it visits all of j's left subtree (i.e., "a, d, f, h") before exploring j's right subtree (i.e., "k, z"). The same is true for the postorder traversal. It visits all of j's left subtree (i.e., "d, a, h, f") before exploring any part of the right subtree (i.e., "z, k").

  3. Breadth-first traversal:

    Depth-first is not the only way to go through the elements of a tree. Another way is to go through them level-by-level.

    For example, each element exists at a certain level (or depth) in the tree:

          tree
          ----
           j         <-- level 0
         /   \
        f      k     <-- level 1
      /   \      \
     a     h      z  <-- level 2
      \
       d             <-- level 3
    

    (Computer people like to number things starting with 0.)

    So, if we want to visit the elements level-by-level (and left-to-right, as usual), we would start at level 0 with j, then go to level 1 for f and k, then go to level 2 for a, h and z, and finally go to level 3 for d.

    This level-by-level traversal is called a breadth-first traversal because we explore the breadth, i.e., full width of the tree at a given level, before going deeper.

    Now, how might we traverse a tree breadth-first? We'll need some other mechanism than the ones we've already used since preorder, inorder and postorder traversals don't produce breadth-first order.

  4. Why breadth-first:

    You may be thinking: "Why would we ever want to traverse a tree breadth-first?" Well, there are many reasons....

    Tree of Officers

    Suppose you have a tree representing some command structure:
                   Captain Picard
                 /                \
        Commander Riker       Commander Data
          /         \               |
     Lt. Cmdr.   Lt. Cmdr.      Lt. Cmdr.
     Worf        LaForge        Crusher
         |                          |
    Lieutenant                  Lieutenant
    Cameo-Appearance            Selar
    

    This tree is meant to represent who is in charge of lower-ranking officers. For example, Commander Riker is directly responsible for Worf and LaForge. People of the same rank are at the same level in the tree. However, to distinguish between people of the same rank, those with more experience are on the left and those with less on the right (i.e., experience decreases from left to right).

    Suppose a fierce battle with an enemy ensues. If officers start dropping like flies, we need to know who is the next person to take over command. One way to trace the path that command will follow is to list the officers in the tree in breadth-first order. This would give:

    1. Captain Picard
    2. Commander Riker
    3. Commander Data
    4. Lt. Cmdr. Worf
    5. Lt. Cmdr. LaForge
    6. Lt. Cmdr. Crusher
    7. Lieutenant Cameo-Appearance
    8. Lieutenant Selar

    Game Tree

    Another time when breadth-first traversal comes in handy is with game trees. Suppose we have a tree for a game of chess. In other words, levels of the tree alternately represent possible moves by you, and then by your opponent, and then by you...
                   current state of game
            /                |             \
       move                move     ...   move    <-- your moves
       queen's             king's         queen
       bishop              knight
     /     |    \         /    |   \        |
    move  move   ...   move  move  ...     ...    <-- opponent's moves
    king  queen's      king  king's
          rook               knight
            |                  |
            .                  .
            .                  .
            .                  .
    

    You have exactly 1 minute to decide on a move. Now, which would be a better use of that time: exploring the branch where you "move your queen's bishop" to its fullest extent (go deep) or explore each of your possible next move first, and then your opponent's responses (breadth-first).

    In this case, traversing the game tree breadth-first makes more sense than exploring one move infinitely (depth-first) before exploring another move.

  5. Help for breadth-first traversing:

    Let's return to example trees that are binary and that just hold characters.

    As we've seen, the recursive tree traversals go deeper in the tree first. Instead, if we are going to implement a breadth-first traversal of a tree, we'll need some help....Perhaps one of the data structures we already know can be of assistance?

    How to use helper data structure: What we'll do is store each element in the tree in a data structure and process (or visit) them as we remove them from the data structure.

    We can best determine what data structure we need by looking at an example:

       f
     /   \
    a     h
     \
      d
    

    When we are at element f, that is the only time we have access to its 2 immediate children, a and h. So, when we are at f, we'd better put its children in the data structure. Obviously then, f must have been in the data structure before them (i.e., first), since we'd have put f in when we were at f's parent.

    So, if we put the parent in the data structure before its children, what data structure will give us the order we need? In other words, to explore the tree breadth-first, do we want the children to be removed from the data structure first or the parent to be removed first?

    Answer: A queue will give us the order we want! A queue enforces first-in-first-out order, and we want to process the first thing in the data structure, the parent, before its descendents.

  6. Using one data structure to implement another:

    The organization of a program that uses a breadth-first traversal might look like:

    main program        tree.h              tree.c
    ------------        ------              ------
    
    call                TreeBreadthFirst    TreeBreadthFirst
    TreeBreadthFirst    prototype           definition
    

    In other words, the main program needs to call some function that performs a breadth-first traversal, like TreeBreadthFirst(). This means that the interface (tree.h) must provide a prototype for such a function. Also, the implementation (tree.c), which the main program has no direct access to, must define the function TreeBreadthFirst().

    If we use a queue to help us implement the breadth-first traversal, then we must extend the picture...

    ...     tree.c             queue.h         queue.c
            ------             -------         -------
    
    ...     TreeBreadthFirst   queue funcs.    queue funcs.
            definition (uses   prototypes      definitions
            a queue)
    

    The tree implementation will use types and functions from the queue interface. The queue implementation will provide definitions for those functions, but they are hidden from the user of the queue--here, the user of the queue is the tree implementation!

    Finally, since the main program cannot see the implementation of the tree, it won't even know that a queue is involved and won't have any access to that queue.

  7. Connecting the queue to the tree:

    Previously, we've seen that the following organization of types can be used for a tree of characters:

    tree.h                          tree.c
    ------                          ------
                                    #include "tree.h"
                                    #include "queue.h"
    
    typedef char treeElementT;      typedef struct treeNodeTag {
                                      treeElementT element;
                                      struct treeNodeTag *left,
                                                         *right;
                                    } treeNodeT;
    
    typedef struct treeCDT          typedef struct treeCDT {
            *treeADT;                 treeNodeT *root;
                                    } treeCDT;
    

    The one adjustment needed is that the tree implementation will have to include queue.h, since it uses a queue.

    Now, the types for a queue have similar organization:

    queue.h                         queue.c
    -------                         -------
                                    #include "queue.h"
    
    type-of-an-element
    
    abstract-type-of-a-queue        concrete-type-of-a-queue
    
    Of course, we know what the abstract type will be and that the concrete type will be a structure:
    queue.h                         queue.c
    -------                         -------
                                    #include "queue.h"
    
    typedef ??
            queueElementT;
    
    typedef struct queueCDT         typedef struct queueCDT {
            *queueADT;                ??
                                    } queueCDT;
    

    It's actually irrelevant (to those writing the tree) what the internals of the queue are, i.e., what a queueCDT really holds and what other types it may need for the implementation (e.g., queueNodeTs?).

    However, we still must determine what a queueElementT is! This is what will make the queue useful to help traverse a tree.

    So, we are really asking: What type of thing should the queue store?

    We can answer that question by asking: What type is present throughout the tree that we can use to refer both to the top-level tree and each and every subtree?

    an ADT
      |     --------
      +---> | root | a CDT
            |  |   |
            ---+----
               |
               v
             -----
             | j |
             |---|
             | | |
             /---\
            v     v
        -----     -----
        | f |     | k |
        |---|     |---|
        | | |     |0| |
        /---\     ----\
       v     v         v
     ...     ...        ...
    

    Answer: treeNodeT *! It is the only type that allows us to refer to both the top-level and all subtrees.

    Therefore, the queue must be able to store things of type treeNodeT *.

    There is more than one way we can achieve this, we could:

    We'll choose the last option, because it produces the most general queue.

    So, we can complete the queue types as:

    queue.h                         queue.c
    -------                         -------
                                    #include "queue.h"
    
    typedef void *
            queueElementT;
    
    typedef struct queueCDT         typedef struct queueCDT {
            *queueADT;                ??  /* Not important to tree! */
                                    } queueCDT;
    

  8. TreeBreadthFirst() function:

    Finally, we can implement the function TreeBreadthFirst(), which traverses a tree using a queue to achieve breadth-first order. Right now, we don't care what it does when its visits an element.

    This function will have to receive a tree via an ADT...and doesn't need to return anything, so its prototype looks like:

    void TreeBreadthFirst(treeADT tree);
    

    Now, the essence of the algorithm is to use a queue, in other words, to process nodes while there are node pointers left in the queue still to be processed.

    So, the core of the function will be looping through the contents of the queue:

    while (!QueueIsEmpty(queue)) {
      ...
    

    However, pointers to all the nodes in the tree won't be in the queue at once. We'll have to place them in the queue when we have access to them (i.e., we only get access to a child when we are at its parent).

    Here's one solution to the problem:

    void TreeBreadthFirst(treeADT tree)
    {
      /* Temporary queue. */
      queueADT queue;
      /* Points to node we are processing. */
      treeNodeT *traverse;
    
      if (tree->root == NULL)
        return;  /* Nothing to traverse. */
    
      /* Create a queue to hold node pointers. */
      queue = QueueCreate();
    
      /*
       * Gotta put something in the queue initially,
       * so that we enter the body of the loop.
       */
      QueueEnter(queue, tree->root);
    
      while (!QueueIsEmpty(queue)) {
        traverse = QueueDelete(queue);
    
        Visit the node pointed to by traverse.
    
        /*
         * If there is a left child, add it
         * for later processing.
         */
        if (traverse->left != NULL)
          QueueEnter(queue, traverse->left);
    
        /*
         * If there is a right child, add it
         * for later processing.
         */
        if (traverse->right != NULL)
          QueueEnter(queue, traverse->right);
      }
    
      /* Clean up the queue. */
      QueueDestroy(queue);
    }
    

    Notice that, because we can solve this problem with iteration (instead of recursion), we do not need a wrapper function that pulls out the top-level node pointer from the CDT.

  9. Using a stack instead:

    Suppose we replaced the use of a queue with a stack...

    void TreeDifferentTraversal(treeADT tree)
    {
      /* Temporary stack. */
      stackADT stack;
      /* Points to node we are processing. */
      treeNodeT *traverse;
    
      if (tree->root == NULL)
        return;  /* Nothing to traverse. */
    
      /* Create a stack to hold node pointers. */
      stack = StackCreate();
    
      /*
       * Gotta put something in the stack initially,
       * so that we enter the body of the loop.
       */
      StackPush(stack, tree->root);
    
      while (!StackIsEmpty(stack)) {
        traverse = StackPop(stack);
    
        Visit the node pointed to by traverse.
    
        /*
         * If there is a left child, add it
         * for later processing.
         */
        if (traverse->left != NULL)
          StackPush(stack, traverse->left);
    
        /*
         * If there is a right child, add it
         * for later processing.
         */
        if (traverse->right != NULL)
          StackPush(stack, traverse->right);
      }
    
      /* Clean up the stack. */
      StackDestroy(stack);
    }
    

    What kind of traversal does this new function produce?

    Answer: A preorder traversal! However, it gives us right-to-left order, since the right child will come out of the stack before the left child.


    Why does a stack (using iteration) give us a preorder traversal? Well, suppose we had a recursive function Traverse(), which performed a preorder traversal. Suppose you used this recursive function on the following tree:
          tree
          ----
           j
         /   \
        f      k
      /   \      \
     a     h      z
      \
       d
    

    Within each call of Traverse() on a particular element (or whatever an element is held in, e.g., a node, an array element, etc.), there would be recursive calls to traverse its children. (Below indentation indicates another level of recursion.)

    Traverse(j)
      Traverse(f)
        Traverse(a)
          Traverse(d)
        Traverse(h)
    

    Each of these recursive calls is a function call. When you call a function, the parameter(s) of the function actually get pushed onto something called the call stack.

    So, the sequence of recursive calls above has the following effect on the call stack:

                             j
    call                ----------
    Traverse(j)         call stack
    
                             f
                             j
    call                ----------
    Traverse(f)         call stack
    
    
                             a
                             f
                             j
    call                ----------
    Traverse(a)         call stack
    
    
                             d
                             a
                             f
                             j
    call                ----------
    Traverse(d)         call stack
    
    
                             a
                             f
                             j
    return from         ----------
    Traverse(d)         call stack
    
                             f
                             j
    return from         ----------
    Traverse(a)         call stack
    
    
                             h
                             f
                             j
    call                ----------
    Traverse(h)         call stack
    
    ...
    

    When the function is called, its parameter(s) go on the top of the call stack, and when the function returns, its parameter(s) get popped off of the call stack.

    In other words, with recursive traversal functions, there is really a stack helping with the traversal.


BU CAS CS - Breadth-First Traversal of a Tree
Copyright © 1993-2000 by Robert I. Pitts <rip at bu dot edu>. All Rights Reserved.