CS 112 -- Homework Four

Due Monday, June 30th@ noon


Programming Assignment (100 pts)

In this assignment we will implement a program that uses graph search to solve a game called the Eight Puzzle. Your program will take an initial board position, and then try to find a sequence of moves from the initial position to a fixed final position. If, after examining some number of positions (e.g., 10,000), you can not find the final position, your program will terminate (rather than running on forever) and print a suitable message. If it succeeds in finding a path through the search space, it will print out that sequence of board positions.

The point of this final program is to apply some data structures used this semester (hash tables and heaps) and to learn about graph search. It will also introduce you Artificial Intelligence programming, in which we try to create programs that solve problems (like games) that are typically associated with intelligent behavior in humans.

The Eight-Puzzle Game

The Eight Puzzle board consists of eight sliding tiles in a 3x3 frame, with one tile slot empty. The puzzle is constructed in such a way that you can slides the tiles up and down into the empty slot, but not remove them. We may represent this puzzle by a 3x3 matrix of numbers from 0 -- 8, e.g.,

     -------------
     | 1 | 4 | 2 |
     -------------
     | 3 | 5 | 8 |
     -------------
     | 6 | 7 | 0 |
     -------------

(The 0 represents the empty slot.) A move consists of rearranging the matrix by exchanging the 0 with a number above, below, to the left, or to the right (some of these possibilities may not exist if the 0 is on an outer edge of the board). For example, from the previous board, we could do either of the following:

     -------------              -------------                       
     | 1 | 4 | 2 |              | 1 | 4 | 2 |                     
     -------------              -------------                      
     | 3 | 5 | 8 |              | 3 | 5 | 0 |                      
     -------------              -------------                     
     | 6 | 0 | 7 |              | 6 | 7 | 8 |                    
     -------------              -------------                   
The goal of the game is to go from a start configuration into a goal configuration by sliding individual tiles one at a time. or simplicity, we shall assume that the final goal position is always this one:

     -------------
     | 0 | 1 | 2 |
     -------------
     | 3 | 4 | 5 |
     -------------
     | 6 | 7 | 8 |
     -------------

You can verify that the initial board position given above can be solved by moving the blank (0) up, left, up, and then left once more.

Your program should read the initial configuration from input files called "input01.txt" through "input04.txt" or "error01.txt" through "error03.txt" and use the goal position just given. The input file will consist of three lines of three integers each. You should do some error checking to verify that the inputs are a permutation of the integers 0 .. 8 (see the sample output given below).

State Space Search

If we think of a board configurations as the "state" of the game, then the space of all possible states can be thought of as a graph with a node for each state, and an edge for each possible move. Since you can always move a tile back to the previous position, note that this graph is undirected. Solving the puzzle is equivalent to searching fora path between two nodes in the graph. For example, here is a fragment of the state space graph showing the solution for the initial configuration given above:

Unfortunately, the complete state space graph has 9! = 362,880 nodes, and so it is not very practical to construct the graph, and then search it. Instead, we will search a virtual state space graph by starting with a given initial board position, and then generate nodes as needed. We will store edges which record the paths back to the initial position, to facilitate printing out the solution. Instead of marking nodes as visited, as we did in lecture, we will store visited nodes in a hash table so that we can quickly find out after generating a new node if it has been visited before.

Sample Session

Here is what I have in mind for the behavior of the program. NOTE: Please do not take every detail of this sample session literally, it is simply an example; your program may vary somewhat, as long as it solves the problem correctly. As usual, my comments are not part of the printout of the program.

csa% java Eight

Welcome to the Eight-Puzzle Solver. 
Here is the initial position read
from the input file:

1 4 2        
3 5 8
6 7 0

Ok, hm... I'm thinking...

Ok, I found a solution with 4 moves:

1 4 2
3 5 8
6 7 0

1 4 2
3 5 0
6 7 8

1 4 2
3 0 5
6 7 8

1 0 2
3 4 5
6 7 8

0 1 2
3 4 5
6 7 8

Bye!

csa% java Eight

Welcome to the Eight-Puzzle Solver. 
Here is the initial position read
from the input file:

0 2 3
3 4 5 
6 7 88     

Error!  Input must consist of permutation of digits 0 -- 8.

csa% java Eight

Welcome to the Eight-Puzzle Solver. 
Here is the initial position read
from the input file:

0 1 2 
2 3 4
5 6 7

Error! Input must consist of permutation of digits 0 -- 8. 

csa% java Eight

Welcome to the Eight-Puzzle Solver. 
Here is the initial position read
from the input file:

0 2 1
3 4 5
6 7 8

Ok, hm... I'm thinking...

Sorry, I could not find a solution after looking at 10000 nodes.

The Algorithm

We will search the graph using "best-first search," in which nodes about to be visited are stored in a priority queue PQueue ordered by a heuristic ordering which we will describe below. You will also have a hash table Visited. Each board position is a class containing a 3x3 array of integers (the actual board), a next pointer (since we will put these nodes into a hash table after they are processed) and a pointer to its predecessor (as in Dikjstra's algorithm); this last will be necessary in order to print out the moves when a solution is found. You might use the following class definition:
    class Node {
        int[] board = new int[3][3];
        int priority;     // The Manhatten Distance, described below
        Node next;        // Pointer used in buckets of hash table
        Node pred;        // Predecessor in state space, used to print out solution at end
    }
The algorithm you use should be something similar to the following pseudo-code:
  
   Initialize Visited and PQueue;
   Node goalPos = node containing final board position (fixed); 
   Node initPos  = node containing initial board position read in from the user;
   int count = 0; 
   Add initPos to PQueue;

   while( PQueue not empty ) 
   {
      Remove a node N from PQueue;

      if ( N is in Visited )
         ;   // do nothing, already saw this node
      else if ( N is the same as goalPos )
      {
         Report success and print pred list from N to initPos in reverse (i.e., initPos first and N last);
         Exit;
      }
      else 
      {
         ++count;
         if(count > 10000)        // Count unique board positions examined
         {
            Tell user that can't find board within limit;
            Exit; 
         }

         Insert N into Visited; 
         Generate a list L of all children of N; 
         forall M in L
            if( M is not in Visited )
            {
               Set the pred pointer of M to point to N; // Record path back to initial position
               Add M to the PQueue; 
            }
            
      }  
   }         

Note that the count variable counts nodes that are examined and expanded, NOT the number of nodes inserted into the heap. It is possible that your heap could overflow before you reach the limit set for nodes examined. I never in fact had a heap overflow in my implementation, but if you want to avoid this, you can put the count increment inside the while loop, or make your heap much larger. Your call.

Hash Table

You may reuse code for the hash table, as long as you credit the source (even if it is yourself). As the size of your table, pick some relatively large prime number, say 991.

For the hash function, you should consider the digits in the board to be a 9-digit integer, and then use an appropriate modulus mapping integers into the range [0..990].

Priority Queue

You should design your program so that it uses the heap data structure. Your queue should be able to hold 10000 items, so that there is no possibility of overflow. You can try larger sizes if necessary. The heuristic ordering used to organize the priority queue will be a conservative measure of the distance from the goal, called the Manhatten Distance heuristic. This is commonly used in games involving moving pieces around a grid. It represents a naive lower bound on how many moves it would take to get to the goal position from the current position and may be calculated as follows. For each digit N in a board position B, let MD(N) be the number of moves it would take to get from the position of N in B to the position of N in the goal position, if N were unencumbered by any other tiles. This can be quite easily calculated if you know the rows and columns involved; for example, if 8 were in the upper middle slot in B (row 0, column 1), then it needs to be in row 2 and column 2 in the goal, and MD(8) in B is (2-1)+(2-0) = 3. The MD for a board B is simply the sum of the MDs for all digits 1 -- 8. Clearly, digits in correct position have a MD of 0 and the MD of the goal node is 0. Thus, the MD of the board

6 8 2
3 4 5
0 7 1
based on the goal position
0 1 2
3 4 5
6 7 8
is found by counting 2 (to shift 6 two moves down) + 2 (to shift 0 two moves up) + 3 ( to move 8 once to the right and twice down) + 3 (to move 1 up twice and once left) = 10. This is a naive estimator of the distance of the current position to the final position, but it is hard to see what other simple estimate would work better (let me know if you think of one!).

The PQueue must be a MinHeap using the Manhattan Distance measure of the board.. You should naturally store this metric in the node itself.

What to Submit

Test your program on suitable inputs to verify that it works properly. We will post some sample outputs in a day or two. Make sure your program reads from the input file "input01.txt". Please submit a README.txt file if there is anything you need us to notice before grading your program.

Be sure to remove all debugging statements before submission. Your program should follow all the stylistic guidelines as specified in the course web page.

You should design your program using good object-oriented design in the following files:

    Eight.java 
    Heap.java         // This should be very similar to that used in HW 4, except with size 10000
    Hashtable.java    // This should be very similar to that used in HW 5, with a different hash function

As usual, do not submit any class files, and call your file by these names EXACTLY.

Grading Criteria

Refer to the grading criteria file on the web page.

You are responsible for thoroughly testing your program to make sure that it works. Use the sample session above as a guide. In grading your program, we will test your program on the initial positions shown on the sample outputs (posted soon).

 


Academic Honesty and Collaboration

Cooperation is recommended in understanding programming concepts and system features. But the actual solution of the assignments including all programming must be your individual work. For example, copying without attribution any part of someone else's program is plagiarism, even if you have modified the code. The University takes acts of cheating and plagiarism very seriously; first time violators are routinely suspended for a semester.