CS 113: Introduction to Computer Science II with Introduction to C

Steve Homer---Spring 1999

Homework 7---due Friday, May 7

Note: The late penalty will be halved for this assignment. However, the last day to turn this assignment in is Wednesday, May 12. This is a hard deadline !

Reading: None.


The files you must submit for this homework are exactly: h7main.c, h7hash.h, h7hash.c, and h7.scr.

In the Standish book on page 482 is a table that compares (theoretically) the performance of three hashing methods: open addressing, double hashing and separate chaining.

You are to write a program that experimentally tests this theory for two of these hashing methods, open addressing with linear probing and open addressing with double hashing. That is, you are to write a program which implements these two methods and which counts the average number of probes they use on hash tables with different load factors.

You will experiment with these hashing methods on a hash table of size 512. The hash table will only hold keys (there are no associated values to store).

Hashing and Probing

You should implement the following hash function to hash into your table:
h(K) = K % 512

This h(K) addresses the open addressing part of the methods and will be used with both probing methods. As for the probing methods, they are the 2 common collision resolution strategies, each of which is determined by a probe decrement p(K). The first, a linear probe, uses p(K) = 1. The second, a double hashing method, uses p(K) = K % 512, if K is odd, and p(K) = (K+1) % 512, if K is even.

In other words, all h(K) does is, given a key to store in the table, it tells what location to first try. If that location is full, a search of the table will be done until a free location is found (or the table is full). To determine how to search, one uses p(K), which, given the same key, tells how much to decrement the location as you search the table (i.e., p(K) does not give a location, but a decrement).

Your main program will print statistics about the number of probes performed for a set of searches. Thus, your hash table must count (and return) the number of probes used for a single key search (since only it has the code that searches the table).

When counting probes:

For this homework, count the number of successful probes (when the key is found in the table) and unsuccessful probes (when the key is not in the table) together. Do not count them separately as in the book.

Main Program

The keys (K's) entered in the table (or searched for) should be numbers between 0 and 100000.

Your program should test the table with two load factors, 50% and 80%. That means run tests where you hash 256 values and also where you hash 408 values into the table.

We have provided 2 sets of files, namely: /cs/course/cs113/current/hw7/keyfile1, /cs/course/cs113/current/hw7/searchfile1, /cs/course/cs113/current/hw7/keyfile2, and /cs/course/cs113/current/hw7/searchfile2. The files ending in 1 are only for the 50% load factor and the files ending in 2 are only for the 80% load factor. The files beginning with "key" contain the keys to add to a hash table. The files beginning with "search" contain the keys to search for.

Note that for each of the 2 load factors, you will:

  1. Start with an empty hash table.
  2. Add the keys in the key file (accumulating how many there are-- don't hardcode the number of keys).
  3. Search for the keys in the search file, keeping track of the total number of probes.

You must repeat the above for both probing methods. Thus, you will have statistics for 2 load factors x 1 hash function x 2 probing decrements = 4 experiments.

The main program should output tables that show: the number of keys searched for, the total number of probes done in the search, and the average number of probes per search for each experiment. Thus, your single main program will contain all 4 experiments and output 2 tables: one for the 50% load factor and one for the 80% load factor. Rows of those tables should be the probing methods, columns should be the 3 statistics we've asked for concerning each experiment.

This main program should be submitted as a file h7main.c.

Hash Module

Your implementations are expected to show what you have learned about data structures. In particular, you should implement a hash table as a data structure. The operations on the table should include creating a table, storing a key, and finding a key. The function that finds a key should take pointers to 2 functions: one that points to the hashing function to use (h(K)) and one that points to the probe decrement function to use (p(K)). Remember, all h(K) does is take a key and return a location. All p(K) does is take a key and return a decrement. It is the storing and finding functions that do the actual searching.

See Chapter 11 of Standish where a variety of different h and p functions are considered. Also see the Pointers to Functions Lab for some useful notes on pointers to functions.

Where to Define Hash and Probe Functions

The main program should contain definitions for the hash function and the 2 probe decrement functions that are to be passed on to the store and find functions.

Make File

We have provided you with a make file for this assignment as /cs/course/cs113/current/hw7/h7makefile.

Test Runs

In the script you submit (h7.scr), display all source code (with cat), show a compilation (using our make file), and a run of your program (in that order).