BU CAS CS 113 - FAQ

A.
Hash is sometimes used as short for hash table. Of course, it also refers to the method of taking a key and determining what location is the first to try in the table.

Q.
Should we use pointers to hash/probe functions with our store key function?

A.
If you need to hash and decrement in the store function, then pass in pointers to those functions there too.

Q.
Is the storing function just like the finding function?

A.
They could be similar, but counting the number of probes is for finding and not for storing.

Q.
How are the h(K) and p(K) functions used in the main program?

A.
The h's take a key and give the location to place it. The p's take a key and give the decrement value. The main program obviously won't put keys in the table itself since that would violate the data structure interface. Instead, it only defines those h and p functions and passes them on to the hash table functions.

Q.
Do we need several hash tables for this program since there are several experiments to run?

A.
Yes, or reuse the same hash several times (you'll have to figure out how to clear the hash after each use).

Q.
Must we use ADTs and CDTs for this homework?

A.
That is not required, although you must choose a good design. You may use ADTs/CDTs if you like.

Q.
Must we include a function to delete keys from the table?

Q.
When looking for a key that was placed in the hash, what if the key looked for is stepped over during the probe decrement part and never found?

A.
That can't happen, since you use the same method to enter the key. I.e., you'll travel along the same path when you "enter" and "search" for the key, so either you find it or it's not there.

I'm trying to find a key. If I hit an empty location, is there not a possibility that the key may be in another location in the table? Why would I want to stop looking?

No, that can't happen. Remember, what you do to place a key in the table (i.e., the path you take) is exactly the same as what you do to find it. So, if you run into an empty location, you KNOW it's not there, because otherwise it would have been put in that location.

This is actually nice because it means we don't have to search the whole table for a non-existent key (unless the table is full).

Q.
Is the number returned from the probe decrement function added or subtracted when moving through the table?

Probably the best way to do it is to be consistent with the book and subtract the decrement value (that's why the linear probing decrement is 1, not -1).

Q.
Is it alright if the pointers to the H and P functions are passed to the hash table create function instead of the store and find functions as specified on the web?

A.
That would be fine. Then, you would store the pointers as part of the data structure.

Q.
Does "creating the table" involve anything more than declaring an array?

A.
Like always, there should either be a hash table function that creates and gives back a table (ADT/CDT design) OR that takes a table and initializes it (this 2nd method was what we used with Stacks--before we did ADTs/CDTs). In any case, there should be a type for a hash table.

Q.
You said we shouldn't use an array in the main program?

A.
What some students are doing is creating an array in the main program (i.e., this array is not the hash table), reading all the keys into that array, then going through that array and storing each of those into the hash table. (Perhaps they are doing this to avoid reading the file more than once.) That's not the best design since it assume that the file will have no more than a certain number of keys. So, even if the hash table is implemented using an array, all those details should be hidden from the main program.

Q.
How should we deal with counting the number of probes done in the find function? The three options I see are 1) having the find function return the number of probes, which is illogical because it should return the found entry itself, 2) passing a count variable into the search function as a parameter, which makes for an un-generic and impractical function OR 3) use global variables, which is too sloppy to even consider?

A.
1)
Well, it doesn't necessary have to return the entry found. Since this is a hash table that only stores keys, we know what the value is if its found (its the key we asked for). It might be reasonable to return the number of probes. You can think of ways to combine that answer with the yes/no one (i.e., found it or not). It may not seem generic, but we are essentially implementing a "hash table that provides probe counts" rather than a plain hash table.

2)
True. Unlike the return value, which can be ignored in C, we'd be forcing people to provide the count parameter (by reference) in this design.

3)
Yes, a global variable would be sloppy, plus we could never guess how many hash tables they will want (and thus, how many global variables we'll need).

There is a related solution, however, that is similar. Remember, we've always said that we can add extra fields to our data structures (e.g., like a "count" for a stack or queue). Could you not add an extra field to the hash table data structure to help keep the statistic you need? That way it wouldn't interfere with the parameters or return value of the find function. The only thing you'd have to be careful to do is provide another hash table function to access that field (you don't want to peek into the data structure after all).

If I use an ADT/CDT design, and I want to store pointers to the hash and decrement functions in the CDT, why won't the following work:

hash = HashCreate();
hash->hashFuncP = Hash1;

A.
Since you call Create in the above code, that implies this code is in the main program. Remember that you cannot access the fields in a CDT via an ADT in the main program. You have to assign the address of "Hash1" to the hashFuncP field inside the implementation (the same as for any field in the CDT). So, create a hash table function to which you can pass the address of "Hash1"--that function should set up the field.

In the Destroy function, is it necessary to free the ADT as well as the CDT?

A.
The ADT is not allocated, so you cannot free it. Besides, from the main program you pass in a copy of that pointer, the ADT, to the destroy function. The CDT, of course, should be freed (using that ADT).

Can we assume that all the keys in the searchfile are in the keyfile?

A.
No. Recall that searches for keys that are not in the table (i.e., unsuccessful searches), still require a certain amount of probes until you determine the key is not present--count those too (as the write-up suggests).

Can we just print the table to the screen using the normal printf command? Second, on the average probes per run, should we print out the average to 2 decimal places or something like that?

Should we prototype our hash and probe functions in the main program or should they be prototyped in the hash header file (with a comment that they should be defined in the main program)?

A.
They should be prototyped where they are used. Used, in this case, means "where they are passed along" to other functions.

I'm not sure how to use the pointers to functions when they are stored as part of the CDT?

Then, you can use those types to create variables (i.e., pointers of those types). If you are storing those pointers in the CDT, these will be fields in the CDT.

Then, you need to get the address of the functions (hash or decrement) into these pointers. You'll probably pass them along to a hash table function (Aha! We have a typedef we can use for the type of those parameters). You'll have to call that hash table function with the address of the proper hash and decrement function (remember, the name of a function behaves like its address). Once those addresses are passed in, copy them into the CDT fields.

Later, when you need to call the hash or decrement functions, you access the pointers in the CDT, and use them (as in lab) to call the functions, passing a key parameter (after all, the hashing and decrement functions both take keys).

Frequently Asked Questions about HW7