CS 112 Summer 2, 2020 -- Homework Six
Due Friday, August 14th @ midnight with 24-hour grace period
The submission time for the entire assignment is the submission time of the last file submitted.
Introduction
You must observe the following requirements (for all homework submitted in this course):
- All programs should be literate, i.e., easily understandable by a human (say, your grader) and follow the Java Style Guidelines for CS112 posted on the class web site;
- You may not use data structure libraries such as ArrayList, since we are learning to write Java from the ground up and you must learn how these libraries are built; however, you are free to use (unless specifically directed otherwise) the basic libraries String, Character, Scanner, and Math;
- You may freely use code from the class web site, the textbook, or lecture (unless specifically directed otherwise) as long as you cite the source in your comments; but you may NEVER use code from the web or other students' programs---this will be considered plagiarism and penalized accordingly.
- There is only ONE problem for this homework, worth 100 points.
Problem: Hash Table Concordance -- 100 pts
For this problem, you will develop a concordance which gives a list of all words occurring
in a text, along with the line number(s) where the word occurs. The input will be single String,
with newline characters \n
at the end of each line. Although a concordance for a book
would give page numbers, in our program we will count line numbers for simplicity.
For example, if you wanted to build a concordance of our Haiku from HW 01, you would
have a string "Arms folded\nTo the moon\nAmong the cows.\n")
and
the concordance produced by your code would be as follows:
among: 2
arms: 0 // note that line numbers start at 0
cows: 2
folded: 0
moon: 1
the: 1,2
to: 1
The interface for your program is here: Concordable.java and the template is here:
Concordance.java. The template is empty except for the Unit Test at the end.
This is the last homework, so you will be expected to think through what is required here and to
use appropriate resources from your previous homeworks, and from lectures and code examples.
The REQUIREMENTS are as follows:
- You must pass all the unit tests;
- Your code should be neat and readable (remove your debugging code unless it is well-designed and useable in the future);
- You must use (a refactoring of) your solution to
Index.java
from HW 05 to put the entries in order when you print out the concordance (I will post my solution after the deadline has passed for HW 05);
- As usual you may not use any Java libraries except as noted at the top of this document.
My SUGGESTIONS are as follows:
- Do as much as possible (i.e., build the whole hash table from the string) in the constructor;
- When inserting a new entry
(word,lineNumber)
into the hash table bucket, insert the word in order, but
do NOT add a duplicate entry if
the word occurs multiple times on the same line, but DO add entries for the same word on separate lines (the first test should make this clear), and add them in order of increasing line numbers;
- Use step-wise refinement: Build the hash table first, test it to make sure it is working, then
create the index, make sure it works, and then put them together, finally making sure
everything is working properly and all tests are satisfied;
- Use the industrial-strength hash function
sfold
for your table; the version given
here will work fine if
you change the type long
to int
;
- Keep the buckets in the hash table ordered by the key/word and then by the line number; since
you are going through the lines of the text in order, you can simply insert entries for
the same word ONLY when you find a larger key/word (so new entries for the same word are
added after earlier entries);
- In order to print out the concordance, since the hash table itself is not in order, you must create a slightly different version of your
Index.java
program from the last homework, which will store the words and the line numbers; one difference is that you want to keep the line numbers in order, so when you insert in the values linked
list, use the "insert in order" method we discussed several times.
Submission:
Files to submit:
- Concordance.java
- Index.java (modified suitably from the last version for hw 05)