C307 / I308 Data Representation

C307 I308 Homework 10

Due date: Monday, April 4, 2022.

Ex. 1. Hash Table

a. Project

Create a project called hw10. Add a class called Node that is part of a package called hashTable. Add the following classes to this package:

WordCounter.java
Porter.java
ListWordC.java
HashTable.java
Main.java

In addition, download the following file and store it in the project folder (in explorer):

stopWords.txt

The program inputs a list of words from the user, terminated by "||". The program is performing an indexing of these words using two hash tables and outputs the result.

First, the words that are too common in the English languages, taken from the file stopWords.txt, are discarded. One hash table is used for this purpose.

Second, each entered word is processed using the Porter Stemmer or transformation to extract the root or the stem. Here is some information about this algorithm:
Porter Stemmer (https://tartarus.org/martin/PorterStemmer/def.txt)

Third, the stems thus obtained are counted and output. A second hash table is used for this purpose.

The program should compile and provide some output, but the indexing result will be empty. This is because some functionality is missing from the HashTable and the Main classes.

b. Hash Table Methods

Complete the implementation of the class HashTable with the 4 functions that don't have a proper implementation yet: hashing, access, remove, and statistic. Inside the body of these functions you'll find a comment saying that the code must be supplied by the student. Read the comments in front of each function carefully for details about their implementation.

The hashing function contains some primitive code that indexes all the words by the first letter, for testing purposes. You must replace that with a better function. You can choose any function you want. I suggest selecting one of the functions recommended as "good" by the the notes (chapter 5).

c. The Main Class

Complete the function indexWords according to the comment starting with "For the student to add". Implement the function increment in the same class.

d. Hash Function Comparison (* 3 extra credit points)

Make some experiences with 3 different hashing functions and choose which one is the best. Write a small paragraph to report the results of your experiences and to explain your reasons why you think the function you chose is the best one. Include the code of all 3 functions, with two of them commented out - leave the best one uncommented. Either create a text file with the comments on the experiments, or write it as a comment to the homework submission using the Canvas text box.

Homework Submission

Upload the files HashTable.java and Main.java, as well as the text file for part d) if applicable, to Canvas - Assignments - Homework 10.