Comp 271 lab 0x0A - hash maps
Goals
- implement a hash map
- optimization through caching
Overview
Start with the hashmap.zip
project. You are to implement the by-now-usual TMap interface using the hashtable approach:
- get(K key)
- put(K key, D data)
- containsKey(K key)
- clear
- size
With these implemented, you should be able to run the
WordCount.wordCounter() method and (with the definition of WHITESPACE
that I included) get 375 unique words in the Wirth essay. I've also
provided you the file small.text, for debugging; it has 10 unique
words and 20 words total.
When using a hashtable and you want to look up a key, you do the following:
-
calculate int hashcode = key.hashCode()
- calculate the position in the table, hashindex = hashcode % hashTable.length;
- Search the list of Nodes pointed to by hashTable[hashindex];
Due to a problem with the possibility that hashcode could be
negative, I've given you a hindex(K key) method that calculates the
table index. Note that the Node class definition is a simple
inner-class defined in
HashMap.java. It represents a singly linked list; new entries are
always to be inserted at the front of the list: hashTable[hashindex] =
new Node(key, data, hashTable[hashindex]). Lists are searched using the
searchKey(K key) method, also provided.
Additional features
You are also to implement an iterator for the hashmap class; you proceed linearly along each list in sequence. The hasNext() is a little tricky; you have to find
the next entry to tell if it's there. (It's also possible to do this by
keeping careful count of how many entries have been returned, versus
size()).
The trickiest part is what to do when you've reached the end of one of
the linked lists; I've given you findNonNull() that moves to the next
nonempty "bucket", or to the end of the table.
Finally, for extra credit you can implement table resizing.
If the table is too big, space is wasted; if it is too small, the lists
are too long. I've provided a resize() method as a "hook"; it is called
when size() exceeds 50% of hashTable.length. This 50% ratio is
sometimes called the load factor;
the java.util.HashMap implementation resizes when the load factor is
75% by default but you can set this in one variant of the constructor.
Note that when resizing, you simply create a new table, and re-enter
everything in the old table. You can recalculate the hashCode of every
key value, but caching this value in the Node makes for faster resizing.
To submit your project, create a zipfile and email it to me at
pld@cs.luc.edu.