Comp 271 lab 9 - tree maps
Goals
- use trees to implement a Map interface
- optimization through caching
Overview
Start with the treemap.zip
project. You are to:
- Rewrite the DataTree class so that it implements my TMap interface.
- Modify the WordCount class so that it uses the new interface (and in particular no longer uses DataTreeNode at all).
- Attempt to improve the performance of a failed lookup followed by an insertion by caching the value of the appropriate node at the end of the lookup.
- Test whether your program runs noticeably faster with the cache enabled.
Parts 1 and 2 should be straightforward. Note that part of the goal is
to make it possible for DataTreeNode to be private to DataTree; class
WordCount should no longer refer to DataTreeNode. Note also that you
should be able to write containsKey(), get(), and put() in terms of the
existing search() and insert() (which might then be made private!)
The difficulty referred to in #3 is that the typical pattern when
dealing with a previously unseen word will be to lookup the word with
get(), and then insert it with put() when the get() fails (returns
null). This means we have to work down through the tree twice:
once in the failed search, and once in the subsequent insert. One
approach to solving this is to note that when this happens, the failed
get() and the following put() are almost always consecutive. So what
you might do, after a failed search, is to record
the failed key and the appropriate DataTreeNode at which the key would
have been inserted, and then on insert check the key first. You could
improve this with a boolean variable prevGetFailed, as that would be a faster test.
For #4, the question is whether this caching is helping any. It might
not! But in order to get a meaningful comparision, be sure you read all
the words from the file into memory (say into an ArrayList) before starting the timing runs. That is, do something like
while (true) {
String s = wordstream.getWord().toLowerCase();
if (s==null) break; // done
wList.add(s);
}
Do at least five tree-building runs in the same session, using the same saved data.
To submit your project, create a zipfile and email it to me at
pld@cs.luc.edu.