Comp 271 lab 9 - tree maps

Goals

Overview

Start with the treemap.zip project. You are to:
  1. Rewrite the DataTree class so that it implements my TMap interface.
  2. Modify the WordCount class so that it uses the new interface (and in particular no longer uses DataTreeNode at all).
  3. Attempt to improve the performance of a failed lookup followed by an insertion by caching the value of the appropriate node at the end of the lookup.
  4. Test whether your program runs noticeably faster with the cache enabled.
Parts 1 and 2 should be straightforward. Note that part of the goal is to make it possible for DataTreeNode to be private to DataTree; class WordCount should no longer refer to DataTreeNode. Note also that you should be able to write containsKey(), get(), and put() in terms of the existing search() and insert() (which might then be made private!)

The difficulty referred to in #3 is that the typical pattern when dealing with a previously unseen word will be to lookup the word with get(), and then insert it with put() when the get() fails (returns null). This means we have to work down through the tree twice: once in the failed search, and once in the subsequent insert. One approach to solving this is to note that when this happens, the failed get() and the following put() are almost always consecutive. So what you might do, after a failed search, is to record the failed key and the appropriate DataTreeNode at which the key would have been inserted, and then on insert check the key first. You could improve this with a boolean variable prevGetFailed, as that would be a faster test.

For #4, the question is whether this caching is helping any. It might not! But in order to get a meaningful comparision, be sure you read all the words from the file into memory (say into an ArrayList) before starting the timing runs. That is, do something like
        while (true) {
            String s = wordstream.getWord().toLowerCase();
            if (s==null) break; // done
            wList.add(s);
        }
Do at least five tree-building runs in the same session, using the same saved data.

To submit your project, create a zipfile and email it to me at pld@cs.luc.edu.