Comp 271 Week 11

Lab 7: some problems with the case of inserting into an empty list
A simple manual test (can be automated, but note that it can be hard to tell if the problem is with add() or equals())

Trees in BlueJ

Height of a random tree of N nodes: Let a = 4.311....
Then the height of a random binary search tree with N nodes is a*ln(N). A "full" tree has height log2(N), and ln(N) = log2(N)*ln(2), so the height of a random tree is a*ln(2) = 2.988... times the height of a perfectly balanced tree.

The factor of three here we can definitely live with. What we cannot live with is the worst-case height of a binary search tree, which is N.

Heaps

Adding new elements to a complete heap while retaining completeness.

Recall that Heapsort makes use of the following three observations:
  1. We can make an array of data into a complete heap in time n log n
  2. We can remove the root element (replacing it by promoting the smaller of its two children, and continuing this process down the branch) in time log n
  3. In light of 2, we can remove all n elements in sorted order in time n log n
Last week I omitted the complete from the first claim.

The idea of the algorithm for #1 is to store the heap in "array format", that is, with the left and right children of node n at positions 2n+1 and 2n+2 respectively. The algorithm then is to add the new element to the end of the array, and look at the path from the root to that element. The new element might be smaller than its parent; if it is, we simply swap the two (in the array). This makes the previous parent even smaller, so nothing needs to be done with the other child. However, we keep swapping on the path up until heap-ness is restored.

Associative Lookup

A classic compilers problem is writing a symbol table that allows lookup of each identifier, with its attributes (type, size, allocated location, etc). More generally, the problem is associative lookup: given some Key value, find the corresponding Data.

Demos: in demo/bintree2, the full word-count program.

problems: too much exposure of DataTreeNode, which really should be private to DataTree?
    How would we make it private?

Look at the Map interface, and hashmap.

Do the word-count program with hashmap (demo/wordcount/HMWordCount). Compare them.

Can we optimize the lookup/insert sequence in some more natural way?

Lab 9: try to do this!

Thursday

Word class

Discuss the following:

static v nonstatic

There's a problem here: some methods that should be static, logically, cannot be, because they refer to the parameter types. A good example would be the wordcount DataTreeNode.recSize:
    private int recsize(DataTreeNode<K,D> p) {
        if (p==null) return 0;
        return recsize(p.left()) + 1 + recsize(p.right());
    }
This method has no ties to any class fields, and so should normally be declared static. But we can't do that, because it has type parameters.

Ordered trees

Are we making any advantage of the fact that the key type K in our TMap is ordered, and that the tree is too?
Look at java.util.SortedMap. How it differs from java.util.Map:

MapDemo and order

Demo: the mapdemo example. What is going on with the HashMap order?

Note that I cannot find any formal documentation guaranteeing that the sets returned by keySet() and values() are sorted! One can do the following:

    class SortedSet extends Set ....

    interface Map {
          Set keySet();
          ...
    }

    interface SortedMap extends Map {
          SortedSet keySet();
          ...
    }

This should work; see my mapdemo() class. Dunno why Sun didn't take this route....

Demo: using inspect to figure out the structure of a HashMap

Comparators

java.util.Collections:
public static <T> void sort(List<T> list, Comparator<? super T> c)