Comp 271 Week 11
Lab 7: some problems with the case of inserting into an empty list
A simple manual test (can be automated, but note that it can be hard to tell if the problem is with add() or equals())
Trees in BlueJ
Height of a random tree of N nodes: Let a = 4.311....
Then the height of a random binary search tree with N nodes is a*ln(N). A "full" tree has height log2(N), and ln(N) = log2(N)*ln(2), so the height of a random tree is a*ln(2) = 2.988... times the height of a perfectly balanced tree.
The factor of three here we can definitely live with. What we cannot
live with is the worst-case height of a binary search tree, which is N.
Heaps
Adding new elements to a complete heap while retaining completeness.
Recall that Heapsort makes use of the following three observations:
- We can make an array of data into a complete heap in time n log n
- We
can remove the root element (replacing it by promoting the smaller of
its two children, and continuing this process down the branch) in time
log n
- In light of 2, we can remove all n elements in sorted order in time n log n
Last week I omitted the complete from the first claim.
The idea of the algorithm for #1 is to store the heap in "array
format", that is, with the left and right children of node n at
positions 2n+1 and 2n+2 respectively. The algorithm then is to add the
new element to the end of the array, and look at the path
from the root to that element. The new element might be smaller than
its parent; if it is, we simply swap the two (in the array). This makes
the previous parent even smaller, so nothing needs to be done with the
other child. However, we keep swapping on the path up until heap-ness
is restored.
Associative Lookup
A classic compilers problem is writing a symbol table that allows
lookup of each identifier, with its attributes (type, size, allocated
location, etc). More generally, the problem is associative lookup: given some Key value, find the corresponding Data.
Demos: in demo/bintree2, the full word-count program.
problems: too much exposure of DataTreeNode, which really should be private to DataTree?
How would we make it private?
Look at the Map interface, and hashmap.
Do the word-count program with hashmap (demo/wordcount/HMWordCount). Compare them.
Can we optimize the lookup/insert sequence in some more natural way?
Lab 9: try to do this!
Thursday
Word class
Discuss the following:
- file I/O using FileReader/BufferedReader
- WHITESPACE options; redoing the count adding ()"
- how I handle lines
- on-demand lookahead
- EOF handling
- wirth.text: content, what to do if it is not found, exception handler
static v nonstatic
There's a problem here: some methods that should
be static, logically, cannot be, because they refer to the parameter
types. A good example would be the wordcount DataTreeNode.recSize:
private int recsize(DataTreeNode<K,D> p) {
if (p==null) return 0;
return recsize(p.left()) + 1 + recsize(p.right());
}
This method has no ties to any class fields, and so should normally be declared static. But we can't do that, because it has type parameters.
Ordered trees
Are we making any advantage of the fact that the key type K in our TMap is ordered, and that the tree is too?
Look at java.util.SortedMap. How it differs from java.util.Map:
- The iterator-like methods entrySet(), keySet(), and values() are in order.
- firstKey, lastKey
- headMap, subMap, tailMap: the map restricted to a key "interval"
MapDemo and order
Demo: the mapdemo example. What is going on with the HashMap order?
Note that I cannot find any formal documentation guaranteeing that the sets returned by keySet() and values() are sorted! One can do the following:
class SortedSet extends Set ....
interface Map {
Set keySet();
...
}
interface SortedMap extends Map {
SortedSet keySet();
...
}
This should work; see my mapdemo() class. Dunno why Sun didn't take this route....
Demo: using inspect to figure out the structure of a HashMap
Comparators
java.util.Collections:
public static <T> void sort(List<T> list, Comparator<? super T> c)