Comp 271 Week 3
Vector
Bailey in §4.2.2 does a similar example to MyList; he calls it Vector. Some differences:
- Vector uses an additive capacityIncr; that is, we add
a certain amount whenever we need to grow. Note though that if
capacityIncr = 0, then we fall back to doubling each time we need to
grow.
- remove(int index) and remove(EltType element) are defined. The
first removes the element by position, the second by the element's own
identity.
- contains() replaces our indexOf; it is simpler.
- Vector extends AbstractList. What would be involved in getting MyList to do that?
- Note how much of Bailey's library has to be pulled in just to get Vector to work. There is a simpler way: java packages, and the creation of precompiled Java ARchive files called jar files and ending with ".jar"
SetVector: §3.7
using vectors/Mylists to implement
an abstract Set. Note the more limited set of operations; there is no
get() and no set(). (There is still an iterator, but there are no
guarantees of the order returned.) Note how many other things I had to pull in to get this to work.
add() now works very differently. Could we improve addAll()? On the
face of it, to form the union of two sets A and B of size N, we need N2
equality comparisons: each element of A has to be compared with each
element of B to determine if it is already there. This cost is
sometimes said to be O(N2) if we don't care if it's N2, or N2/2, or 3N2.
Later we'll make this faster with hashing.
Brief summary: choose a relatively large M, maybe quite a bit larger
than N. Define h(obj) = hashcode(obj) % M. Now choose a big array ht
(for hash table) of size M, initially all nulls. For each a in A, do
something with ht[hash(a)] to mark the table. Then, for each b in B, if
ht[hash(b)] is still null, put it in; it's not a duplicate! If
ht[hash(b)] is there already, then we have to check "the long way", but in general we save a great deal.
Is there an intersect option?
Pre- and Post-conditions
A precondition is something that must
be true before a method is
executed, or else we're not guaranteed sensible results. Virtually all
API documentation is filled with preconditions and postconditions.
Bailey's Assert library has methods for this. When an error occurs, an exception is thrown. Note how this would be a simpler way to handle array-bounds checking!
Example: see the use of pre- and post-conditions in the Vector class of Chapter 3, pp 51-52. They're not set in code! However, some preconditions appear in the actual library, commented out.
Matrix
Things to note:
-
We're using ArrayList to build a 2-D structure.
-
There's no analogue to ArrayList.add(E e); we have to add entire rows
or columns or else the Matrix will no longer be neatly rectangular.
Note that we add new rows and columns "empty", that is, populated with
nulls.
- Not
having an add() is a problem for being a Collection. In fact, if you
look at the javadoc page for java.util.Collection, add() is there
listed as optional. This is why. There is no corresponding issue with Iterator.
(BTW, if you declare Matrix to implement Collection, and don't want to
include an add() method, your best bet is to include it, but have it
throw the UnsupportedOperationException; see the Collection javadoc for
more detail.)
- Because the generic class uses ArrayLists, not arrays, we don't
have any problem using the element type E directly throughout. When we
created the Vector and MyList classes, we had that annoying need to use
Object when creating arrays even when we wanted EltType.
- Matrix.print(int
fieldwidth) is a handy way of generating output. Note the parameter.
Because of the parameter, making this into toString() is tricky.
- Note how Matrix.print() actually works.
- java.util does not in fact include a Matrix type.
- addRow() versus addColumn()
- How do we know all the rows are the same length?
Thursday
Matrix questions
Iterator note: iterators seldom involve loops internally. They are used in loops, but they do not generally contain them.
A simple
example of a precondition is that the function Math.sqrt(double x)
requires
that x>=0. The postcondition is something that is true afterwards,
on the assumption that the precondition held (in this case, that the
value returned is a "good" floating-point approximation to the square
root of x). Note that sometimes precondition X is replaced in java with
the statement that "an exception is thrown if X is false"; this is
probably best thought of as amounting to the same thing.
An invariant is a statement that is both pre and post: if it holds at the start, then it still holds at the end. The classic example is a loop invariant.
int sum = 0;
int n=0;
while (n<=100) { // invariant: sum = 1+2+...+n
n += 1;
sum += n;
}
We're not going to obsess about these, but they're good to be familiar
with. Most loop invariants are either not helpful or are hard to write
down; sometimes, however, they can really help clear up what is going
on.
Back to the Ratio class
The gcd() method is recursive: it calls itself. How could we create an iterative (looping) version? Here's one possibility.
// pre: a>=0, b>=0
int gcd(int a, int b) {
while (a>0 && b>0) {
if (a>=b) a = a % b;
else b = b % a;
}
if (a==0) return b; else return a;
Is there an invariant we can use here? Basically, the gcd of a and b never changes. How do we write that?
Chapter 5: analysis
Big-O notation: a function f(n) is said to be O(g(n)) for some other function g(n) if, eventually, f(n)<= C * g(n). See Figure 5.1 on page 83.
O(n), O(n2). Some examples:
O(1): appending to the end of an ArrayList
O(n): inserting in the "middle" of an ArrayList requires O(n) moves. (What is the "middle")?
Recall that 1+2+...+n is O(n2).
Searching an ArrayList for a randomly located value requires O(n) comparisons.
Adding an element to a SetVector takes O(n) comparisons
Taking the union or intersection of two sets is O(n2).
Building a list up by inserting each element at the front (or inserting each element at random) is O(n2). (This is the last example on page 87.)
Finding if a number is prime by checking every k < sqrt(n) is O(n1/2).
Binary Search
Suppose we're searching for something in a sorted
array A with N elements? Say, an array of String in alphabetical order?
How can we use the fact that things are in order? It's more or less the
same way we look things up in the dictionary.
// searching for the i for which A[i] = x
int search(int x) {
int lo = 0, hi = N-1; // we know that either A[i]=x and lo<=i<=hi, or else x is not found.
do {
int mid = (lo+hi)/2;
// lo <=mid<hi;
if (A[mid] == x) return mid;
if (A[mid]<x) lo =
mid+1; // upper half; lo <=hi
else hi =
mid-1;
// could result in hi =
lo-1!
while (lo < hi);
if (lo==hi && a[lo]==x) return lo;
return -1;
There are two issues here: one is the invariant
for the loop, which is the first comment above, and the other is that
the number of times we can divide N in half before the loop terminates
is log N. (Strictly speaking, that would be log2(N), but it doesn't much matter.)
We will frequently encounter algorithms with running time O(log N) or O(N log N).
Table of Factors
This is the example on page 88. Let us construct a table of all the
k<=n and a list of all the factors (prime or not) of k, and ask how
much space is needed. This
turns out to be n log n. The running time to construct the table varies
with how clever the algorithm is, it can be O(n2) [check all i<k for divisibility], O(n3/2) [check all i<sqrt(k)], or O(n log n) [Sieve of Eratosthenes].
Space in a string
The answer depends on whether we're concerned with the worst case or
the average case (we are almost never interested in the best case). If
the average case, then the answer typically depends on the probability
distribution of the data.
More complexity
A function is said to be polynomial if it is O(nk) for some fixed k; quadratic growth is a special case.
So far we've been looking mainly at running time. We can also consider space needs.