Bailey, Chapter 6, p 119
Morin, Chapter 11, p 225 (O(n log n) sorts only)
Suppose you have an array A of size N that is sorted, so i<j => A[i]<A[j]. Then finding an element can be done in time O(log N).
Generally speaking, O(log N) means "growing only very slowly with N". Casually, O(log N) can be seen as "almost constant".
N | log2(N) |
100 | 7 |
1000 | 10 |
100,000 | 17 |
1,000,000 | 20 |
1,000,000,000 | 30 |
It doesn't really matter what base we use; a change of base just introduces a constant of proportionality. It is often convenient for visualization, however, to use base 2.
To search for value X in log(N) time, we keep dividing the array in half:
lo=0; hi=N-1
while (lo < hi) {
mid = (lo+hi)/2;
if (X<A[mid]) hi = mid-1;
// search A[lo]...A[mid-1]
else if (X>A[mid]) lo=mid+1; //
search A[mid+1]...A[hi]
else lo = hi = mid;
// found
}
Suppose this is the array, and we are searching for X=11.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
2 | 3 | 5 | 8 | 13 | 21 | 34 | 55 |
Initially we have lo=0 and hi=7, so mid=3. X>A[3], so we set lo=4.
Now lo=4 and hi=7, so mid=5. We have X<A[5], so hi=4, at which point the loop stops.
Now let us search for X=25:
lo=0, hi=7: mid=3, X>A[mid] so lo=mid+1
lo=4, hi=7: mid=5, X>A[mid] so lo=mid+1
lo=6, hi=7: mid=6, X<A[mid] so hi=mid-1
lo=6, hi=5
Note in this case the loop terminates with lo>hi.
It is important to understand why the number of times through the loop here is log2(N).
Sometimes it is helpful to introduce a loop invariant here: the statement that either X is to be found in the range A[lo]..A[hi], or else X is not present in A. With this in mind, we can eliminate the second comparison above (X>A[mid]).
This is one of relatively few elementary examples of a loop that is hard to write correctly without an invariant.
Another thing to keep in mind here is that lo≤mid<hi. However, lo==mid will occur if hi=lo+1. So in the following loop, we arrange for the search alternatives to be lo..mid and mid+1..hi. If we instead arrange for the search alternatives to be lo..mid-1 and mid..hi, then the loop can fail to terminate! That is, we can have hi=lo+1, so mid=lo, and so searching mid..hi is the same as searching lo..hi, and so we keep searching lo..hi forever.
lo=0; hi=N-1
while (lo < hi) {
mid = (lo+hi)/2;
// ranges to be searched should be
lo..mid and mid+1..hi, both of which are SMALLER than lo..hi
if (X>A[mid]) lo = mid+1; //
search A[mid+1]...A[hi]
else hi = mid
// search A[lo]...A[mid]
}
public void ssort() { for (int i = 0; i<currsize-1; i++) { // find smallest of elements[i]..elements[currsize-1] and swap to position i int index_min = i; string curr_min_val = elements[i]; for (int j = i+1; j<currsize; j++) { if (String.compareTo(elements[j],curr_min_val) < 0) { curr_min_val = elements[j]; index_min = j; } } swap(i,index_min); } }For the TList<T> version, we need two things. First, the TList<T> class must require that T implement the IComparable interface:
public void ssort() { for (int i = 0; i<currsize-1; i++) { // find smallest of elements[i]..elements[currsize-1] and swap to position i int index_min = i; T curr_min_val = elements[i]; for (int j = i+1; j<currsize; j++) { if (elements[j].compareTo(curr_min_val)<0) { curr_min_val = elements[j]; index_min = j; } } swap(i,index_min); } }
public static void Main(string[] args) { if (args.Length > 0) { LISTSIZE = Convert.ToInt32(args[0]); System.out.format("List size is %d%n", LISTSIZE); } nums = new IntList(LISTSIZE); nums.RandomFill(); nums.ssort(); }In the ssort() method we use Stopwatch to record the time:
// selection sort public void ssort() { Stopwatch s = new Stopwatch(); s.Start(); for (int i = 0; i<currsize-1; i++) { // find smallest of elements[i]..elements[currsize-1] and swap to position i int index_min = i; int curr_min_val = elements[i]; for (int j = i+1; j<currsize; j++) { if (elements[j] < curr_min_val) { curr_min_val = elements[j]; index_min = j; } } swap(i,index_min); } s.Stop(); System.out.format("sorting took %d milliseconds%n", s.ElapsedMilliseconds); }Does the time appear to be quadratic?
Here is the "simplest" partition strategy. It has a flaw. We take a number called the pivotvalue, and divide A[left]...A[right] into two sections, A[left]..A[mid] and A[mid+1]..A[right], so that the first section contains values less than the pivotvalue and the second section contains values greater than or equal to the pivotvalue. Here is the code:
private static int simple_partition(int [] A, int pivotvalue, int left, int right) { while (true) { while (left < right && pivotvalue <= A[right]) right--; // now left == right or A[right] < pivotvalue
while (left < right && A[left] < pivotvalue) left++; // now left == right or pivotvalue <= A[left] if (left < right) swap(A,left,right); else if (A[right] < pivotvalue) return right+1; else return right; // left == right == pivot } }
This and other code can be found in qsort.cs.
One pass through the loop decrements right until it
finds a value < pivotvalue, and increments left until
it finds a value >= pivotvalue. The values are then swapped. When left
and right finally meet, say at mid,
then if i
There is some slight trickiness when left and right meet. If they meet at the end of the first inner while loop, we might have pivotvalue <= A[right] or might not; this is reflected in the test at the end.
Quicksort now looks like this:
private static void quickSortRecursive1(int [] A, int left, int right) // pre: left <= right // post: data[left..right] in ascending order { int pivotindex; // the final location of the leftmost value if (left >= right) return; int pval = (data[left]+data[right])/2; pivotindex = simple_partition(data,pval,left,right); /* 1 - place pivot */ quickSortRecursive1(data,left,pivotindex-1); /* 2 - sort small */ quickSortRecursive1(data,pivotindex,right);/* 3 - sort large */ /* done! */ }
If simple_partition() returns left, then the second recursive call is quickSortRecursive(A, left, right); that is, we have infinite-depth recursion. This will happen if, for example, pivotvalue equals the mimimum value in the array segment. If all the values from A[left] to A[right] are equal, this will happen for any reasonable choice of pivotvalue.
Note that if simple_partition() should return right, the two calls are qSR(A,left,right-1) and qSR(A,right,right); the recursive subcalls are strictly shorter.
The usual way of fixing this is to pick a specific value known to be in the array as the pivotvalue, and also to make sure that, at the end, if the return value is mid, then A[mid] = pivotvalue. This means the recursive calls are qSR(A,left,mid-1) and qSR(A,mid+1,right); these are "safe" even if mid==left or mid==right.
The most common choice of pivotvalue is A[left]. However, if we want to choose A[i] as the pivotvalue for left
private static int bailey_partition(int [] A, int left, int right) { // pre: left <= right // post: data[left] placed in the correct (returned) location
// Random r = new Random(); // needs "using System;"
// int index = r.Next(left, right+1);
// swap(A, index, left); int pivotvalue = A[left]; while (true) { // move right "pointer" toward left while (left < right && pivotvalue < A[right]) right--; if (left < right) swap(A,left++,right); else return left; // left == right == pivot // now pivotvalue = A[right] // move left pointer toward right while (left < right && A[left] < pivotvalue) left++; if (left < right) swap(A,left,right--); // after, A[left] == pivotvalue again else return right; // left == right == pivot } }
Horvick, part 1 p 111 (final page of part 1) has an even simpler strategy. The pivotIndex value is chosen randomly between left and right, inclusive, by the caller. (Horvick's code is for a generic type T; I've replaced that with int. To compare generic values one uses .CompareTo(); I've replaced that with <.)
private int horvick_partition(int [] items, int left, int right, int pivotIndex) { int pivotValue = items[pivotIndex]; Swap(items, pivotIndex, right); int storeIndex = left; for (int i = left; i < right; i++) { if (items[i] < pivotValue) { Swap(items, i, storeIndex); storeIndex += 1; } } Swap(items, storeIndex, right); return storeIndex; }
This partitions the array in a single upwards pass; however, there is a lot more swapping.
3 |
11 |
8 |
18 |
7 |
14 |
5 |
13 |
11 |
3 |
8 |
18 |
7 |
14 |
5 |
13 |
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
11 |
3 |
11 |
18 |
7 |
14 |
11 |
13 |
11 |
3 |
7 |
11 |
18 |
14 |
11 |
13 |
private static IntPair morin_partition(int[] A, int left, int right) { int pivot = A[left]; // Morin actually chose A[left + rand.Next(right-left+1)] int lo = left-1, j = left, hi = right+1; // A[left..lo]Try this on an array with some duplicates. How about {11,3,11,18,7,14,11,13}?< pivot, A[lo+1..j-1] = pivot, A[hi..right] > pivot while (j < hi) { if (A[j] < pivot) { // move to beginning of array
lo+=1; swap(A, j, lo);
j+=1; } else if (A[j] > pivot) {
hi-=1; swap(A, j, hi); // move to end of array } else { j++; // keep in the middle } } return new IntPair(lo,hi); }
int pivot = bailey_partition(A,left,right); // now A[pivot] is (pivot-left)th smallest if (K == pivot-left) return A[pivot]; else if (K < pivot-left) { // Kth-smallest must be among A[left]..A[pivotindex] return findKth(A,left,pivot-1,K); } else { return findKth(A, pivot+1, right, K-(pivot-left)-1); }See median.cs.