Recursion

starts at Bailey page 94


We saw a recursive version of the gcd(m,n) function earlier.

        private static int rgcd(int a, int b) {
                if (a<0) return gcd(-a, b);
                if (a==0) {
                    if (b==0) return 1;
                    return b;
                }
                if (b<a) return rgcd(b,a%b);
                return rgcd(b%a, a);
        }

How about the following recursive version of factorial(n)?

    int factorial(int n) {
       if (n==0) return 1;
       return n*factorial(n-1);
    }

See factorial.cs

(Bailey uses a recursive function sum3, p 95, to add the numbers 1..n instead of multiply them)

Note this mirrors the following definition of n!:
    0! = 1
    n! = n*(n-1)! for n>0

Demo: factorial(10),
    factorial(11) = 39916800
    factorial(12) = 479001600
    factorial(13) = 1932053504
    factorial(14) = 1278945280
    factorial(15) = 2004310016
    factorial(16) = 2004189184
    factorial(17) = -288522240

For that matter, try factorial(34).

What is wrong? Which of these are correct?

We can do a little better by replacing int with long (including as the type of retval). Then we see factorial(17) = 355687428096000, and factorial(13) = 6227020800.

Then there is factorial(-1). What goes wrong here? factorial2() is an attempted fix, in that it may be important to make sure library methods never involve infinite looping, but there is only so much that can be done to deal with library users who do not follow the preconditions.

Stack-trace demo from Console.WriteLine(factorial3(4));

    public static long factorial3(int N) {
        Console.WriteLine("factorial3 called with N={0}", N);
        long retval;
        if (N==0) retval = 1;
        else {
/*A*/ long fact = factorial3(N-1);
retval = N * fact;
}
Console.WriteLine("factorial3({0}) returns {1}", N, retval); return retval; }

Calling fact(4) basically amounts to the following:
push(4)
push(3)
push(2)
push(1)
prod = 1;
prod *= pop()
prod *= pop()
prod *= pop()
prod *= pop()
return prod

Stack frames for the above:

N   = 4
retval=
return to: WriteLine
N   = 3
retval=
return to /*A*/
N   = 2
retval=
return to /*A*/
N   = 1
retval=
return to /*A*/
N   = 0
retval=1
Atomic case

This is indeed a stack of "frames": at any one moment, only the topmost frame is accessible (where the top is at the bottom here). On return, we pop() the topmost frame, exposing the one below.

The pattern above works for any recursive function of the form

    f(n) = A if n=0; = F(n,f(n-1)) otherwise

In this case, f(4) becomes

push(4)
push(3)
push(2)
push(1)
res = A;
res = F(pop(), res)
res = F(pop(), res)
res = F(pop(), res)
res = F(pop(), res)
return res

factorial tail-recursion

Tail-recusion means that, in the recursive case, it is the recursive value that is returned directly, with no further work done to it. Such recursion can be converted mechanically to an iterative form.

Here is a tail-recursive version of factorial. tr_factorial(n, prod) returns n!*prod. Therefore, tr_factorial(n,1) returns n!:

public static long tr_factorial(int n, long prod) {
    if (n==0) return prod;
    return tr_factorial(n-1, n*prod);
}

That second parameter, prod, acts as "scratch paper". Note that when the recursive call returns, its value is itself returned, with absolutely no further multiplication.


Every recursive function has a base case, or atomic case, that does not lead to further recursion.

Empty base case
    example: printing a vector, Bailey p 97.


Fibonacci problem
   
The fibonacci numbers follow the rule Fn = Fn-1 + Fn-2, for n>2; F0 = F1 = 1.
The recursive version is in fibonacci.cs.

Note what happens with fibonacci(45). It takes about 13 seconds! Why so slow? The result here is 1836311903, which is correct.

Problem: each call results in two recursive subcalls.

But this happens with recursive mergesort, too! Except in that case the two subcalls are each half the size. Here, for N not close to 1, the recursive subcalls are more comparably sized.

Trace demo: fibonacci3(8)

The number of subcalls for fib(n), in fact, is O(phiN), where phi = (sqrt(5)+1)/2 (demo with callcount).
(It turns out that the number of subcalls, NC(n), follows the same rule as fib(n):
    fib(n) = fib(n-1) + fib(n-2)
    NC(n) = NC(n-1) + NC(n-2)
)

How do we fix this? fibonacci2(91) (what happens at fibonacci2(92?)


Mathematical Induction

The principle of mathematical induction is closely related to recursion. In recursion, we use F(n-1) to calculate F(n). In induction, we prove statement P(n) using statement P(n-1) as a hypothesis.

Suppose we want to prove 1+2+...+n = n*(n+1)/2 = P(n).
Or 12 + 22 + 32 + ... + n2 = n*(n+1)*(2n+1)/6 = Q(n) = n3/3 + n2/2 + n/6.
(P(n) and Q(n) are defined to be shorthands for the righthand sides here)

Step 1: verify for n=0, or n=1.
Step 2: assuming the claim for n, prove it for n+1. That is, assuming

    12 + 22 + 32 + ... + n2 = Q(n),       CLAIM(n)
prove
    12 + 22 + 32 + ... + n2 + (n+1)2 = Q(n+1).       CLAIM(n+1)

If we start from our CLAIM(n) and add (n+1)2 to both sides, we get

    12 + 22 + 32 + ... + n2 + (n+1)2 = Q(n)+n2+2n+1.

Similarly, Q(n+1) = (n+1)3/3 + (n+1)2/2 + n+1 = (n3/3 + n2 + n + 1/3) + (n2/2 + n + 1/2) + n/6 + 1/6
          = n3/3 + n2/2 + n/6 + n2 + 2n + 1 = Q(n)+n2+2n+1.

Question: where did this formula come from? The 1+2+3+...+n formula at least has a natural derivation: add it to itself in reverse order, and you have n columns where each column adds up to n+1. The doubled total is thus n*(n+1), and the original sum is thus P(n) above.

Induction is tricky: note exactly where we assume the n-stage hypothesis at the start.


Postage-stamp problem, p 98

Suppose you want change in stamps, either 44 cents or 28 cents or 1 cent. For a given amount, what is the fewest number of stamps you can receive as change?

Note that the obvious approach of getting as many 44's as possible, then as many 28's as possible, etc, is not optimal in all (or even very many) cases. For example, if you need 60 cents back, that is either 1 44-cent stamp and 16 1-cent stamps (total of 17) or two 28's and four 1's (total of 6).

Strategy: if we have N cents to make change for, then the optimum is one of the following:
So we calculate stampCount(N-44), stampCount(N-28), and stampCount(N-1), take the minimum, and add one more stamp (which stamp depends on which subproblem was the minimum).

simple recursive version: each call generates three subcalls
efficient version: creating an array of all values so far: an answer cache, Bailey p 99

Some sample results from first, inefficient, version:
stampCount(200) = 6:     404061476 calls, 1811 ms
stampCount(201) = 7:     443663991 calls, 1976 ms
stampCount(202) = 8:     487148470 calls, 2196 ms
stampCount(204) = 5:     587322736 calls, 2614 ms
stampCount(206) = 7:     708097261 calls, 3189 ms
stampCount(208) = 9:     853706337 calls, 3856 ms
stampCount(210) = 11: 1029255633 calls, 4699 ms

Note the time/space tradeoff, rather decidedly in favor of less time. The answers are small, but it takes a long time to get there!

Which version is this?
    stampCount(1000) = 26:  2931 calls, 0 ms.

My source code is in stampchange.cs.


Expressions

Expressions involving unary and binary operators can be expressed neatly as trees:

2*(3+5)
         *
/ \
2 +
/ \
3 5

This is a binary tree. Expressions like 2+3+5 can always be made binary with appropriate grouping: (2+3)+5. Sometimes, however, it's convenient to think of this as a multiway tree. Here's such a tree for 2*(3+5+7):

         expr
/ | \
2 * expr
|
┌──┬──┼──┬──┐
 | | | | |
3 + 5 + 7

For the moment, let's treat these kinds of trees just as notation, and not worry about how we'd actually implement then.

In the binary case, a leaf node holds just a numeric value. A non-leaf node holds an operator, and two subtrees representing the left and right operands. In the multiway case, a node contains a list of alternating values and operands, with an operand between each pair of consecutive values. Values are either numbers or subtrees (corresponding to subexpressions).

These are recursive definitions of trees. So far we are not defining this as an actual C# data structure.

expr_eval

Let us write a program to evaluate integer expressions. Expressions consist of punctuation and numbers; collectively these are called tokens and will be implemented as strings.

The first step is the Tokenizer class, which in this case takes a one-line string and returns the tokens after skipping over any whitespace characters. When the end of the line is reached, null is returned (this turns out to be a little awkward, because if tok is null, then a call like tok.Equals("+") is illegal). The Tokenizer class contains a method

       public static bool isNumber(string t)

It should conceptually be static, but it's only used in reference to the existing tokenizer and so this doesn't matter. The Tokenizer object in the main program will be called t, so t.token() returns the next token. It's easier to write t.isNumber(s) than Tokenizer.isNumber(s).

The syntax for expressions will be defined by the following grammar, in which '::=' separates the item defined from its definition, '|' separates alternatives and '{' and '}' enclose parts that can be repeated 0 or more times:

    expr   ::= term { addop term }
    term   ::= factor { mulop factor }
    factor ::= number | '(' expr ')'
    addop ::= '+' | '-'
    mulop ::= '*' | '/' | '%'

This is so-called "extended Backus-Naur form", or EBNF, where the "extended" refers to the introduction of { } as a repetition mechanism and [ ] as an option mechanism. A slightly more natural grammar is below (in "plain" Backus-Naur form), but it makes it harder to tell what is going on by looking only at the very first token:

    expr   ::= term | term addop expr
    term   ::= factor | factor mulop term
    factor ::= number | '(' expr ')'

Demo: building parse trees for 2*(3+5).

The EBNF grammar, with the { }, is better suited to use with the multiway trees earlier. Alternatively, we can make all the operators binary by enforcing a rule for grouping, though this gets more complicated than we'd like.

The assumption in parsing is that the next token is always available in variable theToken; this is known as one-symbol lookahead. The points where the parser must make a decision are (for the first grammar):

In each case, we can make the decision based only on theToken: in the first case, if theToken is an addition operator, we parse "addop term"; the second case is similar except with multiplication operators. In the third case, we decide based on whether the token is a number or the '(' string; anything else is an error.

When parsing something that represents a concrete token, eg number or ), we simply call theToken = t.token(). When we know exactly what token we are looking for, it is handy to use accept(tok), which if the current token matches tok advances to the next token, but otherwise calls an error.

The actual parsing is done by one method corresponding to each grammar symbol above: exprEval(), termEval() and factorEval(). The idea is that any grammar symbol on the right side of ::= is replaced with a call to its method; anything enclosed in { and } is replaced with a while loop that runs so long as theToken is a legitimate first symbol of whatever is in the { and }. For the first grammar rule, this is the set of addition operators. In brief outline, exprEval does this:

    termEval()
    while (theToken in addops) {
        accept(addop);            // not literally right
        termEval()
    }

Now let's look at the actual code for each one.

The Point: this would be exceedingly difficult to write without using recursion!

Syntax errors

What if there is a syntax error? We will throw an exception: SyntaxException. Note:

We also catch the pre-declared exception System.DivideByZeroException. 

There are several advantages to throwing exceptions rather than using Boolean flags or other parameters to carry error information. Mainly, the code is simpler.

The source code is in oneline_tokenizer.cs and expr_eval.cs.

expr_assign

Now let's extend this to allowing variables in factors, which are set with

    stmt  ::= 'set' ident '=' expr | expr

That is, a line can be either

    set foo = 3+2*6
or
    foo+1

So stmtEval() can either return a value (the second case) or not (the first). We have it return something of type int?, which is an object that is basically an int but which can also be null. Look at the code to see how this works. Such types are sometimes called Nullable.

Without the set keyword, parsing lines like these is much more difficult; we have to look ahead to see if the second token is "=" or not. This is trickier than one might think.

This code is in oneline_tokenizer.cs and expr_assign.cs.

The next step is the mini-Java compiler. See compiler.html.

Virtual machine, stack architecture
mini-java syntax
Extended Backus Normal Form (EBNF)
EBNF => recursive-descent parser code
    accept()
global variables
local variables (including in main())
expressions
scope
symbol tables
CompileExprN
common-subexpression handling