Recursion
starts at Bailey page 94
We saw a recursive version of the gcd(m,n) function earlier.
private static int rgcd(int a, int b) {
if (a<0) return gcd(-a, b);
if (a==0) {
if (b==0) return 1;
return b;
}
if (b<a) return rgcd(b,a%b);
return rgcd(b%a, a);
}
How about the following recursive version of factorial(n)?
int factorial(int
n) {
if (n==0) return 1;
return n*factorial(n-1);
}
See factorial.cs
(Bailey uses a recursive function sum3, p 95, to add the numbers 1..n
instead of multiply them)
Note this mirrors the following definition
of n!:
0! = 1
n! = n*(n-1)! for n>0
Demo: factorial(10),
factorial(11) = 39916800
factorial(12) = 479001600
factorial(13) = 1932053504
factorial(14) = 1278945280
factorial(15) = 2004310016
factorial(16) = 2004189184
factorial(17) = -288522240
For that matter, try factorial(34).
What is wrong? Which of these are correct?
We can do a little better by replacing int with long
(including as the type of retval). Then we see factorial(17) =
355687428096000, and factorial(13) = 6227020800.
Then there is factorial(-1). What goes wrong here? factorial2() is an
attempted fix, in that it may be important to make sure library methods
never involve infinite looping, but there is only so much that can be done
to deal with library users who do not follow the preconditions.
Stack-trace demo from Console.WriteLine(factorial3(4));
public static long factorial3(int N) {
Console.WriteLine("factorial3 called with N={0}", N);
long retval;
if (N==0) retval = 1;
else {
/*A*/ long fact = factorial3(N-1);
retval = N * fact;
}
Console.WriteLine("factorial3({0}) returns {1}", N, retval);
return retval;
}
Calling fact(4) basically amounts to the following:
push(4)
push(3)
push(2)
push(1)
prod = 1;
prod *= pop()
prod *= pop()
prod *= pop()
prod *= pop()
return prod
Stack frames for the above:
N = 4
retval=
return to: WriteLine |
N = 3
retval=
return to /*A*/ |
N = 2
retval=
return to /*A*/ |
N = 1
retval=
return to /*A*/ |
N = 0
retval=1
Atomic case |
This is indeed a stack of "frames": at any one moment,
only the topmost frame is accessible (where the top is at the bottom here).
On return, we pop() the
topmost frame, exposing the one below.
The pattern above works for any recursive function of the form
f(n) = A if n=0; = F(n,f(n-1)) otherwise
In this case, f(4) becomes
push(4)
push(3)
push(2)
push(1)
res = A;
res = F(pop(), res)
res = F(pop(), res)
res = F(pop(), res)
res = F(pop(), res)
return res
factorial tail-recursion
Tail-recusion means that, in the recursive case, it is the recursive value
that is returned directly, with no further work done to it. Such recursion
can be converted mechanically to an iterative form.
Here is a tail-recursive version of factorial. tr_factorial(n, prod) returns
n!*prod. Therefore, tr_factorial(n,1) returns n!:
public static long tr_factorial(int n, long
prod) {
if (n==0) return prod;
return tr_factorial(n-1, n*prod);
}
That second parameter, prod, acts as "scratch paper". Note
that when the recursive call returns, its value is itself returned, with
absolutely no further multiplication.
Every recursive function has a base case,
or atomic case, that does not lead to further recursion.
Empty base case
example: printing a vector, Bailey p 97.
Fibonacci problem
The fibonacci numbers follow the rule Fn = Fn-1 + Fn-2,
for n>2; F0 = F1 = 1.
The recursive version is in fibonacci.cs.
Note what happens with fibonacci(45). It takes about 13 seconds! Why so
slow? The result here is 1836311903, which is correct.
Problem: each call results in two recursive subcalls.
But this happens with recursive mergesort, too! Except in that case the two
subcalls are each half the size. Here, for N not close to 1, the recursive
subcalls are more comparably sized.
Trace demo: fibonacci3(8)
The number of subcalls for fib(n), in fact, is O(phiN), where phi
= (sqrt(5)+1)/2 (demo with callcount).
(It turns out that the number of subcalls, NC(n), follows the same rule as
fib(n):
fib(n) = fib(n-1) + fib(n-2)
NC(n) = NC(n-1) + NC(n-2)
)
How do we fix this? fibonacci2(91) (what happens at fibonacci2(92?)
Mathematical Induction
The principle of mathematical induction is closely related to recursion. In
recursion, we use F(n-1) to calculate F(n). In induction, we prove statement
P(n) using statement P(n-1) as a hypothesis.
Suppose we want to prove 1+2+...+n = n*(n+1)/2 = P(n).
Or 12 + 22 + 32 + ... + n2 =
n*(n+1)*(2n+1)/6 = Q(n) = n3/3 + n2/2 + n/6.
(P(n) and Q(n) are defined to be shorthands for the righthand sides here)
Step 1: verify for n=0, or n=1.
Step 2: assuming the claim for n,
prove it for n+1. That is, assuming
12 + 22 + 32 + ... + n2
= Q(n), CLAIM(n)
prove
12 + 22 + 32 + ... + n2
+ (n+1)2 = Q(n+1). CLAIM(n+1)
If we start from our CLAIM(n) and
add (n+1)2 to both sides, we get
12 + 22 + 32 + ... + n2
+ (n+1)2 = Q(n)+n2+2n+1.
Similarly, Q(n+1) = (n+1)3/3 + (n+1)2/2 + n+1 = (n3/3 + n2 + n + 1/3) + (n2/2
+ n + 1/2) + n/6 + 1/6
= n3/3
+ n2/2 + n/6
+ n2 + 2n + 1 = Q(n)+n2+2n+1.
Question: where did this formula come
from? The 1+2+3+...+n formula at least has a natural derivation: add it to
itself in reverse order, and you have n columns where each column adds up to
n+1. The doubled total is thus n*(n+1), and the original sum is thus P(n)
above.
Induction is tricky: note exactly where we assume
the n-stage hypothesis at the start.
Postage-stamp problem, p 98
Suppose you want change in stamps, either 44 cents or 28 cents or 1 cent.
For a given amount, what is the fewest
number of stamps you can receive as change?
Note that the obvious approach of getting as many 44's as possible, then as
many 28's as possible, etc, is not optimal
in
all (or even very many) cases. For example, if you need 60 cents back, that
is either 1 44-cent stamp and 16 1-cent stamps (total of 17) or two 28's and
four 1's (total of 6).
Strategy: if we have N cents to make change for, then the optimum is one of
the following:
- optimum for N-44, plus one additional 44-cent stamp
- optimum for N-28, plus one additional 28-cent stamp
- optimum for N-1, plus one additional 1-cent stamp
So we calculate stampCount(N-44), stampCount(N-28), and stampCount(N-1),
take the minimum, and add one more stamp (which stamp depends on
which subproblem was the minimum).
simple recursive version: each call
generates three subcalls
efficient version: creating an array of all values so far: an answer
cache, Bailey p 99
Some sample results from first, inefficient,
version:
stampCount(200) = 6: 404061476
calls, 1811 ms
stampCount(201) = 7: 443663991 calls, 1976 ms
stampCount(202) = 8: 487148470 calls, 2196 ms
stampCount(204) = 5: 587322736 calls, 2614 ms
stampCount(206) = 7: 708097261 calls, 3189 ms
stampCount(208) = 9: 853706337 calls, 3856 ms
stampCount(210) = 11: 1029255633 calls, 4699 ms
Note the time/space tradeoff, rather decidedly in favor of less time. The
answers are small, but it takes a long time to get there!
Which version is this?
stampCount(1000) = 26: 2931 calls, 0 ms.
My source code is in stampchange.cs.
Expressions
Expressions involving unary and binary operators can be expressed neatly as
trees:
2*(3+5)
*
/ \
2 +
/ \
3 5
This is a binary tree. Expressions like 2+3+5 can always be made binary with appropriate grouping: (2+3)+5. Sometimes, however, it's convenient to think of this as a multiway tree. Here's such a tree for 2*(3+5+7):
expr
/ | \
2 * expr
|
┌──┬──┼──┬──┐
| | | | |
3 + 5 + 7
For the moment, let's treat these kinds of trees just as notation, and not worry about how
we'd actually implement then.
In the binary case, a leaf node holds just a numeric value. A non-leaf node holds an operator,
and two subtrees representing the left and right operands. In the multiway case, a node contains a list of alternating values and operands, with an operand between each pair of consecutive values. Values are either numbers or subtrees (corresponding to subexpressions).
These are recursive definitions of trees. So far we
are not defining this as an actual C# data structure.
expr_eval
Let us write a program to evaluate integer expressions.
Expressions consist of punctuation and numbers; collectively these are
called tokens and will be implemented as strings.
The first step is the Tokenizer class, which in this case takes a one-line
string and returns the tokens after skipping over any whitespace characters.
When the end of the line is reached, null is returned (this turns out to be
a little awkward, because if tok is null, then a call like tok.Equals("+")
is illegal). The Tokenizer class contains a method
public static bool isNumber(string t)
It should conceptually be static, but it's only used in reference to the
existing tokenizer and so this doesn't matter. The Tokenizer object in the
main program will be called t, so t.token() returns the next token. It's
easier to write t.isNumber(s) than Tokenizer.isNumber(s).
The syntax for expressions will be defined by the following grammar,
in which '::=' separates the item defined from its definition, '|' separates
alternatives and '{' and '}' enclose parts that can be repeated 0 or more
times:
expr ::= term { addop term }
term ::= factor { mulop factor }
factor ::= number | '(' expr ')'
addop ::= '+' | '-'
mulop ::= '*' | '/' | '%'
This is so-called "extended Backus-Naur form", or EBNF, where the "extended" refers to the introduction of { } as a repetition mechanism and [ ] as an option mechanism. A slightly more natural grammar is below (in "plain" Backus-Naur form), but it makes it harder to tell
what is going on by looking only at the very first token:
expr ::= term | term addop expr
term ::= factor | factor mulop term
factor ::= number | '(' expr ')'
Demo: building parse trees for 2*(3+5).
The EBNF grammar, with the { }, is better suited to use with the multiway trees earlier. Alternatively, we can make all the operators binary by enforcing a rule for grouping, though this gets more complicated than we'd like.
The assumption in parsing is that the next token is always available in
variable theToken; this is known as one-symbol
lookahead. The points where the parser must make a decision
are (for the first grammar):
- expr: deciding whether there is another occurrence of "addop term"
- term: deciding whether there is another occurrence of "mulop factor"
- factor: deciding between the two alternatives
In each case, we can make the decision based only on theToken:
in the first case, if theToken is an addition operator,
we parse "addop term"; the second case is similar except with
multiplication operators. In the third case, we decide based on whether
the token is a number or the '(' string; anything else is an error.
When parsing something that represents a concrete token, eg number
or ), we simply call theToken = t.token(). When we know
exactly what token we are looking for, it is handy to use accept(tok),
which if the current token matches tok advances to the next token, but
otherwise calls an error.
The actual parsing is done by one method corresponding to each grammar
symbol above: exprEval(), termEval() and factorEval(). The idea is that
any grammar symbol on the right side of ::= is replaced with a call to its
method; anything enclosed in { and } is replaced with a while loop that
runs so long as theToken is a legitimate first symbol of
whatever is in the { and }. For the first grammar rule, this is the set of
addition operators. In brief outline, exprEval does this:
termEval()
while (theToken in addops) {
accept(addop);
// not literally right
termEval()
}
Now let's look at the actual code for each one.
The Point: this would be exceedingly difficult to write
without using recursion!
Syntax errors
What if there is a syntax error? We will throw an exception:
SyntaxException. Note:
- Definition of class SyntaxException
- try-catch clause in Main()
- throw in accept() and in factorEval()
We also catch the pre-declared exception
System.DivideByZeroException.
There are several advantages to throwing exceptions rather than using
Boolean flags or other parameters to carry error information. Mainly, the
code is simpler.
The source code is in oneline_tokenizer.cs
and expr_eval.cs.
expr_assign
Now let's extend this to allowing variables in factors, which are set with
stmt ::= 'set' ident '=' expr | expr
That is, a line can be either
set foo = 3+2*6
or
foo+1
So stmtEval() can either return a value (the second case) or not (the
first). We have it return something of type int?, which is
an object that is basically an int but which can also be null. Look at the
code to see how this works. Such types are sometimes called Nullable.
Without the set keyword, parsing lines like
these is much more difficult; we have to look ahead to see if the second
token is "=" or not. This is trickier than one might think.
This code is in oneline_tokenizer.cs
and expr_assign.cs.
The next step is the mini-Java compiler. See compiler.html.
Virtual machine, stack architecture
mini-java syntax
Extended Backus Normal Form (EBNF)
EBNF => recursive-descent parser code
accept()
global variables
local variables (including in main())
expressions
scope
symbol tables
CompileExprN
common-subexpression handling