Week of April 18
("machine" == "automaton", by the way)
Inputs, by the way, can be:
From last week:
Regular expression: * means repeat 0 or more times, ? means either 0 or 1 times
What strings match these?
What does a finitestate recognizer for these look like?
More examples of regular expressions:
These use slightly extended regexes (The google example does not support + or *)
\d matches any digit 09,
same as [09]
\W matches anything other than a letter, digit or
underscore, same as [^azAZ09_]
\s matches a space
^ matches the start of the line; $ matches the end of the line
{3,6} means that whatever singlecharacter thing preceding this can match
between 3 and 6 times
What does varname\W*=[^=] match?
Warning: there are quite a few different standards for regular expressions. Always read the documentation.
Let's call the finitestate recognizers finite automata. So far the finitestate recognizers have all been deterministic: we never have a state with two outgoing edges, going two different directions, that are labeled with the same input. A deterministic finite automaton is abbreviated DFA.
How about b (ab)* a? There's a difference here. Now we do have a state with two different edges labeled 'a'. Such an automaton is known as nonde^{. }terministic, that is, as an NFA. We can still use an NFA to match inputs, but now what do we do if we're at a vertex and there are multiple edges that match the current input?
There are two primary approaches. The first is to try one of the edges first, and see if that works. If it does not, we backtrack to the vertex in question and at that point try the next edge. This approach does work, but with a poorly chosen regular expression it may be extremely slow. Consider the regular expression (a?)^{n} a^{n}. This means up to n optional a's, followed by n a's. Let us match against a^{n}, meaning all the optional a's must not be used. The usual strategy when matching "a?" is to try the "a" branch first, and only if that fails do we try the empty branch. But that now means that we will have 2^{n}  1 false branches before we finally succeed.
Example: (a?)^{3} a^{3}.
A much faster approach is to use the NFA with state sets, rather than individual states. That is, when we are in state S1 and the next input can lead either to state S2 or state S3, we record the new state as {S2,S3}. If, for the next input, S2 can go to S4 and S3 can go to either S5 or S6, the next state set is {S4,S5,S6}. This approach might look exponential, but the number of states is fixed.
Example: (a?)^{3} a^{3}.
See also https://swtch.com/~rsc/regexp/regexp1.html, "Regular expression
search algorithms", the paragraph beginning "A more efficient ...."
By the way, a much better regular expression for between n and 2n a's in a row is a^{n} (a?)^{n}. We parse n a's at the beginning, and the optional a's are all following.
The implementation of an NFA/DFA recognizer does literally use the graph approach: for each current state, and each nextinput symbol, we look up what next states are possible with that input symbol. The code to drive the NFA/DFA does not need to be changed for different NFA/DFAs. This is a big win from a softwareengineering perspective.
TCP state diagram: intronetworks.cs.luc.edu/current2/html/tcpA.html#tcpstatediagram. Note the additional softwareengineering issue of this being a distributed system.
Study guide
Some people, when confronted with a
problem, think
“I know, I'll use regular expressions.” Now they have two
problems.
 Jamie Zawinski (regex.info/blog/20060915/247)
Stop Validating Email Addresses with Regex: davidcel.is/posts/stopvalidatingemailaddresseswithregex.
How about even more problems? jimbly.github.io/regexcrossword.
TCP kernel implementation: tcp_ipv4.c tcp_v4_do_rcv(), tcp_seq_next(), tcp_seq_stop(), tcp_v4_err(),
tcp_input.c: tcp_rcv_state_process
Also regex option in gedit search box and eclipse search box
One more example of NFA stateset recognizer: aaaaabaacaad
+a>(3)a>(7)


/a>(4)b>(8)
(1)a>(2)
 \a>(5)c>(9)

+a>(6)d>(10)
NFA to DFA
It is also possible to convert any NFA to a DFA. The catch is that if there are n states in the NFA, there might be 2^{n} states in the DFA.
Subset construction: DFA states are all sets of NFA states. Given such a set, and an input, we form the set of all states reachable on that input from any of the NFA states in the set.
Graph of y^{2} = x^{3} + Ax + B (the (short) Weierstrass form)
What does this have to do with an ellipse?
Elliptic product a⊕b: the graphical constuction over R
Adding a point at infinity
See Boneh & Shoup p 614 (of version 0.5): "The Addition Law" (toc.cryptobook.us, chapter 14 "Elliptic curve cryptography")
Note that if you have two roots r_{1} and r_{2} of a cubic ax^{3} + bx^{2} + cx +d, then the product of all the roots is d/a, and so r_{3} = d/ar_{1}r_{2}.
Finite fields: graui.de/code/elliptic2.
Find the finitefield generator g (or base b)
Taking multiples of g: kg = g⊕g⊕...⊕g, k times. Repeatedsquaring algorithm
Size of E(F_{p}) solution set: roughly p
Basically, for each x, half the time there are no solutions for y and half the time there are two (+y, y). On average there is one, so total number of solutions is ~p.
Montgomery form: y^{2} = x^{3} + Ax^{2} + x
DiffieHellmanMerkle for basic elliptic curve
For classic DiffieHellmanMerkle, Alice chooses an integer a<p, and Bob chooses b<p. Alice and Bob publish g^{a} and g^{b} respectively, where g is the chosen generator. If Alice wants to create a key to use for encrypting a message to Bob, she calculates (g^{b})^{a} = g^{ab}. Similarly, Bob can calculate (g^{a})^{b} = g^{ab} to decrypt. Nobody else can; you have to know either a or b.
For elliptic curves, Alice again chooses an integer a<p, and Bob chooses b<p. Alice and Bob publish a*g and b*g, respectively. Again, knowing g and knowing a*g does not give you a reasonable method for finding a. The rest of the mechanism works exactly as with the classic case.
Edwards form: x^{2} + y^{2} = 1 + Dx^{2}y^{2}. The elliptic product here does not involve cases.
The prime here is p = 2^{255}  19, which is easy to find in python. The curve is y^{2} = x^{3} + 486662x^{2} + x.
Size of E(F_{p}) = 8q, where q is prime; q = 2**252 + 27742317777372353535851937790883648493
Basic Encryption
Use DiffieHellmanMerkle to choose a common secret, and then use a hash of that secret as a conventional encryption key.
Base point: (9, 14781619447589544791020593568409986887264606134616475288964881837755586237401). This has order q, above, in the group.
How did I get this? RFC8032 page 21