Project 3: cracking WEP encryption

Comp [34]49 Project 3: Cracking WEP encryption

I've created a website, cs.luc.edu/pld/courses/449/fall08/wep, that will give you the first byte of the keystream for all Initialization Vectors of the form <N, -1, X>.

You are to use this information to find the key, for two different keys. You are also to include a detailed explanation of how you went about finding the key.

When you go to the site, you provide (a) your initials, (b) the desired magic (a shorthand for choosing one of eight individual keys), and (c) an integer N. You then get a table of all first-keystream-byte values for IV's <N, -1, X>, 0≤X<256.

The strategy is to use this table for N=3...7 successively, and the Fluhrer, Mantin & Shamir method, to figure out each key byte Key[3]...Key[7] in sequence. (Recall that key bytes 3-7 are the "secret part"; bytes 0-2 are the IV.) When working on Key[j], you do not need the earlier Key[i] values with i<j for getting the table, but you do need them for computing your best guess. In other words, you do need to find the key bytes sequentially.

The Fluhrer, Mantin, and Shamir weakness

The first part of the RC4 cipher is the "key-scheduling algorithm" or "KSA": 256 transpositions of the sequence of all byte values. This is represented by an array S; the algorithm is as follows:

            for (i = 0; i<256; i++) S[i] = i;
            j = 0;
            for (i=0; i<256; i++) {
                j = (j + S[i] + Key[i % keylength]);    // keylength = 8
                swap(S[i], S[j]);                                    // another word for "swap" is "transposition"
            }

We will denote by S_n the result of the first n full iterations of the second, transposing for loop (in the notes, S_n was represented by S_n). The final version of S, S₂₅₆, we will often refer to as S, with no subscript. Note that in S_n we have swapped each of S[0] ... S[n-1], each with some corresponding S[j].

It is straightforward to observe that the first byte of the keystream is S[S[1]+S[S[1]]]; we will refer to this as F(S) (F for First; note that S is an entire array of length 256); when we're looking at an actual keystream, this first byte will be denoted Out or Out₀. The heart of the FMS strategy is to note that in many cases this value never changes after stage N+1, 3≤N<7, that is, F(S) = F(S_N+1), and that we can often establish some linear relationship between F(S) and Key[N]. This allows us to use this first keystream byte to, in effect, solve for Key[N].

There are two steps here. The first is to establish that, with a suitable IV of the form <N,-1,X>, in most cases F(S_N+1) has a clear relationship with Key[N] (eg F(S_N+1) = Key[N] + constant). The second is to establish that, about 5% of the time, F(S_N+1) = F(S₂₅₆) = F(S); that is, none of the subsequent transpositions in the KSA have any effect on F(S) and so this clear relationship is preserved after the full KSA phase.

The net result of these two steps is that we can apply our relationship to form a guess at Key[N] for each of the 256 values of X in the IV, and about 5% of the time we will be right. This may not seem like many, but 5% of 256 is about 12, and typically none of the wrong guesses occur more than 3 or 4 times. Thus, the right guess tends to stick out very visibly when we look at the guess frequencies.

The second step is shorter, so we'll argue that first. F(S) = S[S[1]+S[S[1]]] depends on S[i] for i = 1, S[1], and (S[1]+S[S[1]]). With the right IV, all these index values will be ≤ N. During the remainder of the KSA phase, we will have i>N and so the only way a further transposition can alter one of our special index values is that, randomly, j will end up equal to one of them. For a single i, the chance of such a collision involving j is about 3/256 and the chance of a non-collision is (1 - 3/256); for 256 i (actually, there are 256-Ns values of i remaining) the chance of no collisions at all, and thus no changes in F(S), is (1 - 3/256)²⁵⁶, or about 5%.

Now for the first step. Let's do a specific example with IV = <3,-1,12>; remember this means these values are Key[0]...Key[2] respectively. We have S₀:

The KSA main loop starts with i=0, j=0+S[0]+Key[0] = 0+0+3, and so we swap S[0] and S[3] to get S₁:

Next, we have i=1 and j=3+S[1]+Key[1]=3+1+(-1) = 3, and so we swap S[1] and S[3] to get S₂:

Finally we have i=2 and j=3+S[2]+Key[2]=3+2+12=17; we swap S[2] and S[17] to get S₃:

So far all this has depended only on the IV (the known part of the Key). But the next KSA step involves K = Key[3], the first unknown byte. We increment i to 3, and j to some (probably far) value j₃, and swap S[3] and S[j₃]. Assuming S[j₃] was not previously altered, it is likely that before the swap S[j₃]=j₃. This swap gives us S₄, and it is straightforward to verify that F(S₄) = S₄[S₄[1] + S₄[S₄[1]]] = S₄[0+S₄[0]] = S₄[3]. We now note that this value, j₃, is 17+1+K = 18+K, and so K = Out - 18 (or, more generally, K = Out - X - 6 for the IV <3,-1,X>).

With N=3 we found a formula relating K and Out algebraically, but even for N=4 this is tedious. So instead we let the computer do the work for us. For a given <N,-1,X>, when we know Key[0]...Key[N-1] (for N>3 this means we now know some of the secret part of the key) we can compute S_N. Here are the steps:

Compute S_N and j_N, the value of j after N rounds of the KSA loop. Using the Java rc4 library provided, we can do this with
rc4 r = new rc4(); r.ksa(N,Key);
We have j_N as r.the_j(). At this point we have changed S_N[0] through S_N[N-1]; it is likely that S_N[N] has not yet been changed; ie it is still equal to N (though this does not in fact matter). It will also likely be the case that S_N[0]=N and S_N[1]=0, and so F(S_N)=S_N[N].
The next step will exchange S_N[N] and S_N[j], for j = j_N + S_N[N] + Key[N]. Note that j_N = r.the_j() and S_N[N] = r.S(N).
It is now 5% likely that all the remaining changes in S will not affect S[0], S[1], or S[N], and thus not change F(S). If we "win" this 5% bet -- that is, if we're lucky for this particular X -- we have
F(S) = Out = S[N] = S_N+1[N] = S_N[j]
It is likely that S_N[j] = j, but we don't have to assume this; the permutation S_N has an inverse and so we can write j = S_N^-1[Out], or, using the Java notation, j = r.inverse(Out).
Now we have
S_N^-1[Out] = j = j_N + S_N[N] + Key[N]
ie
Key[N] = S_N^-1[Out] - j_N - S_N[N]
This is your guess, in terms of N, Out, X, and your calculations. The value of X here is embedded in the Key.
Note that all three terms on the righthand side above are easily expressible in terms of the Java object r.

What you actually have to do

Collect the IV tables for each value of N
Write a program to generate guess(N,Out,X), as above. Note that the guess will also depend on the key up to that point, and as X will be a part of that key it may not appear as an explicit parameter.
The rc4.java file contains a method readvalues(String filename) that takes a filename of a file such as my IV calculator generates, and returns a byte [] consisting of the IV values (second entries of each line). Thus, java I/O should not be a problem!
Figure out which value of guess occurs most often. It shouldn't even be close. Use that as Key[N], and go on to N+1.
Write up an analysis, including what you did, and just how many times your guess value appeared. Submit it along with all programs you wrote for this.