Comp [34]49 Project 3: Cracking WEP encryption


I've created a website, cs.luc.edu/pld/courses/449/fall08/wep, that will give you the first byte of the keystream for all Initialization Vectors of the form <N, -1, X>.

You are to use this information to find the key, for two different keys. You are also to include a detailed explanation of how you went about finding the key.

When you go to the site, you provide (a) your initials, (b) the desired magic (a shorthand for choosing one of eight individual keys), and (c) an integer N. You then get a table of all first-keystream-byte values for IV's <N, -1, X>, 0≤X<256.

The strategy is to use this table for N=3...7 successively, and the Fluhrer, Mantin & Shamir method, to figure out each key byte Key[3]...Key[7] in sequence. (Recall that key bytes 3-7 are the "secret part"; bytes 0-2 are the IV.) When working on Key[j], you do not need the earlier Key[i] values with i<j for getting the table, but you do need them for computing your best guess. In other words, you do need to find the key bytes sequentially.

The Fluhrer, Mantin, and Shamir weakness

The first part of the RC4 cipher is the "key-scheduling algorithm" or "KSA": 256 transpositions of the sequence of all byte values. This is represented by an array S; the algorithm is as follows:

            for (i = 0; i<256; i++) S[i] = i;
            j = 0;
            for (i=0; i<256; i++) {
                j = (j + S[i] + Key[i % keylength]);       // keylength = 8
                swap(S[i], S[j]);                                    // another word for "swap" is "transposition"
            }

We will denote by Sn the result of the first n full iterations of the second, transposing for loop (in the notes, Sn was represented by S_n). The final version of S, S256, we will often refer to as S, with no subscript. Note that in Sn we have swapped each of S[0] ... S[n-1], each with some corresponding S[j].

It is straightforward to observe that the first byte of the keystream is S[S[1]+S[S[1]]]; we will refer to this as F(S) (F for First; note that S is an entire array of length 256); when we're looking at an actual keystream, this first byte will be denoted Out or Out0. The heart of the FMS strategy is to note that in many cases this value never changes after stage N+1, 3≤N<7, that is, F(S) = F(SN+1), and that we can often establish some linear relationship between F(S) and Key[N]. This allows us to use this first keystream byte to, in effect, solve for Key[N].

There are two steps here. The first is to establish that, with a suitable IV of the form <N,-1,X>, in most cases F(SN+1) has a clear relationship with Key[N] (eg F(SN+1) = Key[N] + constant). The second is to establish that, about 5% of the time, F(SN+1) = F(S256) = F(S); that is, none of the subsequent transpositions in the KSA have any effect on F(S) and so this clear relationship is preserved after the full KSA phase.

The net result of these two steps is that we can apply our relationship to form a guess at Key[N] for each of the 256 values of X in the IV, and about 5% of the time we will be right. This may not seem like many, but 5% of 256 is about 12, and typically none of the wrong guesses occur more than 3 or 4 times. Thus, the right guess tends to stick out very visibly when we look at the guess frequencies.

The second step is shorter, so we'll argue that first. F(S) = S[S[1]+S[S[1]]] depends on S[i] for i = 1, S[1], and (S[1]+S[S[1]]). With the right IV, all these index values will be ≤ N. During the remainder of the KSA phase, we will have i>N and so the only way a further transposition can alter one of our special index values is that, randomly, j will end up equal to one of them. For a single i, the chance of such a collision involving j is about 3/256 and the chance of a non-collision is (1 - 3/256); for 256 i (actually, there are 256-Ns values of i remaining) the chance of no collisions at all, and thus no changes in F(S), is (1 - 3/256)256, or about 5%.

Now for the first step. Let's do a specific example with IV = <3,-1,12>; remember this means these values are Key[0]...Key[2] respectively. We have S0:

012345678910111213141516171819

The KSA main loop starts with i=0, j=0+S[0]+Key[0] = 0+0+3, and so we swap S[0] and S[3] to get S1:
312045678910111213141516171819

Next, we have i=1 and j=3+S[1]+Key[1]=3+1+(-1) = 3, and so we swap S[1] and S[3] to get S2:
302145678910111213141516171819

Finally we have i=2 and j=3+S[2]+Key[2]=3+2+12=17; we swap S[2] and S[17] to get S3:
301714567891011121314151621819

So far all this has depended only on the IV (the known part of the Key). But the next KSA step involves K = Key[3], the first unknown byte. We increment i to 3, and j to some (probably far) value j3, and swap S[3] and S[j3]. Assuming S[j3] was not previously altered, it is likely that before the swap S[j3]=j3. This swap gives us S4, and it is straightforward to verify that F(S4) = S4[S4[1] + S4[S4[1]]] = S4[0+S4[0]] = S4[3]. We now note that this value, j3, is 17+1+K = 18+K, and so K = Out - 18 (or, more generally, K = Out - X - 6 for the IV <3,-1,X>).

With N=3 we found a formula relating K and Out algebraically, but even for N=4 this is tedious. So instead we let the computer do the work for us. For a given <N,-1,X>, when we know Key[0]...Key[N-1] (for N>3 this means we now know some of the secret part of the key) we can compute SN. Here are the steps:

What you actually have to do

  1. Collect the IV tables for each value of N
  2. Write a program to generate guess(N,Out,X), as above. Note that the guess will also depend on the key up to that point, and as X will be a part of that key it may not appear as an explicit parameter.

    The rc4.java file contains a method readvalues(String filename) that takes a filename of a file such as my IV calculator generates, and returns a byte [] consisting of the IV values (second entries of each line). Thus, java I/O should not be a problem!

  3. Figure out which value of guess occurs most often. It shouldn't even be close. Use that as Key[N], and go on to N+1.
  4. Write up an analysis, including what you did, and just how many times your guess value appeared. Submit it along with all programs you wrote for this.