Week 11, Comp [34]49, Nov 6

finish DSSS
brief CDMA example
Barker codes
LLC header
safety
What Went Wrong With WEP

DSSS Autocorrelation	(just started at end of week 10)
====================

	Shift the m-sequence by amount tau, cyclically 
	(so if i>=N-tau, where N=2^n-1, we shift ith element to i-(N-tau)
	
	Count how many places the original and the shifted sequence are the same.
	Count how many places they're different.
	The two counts will differ by +/- 1

	replace 0's with -1's
	Now calculate R(tau) as:
		1/n * (B dot (B shifted by tau))
		
What correlation means: a measure of relative randomness.
This is good news for CDMA-type sequences: everyones is shifted by 1 place
from everyone else's. Cross-correlation is *close* to zero.

Actually, one usually tries to do better; Walsh codes give correlations
of zero.

For DSSS, what we're really interested in is low correlation with noise.
m-sequences do a good job of giving us that. 


======================================================

CDMA: everyone transmits overlapping signals, 
BUT we can separate them out with linear algebra.

we take a bit, and spread it with a spreading code.
Each user's code is "orthogonal" when 0 is replaced by -1:

Stallings example:

A: 100101	+--+-+		
B: 110011	++--++
C: 110110	++-++-

Note that A and B are orthogonal in this sense, and A and C,
but *not* quite B and C.

How about a fully-orthogonal example?

A	1 1 1 1		(+1 +1 +1 +1)		Examples omitted Week 10
B	1 1 0 0		(+1 +1 -1 -1)
C	1 0 1 0		(+1 -1 +1 -1)

Let a, b,c be +/- 1, the bits that each sends. A sends aA

Signal as received is aA + bB + cC

=========================================================================
=========================================================================

Stallings Section 14.4

Table 14.5: note 802.11b uses DSSS; 802.11a uses OFDM; 
802.11g uses OFDM except in backward-compatibility mode with 802.11b

Barker sequence: "all autocorrelation values |R(tau)| <= 1".
Except that is trivial; what Stallings really means is that if you 
do the shift NONcyclically and then take the dot-product of the overlap,
WITHOUT DIVIDING BY N, you get -1, 0, or 1.

	N=7: 	+++--+-
	shift	 +++--+-
	corr	 ++-+--
	
		+++--+-
		  +++--+-
		  +---+
		
		+++--+-
		   +++--+-
		   --++
		   
		+++--+-
		    +++--+-
		    -+-
		
		etc  

See http://mathworld.wolfram.com/BarkerCode.html

Barker property is preserved by negation, alternation, and reversal.

If a barker sequence multipath-overlaps itself, the overlap zone is well-behaved.

Barker sequences are used for DSSS in 802.11b.

802.11 FHSS: defined, but not used by 802.11a, b, g, or n.
(may be used by InfraRed)

802.11a: p 447, Table 14.6
 -- OFDM
 -- different modulation techniques
 -- convolution codes
 

Received: 
	N-bit barker sequence for main bit
	*shifted* piece of sequence, as multipath distortion
	
Now dot-product the barker sequence with the input.
sequence dot itself = N
|sequence dot shifted_piece | <= 1


==============================================================================
==============================================================================

Between the MAC header and the IP header

	Ethernet: TYPE field
	802.11: no type field; what comes next? Why do we think it is IP?

The missing piece is the "LLC" header; see stallings 14.1

Three alternative services:
	unacknowledged connectionless service
	connection-mode service (mini-tcp)
	acknowledged connectionless service
	
IP only uses the first of these. Pretty much EVERYBODY only uses the first of these.

Header:

	+-------+-------+-------+-------+
	| DSAP  | SSAP  |  LLC control  |
	+-------+-------+-------+-------+
	
LLC control can be one byte
Common value: hex AA AA 03

Total space: typically 8 bytes, including SNAP extension (SubNetwork Access Protocol)

The issue: we know the first 2-3 bytes of every wireless packet body!
If there were no LLC header, the first two bytes would be hex 45 00,
from IP v4

==============================================================================

FCC and wi-fi radiation safety

Existing FCC radiation-safety levels for the ISM band are 1 mw /cm^2.
("FCC limits for maximum permissible exposure (MPE)", OET Bulletin 56, 1999)

A wi-fi router is limited to 1 watt total output power.

At distance r from the antenna, the power per unit area is 1 watt / 4*pi*r^2,
where 4*pi*r^2 is the area of a sphere of radius r (we are assuming an 
isotropic radiation distribution). 

Solving 1 mw/cm^2 = 1000mw/4*pi*r^2, we get r^2 = 80, or r = 9 cm.

At a more typical distance of 1 yard (90 cm), the power density is
10 microwatts/cm^2 = 0.01 mw/cm^2
 
==============================================================================
==============================================================================
==============================================================================

What Went Wrong With WEP

3 main problems:
	brute-force attacks (including weak-key attacks)
	poor-implementation attacks
	fluhrer-mantin-shamir attack

A big part of the story appears to be the selection of good cryptographic
algorithms, followed by poor implementation choices, probably due to a lack
of understanding of the IMPORTANCE of implementation details.

	see postel.org/postel.html

Basics of "symmetric-key cipher": each side has a secret key

Block ciphers

Basics of xor-style "stream cipher": Key is used to seed a PRNG, 
	which is the "keystream". Messages are simply XORed with the keystream.
	
	Critical issue: NEVER REUSE A KEYSTREAM!
	
	If we have ciphertexts c1 and c2, each encrypted with keystream ks,
	from messages m1 and m2, so c1 = ks xor m1 and c2 = ks xor m2,
	then c1 xor c2 = (ks xor m1) xor (ks xor m2) = (ks xor ks) xor (m1 xor m2)
	= m1 xor m2
	Now, there are standard attack strategies that often let us guess
	m1 or m2 (and thus both), particularly if we KNOW something about either.
	
	Of course, if we KNOW a message and its ciphertext (ie m1 and c1), 
	this completely gives the keystream away!

	LFSR from Chapter 8: PRNG is not really "strong" enough 
	(though variants on this idea *are* used in cryptography)

==============
	
Key management: weakness of all symmetric-key ciphers.
How do you change to a new key?

	Diffie-Hellman Key Exchange: 
	Both sides agree publicly to a prime p and a base g (typically g=2 or 5)
	p must have more bits than the desired keylength.
	Alice picks a, and sends Bob g^a mod p. Alice does *not* share a.
	Bob picks b, and sends Alice g^b mod p. Bob does *not* share b.
	Both sides can now compute k = g^(ab) = (g^a)^b = (g^b)^a
	
However, Diffie-Hellman is vulnerable to a man-in-the-middle attack.

What about negotiating a SESSION KEY? So that the primary key is exposed
as little as possible? 
	
	Typical strategy: 
		* both sides do DH key exchange, using the secret key SKEY
		  to "sign" messages: append md5(msg ^ SKEY)
		* now use the DH key as the "session key"
		
	Alternatively, simply use SKEY to encrypt the session key and send 
	to the other party.

WEP DID NOT DO THIS

In fact, under most implementations of WEP, the secret key is GLOBAL:
used by ALL stations (and all traffic).

Updating the key is a MESS.

Perhaps the biggest reason for this is that the IEEE didn't want to
get involved in key-management algorithms. 

================

Another problem:

Every WEP packet is XORed with the START of the keystream;
that is, the keystream is "restarted" with each packet.

One reason for not continuing the keystream is that some packets may be lost;
how would we know where in the keystream a given packet was?

	One fix would have been to specify a "keystream position" field
	in the unencrypted part of the header. This does NOT give anything away.

What WEP *did* do was to divide the 8-byte key into:
	the IV (the "nonce") (3 bytes)
	the secret part (5 bytes/ 40 bits)

The theory is that the IV changes "often"; supposedly it is different for each packet.
Problems:
	* some implementations don't do this well.
	  One implementation alternates between two different IVs.
	  Only someone who had NO idea of the crypto consequences of that
	  could have made that choice!
	* Most PCMCIA cards restart the IV counter from 0 each time the card
	  is reinitialized (eg on each reboot, or each wakening from sleep)
	* Even if we choose random IVs, duplicates are likely after ~5000 packets.
	* There's only 24 bits anyway; it wraps around SOON ENOUGH.
	  500 packets/sec => wrap in 32,000 sec ~ 500 min ~ 8 hours

Given that WEP encryption is in use, here are some other ways to get the plaintext
(from Borisov)

1. Look for packets that probably contain "Password:", or "HTTP GET"

2. Send the station an email; we know the plaintext! The AP will encrypt it.
Or entice someone to visit our web page.

3. Some APs send out b'cast packets in both encrypted and unencrypted form,
if authentication is not mandatory. (Think about why.)
We just use the wireful part of the network to generate the packets

4. It is sometimes possible to modify packets. If we can eavesdrop on some
packets, and then MODIFY them so they're sent to us, and then transmit
the modified packet, we'll get the AP to decrypt for us.

5. Building a "rainbow table" of all IV/keystream pairs (2^24 * 1500 bytes)
is feasible.

=====================================================

WEP details: IV + key choice
LLC header 0xAA, so 1st byte of keystream is known

WEP authentication:
	STA: 	send request
	AP:	send random message m1
	STA:	send c1 = e(m1) = m1 XOR ks(KEY), ks(KEY) = keystream of KEY

In theory, STA has now authenticated to AP that it *has* KEY.

HOWEVER:
Suppose we eavesdrop on this. We know m1 and c1. 
But then ks(KEY) = c1 XOR m1!

Now we try to authenticate ourself. AP sends us m2.
We send back c2 = c1 XOR (m1 xor m2) = (c1 XOR m1) xor m2 = ks(KEY) XOR m2

We're authenticated!

Note that we SHOULD NOT be allowed to reuse the IV like that.
But will the AP stop us? 

No.

Because if it did, it would break 802.11-compatible clients,
and users would complain.

How could you fix this? Have station send back md5(m1 ^ c1) or something.

=====================================================
	

WEP is based on RC-4 (also used by SSL)

History of RC-4

In the beginning, there was DES.
Designed by IBM, and approved by the NSA, in 1976.
The National Bureau of Standards (now NIST) is the agency that actually approved it,
but the NSA pulled the strings. 
The NSA *did* make some suggestions to IBM regarding the so-called "S-boxes",
but it does NOT appear likely that they used this to introduce a weakness.

56-bit-key
block cipher (though it can also be used in "stream mode")

Basic discussion of these two strategies

RC-2: Rivest Cipher 2: cooked up by Ron Rivest of RSA Security Inc in 1987
for Lotus Notes, so they could get an export license. RC-2 has a variable-length
key; the export license was for a 32-bit key.
Officially it is a block cipher

RC4: Also 1987; more explicitly a stream cipher. In principle, 
RC-4 is faster in software. RC-4 has *no* bit-tweaking operations.

RC-4 is used by SSL

outline of RC4
==============

	key-scheduling step
            for (i = 0; i<256; i++) S[i] = i;
            j = 0;
            for (i=0; i<n; i++) {
    	        j = (j + S[i] + key[i % keylength])
    	        swap(S[i], S[j]);
    	    }
	
	byte-generation step (start with i=j=0)
	    i = i+1;
            j = j + S[i];
    	    swap(S[i], S[j]);
            return S[ S[i] + S[j] ];

Do example, 1st 5 steps with KEY = 3, -1, 5, 1, 2
	
Fluhrer, Mantin, & Shamir WEP vulnerability: 
The very first byte of the keystream gives us some information!
Each packet is encrypted "from scratch" in WEP, and the first byte of
data payload is known, so we get LOTS of first bytes of the keystream!


==============================================================================

FMS attack

	What the initial three steps of the key-scheduler does
	
	IV: <3,-1,X>: if 1st byte of keystream is Y, then we guess Y-X-6
	What are the odds that the packet won't be further messed up beyond 
	the first 3 stages?
		255 chances
		odds of one chance messing things up: 3/255
		P(no messup) = (1-3/255)^255 = e^-3 = 0.04978
	
	4-step version

	General version: precompute S_(n-1)
	
==============================================================================

Message modification (BGW)

The theory is that there's nothing a bad guy can do to modify the message
and preserve the CRC checksum. 
But CRC isn't strong enough; this is in fact trivial!

IP-address modification
XORing to modify the address is trivial.
What turns out to be nontrivial is updating the IP-header checksum.
We know what fudge factor to ADD, but we don't know what to XOR.