Comp 343/443    

Fall 2011, LT 412, Tuesday 4:15-6:45
Week 9, Nov 8

Read:

    Ch 3, sections 1, 2, 3
    Ch 5, sections 1 (UDP) and 2 (TCP)




TCP issues

(common to any transport protocol, actually)
    
Final-ACK problem: what if the final ACK is lost? The other side will resend its final FIN, but there will be no one left to answer! This is solved with the TIMEWAIT state.

Old late duplicates problem: Suppose a connection between the same pair of ports is closed and promptly reopened. Sometime during the first connection, a packet is delayed (and retransmitted). It finally arrives during the second connection, at just the right moment that its sequence number fits into the receive window of the receiver. (Example: ISN1 = 0, delayed packet seq number = 8000, ISN2 = 5000, receiver is expecting relative sequence number of 3000 when the old packet arrives.)

TIMEWAIT to the rescue again!
 
What a connection is: machine state at each endpoint
TCP should handle:
 
     
ISN rationale 1: old late duplicates
       
ISN rationale 2: distinguishing new SYN from dup SYN
 
From Dalal & Sunshine's original paper on the TCP 3-way handshake:
     
2-way handshake: can't confirm both ISNs
             
4-way handshake:
            1    --SYN->
            2    <-ACK--
            3    <-SYN--
            4    --ACK->
This FAILS if first SYN is very very old ! The ack at line 2 is ignored by its receiver. LHS thinks the SYN on line 3  is a new request, and so it acks it. It would then send  its own SYN (on what would be line 5), but it would be ignored. At this point A and B have different notions of ISNA.
         
3-way handshake
 


TCP state diagram

         
                                                Functional Specification
 
 TCP State Diagram
 
                                    
                              +---------+ ---------\      active OPEN  
                              |  CLOSED |            \    -----------  
                              +---------+<---------\   \   create TCB  
                                |     ^              \   \  snd SYN    
                   passive OPEN |     |   CLOSE        \   \           
                   ------------ |     | ----------       \   \         
                    create TCB  |     | delete TCB         \   \       
                                V     |                      \   \     
                              +---------+            CLOSE    |    \   
                              |  LISTEN |          ---------- |     |  
                              +---------+          delete TCB |     |  
                   rcv SYN      |     |     SEND              |     |  
                  -----------   |     |    -------            |     V  
 +---------+      snd SYN,ACK  /       \   snd SYN          +---------+
 |         |<-----------------           ------------------>|         |
 |   SYN   |                    rcv SYN                     |   SYN   |
 |   RCVD  |<-----------------------------------------------|   SENT  |
 |         |                    snd ACK                     |         |
 |         |------------------           -------------------|         |
 +---------+   rcv ACK of SYN  \       /  rcv SYN,ACK       +---------+
   |           --------------   |     |   -----------                  
   |                  x         |     |     snd ACK                    
   |                            V     V                                
   |  CLOSE                   +---------+                              
   | -------                  |  ESTAB  |                              
   | snd FIN                  +---------+                              
   |                   CLOSE    |     |    rcv FIN                     
   V                  -------   |     |    -------                     
 +---------+          snd FIN  /       \   snd ACK          +---------+
 |  FIN    |<-----------------           ------------------>|  CLOSE  |
 | WAIT-1  |------------------                              |   WAIT  |
 +---------+          rcv FIN  \                            +---------+
   | rcv ACK of FIN   -------   |                            CLOSE  |  
   | --------------   snd ACK   |                           ------- |  
   V        x                   V                           snd FIN V  
 +---------+                  +---------+                   +---------+
 |FINWAIT-2|                  | CLOSING |                   | LAST-ACK|
 +---------+                  +---------+                   +---------+
   |                rcv ACK of FIN |                 rcv ACK of FIN |  
   |  rcv FIN       -------------- |    Timeout=2MSL -------------- |  
   |  -------              x       V    ------------        x       V  
    \ snd ACK                 +---------+delete TCB         +---------+
     ------------------------>|TIME WAIT|------------------>| CLOSED  |
                              +---------+                   +---------+
 
                      TCP Connection State Diagram
                               Figure 6.
 
    half open
    simultaneous open
     
                               

   

Anomalous TCP scenarios

Duplicate SYN           (cf Duplicate RRQ in the TFTP protocol)
                recognized because of same ISN

Loss of final ACK       (cf TFTP)
                any resent FIN will receive a RST in response

Old segments arriving for new connection
                solved by TIMEWAIT

Sequence number wraparound (WRAPTIME < MSL)
                Note WRAPTIME = time to send 4 GB
                WRAPTIME = 100 sec => 40 mbytes/sec => >300 Mbits/sec

Client reboots (application restart isn`t an issue)
                Could an old connection be accepted as new?
                
 


Demo of tcp_stalk


1. Note ServerSocket v Socket

2. accept() loop

3. connection semantics: what if one client is connected, and a second one tries to connect and send?

4. Python version of the client

5. Note the DNS-failure handling. My home ISP never has DNS lookup failures; all failed lookups resolve to the IP address of a host that has a search engine as a web page. Alas, this is not helpful if we're not looking for web pages.



6.2.5: TCP timeout & retransmission

original adaptive retransmission: TimeOut = 2*EstRTT,
EstRTT = α*EstRTT + (1-α)*SampleRTT, for fixed α, 0<α<1 (eg α=1/2 or α=7/8)
For α≃1 this is very conservative (EstRTT is slow to change). For α≃0, EstRtt is very volatile.
        
RTT measurement ambiguity: if a packet is sent twice, is the ACK in response to the first transmission or the second?
Karn/Partridge algorithm: on packet loss (and retransmission)
 
Jacobson/Karels algorithm for calculating the TimeOut value:
EstRTT = α*EstRTT + (1-α)*SampleRTT
EstDev = α*EstDev + (1-α)*SampleDev
TimeOut = EstRTT + 4*EstDev
                
TCP timers: 
 

Path MTU Discovery
    Covered Week 8; uses some ICMP features
    Routinely part of TCP implementations now
    Uses IP DONT_FRAG bit, and ICMP Frag Needed / DF Set response


 

Simple packet-based sliding-windows algorithm


receiver-side

window size W
Next packet expected N; window is N ... N+W-1
   
Generic strategy
   
We have a pool EA of "early arrivals": packets buffered for future use.
When packet M arrives:
if M<N or M≥N+W, ignore
if M>N, put the packet into EA.

if M=N,
       output the packet (packet N)
        K=N+1
        slide window forward by 1
        while (packet K is in EA)
               output packet K
               slide window forward by 1 (to start at K+1)
       
There are a couple details left out.

Specific implementation:
   
bufs[]: array of size W. We always put packet M into position M % W
As before, N represents the next packet we are waiting for.

At any point between packet arrivals, packet slot N is empty, but some or all of N+1 .. N+W-1 may be full.
   
Suppose packet M arrives.
   
1. M<N or M≥N+W: ignore
2. otherwise, put packet M into bufs[M%W]
3. while (bufs[N % W] has a valid packet) {
           write it
           N++
       }
       If M!=N, this loop will do nothing.
       But if M==N, we will write packet N and any further saved packets, and slide the window forward.
4. Send a cumulative acknowledgement of all packets up to but not including (the current value of) N; this is either ACK[N] or ACK[N-1] depending on protocol wording
      
sender side:
    We will assume ACK[M] means all packets <=M have been received (second option immediately above)
   
    W = window size, N = bottom of window
    window is N, N+1, ..., N+W-1
   
    init: N=0. Send full windowful of packets 0, 1, ..., W-1
   
Arrival of Ack[M]:
   
    if M < N or M≥N+W, ignore
    otherwise:
        set Last = N+W-1 (last packet sent)
        set N = M+1.
        for (i=Last+1; i<N+W; i++) send packet i
       
   
Some TCP notes

First, if a TCP packet arrives outside the receiver window, the receiver sends back its current ACK. This is required behavior. We discard the packet, but we don't completely ignore it.

Second, the TCP window size fluctuates. Thus, the pool EA must be more abstract than simply keeping track of positions modulo W.

Third, TCP senders do not send a full window, ever. TCP has something called "slow start" to prevent this.