Comp 343/443
Fall 2011, LT 412, Tuesday 4:15-6:45
Week 9, Nov 8
Read:
Ch 3, sections 1, 2, 3
Ch 5, sections 1 (UDP) and 2 (TCP)
TCP issues
(common to any transport protocol, actually)
Final-ACK problem:
what if the final ACK is lost? The other side will resend its final
FIN, but there will be no one left to answer! This is solved with the
TIMEWAIT state.
Old late duplicates problem:
Suppose a connection between the same pair of ports is closed and
promptly reopened. Sometime during the first connection, a packet is
delayed (and retransmitted). It finally arrives during the second
connection, at just the right moment that its sequence number fits into
the receive window of the receiver. (Example: ISN1 = 0, delayed packet
seq number = 8000, ISN2 = 5000, receiver is expecting relative sequence
number of 3000 when the old packet arrives.)
TIMEWAIT to the rescue again!
What a connection is: machine state at each endpoint
TCP should handle:
- lost packets
- damaged packets
- reordered packets
- duplicated packets
- widely varying delay
ISN rationale 1: old late duplicates
ISN rationale 2: distinguishing new SYN from dup SYN
From Dalal & Sunshine's original paper on the TCP 3-way handshake:
2-way handshake: can't confirm both ISNs
4-way handshake:
1 --SYN->
2 <-ACK--
3 <-SYN--
4 --ACK->
This FAILS if first SYN is very very old
! The ack at line 2 is ignored
by its receiver. LHS thinks the SYN on line 3
is a new request,
and so it acks it. It would then send
its own SYN (on what would
be line 5), but it would be ignored. At this point A and B have
different notions of ISNA.
3-way handshake
TCP state diagram
Functional Specification

+---------+ ---------\ active
OPEN
| CLOSED
|
\ -----------
+---------+<---------\ \ create
TCB
|
^
\ \ snd SYN
passive OPEN | |
CLOSE \
\
------------ | |
---------- \
\
create TCB | | delete
TCB \
\
V
|
\ \
+---------+
CLOSE | \
| LISTEN |
---------- | |
+---------+
delete TCB | |
rcv SYN |
|
SEND
| |
----------- | |
-------
| V
+---------+
snd SYN,ACK / \
snd SYN
+---------+
|
|<-----------------
------------------>|
|
|
SYN
|
rcv
SYN
| SYN |
| RCVD |<-----------------------------------------------| SENT |
|
|
snd
ACK
| |
|
|------------------
-------------------| |
+---------+ rcv
ACK of SYN \ / rcv
SYN,ACK +---------+
|
-------------- | |
-----------
|
x
| | snd
ACK
|
V
V
|
CLOSE
+---------+
|
-------
| ESTAB
|
| snd
FIN
+---------+
|
CLOSE | |
rcv
FIN
V
------- | |
-------
+---------+
snd FIN / \ snd
ACK +---------+
|
FIN
|<-----------------
------------------>| CLOSE |
| WAIT-1
|------------------
| WAIT |
+---------+
rcv FIN
\
+---------+
| rcv ACK of
FIN -------
|
CLOSE |
|
-------------- snd ACK
|
------- |
V
x
V
snd FIN V
+---------+
+---------+
+---------+
|FINWAIT-2|
| CLOSING
|
| LAST-ACK|
+---------+
+---------+
+---------+
|
rcv ACK of FIN
|
rcv ACK of FIN |
| rcv
FIN --------------
| Timeout=2MSL -------------- |
|
-------
x V
------------
x V
\ snd
ACK
+---------+delete TCB
+---------+
------------------------>|TIME WAIT|------------------>| CLOSED |
+---------+
+---------+
TCP Connection State Diagram
Figure 6.
half open
simultaneous open
Anomalous TCP scenarios
Duplicate
SYN (cf
Duplicate RRQ in the TFTP protocol)
recognized because of same ISN
Loss of final ACK (cf TFTP)
any resent FIN will receive a RST in response
Old segments arriving for new connection
solved by TIMEWAIT
Sequence number wraparound (WRAPTIME < MSL)
Note WRAPTIME = time to send 4 GB
WRAPTIME = 100 sec => 40 mbytes/sec => >300 Mbits/sec
Client reboots (application restart isn`t an issue)
Could an old connection be accepted as new?
Demo of tcp_stalk
1. Note ServerSocket v Socket
2. accept() loop
3. connection semantics: what if one client is connected, and a second one tries to connect and send?
4. Python version of the client
5. Note the DNS-failure handling. My home ISP never has DNS lookup failures; all
failed lookups resolve to the IP address of a host that has a search
engine as a web page. Alas, this is not helpful if we're not looking
for web pages.
6.2.5: TCP timeout & retransmission
original adaptive retransmission: TimeOut = 2*EstRTT,
EstRTT = α*EstRTT + (1-α)*SampleRTT, for fixed α, 0<α<1 (eg
α=1/2 or α=7/8)
For α≃1 this is very conservative
(EstRTT is slow to change). For α≃0, EstRtt is very volatile.
RTT measurement ambiguity: if a packet is sent twice, is the ACK in
response to the first transmission or the second?
Karn/Partridge algorithm:
on packet loss (and retransmission)
- Double Timeout
- Stop recording SampleRTT
- Use doubled Timeout as EstRTT when things resume
Jacobson/Karels algorithm
for calculating the TimeOut value:
EstRTT = α*EstRTT + (1-α)*SampleRTT
EstDev = α*EstDev + (1-α)*SampleDev
TimeOut = EstRTT + 4*EstDev
TCP timers:
- TimeOut
- 2*MSL Timewait
- persist: sender polls receiver when windowsize = 0
- keepalive
Path MTU Discovery
Covered Week 8; uses some ICMP features
Routinely part of TCP implementations now
Uses IP DONT_FRAG bit, and
ICMP Frag Needed / DF Set
response
Simple packet-based sliding-windows algorithm
receiver-side
window size W
Next packet expected N; window is N ... N+W-1
Generic strategy
We have a pool EA of "early arrivals": packets buffered for future use.
When packet M arrives:
if M<N or M≥N+W, ignore
if M>N, put the packet into EA.
if M=N,
output the packet (packet N)
K=N+1
slide window forward by 1
while (packet K is in EA)
output packet K
slide window forward by 1 (to start at K+1)
There are a couple details left out.
Specific implementation:
bufs[]: array of size W. We always put packet M into position M % W
As before, N represents the next packet we are waiting for.
At
any point between packet arrivals, packet slot N is empty, but some or all of N+1 .. N+W-1 may
be full.
Suppose packet M arrives.
1. M<N or M≥N+W: ignore
2. otherwise, put packet M into bufs[M%W]
3. while (bufs[N % W] has a valid packet) {
write it
N++
}
If M!=N, this loop will do nothing.
But if M==N, we will write packet N and any further saved packets, and slide the window forward.
4.
Send a cumulative acknowledgement of all packets up to but not
including (the current value of) N; this is either ACK[N] or ACK[N-1]
depending on protocol wording
sender side:
We will assume ACK[M] means all packets <=M have been received (second option immediately above)
W = window size, N = bottom of window
window is N, N+1, ..., N+W-1
init: N=0. Send full windowful of packets 0, 1, ..., W-1
Arrival of Ack[M]:
if M < N or M≥N+W, ignore
otherwise:
set Last = N+W-1 (last packet sent)
set N = M+1.
for (i=Last+1; i<N+W; i++) send packet i
Some TCP notes
First, if a TCP packet arrives outside the receiver window, the
receiver sends back its current ACK. This is required behavior. We
discard the packet, but we don't completely ignore it.
Second, the TCP window size fluctuates. Thus, the pool EA must be more abstract than simply keeping track of positions modulo W.
Third, TCP senders do not send a full window, ever. TCP has something called "slow start" to prevent this.