Computer Networks Spring 2010 Corboy Law 522
sorcerer's apprentice bug
sliding windows
Sliding Windows
; §2.5, P&D
2.5: stop-and-wait versus sliding windows
stop-and-wait lost-packet recovery
sorcerers' apprentice bug (both sides retransmit on duplicate)
http://www.youtube.com/watch?v=XChxLGnIwCU, around T= 5:35
first/last packet: special handling!
lost final packet: note that this one has no ACK!
duplicated first packet
"Ladder" diagram of SWS=4
, with RTT = 4
sliding windows basic ideas:
SWS, RWS, LFS, LFA, NFE
cumulative ACKs
sender:
SWS: Sender Window Size
LAR: Last ACK received; window is LAR+1 ... LAR+SWS
LFS: Last Frame Sent, usually close to LAR+SWS
receiver:
LCFR: Last Cumulative Frame Received.
All frames <= LCFR
have been received.
LFR (Last Frame Received): LCFR <= LFR <= LAF
LAF: Last (highest) Acceptable Frame: LAF = LCFR+RWS
receive window: LCFR+1 ...
LAF
NFE: Next Frame Expected: = LCFR+1
slow sender v slow router
bandwidth×delay
Four regions of sender line:
x <=LAR, LAR<x<=LFS, LFS<x<=LAR+SWS, LAR+SWS<x
cumulative ACKs
lost packets
Selective ACKs (SACKs)
Finite sequence numbers
Flow control
slow receiver
Loss recovery under sliding windows
Diagrams when a single packet is lost. Note that, if SWS=4 and packet 5
is lost, then packets 6, 7, and 8 may have been received. For this
reason (and because losses are usually associated with congestion, when
we do not wish to overburden the network), we will retransmit only
the first lost packet, eg packet 5. If packets 6, 7, and 8 are also
lost, then we will receive ACK(5). However, if packets 6-8 made it
through after all, then we will receive back ACK(8), and so the sender then knows 6-8 do not need retransmission and that the next packet we will send is packet 9.
Simple fixed-window-size analysis.
My example:
A----R1----R2----R3----R4----B
In the backward B->A direction, all connections
are infinitely fast. In the A->B direction, the A->R1 link is
infinitely fast, but the other four each have a bandwidth of 1
packet/second. This makes the R1->R2 link the "bottleneck
link" (it has the minimum bandwidth, and although there's
a tie for the minimum this link is the first such one
encountered)
Alternative example:
C----S1----S2----D
Assumptions: C--S1 link is infinitely fast (zero delay),
S1->S2, S2->D each take 1.0 sec bandwidth delay (so two packets
take 2.0 sec, per link, etc), and ACKs have the same delay in the reverse direction.
In both scenarios:
no-load RTT = 4.0 sec
Bandwidth = 1.0 packet/sec (= min link bandwidth)
We assume a SINGLE CONNECTION is made; ie there is no competition.
Bandwidth × Delay here is 4 packets (1 packet/sec × 4 sec RTT)
Case 1: SWS = 2
so SWS < bandwidth×delay (delay = RTT):
less than 100% utilization
delay is constant as SWS changes (eg to 1 or 3); this is the "base" rtt
or "no-load" RTT
throughput is proportional to SWS
When SWS= 2, throughput = 2 packets /4 sec = 2/4 = 1/2 packet/sec
During each second, two of the routers R1-R4 are idle
RTT_actual = 4 sec
time
|
A sends
|
R1 queues
|
R1 sends
|
R2 sends
|
R3 sends
|
R4 sends
|
B ACKs
|
0
|
1,2
|
2
|
1
|
|
|
|
|
1
|
packet2
|
|
2
|
1
|
|
|
|
2
|
|
|
|
2
|
1
|
|
|
3
|
|
|
|
|
2
|
1
|
|
4
|
3
|
|
3
|
|
|
2
|
1
|
5
|
4
|
|
4
|
3
|
|
|
2
|
6
|
|
|
|
4
|
3
|
|
|
7
|
|
|
|
|
4
|
3
|
|
8
|
5
|
|
|
|
|
4
|
3
|
Note the brief pile-up at R1 (the bottleneck link!) on startup.
However, in the steady state, there is no queuing. (That changes below
in case 3.) Real sliding-windows protocols generally have some way of
minimizing this "initial pileup".
Case 2: SWS = 4
When SWS=4, throughput = 1 packet/sec, RTT_actual =4,
and each second all four bottleneck links are busy.
Note that throughput = bottleneck-link bandwidth, so
this is the best possible throughput.
Case 3: SWS = 6
SWS > bandwidth×delay:
What happens is that the extra packets pile up at a router somewhere
(specifically, at the router in front of the bottleneck link)
Delay rises (artificially); bandwidth is that of bottleneck link
example: SWS=6. Then the actual RTT rises to 6.0 sec.
Each second, there are two packets in the queue at R1.
avg_queue + bandwidth×RTT_noload = 2+4 = 6 = SWS
Now, however, RTT_actual is 6, and to the sender it appears that SWS = bandwidth × RTTactual.
Note that in all three cases, in the steady state the sender never
sends faster than the bottleneck bandwidth. This is because the
bottleneck bandwidth determines the rate of packets arriving at B,
which in tern determines the rate of ACKs arriving back at A, which in
turn determines A's continued sending rate. This aspect of sliding
windows is called self-clocking.
Graphs of SWS versus:
throughput
delay
queue utilization
The critical SWS value is equal to bandwidth×RTT (where "bandwidth" is
the bandwidth of the bottleneck link). Below this, we have:
throughput is proportional to SWS
delay is constant
queue utilization in the steady state is zero
For SWS larger than the critical value, we have
throughput is constant (equal to the bottleneck bandwidth)
delay increases linearly with SWS
queue utilization increases linearly with SWS
We can actually say a little more about the queue utilization. RTTnoload is the "propagation" time; any time in excess of that is spent waiting in a queue somewhere. Thus, we have queue_time = RTTactual - RTTnoload.
The number of bytes in the queue is just bandwidth*queue_time; this
represents the number of packtes if we measure bandwidth in units of
packets.
Normally we think of RWS=SWS. However, it is possible to have
2.4: error detection
parity
1 parity bit catches all 1-bit errors
No generalization to N!
That is, there is no N-bit combination that catches all N-bit errors.
IP/UDP/TCP checksums (the "Internet" checksum)
ones-complement sum of 16-bit words A and B:
form the twos-complement sum A+B
if there is an overflow bit, add it back in as low-order bit
Fact: ones-complement sum is never 0000 unless all bits are 0.
Rule of 9's: the internet checksum is in fact the remainder upon dividing by 216 - 1. This is perhaps the real reason for using ones-complement arithmetic in a twos-complement world.
weakness: transposing words leads to the same checksum.
CRC
Based on long division of polynomials. We treat the message, in binary,
as a giant polynomial m(x), using the bits of the message as successive
coefficients (eg 10011011 = X7 + X4 + X3
+ X + 1). We standardize a divisor polynomial p(x) of degree 32. We
divide m(x) by p(x); our "checksum" is the remainder r(x), of max
degree 31 (so it fits in a 32-bit word.
This is a reasonably secure hash against real-world network corruption,
in that it is very hard for systematic errors to result in same hash
code. (For sums, byte transposition or "matching bit" errors leaves sum
unchanged)
CRC is not secure against intentional
corruption; given msg1, there are straightforward mathematical means
for tweaking last bytes of msg2 so that so crc(msg1) == crc(msg2)
quickie example of "mod2-polynomial" long division: addition = subtraction = XOR
secure hashes (md5, etc)
Nobody knows at all how to produce two messages with same hash
error-correcting codes
2-D parity (corrects 1-bit errors)
fundamental role of error-correcting codes
(= "forward error correction")