Computer Networks Week 5   Corboy Law 522


Read:
Chapter 1: 1.1-1.3, 1.5
Chapter 2: 2.1-2.6, 2.8.2 (wi-fi)


 

3.1.2: virtual circuit switching

The road not taken by IP.

In VC switching, routers know about end-to-end connections. To send a packet, a connection needs to be established first. For that connection, each link is assigned a "connection ID" (traditionally called the VCI, for Virtual Circuit Identifier). To send a packet, the host marks the packet with the VCI assigned to the host--router1 link.

Packets arrive (and depart) routers via one of several ports, which we will assume are numbered beginning at 0. Routers maintain a connection table indexed by <VCI,port> pairs. As a packet arrives, its inbound VCI and inbound port are looked up, and this produces an outbound <VCIout, portout> pair. The VCI field is then rewritten to VCIout, and the packet is sent via portout.

Note that typically there is no source address information included in the packet (although the sender can be identified from the connection, which can be identified from the VCI at any point along the connection). Packets are identified by connection, not destination. Any node along the path (including the endpoints) can look up the connection and figure out the endpoints.

Note also that each switch must rewrite the VCI. Datagram switches never rewrite addresses (though they do rewrite hopcount/TTL fields).

Example: construct VC connections between:
    A and F
    A and E
    A and C
    B and D

        A--S1-----S2--D
           |       |
           |       |
        B--S3-----S4----S5---F
           |       |
           C       E            

I will use the following VCIs. They are chosen more or less randomly here, but the requirement is that they be unique to each link. Because links are generally taken to be bidirectional, a VCI used from S1 to S3 cannot be reused from S3 to S1 until the first connection closes.

A to F:  A--4--S1--6--S2--3--S3--8--S5--1--F    A to E via S2
A to E:  A--5--S1--6--S3--3--S4--8--E            Note that this path went via S3, the opposite corner of the square
A to C: A--6--S1--7--S3--3--C
B to D: B--4--S3--8--S1--7--S2--8--D           

Demo: construct the <VCI,port> tables from the above.

The namespace for VCIs is small, and compact (eg contiguous). Typically the VCI and port bitfields can be concatenated to produce a <VCI,Port> index suitable for use as an array index. VCIs are local identifiers. (IP addresses are global identifiers.)
             
IP advantages: 
VC advantages:
                 
3.1.3: source routing
Never used in the real world, but a conceptual possibility.



3.3: ATM cell basics (defer?)

rational for small fixed-size cells:
       
ATM (and cell networks in general):
                small cells (typically 5 bytes header + 48 bytes data)
                virtual circuits; connection-oriented
                   (28-bit addresses after connection is established)
                Switched point-to-point links; some rings
                Note ATM mandates no cell reordering!  This is bad for parallelism
                No physical b'cast
                Forwarding delay & packet size; cut through
                loss of 1 cell destroys packet; need reliable medium

Error correction of Shacham & McKenney [1990]
         send N cells and then one of all N XOR'ed together
         allows recovery from any one lost cell

3.3.2: Skim. Segmentation/reassembly and AAL 3/4, AAL 5. 
        SAR/AAL. AAL 1, 2, 3/4, 5
AAL 3/4: we first define a high-level "wrapper" for an IP packet, called the CS-PDU.
we then chop this into as many 44-byte chunks as are needed; each chunk goes into a 48-byte ATM payload, along with
9 bytes overhead / 44 bytes data: > 20% overhead

AAL 5: CRC is moved to the CS-PDU and promoted to 32-bits.
MID field is discarded (no one used it, anyway)
A bit from the ATM header is used to indicate:
The CS-PDU is chopped into 48-byte chunks, which are then used as the entire body of each ATM cell. 5 bytes overhead / 48 bytes data: 10% overhead. Errors are detected by the CS-PDU CRC-32. This also detects lost cells (impossible with a per-cell crc!)

Addressing: VPI/VCI  VCI: local use only?

3.3.3: virtual circuits as applied to ATM
                store-and-forward of cells v. cut-through for packets
 
 
    3.4: switching: cut-through v. store-and-forward (not done)
        Crossbar switch, other switching fabrics


Chapter 4: IP

4.1: the goal of IP is to connect all the different LANs into one large "virtual" LAN. To this end, the primary feature offered by the IP layer is routing and addressing, which go hand-in-hand.

In terms of the "protocol graph", there are several LAN models that lie below IP, and several end-to-end transport models above. However, there are in practice no competitors to IP.

The IP network service model is to act like a LAN. That is, there are no acknowledgements; delivery is generally described as best-effort.
       
IP routing is based on the idea that, at any given host or router, an IP address can be divided into the network portion and the host portion. Classically, this IP address division is as follows:

1st bits
1st byte
byte 1
byte 2
byte 3
byte 4
 # nets
# hosts
Class A
0
0-127
net
host
host
host
128
224
Class B
10
128-191
net
net
host
host
16384
65536
Class C
110
192-223
net
net
net
host
221
256

(The underlying idea here was that there would be a small number of very large networks, a medium number of institution-sized networks, and a large number of small networks.)

Routing is then based only on the net portion of the address. A class-B site would represent only a single routing-table entry in the outside world; only inside the site would the host bits be taken into account.

This feature is what gives IP routing its great scalability. While Ethernet also uses datagram routing, Ethernet routing tables must be much larger, and are not practical for wide-area routing.

Unlike Virtual Circuit routing, IP routers do not rewrite addresses (although we will come later to Network Address Translation, or NAT, where this is done). However, IP routers do need to perform some header updates.
       
IP header fields and what they do
        
        3.1.  Internet Header Format (from RFC 791)
 
        A summary of the contents of the internet header follows:
 
 
        0                   1                   2                   3   
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |Version|  IHL  |Type of Service|          Total Length         |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |         Identification        |Flags|      Fragment Offset    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |  Time to Live |    Protocol   |         Header Checksum       |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                       Source Address                          |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                    Destination Address                        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                    Options                    |    Padding    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
 
                Example Internet Datagram Header
 
                       Figure 4.
 
        
                TOS
                fragmentation
                TTL, protocol, checksum
                options

Ethernet headers have no TTL field. Routing cycles are a calamity, as a result; one-dimensional routing loops (A->B->A) are banned by the forwarding algorithm. IP needs a way of catching badly addressed packets; this is done by decrementing the TTL by 1 at each router, and then discarding the packet if the TTL reaches 0. Making any change in the header requires updating the header checksum; this can be done "algebraically" but it is not hard simply to re-sum the 8 halfwords of the average header.


Fragmentation and Reassembly

If you are trying to interconnect two LANs (as IP does), what else might be needed besides Routing and Addressing? IP assumes both LANs are based on 8-bit bytes (something not universally true in the early days of IP; to this day the RFCs refer to "octets" to emphasize this requirement). IP also defines bit-order within a byte, and it is left to the networking hardware to translate properly. Data bytes are completely transparent.

There is one more feature IP must provide, however: it must accomodate networks for which the maximum packet size, or maximum transfer unit, MTU, is smaller. Otherwise, if we were using IP to join IP Token Ring (MTU = 4k) to Ethernet (MTU = 1500), the token-ring packets might be too large to deliver. (They might not, if the endpoints had been able to negotiate an appropriate MTU, but this cannot be guaranteed).

So, IP must support fragmentation, and also reassembly. There are a couple major strategies here: per-link fragmentation and reassembly, where the reassembly is done at the opposite end of the link (like ATM), and path fragmentation and reassembly, where reassembly is done at the far end of the path. The latter approach is what is taken by IP, partly because intermediate routers are too busy to do reassembly (this is as true today as it was thirty years ago), and partly because IP fragmentation is seen as the strategy of last resort.

When an IP datagram is fragmented, the IDENT field marks fragments of the same packet, and the Fragment Offset field marks the start position of this fragment. Note that the start position can be a number up to 216, the maximum IP packet length, but the FragOffset field has only 13 bits. This is handled by requiring fragments to have sizes a multiple of 8 (three bits), and left-shifting the FragOffset value by 3 bits before using it.

Example (where MTUs are excluding the LAN header)

A------MTU 1500---- R1 -------  MTU 1000 -------- R2 ----------MTU 400 ------ B

A sends a packet of 1500 bytes to R1: 20 bytes of IP header and 1480 of data.

R1 fragments into two packets of sizes 20+976 = 996 and 20+504=544. Having 980 bytes of payload in the first fragment would fit, but violates the divisible-by-eight rule. The first has FragOffset = 0; the second has FragOffset = 976.

R1 refragments the first fragment into three packets as follows:
R1 refragments the second fragment into two:
Note that it would have been more efficient to have fragmented into four fragments of sizes 376, 376, 376, and 352 in the beginning. Note also that the packet format is designed to handle fragments of different sizes easily. The algorithm is based on multiple fragmentation with single reassembly.

An Example Reassembly Procedure (RFC 791)
 
For each datagram the buffer identifier is computed as the concatenation of the source, destination, protocol, and identification fields.  If this is a whole datagram (that is both the fragment offset and the more fragments  fields are zero), then any reassembly resources associated with this buffer identifier are released and the datagram is forwarded to the next step in datagram processing.
 
If no other fragment with this buffer identifier is on hand then reassembly resources are allocated.  The reassembly resources consist of a data buffer, a header buffer, a fragment block bit table, a total data length field, and a timer.  The data from the fragment is placed in the data buffer according to its fragment offset and length, and bits are set in the fragment block bit table corresponding to the fragment blocks received.
 
If this is the first fragment (that is the fragment offset is zero)  this header is placed in the header buffer.  If this is the last fragment ( that is the more fragments field is zero) the total data length is computed.  If this fragment completes the datagram (tested by checking the bits set in the fragment block table), then the datagram is sent to the next step in datagram processing; otherwise the timer is set to the maximum of the current timer value and the value of the time to live field from this fragment; and the reassembly routine gives up control.
 
If the timer runs out, the all reassembly resources for this buffer identifier are released.  The initial setting of the timer is a lower bound on the reassembly waiting time.  This is because the waiting time will be increased if the Time to Live in the arriving fragment is greater than the current timer value but will not be decreased if it is less.  The maximum this timer value could reach is the maximum time to live (approximately 4.25 minutes).  The current recommendation for the initial timer setting is 15 seconds.  This may be changed as experience with this protocol accumulates.  Note that the choice of this parameter value is related to the buffer capacity available and the data rate of the transmission medium; that is, data rate times timer value equals buffer size (e.g., 10Kb/s X 15s = 150Kb).


Finally, any given IP link may provide its own link-layer fragmentation and reassembly (as ATM links do). This can be done transparently by the LAN layer (ATM again), or (less often) with some kind of negotiation by the IP layers.
 


4.1.4: IP routing
host algorithm: Check IPnet. If it matches our own network, deliver directly via the LAN. Otherwise, send to our designated router.
router:

Default routes are hugely important in keeping leaf router tables small.

How a packet traverses layers, with headers; routing
 
        A       net 200.3.9      router   net 201.4.6    B
        |_________________________|  |___________________|
   200.3.9.5            200.3.9.254  201.4.6.1         201.4.6.7
 
Actual routers might try in order: host-specific, local, net, default
 


ARP and DHCP

How does a router (or host) find the physical address of a neighbor on the same network? The Address Resolution Protocol (ARP) is generally used. Alternatives: polling, link-layer notice, IP-layer notice
 
Basic ideas: broadcast request/unicast reply, cache
 
Timeout: used to be ~10 minutes (20 min for early linux),
now is much less (linux 2.4 arp timeout is ~60 seconds)

    see: http://www.cs.helsinki.fi/linux/linux-kernel/2002-07/0179.html
    ip -s neigh
    
finer points:
If A arps "where is B?"
        1. B always puts A in its cache
        2. All hosts with A in their cache update the entry
Self-arp, or gratuitous arp: detects duplicates, ethernet address changes send an arp request for yourself (and hope you don't get answers!)
 
Flooding: 
    what if A tries to send 100 packets to B; how many ARPs? 
    A b'casts, everyone replies & needs to ARP to get A's addr
ARP and networks without b'cast [eg ATM]
        Failure in presence of looping
security implications of ARP
proxy arp

detecting sniffers:
To find out if host A is in promiscuous mode, send an ARP "who-has A?" query. Address it not to the broadcast Ethernet address, though, but to some nonexistent Ethernet address.

If promiscuous mode is off, A's NI will reject the packet.
If promiscuous mode is on, A's NI will pass the arp request to A itself, which will probably answer it.

Alas, linux kernels reject at the software level arp queries to physical ethernet addresses other than our own.
BUT: they do respond to faked Ethernet multicast addresses.

Windows: try Ethernet addresses ff:ff:ff:00:00:00 or ff:ff:ff:ff:ff:fe

You can ping A's actual IP address (which requires a separate ping for each host, and a prior scan to find all the hosts), or try pinging the IP b'cast address (all 1's in the host part).
        


 
DHCP; once known as Reverse ARP (RARP)
    You b'cast your Ethernet addr, and hope a DHCP server finds it and
    sends you your IP address. Also other helpful startup options!!!
        subnet mask
        default router
        DNS Server

DHCP and servers: who's going to update the DNS entries?

Minimal network config:
    ip addr
    subnet mask
    default router
    DNS server
   

4.3.1: subnets

Just outlined