The main topic here is how IP works; 4.1 contains an overview of IP itself, 4.2 discusses how IP routers communicate and build their routing tables, and 4.3 extends these ideas to, as the section is titled, the "Global Internet."Chapter 4: IP and Routing
We'll omit section 4.4 on multicast.
To do this, IP must provide a global mechanism for addressing and routing, so that packets can actually be delivered from any host to any other host. IP addresses are 32 bits, and space for them is shown in the header diagram on page 252 (Fig 4.3). A more detailed discussion of IP addresses begins on page 262, in section 4.1.3. An essential feature of IP addresses is that they consist of a "network" part and a "host" part. 4.1.3 lists the "legacy" mechanism for network/host allocation; the current strategy today is discussed later in 4.3.2. The entire point of this network/host separation is that routers list only the network parts of the destination addresses, which saves space. There are currently around 5 million hosts on the Internet, but only 50,000 or so networks. A routing table of 50,000 entries is (just barely) feasible; a table a hundred times larger is not.
In addition to routing and addressing, IP must also support the transmission of packets that may be too large for some intermediate physical LAN. For example, two token rings with a 4K max packet size can send and receive packets of size 4KB, but if they are connected by an intervening Ethernet that can only forward packets of size 1.5KB, something has to give. IP handles this by supporting fragmentation. This is summarized starting on page 253. The IP approach is awkward and inefficient, and IP fragmentation is seldom used (instead, IP tends to choose rather small packet sizes (eg 512 bytes) to avoid it). However, fragmentation is essential conceptually, in order for IP to be able to support large packets without knowing anything about the intervening networks.
Note that IP is a "best effort" system; there are no acknowledgements or retransmissions or any of that stuff. We ship it off, and hope it gets there. Most of the time, it does. 4.1.2 discusses this "service model" at more length.
In section 4.1.4 the text returns to the details of how IP forwarding works.
4.1.5 discusses ARP, the protocol used if a host knows the desired local IP address and needs to find out the Ethernet address.
4.1.6, which you can skip entirely, discusses automatic IP address assignment to hosts, by a DHCP server.
4.1.7 discusses ICMP, the Internet Control Message Protocol. You should at least skim this, because new ICMP messages will be introduced later in the text.
Skip 4.1.8.
How does IP forwarding (section 4.1.4) differ from Ethernet switch forwarding? Both are instances of the datagram-forwarding strategy, presented in 3.1.1.
Suppose you wanted to connect two Ethernet LANs, Eth1 and Eth2. You have two routers, one on each LAN, and a Token Ring link joining them. What would you have to do to get a packet from host A on Eth1 to host B on Eth2? Assume everything has to be done "manually". To where does host A's Ethernet actually deliver the packet, initially?
4.1.2 discusses the primary "small-scale" algorithm/protocol used today: distance-vector, or RIP (Routing Information Protocol). The normal case proceeds straightforwardly; some abberant cases (and their fixes) are discussed beginning with the last paragraph on page 287. There is some debate today whether or not these fixes have, effectively anyway, completely solved the problem.
Skip the subsections titled "Implementation" (p 288) and "Routing Information Protocol" (p 290).
The Link State algorithm/protocol discussed in 4.2.3 is the primary "large-scale" or "large-site" algorithm used today; it is used internally at most large Internet Providers. Read up to the green text on page 298, through the example; skip the subsection beginning on page 298 titled "The Open Shortest Path First Protocol (OSPF)".
Internet providers share routing information between each other via BGP, discussed in 4.3.3.
Skip 4.2.4 and 4.2.5.
Here is the Loyola situation. Loyola has a class-B IP network address, 147.126.x.y. The x.y are the host bits, officially. However, within Loyola, the third byte, x, is used as a subnet number; x=2 and x=3 are used by the Computer Science department, x=1 is used by the computer center, and x=36 is used by the dorm room computers. Internal Loyola routers use the first three bytes in their forwarding tables. Hosts attempt direct Ethernet delivery only if the first three bytes match. Outside Loyola, only the first two bytes are used in routing tables. Another way to look at the situation is that, within Loyola, it looks like Loyola has 255 class-C networks. Outside, it looks like Loyola has a single, class-B, network.
Subnetting does not make more addresses available; bits used for the subnet number come out of the host bits. Subnetting does, however, make network addressing more flexible.
Subnetting, in effect, moves the network/host division point to the right of where it would be based only on the official class-A/B/C rule. A subnet mask is used to keep track of the new position.
4.3.2 In the mid-90's the Internet was running out of IP addresses. Class C allows only 255 hosts; most organizations want more than a class C. There are only 16,000 possible class-B addresses. There are more than 16,000 organizations that want such addresses! An organization can take multiple class-C's instead of a class B, but then the organization appears multiple times in the backbone tables, and the backbone tables become larger. In the mid-90s the backbone tables reached 40-50,000 entries and a crisis loomed. The tables were about as large as they could get.
Around this time, CIDR was introduced. This stands for "Classless internet domain routing", with the emphasis on "classless". The old address classes A, B, and C were (selectively) abandoned. Instead, multiple class C's could be bundled together into a single routing-table entry, via CIDR or "supernets". This strategy is just like subnets, except the network/host division point is being moved to the left. This means that an organization can buy an address block consisting of a bundle of class C's, of exactly the right size (well, block sizes are always a power of 2).
Another use of CIDR has been the rise of provider-based addressing. Loyola is 147.126.0.0; suppose that 147.0.0.0 meant the Chicago area (it doesnt). In this picture, 147.124 might be Northwestern, 147.125 might be DePaul, and 147.127 might be IIT. Real provider-based addressing is done by assigning to internet service providers large (very large) chunks of Class C addresses. The provider then subdivides the address blocks and allocates them to customers; outside the providers' routers, only a single entry (for that provider) need be made in any other organization's routing table. This has proven to be a huge savings, allowing much smaller routing tables. In theory, one only needs one entry in the backbone routing table for each top-level internet service provider.
4.3.3 Within an organization, we saw in 4.2 that one could use either distance-vector or link-state routing. Between organizations, however, neither is practical: the organizations would have to agree to use the same protocol, and these routing protocols just aren't universal. Enter BGP, which is a universal routing protocol, of a sorts.
The tradeoff with BGP is that it does not have any way to specify distance information at all. All it can convey is reachabilitiy. Optimal routes are not even attempted.
RIP uses optimal routes, and optimality serves as a guarantee that there are no routing loops: a route with a loop in it cannot be the shortest one possible! BGP can't guarantee optimality, but loops would still be a problem. So BGP deals with them another way: by exchanging full path information. The paths aren't lists of each router traversed, though, but just of each organization (autonomous system).