Network Management
Summer 2016, Corboy 710, TTh 5:30-8:45 pm
Class 1: July 5
Exams, ground rules
Management: the choices we make.
The following is the "official OSI" basic five areas for network management
(see also IntroNetworks
Network Management and SNMP)
- fault detection
- configuration
- accounting (eg user accounts)
- performance
- security -- a topic unto itself
Some people add:
- maintaining reliability: five 9's (99.999% availability is 5
minutes/year) (reliability is related to fault detection; for example,
redundant hardware helps with reliability but only if faults can be
detected quickly so "failover" can be initiated promptly)
- helpdesk support
- compliance monitoring
Sometimes we look at network management as managing the network hardware and
software. Lots of traditional network management focuses almost entirely on
this. However, we can also talk about managing bandwidth,
which ultimately boils down to doing something other than giving everyone
(or every connection!) a roughly equal share of what is available.
Fault detection might not seem to be tied directly to our choices, but we do make choices that affect how readily
faults are detected. And anyone with the title "Network Manager" is
expected to detect and repair problems promptly!
A classic configuration decision is whether a medium-sized network should
use Ethernet switching exclusively, or should be divided into subnets so as
to make use of IP routing. The rise of Software-Defined Networking has
further complicated this choice.
SNMP (Simple Network Management Protocol) is a protocol associated with
retrieving network statistics from various "agents". Management
is the art of making initial configuration decisions, and then later
decisions based on SNMP data and other data to keep everything running
smoothly.
(For completeness, the OSI alternative to SNMP is also an option: it is
called Common Management Information Protocol, or CMIP. It is decades behind
schedule, and so may never be widely supported, but it is possibly a better
solution technically.)
Another form of network management is change
management. Is your site changing its IP address prefix, due to a
provider change? Are you migrating to use of private 10.0.0.0/8 IP
addresses, along with Network Address Translation (NAT) to reach the outside
world? Are you upgrading from Windows 10 to Xenial Xerus? There is a fair
bit of material in chapter 1 of Mauro & Schmidt devoted to the nuts and
bolts of change management: administration, testing, support, software
distribution, etc. There is also emergency
change management, usually initiated by the discovery of malware
(and usually, though not always, focused on distribution of service patches
or updates).
Other examples of management:
BGP policy-based routing: what can we do
with creative routing?
Linux Advanced Routing Toolkit: what tools do we have
for bandwidth allocation?
There is some conflict in Network Management world as to whether the main
focus is hardware (physical
network at your site), or software services (web, servers, etc). Managing
bandwidth through allocation is something that many "network managers" do
not do at all.
How do you tell when a server is down??? When it's not responding? How long?
What if it responds to simple queries, but not complex ones?
Here are four rough sizes of networks:
- Building-sized (or campus-sized) single-business networks
- Multi-campus networks (eg Loyola's)
long links, sophisticated internal routing
- Internet Service providers
very long links, internal & external routing
- Data Centers, which may have ~100,000 servers and ~4,000 switches
Layers
7-layer, 5-layer models
Physical
LAN
Internetwork (IP)
IP addresses have a Net part and a Host
part. The division point is constant per LAN.
Transport (TCP, UDP)
Session
Presentation
Application
OSI 7-layer model:
wishful thinking from self-important bureaucrats trying
to justify their existence?
Not exactly, but not far off
Comments on Session & Presentation layers
Session: ssh controlmaster connection! But we don't need this as a special
layer;
Presentation: ASN.1, BER: these are very important for SNMP!
Some synonyms: packet/frame/PDU/segment/??
Review of network building blocks
Workstations & Servers: endpoints
Software services live on these devices! Also, these devices speak IP
(Internet Protocol), and so you might want to collect stats on IP addresses
assigned, subnet masks, routers, DNS, etc.
Workstations have a 6-byte physical Ethernet address burned into the card
(occasionally there are problems with duplicate addresses; these are rare,
but pretty frustrating). On bootup, workstations acquire a 4-byte IP
address, usually via DHCP but occasionally by static configuration. They
also acquire, at a minimum,
- a subnet mask, which defines how the IP address splits into the net portion and host
portion
- a preferred router
- a DNS server, to translate, say, "ulam2.cs.luc.edu" to 147.126.65.47
The way DHCP works is that clients broadcast a DHCP query that contains
their physical address; the DHCP server on the same subnet answers it.
(Actually, usually the local-subnet router plays a role as a "forwarder" to
the real DHCP server, typically not
on the same subnet). The DHCP response includes the assigned IP address as
well as the information above, and sometimes a lot more information as well.
A subnet is defined as all hosts with a common IP net address, as determined
by the subnet mask. Two nodes with the same IP net address reach each other
directly, by sending to each others physical Ethernet address (as discovered
by the ARP protocol). Two nodes on different subnets send to each other via
routers.
Note that in order for the network to work, we need
- routers
- DHCP servers
- DNS servers
Repeaters/Hubs (or linear coax-based Ethernet)
Brief view of Ethernet packet format:
6 bytes destination address
6 bytes source address
2 bytes type (eg IP, IPX, ARP)
Data
Linear coax had nothing to fail, except the cable itself. You noticed a
fault when you couldn't reach the other end. Repeaters in some sense are
simply an active replacement for coax; they retransmit the arriving bits on
all other interfaces, as they arrive; collisions are passed on. Some
repeaters do speak SNMP; they can report onthe following:
- collision rates
- per-host traffic
- per-host/per-destination traffic
- total available bandwidth consumed
- Ethernet errors: packets too small, too large, insufficient gap,
corrupted packets
- Hardware errors within the repeater itself: interface errors, dropped
packets, temperature, OS faults, etc
Bridges/Switches
These devices shield segments from collisions. They construct tables of the
form ⟨dest,interface⟩.
If a packet arrives for destination D, and
there's an entry for ⟨D,i⟩, then the packet is forwarded only on interface
i; otherwise, it is forwarded on all interfaces except for the arrival
interface (that is, broadcast).
If a packet arrives on interface i from
origin D, then ⟨D,i⟩ is inserted into the table.
Thus, initially all packets are broadcast, but quickly the bridge builds its
table to route packets more efficiently, and soon each packet takes only the
direct path to its destination.
Switches read in full packets; that is, each interface is a full Ethernet
interface. Thus, there is a full set of Ethernet data for each
interface. Additionally, most switches are capable of sophisticated
configuration, in which certain sets of ports (interfaces) are linked
together into virtual networks. Switch ports may not all run at the same
speed (eg there may be a mix of 100mbps Ethernet and gigabit Ethernet); the
switch's statistics can be used to help decide whether you're using the
different port capabilities optimally. Finally, switches may be able to
report information about the size of the forwarding tables and how many
non-b'cast packets arrive for which the destination is not found in the
table (forwarding errors).
Routers
IP routers work like switches, except that traffic is forwarded from one IP
network to another only by
arrangement. There is no analogue to "learning switches".
Routers do have IP addresses. They have information on rate of packets
routed, rate of routing-table modifications, etc.
Here's an important router question. What if I bring my home laptop into
work, and plug it into my office computer jack? Will this be detected? If
so, how? The DHCP server on the network might
notice that it has handed out an IP address to a physical address never
before seen, but I could bypass this by configuring my home laptop to use my
office machine's IP address. At that point, the router might
notice that my Ethernet address is different. Will it actually catch this?
How can it report some statistics that would let management notice what is
going on? Can routers be configured so as to attempt to prevent
this? (Many high-end wireless routers do attempt to block any traffic from
wifi physical addresses that haven't been authorized.)
Switches are considered "Layer 2" in the 7-layer and 5-layer models; routers
are "Layer 3". Sometimes one speaks of "layer-2 switching" versus "layer-3
switching".
A typical configuration decision is whether to have your site be one giant
subnet, where switched Ethernet is used to route packets from one
workstation to another, or whether to subdivide internally (eg by floor, or
department, or building) into IP subnets. Routers would then be needed to
move traffic from one subnet to the other. Routers serve to limit the scope
of broadcast traffic (such as ARP and DHCP requests). Routers are smarter
and more flexible, able to implement internal firewalls and other traffic
restrictions. However, routers are also slower, formerly an order of
magnitude slower.
In the lecture I'm working on how basic IP routing works, and how it works
with subnets. Here are some references to IntroNetworks:
Proxy Transport
At many sites, connection to the web is made not by direct connection to
remote webservers on port 80, but by connecting to a proxy server at
your site, which in turn makes the actual connections. The proxy server is
thus able to filter out some malicious material, and also can cache sites
for better bandwidth utilization. Proxy servers can be transparent,
like at Loyola, where you appear to be connecting directly to the remote
server's port 80 but in fact your connection has been intercepted, or else explicit,
in which case the address and port of the proxy server has to be configured
in your browser.
Concept of NMS: Network Management System
We will look (some) at OpenNMS; see opennms.org.
Agents: every device on the network that reports to the NMS is called an agent. Agents can report via SNMP
(below) or via some other mechanism.
The management station, or manager,
is the node to which agents report, either directly or indirectly. Indirect
reporting means that there is a "submanager" out there, collecting data from
a pool of agents and forwarding it up to the master manager.
Agent reporting may be initiated by the agent or, more commonly, by the
manager, through polling.
Some sort of PROTOCOL is used. Most common is SNMP, although application
software is often polled by "direct contact"; eg, we can verify that a
server is successfully running SMTP (email) by connecting to port 25 and
verifying that we see the expected responses. At some point we will look at
some of the java applets used by OpenNMS to attempt to contact various
servers to verify that services are running appropriately.
The following SNMP data is stored by the manager (possibly in a distributed
fashion):
MIB (Mgmt Information Base): the table
of attribute names and "lookup keys"
MDB (mgmt database): actual data
values
An NMS constantly monitors devices for function, operation, and
configuration, and reports problems in real time. The NMS can answer
questions about:
- Fault Management / Reliability
- Help-Desk management
- Configuration Management
- Security
- Performance
- Compliance
Is everyone running WinXP? Is
everyone running the company version?
Does everyone have
Service Update 09-31804 installed?
Is anyone plugging in
devices that IT doesn't know about?
- Accounting
Brief intro to SNMP
SNMP, for Simple Network Monitor Protocol, is a way to get information from
each node on your network. Each device must run an SNMP "agent" module; for
example, workstations must run an SNMP software package in order to respond.
SNMP can be used readonly to poll the agents and retrieve data, or in
readwrite mode to update and configure the devices via their agents.
SNMP started as SGNP: Simple Gateway Monitoring Protocol,
in 1987 ("gateway" is an old term for "router"). It conflicted with the OSI
approach known as CMIP (Common Management Information Protocol). At the time
CMIP was too large and complex for practical implementation.
In 1988 the Internet Activities Board decided to pursue both SGMP and CMOT:
CMIP over TCP/IP. This failed within a year: CMOT was dropped and SGMP had
evolved into SNMPv1.
Perhaps the first issue for SNMP is how are we going to NAME all the
possible attributes? Remember that many devices will have
manufacturer-specific attributes
One important manufacturer-specific attribute is the Device Temperature.
SNMP defines an enormous tree-structured naming hierarchy, using strings of
digits known as Object IDentifiers, or OIDs. A diagram appears in Mauro
& Schmidt, page 24. Here are some upper levels:
1 iso
3 standard
6 dod
1 internet
2 mgmt 4: private
1 mib-2
Thus, the prefix 1.3.6.1.2.1 is would be the OID prefix for the mib-2 data;
mib-2 was an early standardization of the SNMP data that would "usually" be
available. The prefix 1.3.6.1.4.1 is for "private", or
manufacturer-specific, data.
Here are some of the next mib-2 levels; we will use "mib2" to represent
"1.3.6.1.2.1"; thus mib2.5 denotes
"1.3.6.1.2.1.5"
mib2.1 system
mib2.2 interfaces
mib2.3 arp
mib2.4 ip
mib2.5 icmp
mib2.6 tcp
mib2.7 udp
mib2.8 egp (obsolete)
mib2.9 unimplemented [?]
mib2.10 unimplemented [?]
mib2.11 snmp server
mib2.25 host resources
There are more.
SET GET GET-NEXT, response, TRAP
atomic values only! Note use of GET-NEXT
The "base" MIB is MIB-2
Issues:
data presentation (eg byte order, but much more)
NAMING for all those possible attributes!
ASN.1/BER data representation: defer
data can be subdivided into fields, though it is not for SNMP.
A MIB is an assignment to each of a set of OIDs a specific
attribute name and type. (MIBs also define tabular data forms.) The OIDs
name the general attributes, not a specific instance. In that sense, OIDs
are like Java class definitions, not class instances.
Questions:
- given an OID, how do we find a MIB file that defines it?
- given a piece of hardware, how do we find a MIB that defines its SNMP
responses?
The first case corresponds to our seeing 1.3.6.1.2.1.1.9 in the output of
the system snmp walk; we did not,
however, know how to interpret the responses.
The second case is probably more common: you have a new switch, and need to
find out what kinds of SNMP data it submits in the private
(1.3.6.1.4.1) subtree.
If we run a MIB browser such as iReasoning, we can see the OIDs. Sometimes
googling for the OID will turn something up. Sometimes searching the mib
files for, say, the string "system 9" to figure out the OIDs of form
system.9, will find what we want.
Demos using iReasoning tool and snmpwalk
We will use host ulam3 (10.38.2.42) for these demos
(/etc/default/snmpd by default binds snmpd only to localhost!)
snmpwalk -v 2c -c public ulam3 .1.3.6.1.2.1.1
snmpwalk -v 2c -c public ulam3
1.3.6.1.4.1
End of MIB
snmpwalk -v 2c -c tengwar ulam3
1.3.6.1.4.1
gads of data
snmpwalk -v 1 -c tengwar ulam3 1.3.6.1.4.1.42
gads of data
As of 1 July 2016, the ulam3 SNMP community strings are "public", "futhark"
and "tengwar".
You can put .1.3.6.1.4.1.42 into the upper-right box of the iReason tool [at
least for ulam3]
Other ways of polling devices:
ssh: limitations: lack of "universal" account
lack of
"limited" account
doesn't work
for most hubs/switches/non-hosts