Network Management

Week 11, Apr 11
LT-412, Mon 4:15-6:45 pm

midterm resubmissions due today

SNMP v3
LARTC
BGP

Privacy:

DES key is 64 bits; the other 8 bytes of 16-byte key is used as preinitialization value. (Actually, DES keys are 56 bits)

How does this prevent the four risks:

DISCLOSURE: eavesdropping
MODIFICATION of messages or information. "man-in-the-middle" attacks
MASQUERADE: an endpoint pretending to be someone else.
MESSAGE STREAM MODIFICATION:

Basic key management

An authoritative engine keeps a secret local key for each user (manager).

The nonAuthoritative side first generates a user key ku, based on the password:

    password => repeated to 1MB => take hash => ku

(The repetition to 1MB is intended to make the calculation slower, to make it harder to guess passwords if the ku is compromised)

Now take the 16-byte (for md5) ku, and form the concatenation ku^engineID^ku. Run that through MD5 to get kul, the localized user key (localized to a specific agent).

This calculation is done by the manager, which sends kul to the agent without revealing the password. (Typically, the actual ku is available to the agent at some initial point in the configuration process.)

This kul key cannot be used on another SNMP node, as its engineID will be different.

Summary:
    password => repeated to 1MB => take MD5 hash => ku
    MD5(ku^engineID^ku) => kul
Either side can do this calculation. However, normally the agent side keeps only kul.

Note that kul is never sent in the clear.

KEY Update, abstract version

How do you create new accounts on a linux system? By using adduser, over an encrypted ssh session. You just create each user from scratch.

But in SNMP we don't have shell sessions, and we may not even have encryption. The interesting case is creating authenticated new user accounts without risking the chance that an eavesdropper will learn the key, and without using encryption.

A simpler example is key update. Again, if we're doing this with encryption then we could simply send the new password. But we instead

1. The user generates a new password, eg "zanzibar"
2. User creates localized key newkul for zanzibar + specific agent (kul = local user key)
3. We generate 16-byte string rand, cryptographically random
4. Calculate delta as follows:
    temp := md5(oldkul^rand)
    delta := temp XOR newkul
5. Send rand and delta to the authoritative engine (agent) end.
    why can't temp be inferred?
    why can't oldkul be inferred?

The agent can use rand and delta and known oldkul to compute temp and thus newkul.

(This is from page 35++ of rfc3414, slightly simplified.)

When we do a key update, we actually send some encoding of (rand, delta) as a "keychange object". From rfc 3414:

The 'random' and 'delta' components are then concatenated as described above, and the resulting octet string is sent to the recipient as the new value of an instance of this object [ie a "keychange" object -- pld]

At the receiver side, when an instance of this [keychange] object is set to a new value, then a new value of K is computed as follows: ...

Note that this is yet another object-specific modification of the SET semantics: when you appear to be doing a simple assignment, you are in fact doing something rather more complex. (Other examples involved setting a testAndIncr semaphore, or a rowStatus object.)

usmUserTable

Agents keep usmUserTable with auth/priv info for each user. The index is user name (and local SNMP engineID, for proxy cases)

Agent starts out with one row in this table, configured manually. New users are created by cloning the row, and then updating the keys.

The original row is the "master" key, which may never be updated.

usmUserTable values
Note: some fields can not be read!

engineID
userName	string
securityName	string, may be same as userName
cloneFrom	a row-pointer referring to the original row from which this row was originally cloned; used for creating new users
userAuthProtocol	none / md5 / sha1
userAuthKeyChange	key-change string as above, used for the original clone key change. It encodes the (rand, delta) pair above. Note that setting a value to this field doesn't actually set the stored kul to the value sent; rather, it triggers the kul + (rand,delta) -> newkul update process above.
userOwnAuthKeyChange	like the preceding entry, except this one can only be changed if the userName in the message security header matches the userName object for this row. This is used for users making further changes to their own keys.
userPrivProtocol	none / DES
userPrivKeyChange	like userAuthKeyChange
userOwnPrivKeyChange	like userOwnAuthKeyChange
userPublic	How do we find out if a key change worked, if we can't read the new key value? Set this as well; either both were set or neither so just see if this was set. Helps if the Response to the keychange request was what got lost
userRowStatus	RowStatus column
userStorageType	volatile v nonvolatile storage

There is also a userSpinLock (a testAndIncr semaphore) for the entire table; it is not a column entry.

When doing key updates:

GET(userSpinLock.0) and save in sValue.
Generate the keyChange value based on the old (existing) secret key and the new secret key, let us call this kcValue. (This is the (rand, delta) data described above).
SET(userSpinLock.0=sValue, userAuthKeyChange=kcValue, userPublic=randomValue)

We verify that this worked with a GET request to see if userPublic changed to the randomValue specified.

Adding a new user:
Params: remote addr, new username, passwords, row to clone from

1. See if there already is a row for that user. If it exists but is not ACTIVE, then maybe another mgr is adding this user.

2. Create new row, by cloning, and put into CreateAndWait state.

3. Use master user to update auth/privacy keys. Note that in the future the user's own new password will work.

What if a user/password USR1 is compromised? This would have to be at the manager end. If so, we hopefully have the MASTER account to disable USR1 and then create credentials for a new USR2.

Creating and Cloning Users

Agent initialization with master account <root,ARTICHOKE>:

1. new "agent" (no longer called that!) is configured with the master <userid,passwd> that is, <root,ARTICHOKE>. The agent does the passwd=>kul conversion (ARTICHOKE=>kul_arti) immediately, and stores only kul_arti. (note that kul_arti is specific to that agent, by virtue of the use of the agent's engineID; the "l" in "kul" means "local", as in "local to that agent".)

(technically you have to TRUST the agent software not to keep ARTICHOKE around. NetSNMP is not so well-behaved here!)

All but the most secure sites will use the same master password for large classes of devices (perhaps one master password per site, or per subnet, or per scope-of-management). Somebody has to keep a list of each agent and its master <userid,password>!

Compromise of one agent does not reveal ARTICHOKE, assuming only kul_arti was stored.

Now let us clone a new user account <ivan,ZANZIBAR>:

To set up this specific account, someone manually enters <root,ARTICHOKE> into the NMS or other software (hereinafter called the "manager"). The manager takes ARTICHOKE and generates the same kul_arti. This is used to authenticate messages to the agent. (kul_arti is used either as the DES encryption key or as the appended "fingerprint" for MD-5/SHA1 authentication; either way, it is not sent in the clear).

The manager issues a kul_arti-authenticated command to "clone" the master <root, kul_arti> entry to create a new entry <ivan, kul_arti>, and then immediately changes the key, eg to <ivan,kul_zan>, using the normal key-change protocol. The clone and keychange can be done in one atomic step, or else the new row's rowStatus could remain createAndWait until the key change has been completed. (There is nothing actually wrong about having a new account with the old key.)

Cloning means to create the new row, using the new username as the key:
    SET(newuser.userName='ivan', newuser.userCloneFrom = root, newuser.userStatus = createAndWait)

    SET(newuser.userAuthKeyChange = authKeyDelta, newuser.userPublic = random)

After verifying that the update "took" (ie that the newuser.userPublic field did in fact change), one does
    SET (newuser.userStatus = active)

Note that cloning an entry means creating a new row in a table. All this requires a trusted admin to enter the master password ARTICHOKE into the manager, once. Just once, if credentials are persistent.

4. New user ivan can initiate a personal key change at any time. This starts with the choice of a new password, say ZAKUSKI, and the communication of this to the agent which will update <ivan,kul_zan> to <ivan,kul_zak>. Ivan can do this to some or all of the agents he communicates with, BUT I cannot imagine a reason for doing it to just some.

5. The master account password can also be updated, under the direction of the root administrator, either for individual agents or whole scads of them.

Even if we use the same root password throughout the site, it sure beats using an SNMPv1 community string (ie a password) that cannot be changed remotely (and thus likely will not be changed in practice at all), and that is sent in the clear.

Password update can be done because concerns about human-user leakage (too many people know the password ZAKUSKI), or codebreaker leakage (a codebreaking algorithm is applied to a large set of packets). The latter is probably less of a concern than the former.

Initial NetSNMP USM configuration

See http://www.net-snmp.org/docs/README.snmpv3.html.

The official way is to run net-snmp-config --create-snmpv3-user, which generates the following dialog:

Enter a SNMPv3 user name to create:
master
Enter authentication pass-phrase:
saskatchewan
Enter encryption pass-phrase:
[press return to reuse the authentication pass-phrase]
novosibirsk
adding the following line to /var/net-snmp/snmpd.conf:
   createUser master MD5 "saskatchewan" DES novosibirsk
adding the following line to /usr/local/share/snmp/snmpd.conf:
   rwuser master

This did create the /var/net-snmp/snmpd.conf entry. However, when I ran
    snmpget -v 3 -u master -l authNoPriv -a MD5 -A saskatchewan localhost sysUpTime.0
I got
    authorizationError (access denied to that object)

It turned out that when I first did this, the entry rwuser master was in the wrong place; it needed to be put in /etc/snmp/snmpd.conf. And then it worked. (Actually I used rouser). Now, the snmpd.conf file is in /usr/local/share/snmp.

Note that use of the rwuser/rouser directives are not part of VACM, at least not apparently.

Also note that the version, username, security_level (-l), hash protocol (-a), and password (-A) can all be set in your per-user .snmp/snmp.conf file; having done this (except for the version) I can also just use

    snmpget -v 3 localhost sysUpTime.0

One can create multiple snmp users this way, bypassing the SNMPv3 clone-from process, or just one. To clone users, net-SNMP provides the snmpusm shell command; a typical example to clone user "pld" from the "master" account created above would be

snmpusm -v3 -u master -l authNoPriv -a MD5 -A saskatchewan localhost create pld master

snmpusm -v3 -u pld -l authNoPriv -a MD5 -A saskatchewan localhost passwd saskatchewan ramblers

The first command creates pld as a clone of master, still with password "saskatchewan", and the second sets pld's pasword to "ramblers".

Note that replacing localhost with another hostname allows you to use this command to clone a user on any SNMPv3 agent for which you have credentials.

Note also that NetSNMP used to be bad about hanging on to the original password. The snmpd.conf manual page states that the /var/net-snmp/snmpd.conf entry will be replaced with one containing just kul, but this does not seem to work:

This directive [the createUser line above] should be placed into the /var/net-snmp/snmpd.conf file instead of the other normal locations. The reason is that the information is read from the file and then the line is removed (eliminating the storage of the master password for that user) and replaced with the key that is derived from it. This key is a localized key, so that if it is stolen it can not be used to access other agents. If the password is stolen, however, it can be.

Earlier the passwords remained in the file, but on my current version, 5.4.2, the contents of the file is

############################################################################
# STOP STOP STOP STOP STOP STOP STOP STOP STOP
#
# **** DO NOT EDIT THIS FILE ****
#
# STOP STOP STOP STOP STOP STOP STOP STOP STOP
############################################################################
#
# DO NOT STORE CONFIGURATION ENTRIES HERE.
# Please save normal configuration tokens for snmpd in SNMPCONFPATH/snmpd.conf.
# Only "createUser" tokens should be placed here by snmpd administrators.
# (Did I mention: do not edit this file?)

usmUser 1 3 0x80001f888091ab263b3c8be84a 0x6d617374657200 0x6d617374657200 NULL .1.3.6.1.6.3.10.1.1.2 0xd60568f337ab44fde4fc36903979b254 .1.3.6.1.6.3.10.1.2.2 0x68c41efe21d540a6427c7528091dbb97 ""
setserialno 897769326
##############################################################

Much safer!

See also /home/pld/.snmp/snmp.conf. This is what enables
snmpget -v 3 localhost sysUpTime.0

OpenNMS use of SNMPv3:
see http://opennms.org/index.php/SNMPv3_protocol_configuration

So far we've been managing devices (or links). Now it's time to think about managing traffic.

Methods for managing traffic

Queuing disciplines
policing
traffic control
reservations
prioritizing
firewalls

Many of these are handled through the LARTC package: Linux Advanced Routing & Traffic Control

leaf-node zones: here we can regulate who has what share of bandwidth
Core problem: we likely can't regulate inbound traffic directly, as it's already been sent!

Notes on using tc to implement traffic control

Goal: introduce some notion of "state" to stateless routing

LARTC HOWTO        -- Bert Hubert, et al
    http://lartc.org/howto

A Practical Guide to Linux Traffic Control    -- Jason Boxman
http://blog.edseek.com/~jasonb/articles/traffic_shaping
    good diagrams

Traffic Control HOWTO, v 1.0.2     -- Martin Brown: local copy in pdf format

Policy Routing with Linux, Matthew Marsh (PRwL)

Good sites:

http://linux-ip.net/articles/Traffic-Control-HOWTO/classful-qdiscs.html

http://tldp.org/HOWTO/Adv-Routing-HOWTO/lartc.adv-filter.u32.html:
good article on u32 classifier

Good stuff on "real-world scenarios":
http://www.trekweb.com/~jasonb/articles/traffic_shaping/scenarios.html

The linux packages we'll be looking at include:

iptables: for basic firewall management, including marking packets. Dates from 1998

iproute2: for actually routing packets based on more than their destination address. Also provides the ip command for maintaining all the system network state. Dates from ~2001?

tc: traffic control, for creating various qdiscs and bandwidth limits

inbound, outbound: these terms are always relative to your entire site

Policing: dropping or delaying naughty packets. Delaying inbound packets may not really do much! The inbound bandwidth has already been consumed. If you are lucky, your inbound delay will mean that the traffic endpoint will delay its ACK, which in turn will mean that the sender will slow down in the future.

Shaping: (some qdiscs, like TBF, do this) shaping means delaying packets, on egress. Typically done for outbound packets. Typically this means non-work-conserving queuing

scheduling (all qdiscs do this). This is prioritizing packets. No explicit delay is introduced, but there may be implicit delay as your packets get shuffled to the bottom.

Queuing:
    bandwidth: what packets get sent
    promptness: when will they be sent
    who gets dropped (inverse of who gets sent, sort of)

Queuing Disciplines (qdisc): does scheduling. Some also support shaping/policing.
A qdisc determines how packets are enqueued and dequeued. Some options:

fifo + taildrop
longest-queue taildrop (LQTD): a router with multiple queues shares memory between them; when a packet arrives, the longest queue is always the one that experiences the drop
fifo + random drop
RED: introduces random drops not for policing, but to encourage good behavior by senders. Used in core networks, not leaf networks
stochastic fair queuing (each TCP connection is a flow). SFQ gives each flow a guaranteed fraction of bandwidth, when needed. Other SFQ flavors: flows are subnets, etc. However, if we're doing scheduling of inbound traffic, it doesn't do much good to do SFQ based on destination (unless we can do it at the upstream router at our ISP)
pfifo_fast (or, generically, pfifo): priority fifo

Some reasons for an application to open multiple TCP connections:

cheating SFQ limits
much improved high-bandwidth performance
the nature of the data naturally divides into multiple connections

tc's pfifo_fast qdisc has three priority bands built-in: 0, 1, and 2.
        enqueuing: figure out which band the packet goes into (based on any packet info; eg is it VOIP?)
        dequeuing: take from band 0 if nonempty, else 1 if nonempty, else 2

ipTables

IPtables should probably be thought of as a firewall tool: it can filter traffic, mark/edit headers, and implement NAT.
How to direct traffic (at least without destaddr rewriting) was somewhat limited.

Iptables has 5 builtin chains, representing specific points in packet processing (and implemented as five specific hooks in the kernel). Chains are list of rules. The basic predefined chains are:

PREROUTING: for before a routing decision is made; includes packets from the outside world and locally generated packets.
INPUT: for packets to be delivered locally to this host.
FORWARD: for packets that will be routed, and which are not for local delivery.
OUTPUT: locally generated packets; either remote or local destination.
POSTROUTING: outbound traffic

You can define your own chains, but they are pretty esoteric unless you're using them as "chain subroutines", called by one of the builtin chains.

Rules contain packet-matching patterns and actions. Typically, a packet traverses a chain until a rule matches; sometimes the corresponding action causes a jump to another chain or a continuation along the same chain, but the most common case is that we're then done with the packet.

Tables are probably best thought of as sets of chains, except that the same chain can appear in multiple tables. In some contexts it is more accurate to think of the tables as data structures referred to by the chains.

Specifically, in iptables the tables are

FILTER contains INPUT, OUTPUT and FORWARD chains (below)
NAT contains PREROUTING, OUPUT and POSTROUTING.
MANGLE contains PREROUTING and OUTPUT
RAW, for various low-level packet updating

Targets: ACCEPT, DENY, REJECT, MASQ, REDIRECT, RETURN

The FILTER table is where we would do packet filtering. The MANGLE table is where we would do packet-header rewriting. The MANGLE table has targets TOS, TTL, and MARK.

Obvious application: blocking certain categories of traffic
Not-so-obvious: differential routing (so different traffic takes different paths), and actually tweaking traffic (with MANGLE; can be done either before or after routing)

Here is a diagram from http://ornellas.apanela.com/dokuwiki/pub:firewall_and_adv_routing indicating the relationship of the chains to one another and to routing.

Note that the Local Machine is a sink for all packets entering, and a source for other packets. Packets do not flow through it. The second Routing Decision is for packets created on the local machine which are sent outwards (or possibly back to the local machine).

      Incoming
       Traffic
          |
          |
          V
     +----------+
     |PREROUTING|
     +----------+
     |   raw    |  <--------------+
     |  mangle  |                 |
     |   nat    |                 |
     +----------+                 |
          |                       |
          |                       |
       Routing                    |
    +- Decision -+                |
    |            |                |
    |            |                |
    V            V                |
  Local        Remote             |
Destination   Destination         |
    |            |                |
    |            |                |
    V            V                |
+--------+  +---------+           |
| INPUT  |  | FORWARD |           |
+--------+  +---------+           |
| mangle |  | mangle  |           |
| filter |  | filter  |           |
+--------+  +---------+           |
    |            |                |
    V            |                |
  Local          |                |
 Machine         |                |
                 |                |
                 |                |
  Local          |                |
 Machine         |                |
    |            |                |
    V            |                |
 Routing         |                |
 Decision        |                |
    |            |                |
    V            |                |
+--------+       |                |
| OUTPUT |       |                |
+--------+       |                |
|  raw   |       |                |
| mangle |       |                |
| filter |       |                |
+--------+       |                |
    |            |                |
    |      +-------------+        |
    |      | POSTROUTING |      Local
    +----> +-------------+ --> Traffic
           |   mangle    |
           |     nat     |
           +-------------+
                 |
                 |
                 V
              Outgoing
              Traffic

From the iptables man page:

filter This is the default table.   It contains the built-in chains
      INPUT (for packets coming into the box itself), FORWARD (for
      packets being routed through the box), and OUTPUT (for locally-
      generated packets).

pld: generally, users add things to the forward chain. If the box is acting as a router, that's the only one that makes sense.

nat    This table is consulted when a packet that creates a new connection
      is encountered. It consists of three built-ins: PREROUTING
      (for altering packets as soon as they come in), OUTPUT (for
      altering locally-generated   packets   before   routing),   and
      POSTROUTING (for altering packets as they are about to go out).

pld: The NAT table is very specific: it's there for implementing network address translation. Note that the kernel must keep track of the TCP state of every connection it has seen, and also at least something about UDP state. For UDP, the kernel pretty much has to guess when the connection is ended. Even for TCP, if the connection was between hosts A and B, and host A was turned off, and host B eventually just timed out and deleted the connection (as most servers do, though it isn't really in the TCP spec), then the NAT router won't know this.

Part of NAT is to reuse the same port, if it is available; port translation is only done when another host inside NAT-world is already using that port.

mangle This table is used for specialized packet alteration.   It has
      two built-in chains: PREROUTING (for altering incoming packets
      before routing) and OUTPUT (for altering locally-generated packets
      before routing).

pld: classic uses include tweaking the Type-Of-Service (TOS) bits. Note that it's actually kind of hard to tell if an ssh connection is interactive or bulk; see the example from Boxman below.

A second application is to set the fw_mark value based on fields the iproute2 RPDB (Routing Policy DataBase) cannot otherwise see. (RPDB can see the fw_mark). This is often used as an alternative to "tc filter".

An extension of this is the CLASSIFY option:
        iptables -t mangle -A POSTROUTING -o eth2 -p tcp --sport 80 -j CLASSIFY --set-class 1:10

The CLASSIFY option is used with the tc package; it allows us to place packets in a given tc queue by using iptables.


potential inconsistencies:
    traffic to port 21 gets "ICMP message 1",
    traffic to port 53 gets "ICMP message 2"
    traffic to port 70 gets blackholed
But in practice this is not such an issue.

Here are ulam3's iptables entries for enabling NAT. Ethernet interface eth0 is the "internal" interface; eth1 connects to the outside world. In the NAT setting, the internal tables are in charge of keeping track of connection mapping; each outbound connection from the inside (any host/port) is assigned a unique port on the iptables host.

OUT_IFACE=eth1
IN_IFACE=eth0 # the interface to which the private subnet is connected
iptables --table nat --append POSTROUTING --out-interface $OUT_IFACE -j MASQUERADE

# the next entry is for the "PRIVATE NETWORK" interface
iptables --append FORWARD --in-interface $IN_IFACE -j ACCEPT

echo 1 > /proc/sys/net/ipv4/ip_forward

These examples are from http://netfilter.org/documentation/HOWTO//packet-filtering-HOWTO.html.

Here is an example of how to block responses to pings:

 iptables --table filter -A INPUT -p icmp -j DROP

To remove: iptables --delete INPUT 1 (where 1 is the rule number). An even better command is

iptables -A INPUT -p icmp --icmp-type echo-request -j DROP

The icmp-type options can be obtained with the command iptables -p icmp --help.
Demo on linux1. The idea is that we are zapping all icmp packets as they arrive in the INPUT chain.

We are appending (-A) to the INPUT chain; the source address is localhost (note that we're blocking our outbound responses), the protocol is icmp, and if this is the case we jump (-j) to the DROP target.

The above rule is in the INPUT chain because we are blocking pings to this host. If we want to block pings through this host, we add the rule to the FORWARD chain.

Here is a set of commands to block all inbound tcp connections.

First, we create a new chain named block.

## Create chain which blocks new connections, except if coming from inside.
# iptables -N block

# create the block chain
# iptables -A block -m state --state ESTABLISHED,RELATED -j ACCEPT
# iptables -A block -m state --state NEW -i ! ppp0 -j ACCEPT
# iptables -A block -j DROP

## Jump to that chain from INPUT and FORWARD chains.
# iptables -A INPUT -j block
# iptables -A FORWARD -j block

The -j option means to jump to the block chain, but then return if nothing matches. However, as the last rule always matches, this doesn't actually happen.

The interface is specified with -i; the second block entry states that the interface is anything not ppp0.

The -m means to load the specific "matching" module; -m state --state NEW means that we are loading the tcp state matcher, and that we want to match packets starting a NEW connection.

Here is an example of how to block all traffic to port 80. It does not use new chains.

iptables --table filter -A INPUT -p tcp --sport 80 -j DROP

The option --dport is also available, as is --tcp-flags. Also, -p (protocol) works for icmp, udp, all

Here is how to allow traffic only to ports 80 and 22:

iptables -A INPUT -p tcp --sport 80 -j ACCEPT
iptables -A INPUT -p tcp --sport 22 -j ACCEPT
iptables -A INPUT -p tcp --sport 31337 -j ACCEPT
iptables -A INPUT -p tcp -j DROP

On my home router, I have the command blockhost, that does the following:

iptables --table filter --append FORWARD --source $HOST --protocol tcp --destination-port 80 --jump DROP
iptables --table filter --append FORWARD --destination $HOST --protocol tcp --source-port 80 --jump DROP

Note that the router lies between the host and the outside world; that is, I must use the filter table. Also, I block inbound traffic from port 80, and also outbound traffic to port 80. (I also have a command to block all traffic, when necessary.)

What if we want to throttle traffic of a particular user? We have the iptables owner module. It applies only to locally generated packets. If we're throttling on the same machine that the user is using, we can use this module directly.

If the throttling needs to be done on a router different from the user machine, then we need a two-step approach. First, we can use this module to mangle the packets in some way (eg set some otherwise-unused header bits, or forward the packet down a tunnel). Then, at the router, we restore the packets and send them into the appropriate queue.

Iptables can also base decisions on the TCP connection state using the state module and the --state state option, where state is a comma separated list of the connection states to match. Possible states are INVALID meaning that the packet could not be identified for some reason which includes running out of memory and ICMP errors which don't correspond to any known connection, ESTABLISHED meaning that the packet is associated with a connection which has seen packets in both directions, NEW meaning that the packet has started a new connection, or otherwise associated with a connection which has not seen packets in both directions, and RELATED meaning that the packet is starting a new connection, but is associated with an existing connection, such as an FTP data transfer, or an ICMP error. [from man iptables]