NetMgmt Week 7

Network Management Week 7

Fall 2009; LT-412, Wed 4:15-6:45 pm

DISMAN
private mibs
cisco mibs
CMIP
java
RMON
OpenNMS

Notes on the midterm, set for October 14:
It will be open book, but not open-note.
If I ask a question about a mib file, you will have access to the file.
You should know something about the creation and layout of mib files.
You should know about how SNMP stores and manages tables, including all the ways of fetching tables.
There will be no new MIBs presented (though there might be "hypothetical" mibs)

DISMAN

Notes on DISMAN-EVENT-MIB.txt (for DIStributed MANagement; this is an important group)

Its OID root is { mib-2 88 }.

The point is to define triggers and actions.

mteTriggerTable
mteTriggerValueID
mteTriggerTest: outlines the three types of triggers.

For 'existence', the specific test is as selected by mteTriggerExistenceTest. When an object appears, vanishes or changes value, the trigger fires. If the object's appearance caused the trigger firing, the object MUST vanish before the trigger can be fired again for it, and vice versa. If the trigger fired due to a change in the object's value, it will be fired again on every successive value change for that object.

For 'boolean', the specific test is as selected by mteTriggerBooleanTest. If the test result is true the trigger fires. The trigger will not fire again until the value has become false and come back to true.

Trigger Delta Table
    for delta sampling
Trigger Existence Table
    specifies information about an OID where the triggering occurs when something starts or ceases to exist
Trigger Boolean Table
    specifies whether we expect a given value to be true or false
Trigger Threshold Table
    information about "threshold" triggers
Objects Table

    mteObjectsName                      SnmpAdminString,
    mteObjectsIndex                     Unsigned32,
    mteObjectsID                        OBJECT IDENTIFIER,
    mteObjectsIDWildcard                TruthValue,
    mteObjectsEntryStatus               RowStatus

Event Table

Enterprise (private) mibs

I have a large directory of various mib files; one can search in it for "enterprises"
egrep '{ *enterprises *[0-9]+' * # note the pattern here, in bold, between single quotes

2	ibm
9	cisco
11	hp
16	hpnr
23	novell
36	dec
77	lanmanager
119	nec
224	lanOptics
232	compaq
311	microsoft
353	atmForum
494	madge
711	lightstream
795	adaptec
1123	symbios
1608	mylex
2021	ucd-snmp
2636	juniper
8072	net-snmp

Getting MIB files: Lots of these are available from RFCs, sometimes in need of minor patches. There are software tools available to extract the MIB file from a published RFC. Private ones are usually available from the company involved, though often with a surprising amount of difficulty.

A few cisco mibs

cisco BRIDGE-MIB.MIB (1994; you know this is old because it says "bridge" rather than "switch")
    spanning-tree info (dot1dStp group)
          lots of scalars
          dot1dStpPortTable
    forwarding table ?

Nothing on queue overflows!


cisco Catalyst 2900 ethernet switch
     CISCO-C2900-MIB-V1SMI.MIB

This is for this specific series of switch
c2900SysInfo:
    BoardVersion
    PeakBuffersUsed
    TotalBufferDepth
    AddrCapacity (forwarding table size)
    SNMP versions of the various LED displays

c2900SysConfig:
what to do for:
    security issues
    broadcast-packet storms

c2900ModuleTable: table of all included hardware modules

c2900Port:
    table of port entries. Each module may have several ports.
    address-learning stuff
    PortBufferCongestionControl

c2900BandwidthUsage:

Hardware environment:
OLD-CISCO-ENV-MIB.MIB
    sample data for monitoring temperature
Test points:

temperature of entering air
temperature of air leaving the router
+5 v power-supply voltage
+12 v power-supply voltage
-12 v power-supply voltage
-5 v power-supply voltage

Also shutdown thresholds for the above. #1: 43° C, #2: 58° C (typical),
Also information on the stats as of the last shutdown (more important than last reboot!)
No table of history values!
Info on the environment-monitoring card itself.

Cisco TCP: CISCO-TCP-MIB-V1SMI.MIB

Notice that the intent here is to extend the existing TCP table. The additional fields:

TcpConnInBytes: how many input bytes on this connection
TcpConnOutBytes: similar
TcpConnInPkts: inbound packet counter. How are they counting overlapping retransmits?
TcpConnOutPkts: similar.
TCPConnElapsed: elapsed time
TcpConnSRTT: "smoothed" RTT estimate; used by Jacobson-Karels algorithms
TcpConnRetranPkts: how many packets did we retransmit due to timeout?
TcpConnFastRetransPkts: how many resent due to Fast Retransmit (3 dupACKs)?
TcpConnRto: retransmit timeout (calculated)

Nothing on the current CongestionWindow size, though, or even the flavor of TCP congestion management (Tahoe, Reno, newReno)

Cisco queue management: CISCO-QUEUE-MIB

Queue type and subqueues
for subqueues:

    cQStatsQNumber	Integer32 (0..2147483647),
    cQStatsDepth	Gauge32,
    cQStatsMaxDepth	Integer32,
    cQStatsDiscards	Counter32

Is this all you need? There is no history table for recent queue bursts, for example.

Here is a table of all the MIBs supported by the cisco 7200-series routers. Note that some are cisco-specific, and some are generic extensions of mib-2.

ATM-MIB (RFC 1695)	CISCO-SYSLOG-MIB
BGP4-MIB (RFC 1657)	CISCO-TC
CISCO-AAA-SERVER-MIB	CISCO-TCP-MIB
CISCO-AAL5-MIB	CISCO-VLAN-IFTABLE-RELATIONSHIP-MIB
CISCO-ACCESS-ENVON-MIB	CISCO-VPDN-MGMT-MIB
CISCO-ATM-EXT-MIB	CISCO-VPDN-MGMT-EXT-MIB
CISCO-BULK-FILE-MIB	ENTITY-MIB (RFC 2737)
CISCO-CDP-MIB	ETHERLIKE-MIB (RFC 2665)
CISCO-CLASS-BASED-QOS-MIB	EVENT-MIB (RFC 2981)
CISCO-CONFIG-COPY-MIB	EXPRESSION-MIB
CISCO-CONFIG-MAN-MIB	IF-MIB (RFC 2233)
CISCO-ENTITY-ALARM-MIB	IP-LOCALPOOL-MIB
CISCO-ENTITY-ASSET-MIB	MPLS-LDP-MIB
CISCO-ENTITY-EXT-MIB	MPLS-LSR-MIB
CISCO-ENTITY-FRU-CONTROL-MIB	MPLS-TE-MIB
CISCO-ENTITY-PFE -MIB	MPLS-VPN-MIB
CISCO-ENTITY-SENSOR-MIB	NOTIFICATION-LOG-MIB (RFC3014)
CISCO-ENTITY-VENDORTYPE-OID-MIB	OLD-CISCO-CHASSIS-MIB
CISCO-ENVMON-MIB	OLD-CISCO-CPU-MIB
CISCO-FLASH-MIB	OLD-CISCO-INTERFACES-MIB
CISCO-FRAME-RELAY-MIB	OLD-CISCO-IP-MIB
CISCO-FTP-CLIENT-MIB	OLD-CISCO-MEMORY-MIB
CISCO-HSRP-EXT-MIB	PIM-MIB (RFC 2934)
CISCO-HSRP-MIB	RFC1213-MIB (MIB II)
CISCO-IETF-IP-MIB	RFC1243-MIB (AppleTalk)
CISCO-IMAGE-MIB	RFC1253-MIB (OSPF)
CISCO-IPMROUTE-MIB	RFC1315-MIB (FRAME RELAY MIB)
CISCO-IP-STAT-MIB	RFC2495-MIB (DS1)
CISCO-MEMORY-POOL-MIB	RFC2496-MIB (DS3)
CISCO-NBAR-PROTOCOL-DISCOVERY-MIB	RMON-MIB (RFC 1757)
CISCO-PING-MIB	SNMP-FRAMEWORK-MIB (RFC 2571)
CISCO-PPPOE-MIB	SNMPv2-MIB (RFC 1907)
CISCO-PROCESS-MIB	SNMP-NOTIFICATION-MIB (RFC 2573)
CISCO-PRODUCTS-MIB	SNMP-TARGET-MIB (RFC 2573)
CISCO-QUEUE-MIB	SNMP-USM-MIB (RFC 2574)
CISCO-RTTMON-MIB	SNMP-VACM-MIB (RFC 2575)
CISCO-SMI	SONET MIB
CISCO-SSG-MIB	TCP-MIB (RFC 2012)
CISCO-NETFLOW-MIB	UDP-MIB (RFC 2013)
CISCO-AAA-SESSION-MIB	---

OSI mgmt scheme: CMIP: Common Mgmt Info Protocol
    contains CMIS (CMI Service) as a subset
    addresses all 7 layers of OSI model, including the 2 that do not exist

    CMIP is rather large; needs lots of resources.
    Object-oriented (later versions of SNMP have some support here)

SNMP uses OIDs to name everything; CMIP uses Distinguished Names (DNs) similar to X.500. Here is a possible DN format for me at Loyola; the CN values are "domain components":

dn: cn=Peter Dordal,dc=luc,dc=edu

The OU (Organizational Unit) component is also used; in the above, it might be ou=Computer Science Department. Also, sometimes the "dotted" syntax is used to join DC's.

Distinguished Names are a little verbose, but they do serve the purpose of "labeling every attribute" at least as well as OIDs, and they provide for slightly more natural naming of table attributes.

Throughout the development of SNMP, most participants seemed to believe that CMIP was better and was coming Real Soon Now. That didn't happen, and basically still has not.

SNMP & CMIP use a combination of polling and async notification (SNMP "traps")

CMIP view of the world:

Network Mgmt:

Organization model: how NMS is to be organized
Information model (SNMP mostly here): the data itself
Communication model: how components communicate
Functional model:
5 functional areas: fault, configuration, performance, security, accounting

To compare:

    SNMP focuses mostly on the Information Model part of this.

Why specify an organization model? Shouldn't the software focus on mechanism?
Consider SAP software.
It contains modules for ERP, Customer mgmt, accounting mgmt, etc. Customers who buy SAP are buying a way of ORGANIZING their business, not just a way of implementing their existing organization!

Organization model:
    simple:
    manager, MDB, agents, unmanaged objects

    3-tier:
        MoM: manager of managers
        Local Managers

These kinds of models are used in the SNMP world as well, sometimes on a more ad hoc basis.

Information model:

ASN.1 naming: hierarchical, numbered levels (why weren't strings used?)

MIB: naming info and TYPE info

MIB v MDB:
    MIB contains schema for each device; MDB contains actual results
    Adding a new kind of device requires updating the MIB

Implementation of an agent: you know the names of everything, but how to get the statistics is an entirely separate issue.

Much of MIB info is about device statistics, but some is about people, or about software.

Specifications:
    OID & descriptor
    Syntax
    Access
    Status
    Definition

GET/SET apply only to atomic (scalar) values; structures and arrays need GET-NEXT.

Communication model:
    Largely this is the SNMP protocol itself, including UDP transport. However, sometimes the arcitectural model and administrative models are lumped in here, and also module-block-diagrams for agents and managers.

Functional model:
What it does.

configuration: Configuration can be monitored through SNMP, but also can be implememented. Sort of.
fault detection: SNMP fault detection is widely distributed over a great variety of MIBs (and traps)
performance: ditto
security: varied
accounting: SNMP does not really address this at all.

SNMP OID trees versus true objects

    cisco_router EXTENDS router

    cisco_7400 EXTENDS cisco_router

Object inheritance makes a lot of sense here. Note that SNMP cisco-specific information is under 1.3.6.1.4.1.9, and general router info is under 1.3.6.1.2.1.4 (IP)

Still, OID trees do offer a relatively natural perspective on inheritance, and CMIP is not much more natural.

sysObjectID

Last week we discussed that the sysObjectID value returned by an agent can be used by an NMS to look up something of that agent's capabilities, especially (though not necessarily exclusively) under the private 1.3.6.1.4.1 OID tree.

However, note that netSNMP returns private.8072.3.2.10, described in the following table:

1.3.6.1.4.1	private
private.8072	net-snmp
net-snmp.3	netSnmpEnumerations
netSnmpEnumerations.2	netSNMPAgentOIDs
netSNMPAgentOIDs.10	not specifically mentioned

If we look this up in the NET-SNMP mibs, however, there is nothing to be found. If the NET-SNMP project does not define what this last OID level, the 10, stands for, then there is precious little chance that an off-the-shelf NMS will figure it out.

Also, note that if we try to do data retrieval from a net-snmp agent below private.8072, we get entries ranging from net-snmp.1.2.1.1.4 to net-snmp.1.9.1.1.5. Not net-snmp.3! (Although this in and of itself is not unexpected; it simply means that agents identify themselves with one OID, and keep their agent-specific data collections under another.)

Brief look at NET-SNMP-AGENT-MIB.txt, defining
nsVersion              OBJECT IDENTIFIER ::= {netSnmpObjects 1}
nsMibRegistry          OBJECT IDENTIFIER ::= {netSnmpObjects 2}
nsExtensions           OBJECT IDENTIFIER ::= {netSnmpObjects 3}
nsDLMod                OBJECT IDENTIFIER ::= {netSnmpObjects 4}
nsCache                OBJECT IDENTIFIER ::= {netSnmpObjects 5}
nsErrorHistory         OBJECT IDENTIFIER ::= {netSnmpObjects 6}
nsConfiguration        OBJECT IDENTIFIER ::= {netSnmpObjects 7}
nsTransactions         OBJECT IDENTIFIER ::= {netSnmpObjects 8}

NET-SNMP-SYSTEM-MIB.txt defines:
nsMemory                    OBJECT IDENTIFIER ::= {netSnmpObjects 31}
nsSwap                      OBJECT IDENTIFIER ::= {netSnmpObjects 32}
nsCPU                       OBJECT IDENTIFIER ::= {netSnmpObjects 33}
nsLoad                      OBJECT IDENTIFIER ::= {netSnmpObjects 34}
nsDiskIO                    OBJECT IDENTIFIER ::= {netSnmpObjects 35}

Similarly, there is no clear-cut OID (either key or value) that identifies "ethernet switches" as a class. However, there isn't much of anything that identifies them as a class; there are all sorts of special cases (VPNs and "bridge-routers" come to mind). On the other hand, it's probably enough for an NMS to match on the proper prefix 1.3.6.1.4.1.8072. Bottom line, an NMS had best be prepared to doa lot of prefix matching on the sysObjectID value.

Java

We will use the Drexel University Java SNMP package.

snmpget.java

(demo version, not the full net-snmp snmpget! Assumes localhost, for one thing)
Review the code:

Create the SNMPv1CommunicationInterface object
get the OID for which we are looking
getMIBEntry (this performs the actual request)
extract the value

Note that this doesn't let us do anything with the value. So at the tail end we try to convert to int, byte[], or long[]s. Note that although the documentation states that an OID is to convert to an int[], this in fact does not work; it must be long[].

Compiling and running:
    CLASSPATH=".:snmp.jar:$CLASSPATH"
    export CLASSPATH
    javac snmpget.java
    java snmpget arg

Demo on system group 1.3.6.1.2.1.1.x.0, 1.3.6.1.2.1.2.2.1.6.2

snmpgetnext.java
1.3.6.1.2.1.4.20 (should be an IP address)

tableget1.java

Demo of table retrieval. Note makeArrayList(), not actually used.

What would you have to do to retrieve one row of this table?
What would you have to do to retrieve row i, column j?
What would you have to do to get a column? Assuming you did not know the dimensions of the table?
What would you have to do to get all the OID index values?

RMON

= RMON 1, containing Ethernet information only. Later, RMON2 added monitoring at the IP and TCP (port) layers.

0. Basic idea: remote agents do some limited monitoring of subnets. They become "mini-managers", freeing the manager from probing every host on the subnet. Managers can create rows to specify what monitoring is to be done.

1. OwnerString: in RMON-MIB

2. EntryStatus: for row creation
    how row creation works
    timeouts


    Look at transitions table in MIB

To:	valid	create Request	under Creation	invalid
From:
valid	OK	NO	OK	OK
createRequest	N/A	N/A	N/A	N/A
underCreation	OK	NO	OK	OK
invalid	NO	NO	NO	OK
nonExistent	NO	OK	NO	OK

createRequest supplies a value for etherStatsIndex. In the unlikely event that this value is now in use (unlikely at least if the management station checked first that it was unused, which would mean another manager created the row at about the same time), then we try again.

3. etherStats: it's not indexed by interface! etherStatsIndex is arbitrary (pseudorandom); it is the second column etherStatsDataSource that contains the OID of the actual interface to be measured. Note that we can have more than one row for the same interface! (perhaps started at different times?)

eSIndex, eSDataSource, data, eSOwner, eSStatsStatus

In order to identify a particular interface, this object shall identify the instance of the ifIndex object... for the desired interface. For example, if an entry were to receive data from interface #1, this object would be set to ifIndex.1.

Note two managers might create entries for the same interface! (Though this is not supposed to happen) A manager chooses an etherStatsIndex value pseudorandomly, and creates the entry.

Also note that we have a coarse size histogram available: etherStatsPkts65to127Octets to etherStatsPkts1024to1518Octets

History group: two tables: control & data

The control entry specifies the interface (historyControlDataSource), the bucket count, and the interval. The data table then creates one bucket for each time interval, containing a summary of the usage during that interval. Actual history stats: packet counts, byte counts, error counts of various types, etc.

we create a control entry, and that directs how the data table will be built. Note how the data table is indexed, and how rows get deleted (recycled?) as new buckets are created.

Note that we might create a 30-second history table and a 30-minute history table, with possibly different bucket counts for each.

The etherHistorySampleIndex is initialized to 1 and is incremented each interval. Old bucket entries are deleted as the buckets are recycled: from historyControlBucketsGranted:

                  When the number of buckets reaches the value of
                  this object and a new bucket is to be added to the
                  media-specific table, the oldest bucket associated
                  with this historyControlEntry shall be deleted by
                  the agent so that the new bucket can be added.

Hosts group, with three tables hostControlTable, hostTable, and hostTimeTable.
The hostControlTable specifies the interface. Note hostControlTableSize

The hostTable is indexed by the host physical address, and contains per-host statistics such as in and out packets, bytes, and errors.

The hostTimeTable is the same except indexed serially by discovery time. See "The hostTimeTable has two important uses."

TopN group
hostTopNGroup
The N is supposed to be the hostTopNRequestedSize.
The hostTopNRateBase specifies what variable we are to use for the ranking.

In the actual data table, the hostTopNIndex specifies the 1..N rank of a particular host (the hostTopNAddress).

Matrix Group

MatrixControlTable
SD and DS tables: why both? Because one is indexed by source and the other by destination.

Use of TestAndIncr as a semaphore

create column in a row of type TestAndIncr, which takes values:
   ok(1)
   inuse(2)
GET the TestAndIncr object. GET always returns the current value.
If the value is ok, we could use the row.
If the value is inuse, we can't
SET (ok):
        if this succeeds, then the row is now locked.
        if it fails, then the row already was locked, and all our other SETs in the command fail as well.
        Same syntax as normal SET; different semantics

Note that to update a value in a row that is subject to contention, we must:

lock the row
GET the old value
SET the new value
unlock the row

The last two can be done in parallel.

OpenNMS

openNMS basics:
    discovery
    capability detection
    monitoring modules


OpenNMS
Tarus Balog - chief developer

OpenNMS focuses on NODES and also SERVICES.
traditional SNMP view: just NODES (routers, switches) and whether they are working.

The three main functional areas of OpenNMS are:

service polling, which monitors services on the network and reports on their "service level";
data collection from the remote systems via SNMP in order to measure the performance of the network;
A system for event management, statistics, and notifications.

Note that that last category is tricky: the NMS must know what events are "important", and what SNMP data from each device means, and when to sound the alarm. False positives and false negatives are both bad.

Data collection scenario:

To mitigate the problem OpenNMS was modified to collect 200,000 data points from approximately 24,000 interfaces every five minutes, or 2.4 million data points an hour from a single instance of OpenNMS. The limitation turned out to be the speed at which the disk controller could write the data, not OpenNMS itself.

discovery => capabilities detection => poller monitor

discovery

    ping (individual pings to listed ranges of IP addrs (with exceptions))
    manual configuration
<discovery-configuration threads="1" packets-per-sec
        initial-sleep-time="300000" restart-sleep-time="86400000"
        retries="3" timeout="800">

        <include-range retries="2" timeout="3000">
                <begin>192.168.0.1</begin>
                <end>192.168.0.254</end>
        </include-range>

        <include-url>file:/opt/OpenNMS/etc/include</include-url>

</discovery-configuration>

You can do this manually, in the file above, or else using the web interface,
    admin=> Configure Discovery => Include Ranges => Add New

Demo of the result in ulam3:/opt/OpenNMS

capabilities detection

core list
    probe for the following:
    # Citrix    central LAN software mgmt
    # DHCP
    # DNS
    # Domino IIOP    Lotus workgroup solution
    # FTP
    # HTTP
    # HTTPS
    # ICMP
    # IMAP        email inbound
    # LDAP
    # Microsoft Exchange
    # Notes HTTP
    # POP3        email inbound
    # SMB
    # SMTP        email server
    # SNMP
    # TCP        port is specifiable; generic service detection

Much newer (and longer) list at http://www.opennms.org/index.php/Discovery_Configuration_How-To;
JDBC, radius, ssh added

Review the network services involved.

SMB: checked, but file-sharing features are NOT continually monitored

SNMP: special case, used for further data collection not polling
    query: system, interfaces, IP->ipAddrTable (other IP addrs for this host)

    discovery information: discovery-configuration.xml

capabilities detection: capsd-configuration.xml
Part of the discovery phase: figuring out what each note is capable of.
Which notes are SMTP servers? Web servers?

Sample plugin config for ICMP from poller-configuration.xml:
<protocol-plugin protocol="ICMP" class-name="org.opennms.netmgt.capsd.IcmpPlugin" scan="on" user-defined="false">
     <protocol-configuration scan="on" user-defined="false">
          <range begin="192.168.10.0" end="192.168.10.254"/>
          <property key="timeout" value="4000"/>
          <property key="retry" value="3"/>
     </protocol-configuration>

     <protocol-configuration scan="off" user-defined="false">
          <range begin="192.168.20.0" end="192.168.20.254"/>
     </protocol-configuration>

     <protocol-configuration scan="enable" user-defined="false">
          <specific>192.168.30.1</specific>>
     </protocol-configuration>

     <property key="timeout" value="2000"/>
     <property key="retry" value="2"/>
</protocol-plugin>

scan: can be turned on/off dynamically
user-defined: did user add this dynamically from console?

POLLING & "COLLECTING" (the latter applies only to SNMP)

Group net devices/services into PACKAGES, with package-specific polling instructions (eg protocol, what to look for, frequency)

Special provision for "30-second outages". These are real outages, but "can be annoying yet hard to correct"

Sample poller service entry from poller-configuration.xml:

           <service name="DNS" interval="300000" user-defined="false" status="on">
                   <parameter key="retry" value="3"/>
                   <parameter key="timeout" value="5000"/>
                   <parameter key="port" value="53"/>
                   <parameter key="lookup" value="localhost"/>
           </service>

Question: how do services GET polled? What constitutes a "response"?
OpenNMS contains "poller monitors": drop-in packages that do these checks on a per-service basis.

OpenNMS uses "adaptive polling": once an outage is discovered through NMS, the service is polled more frequently. Typical generic polling interval: 300 sec; typical polling interval for down service: 30 sec.

Data collection

configured for each SNMP "PACKAGE"

data put into RRDTool, round-robin DB.

HTTP:

GET /index.html HTTP/1.1
HOST: xenon.cs.luc.edu
            --> returns Apache startup page
    HTTP/1.1 200 OK
    Date: Tue, 18 Mar 2008 19:47:59 GMT
    Server: Apache
    Last-Modified: Wed, 01 Mar 2006 01:04:31 GMT
    ETag: "2f3e2a-5a3-86bbc5c0"
    Accept-Ranges: bytes
    Content-Length: 1443
    Content-Type: text/html; charset=ISO-8859-1

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
            "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
        <head>
            <title>Test Page for Apache Installation</title>
        </head>

        <body>
            <p>If you can see this, it means that the installation of the
            <a href="http://www.apache.org/foundation/preFAQ.html">Apache web server</a>
            software on this system was successful. You may now add content to this
            directory and replace this page.</p>
        ....


GET /foobar.html HTTP/1.1
HOST: xenon.cs.luc.edu
            --> returns 404


OPTIONS * HTTP/1.1
HOST: xenon.cs.luc.edu

-->    HTTP/1.1 200 OK
    Date: Tue, 18 Mar 2008 19:49:39 GMT
    Server: Apache/2.2.2 (Unix)
    Allow: GET,HEAD,POST,OPTIONS,TRACE
    Content-Length: 0
    Content-Type: text/plain

Note codes (200, 404, etc), matching "response", and the body, which we can match with a regular expression.
See http://www.opennms.org/index.php/HTTP_monitor
Question: at what point do we decide the server is working ok?

Next go through http://www.opennms.org/index.php/Testing_Filtering_Proxies_With_HTTPMonitor and see how "negative" polling can be implemented.
(We simply expect a "response" of 400-599)

Of course, if you simply blackhole requests for facebook.com,
this won't work.

Now look at HttpMonitor.java:

Look at
    int response = ParameterMap.getKeyedInteger(parameters, "response", -1);
    String responseText = ParameterMap.getKeyedString(parameters, "response text", null);

Look at how poller-monitor goes through the response

Look at: if (line.startsWith("HTTP/")) {
    parse out the response numeric code

responsetext:
        int responseIndex = line.indexOf(responseText);
        if (responseIndex != -1)
            bResponseTextFound = true;

Conclusion: source code I'm looking at does NOT do regexp matching!
(It's older than the online docos.)

=========================================

DnsMonitor.java

lookup: line 141
build packet: line 158
line 186: request.verifyResponse(incoming.getData(), incoming.getLength())
no actual verification that DNS value is correct, but we don't NEED that!
We do presumably verify that dns response is for the requested machine.
From the source for DNSAddressRequest.java:
     * This method only goes so far as to decode the flags in the response
     * byte array to verify that a DNS server sent the response.
Verifies request ID (sequence # of request)

=========================================

SmtpMonitor.java

When we connect, other end should send:
    220 ulam2.cs.luc.edu ESMTP Postfix

214:    read banner
218-240    multiline banner handler
247:    check for the 220
251:    respond HELO myname
    response should be
        250 ulam2.cs.luc.edu
    Note that EHLO myname would produce somehting like:
        250-ulam2.cs.luc.edu
        250-PIPELINING
        250-SIZE 10240000
        250-VRFY
        250-ETRN
        250-STARTTLS
        250-AUTH PLAIN
        250 8BITMIME
289:    check for 250
290    send QUIT

=========================================

SshMonitor.java

144:    String strBannerMatch = (String) parameters.get("banner");
185:    read a line
195:     check for match with banner line
199-200    send our response line
205:    see if we get any further response, but don't parse it

=========================================

SmbMonitor.java:

doesn't do anything!!

=========================================

Some notes: this version doesn't have a rich set of
expect/send methods. Everything is done "by hand".

=========================================
=========================================

Non-snmp managing tools

ping

both individual and subnet pings are widely used for new-node detection, although subnet pings are not necessary.

        use of individual pings
        use of "subnet ping"

Other methods of host discovery
        ARP (or IP physaddr) table
        TCP Connection table
        Packet sniffing
        Port scanning

traceroute

mostly used for manual analysis of problems

tcpdump/wireshark

often used for manual analysis. Some problems:

switched Ethernets often mean you can't see more than one host!
massive amounts of data

Note that wireshark "understands" a whole lot of different packet formats.

nmap (port scanner)

Good tool for scanning subnets to determine device type
Excellent port-scanning features
Excellent at "OS fingerprinting"

spong

spong.sourceforge.net
"spong is not quite dead yet"; recent work in 2005

spong is a non-snmp-based Network Management System. It consists of a collection of monitor scripts, typically in perl. The spong.hosts file lists hosts and services that need monitoring. It works best for modest networks. OpenNMS is also a collection of monitor scripts, but note spong's different discovery method.

            spong.interfaces:     snmp interfaces check
                        checks for any that are "down"

        client-side installation, too

mon

        mon service monitoring daemon
        www.kernel.org/software/mon

recently has added SNMP to checking: uptime, HP printers, processes, disk free space, disk quotas


like spong, mon is based on manual configuration of a number of "probe" programs. SNMP discovery is *not* used. Instead there are "hostgroup" entries.

        smtp: get quickie response from server
            <220
            >QUIT
            <221

        http: actually get a page
        mysql: gets list of tables

        email probing: what can go wrong?

not enough server disk space (sendmail pauses automatically)
too high server processor load
authentication tool failure
insufficient randomness for SSL
dns trouble
deferred connection

Neither of these has explicit monitors for checking actual transfers: sending data, seeing if it is delivered ok.

openNMS: auto-detects new hosts, monitoring profile is based on sysOID; also supports manual config (probably the only feasible approach!) for servers & services (can probe for services)