Welcome to docs.opsview.com

NMIS

NMIS, or Network Management Information System, is a project which provides detailed traffic information about routers, switches and servers, via SNMP.

Limitations

While NMIS does a lot of other functions similar to the rest of Opsview, only the parts that are specifically for interface collection and display are integrated into Opsview.

NMIS only supports SNMP v1 and SNMP v2c. There is no support for SNMP v3.

NMIS is only available to users with ADMINACCESS role.

If you move a host between slave servers, or between a slave and the master, then the NMIS historical data will be lost.

Architecture

In a distributed setup, the Opsview server monitoring a device collects extra information via the NMIS utilities. This data is stored locally and aggregated data is sent back to the master server.

In this way the master server only contains aggregated information about all NMIS monitored devices and data about devices it is monitoring itself. To get more detailed information the user can 'drill down' into the data and the browser session will be redirected to the slave to get it (therefore the users desktop must have access to all slave servers on port 80).

There are 2 main data gathering modes:

  • collect - this is run every 5 minutes, getting device information and populating local RRDs
  • update - queries devices for interface configuration changes. This is set in Opsview to run every 4 hours

These are called using:

/usr/local/nagios/bin/call_nmis type=collect

You can also add the parameters:

  • debug=true to get debug output
  • node={name} to only run against that specific host

In addition, Opsview also runs an rsync job on slaves every hour. This synchronizes the RRDs across all nodes in a slave cluster. If you want to change the frequency, you have to amend the file /usr/local/nagios/installer/crontab-slave.nagios on the master server (as this is sent out as part of send2slaves).

Note: As the rsync keeps the servers up to date between each other, in the event of a node failure, the time between the failure and the last rsync will be lost. When the failing node recovers, the RRDs on the takeover node will be rsync'd across.

Distributed setup

Opsview master is also the NMIS central master.

An Opsview slave will have an NMIS slave instance running if there is a host on that slave with NMIS running against it.

The status screen for NMIS contains summary information for all slaves and any hosts that the master is collecting statistics for. The master pulls information from the slaves when a type=collect is run. Usually this transfer is done via HTTP, but Opsview uses the ssh connection to get this information instead.

Web UI on slaves

While the Opsview master holds summarised information about each host in NMIS, detailed interface statistics are on the Opsview server that is actually polling the device, so you will need the web interface down to each slave to view those graphs.

To setup the slave UI for NMIS, you can follow the slave web UI setup. The key parts of the Apache configuration are:

ScriptAlias /cgi-nmis/ /usr/local/nagios/nmis/cgi-bin/
<Directory "/usr/local/nagios/nmis/cgi-bin/">
        SetEnv PERL5LIB /usr/local/nagios/perl/lib
        AllowOverride None
        Options ExecCGI
        Order allow,deny
        Allow from all

        AuthType Basic
        AuthName "Opsview Login (Admin users only)"
        AuthUserFile /usr/local/nagios/etc/htpasswd.admin
        Require valid-user
</Directory>

Alias /static/nmis/ /usr/local/nagios/nmis/htdocs/
<Directory "/usr/local/nagios/nmis/htdocs">
        Options None
        AllowOverride None
        Order allow,deny
        Allow from all
</Directory>

Configuration files

NMIS has several configuration files. These reside in /usr/local/nagios/nmis/conf.

For slave setups, the configuration files in /usr/local/nagios/nmis/conf will be sent to the slaves, with the exception of the following:

  • nmis.conf (copied from master and amended for slave use)
  • nodes.csv (generated from scratch)
  • master.csv (generated from scratch)
  • slave.csv (generated from scratch)
  • slaves.csv (generated from scratch)

These are created as part of the Opsview reload process.

If changes need to be made to nmis.conf, this should be done on the master and then an Opsview reload initiated to send this file out to all slaves.

If changes need to be made to binary files, do this on the master and a send2slaves will then send out the new binary files to the slaves.

Note: changes made to any of these files will be lost in an upgrade. If you believe your changes are useful for the wider community, please let us know and we'll look into incorporating them into the codebase.

Additional device support

Additional devices are automatically discovered within the NMIS code. The basic process to add extra devices is:

  • Find the NodeModel:SystemName for the device on the summary page. If this is in an OID format, you will need to get the specific OID and create a new device.oid file with the necessary translations. See example files in /usr/local/nagios/nmis/mibs. The OID values can be generated manually using snmptranslate. A tool is shipped with NMIS called mib2oid.pl, though there have been issues getting this to work correctly
  • Amend nmis.pl and nmiscgi.pl to treat your device appropriately. This would include adding what data points are gathered, as well as what to display on screens

Troubleshooting

The NMIS status page shows a lot of devices not scanned

NMIS data collection scans run every 5 minutes - if there is a scan already running it is killed and the new scan continues. The most likely cause of 'dataless devices' is a scan of your system takes more than 5 minutes and hence does not complete correctly.

Confirm scans take too long

To check if scans take too long a scan can be run by hand as follows:

time /usr/local/nagios/bin/call_nmis nmis.p l type=collect mthread=true debug=true

and check to see the time returned is under the 5 minute time limit.

Resolution step one

Examine the nmis log file (/usr/local/nagios/nmis/logs/nmis.log) and look for SNMP devices that have timeouts - look for entries such as

D-MMM-YYYY HH:MM:SS, updateUptime, xxx.xxx.xxx.xxx, SNMP error: No answer from xxx.xxx.xxx.xxx for ifNumber. SNMP_Simple->SNMPv1_Session ......

Confirm the device can access SNMP queries and the SNMP configuration tab on the Host Edit page contains the correct detail

Resolution step two

NMIS by default can scan 2 devices at a time. This can be increased but also puts extra load on the system.

Pre Opsview 3.3 amend /usr/local/nagios/bin/call_nmis from

exec /usr/local/nagios/nmis/bin/$cmd $@

to read

exec /usr/local/nagios/nmis/bin/$cmd maxthreads=X $@

where X is a multiple of the number of CPU's or cores in your system (the default value is 2)

From Opsview 3.3 onwards this value is stored within /usr/local/nagios/etc/opsview.conf as

$nmis_maxthreads=X;

NOTE: If running in a distributed setup you will need to run

send2slaves

as nagios user on the master server to update slaves with the changes made.

NOTE: multiple nmis processes are only initiated when mthreads=true which is only set on the collect process within cron

Navigation
Print/export
Toolbox