Welcome to docs.opsview.com

SNMP Interfaces

The SNMP interfaces page lists all the interfaces on this host to setup monitoring of the interfaces in a simple manner.

Note: Only interfaces listed in the MIB-II section of SNMP will be queried.

Host Parameters

These options are specific to the host:

Extended Throughput Data

If this option is enabled then the Interface service check will also return unicast, multicast and broadcast performance data. This will be in the form of bits per second based on the interface speed.

SNMP Message Size

Some SNMP devices can return a significant amount of data which fills the standard SNMP buffer size of around 500 octets.

Many devices cannot cope with setting the maximum buffer size so this option allows the size to be tailored to each device. The units are Kio which are in multiples of 1024.

SNMP ifDescr Level

Some SNMP devices can have very long descriptions (ifDescr) for each interface on a device, mostly made up from common words. There is a limit in Opsview that this description shouldn't exceed 52 characters otherwise monitoring the interface will not work as expected (a 'duplicate interface' error may be shown at the bottom of the screen).

Setting this option can remove common words to reduce the length of each interface ifDescr and help to avoid duplicate interfaces.

The settings are as follows:

Setting Words Removed
Off (default) None
Level 1 'Nortel Ethernet', 'Nortel', 'Routing', 'Module'
Level 2 Trailing spaces removed
Level 3 'PCI Express', 'Quad Port', 'Gigabit', 'Server'
Level 4 'Corrigent systems', ', , '
Level 5 'Ethernet', 'Frontpanel', 'RJ45', '1000BASE-T', '- no sfp inserted'

Levels are cumulative. Further levels may be added in the future. The level should not be changed once monitoring is working to prevent loss of historical data.

Usage

Click on Query host to check if the interfaces have changed. Any new interfaces will be merged into this screen. Any interfaces that no longer exist on the host will be removed. However, the configuration is only saved when the Submit button is pressed.

If you select an interface using the checkbox beside the name, Opsview will create a service for each interface after a reload. This will monitor throughput, errors and discards. Use the checkbox beside Interfaces to poll to toggle all interfaces.

You can set values for each interface you are monitoring.

Default Thresholds

For any selected interface, if the cell is empty, the threshold value will be taken from the default line.

If a cell is set to -, then no threshold will be set. This is equivalent to saying “I do not want to set a warning threshold”.

Throughput

Throughput is monitored from the multiple service check called Interface (this is the historical name). This calculates the rate of throughput between checks and returns the input and output information. If the rate is above the threshold value, then an alert will be raised at the appropriate level.

Performance data will be returned based on the input and output rate in octets per second. If the threshold is specified as a percentage value, the performance data returned will be a percentage value instead.

If there is a host you do not want to monitor throughput, you can remove the service check from the host.

Note: If you specify a percentage threshold and it is not possible to work out the interface speed (eg VLANs), then the plugin will return a WARNING with the message:

INTERFACENAME throughput (in/out) X bps/Y bps but has an interface speed of 0, so cannot check a percentage threshold

You should set the threshold to be based on bits per second for this interface, rather than using a percentage threshold.

Advanced Syntax

This feature was committed on 2012-02-28.

It is possible to use advanced syntax for more complicated threshold checking. For example:

  • IN 10:50% - alert if input throughput is below 10% or above 50%
  • OUT 30000:50000 - alert if output throughput is below 30,000 bits/sec or above 50,000 bits/sec
  • IN 10:50% and OUT 30:55% - alert if both input throughput is below 10% or above 50% and output throughput is below 30% or above 55%
  • IN 10:50% or OUT 30:55% - alert if either input throughput is below 10% or above 50% or output throughput is below 30% or above 55%
  • 40:60% - this is the same as IN 40:60% or OUT 40:60%
  • 75% - this is the same as 0:75% which was the old behaviour

Most whitespace is ignored. Note that you cannot mix percentage and bits per second values in the same threshold.

Errors

Errors is monitored from the multiple service check called Errors. This calculates the average number of errors per minute between checks, and returns the input and output error per minute information. If the rate is above the threshold, then an alert will be raised at the appropriate level. Performance data will be returned based on the input and output errors per minute.

If there is a host you do not want to monitor errors, you can remove the service check from the host.

If the interface is down, then the state of Errors will be set to OK and the output will say Interface NAME is down.

Note: you should set maximum check attempts to 1 because a subsequent invocation may have no errors and a notification will not get raised

Discards

Discards is monitored from the multiple service check called Discards. This calculates the average number of discards per minute between checks, and returns the input and output error per minute information. If the rate is above the threshold, then an alert will be raised at the appropriate level. Performance data will be returned based on the input and output errors per minute.

If there is a host you do not want to monitor discards, you can remove the service check from the host.

If the interface is down, then the state of Discards will be set to OK and the output will say Interface NAME is down.

Note: you should set maximum check attempts to 1 because a subsequent invocation may have no discards and a notification will not get raised

Limitations

You need to have SNMPv2c if you are monitoring an interface of 100Mbs or over. This is because SNMPv2 supports 64bit counters, but SNMPv1 doesn't. If you use SNMPv1, your graphs are likely to have gaps in them.

Interfaces are monitored by name, so if the SNMP index position changes (which could happen on a router reboot), then a rescan of the device will occur to check (Opsview treats the SNMP index as an internal number which a system administrator does not need to know about. By working with names only, Opsview can automatically follow any changes to the SNMP index position without human intervention).

If there are multiple interfaces with the same name, the ifIndex will also be passed to the plugin to check. If the ifName does not match the expect interface name for this ifIndex, an alert will be raised which says:

WARNING - Interface name $user_specified_ifname expected at index $user_specified_index, but got $name!

You will need to run Query host to list the interfaces to check again.

Note: if the index moves to a position with the same interface name, then Opsview will not see a change and continue monitoring this interface as usual even though it could be a different interface.

If you have a Cisco router, please check this Cisco support article regarding ifIndex persistence.

Troubleshooting

Why aren't my interfaces being monitored?

The services are only created if the host has the Interface Poller, Interface, Discards and Errors service checks associated to the host, either directly in the host monitors tab or via a host template.

Newly created systems have a host template called SNMP - MIB-II which you should assign to every host which wants the interface monitoring.

I'm getting thresholds that are over 100%

For each interface, Opsview will work out the utilisation of an interface based on the amount of bytes transferred as reported by SNMP divided by the time difference of the two values, as a percentage of the interface speed as reported by SNMP's ifSpeed counter.

We have done debugging where we have run Opsview's plugins and compared figures with a regularly executed snmpwalk and have found that the data values are exactly the same, so we are confident that the collection of data and the calculation of the utilisation is correct.

There seem to be different reasons for why you can get over 100% utilisation:

  • The wrong ifSpeed is reported by the device. This can sometimes occur with Net-SNMP, but it is possible to set the speed correctly in the configuration file
  • Some speeds are not the maximum possible throughput. ifSpeed is defined as “An estimate of the interface's current bandwidth in bits per second”
  • Full duplex may skew the results as you may be able to get more transfer in one direction than in another
  • Some devices only update the SNMP counters at certain intervals. This means you could see sudden spikes in utilisation if Opsview gathers data at different intervals

If you have interfaces that are consistently reporting more than 100% utilisation, please contact support to investigate.

References:

Plugin raises a WARNING about an interface with 0 speed

If you get an error like:

INTERFACENAME throughput (in/out) 0 bps/0 bps but has an interface speed of 0, so cannot check a percentage threshold

When a threshold is specified as a percentage value, Opsview works out the percent utilisation based on the speed. However, if the speed is zero, this is not possible.

Possible resolutions:

  • The device is reporting the incorrect speed - contact the device manufacturer. If the device is a Unix server running net-snmp, you can force net-snmp to set a specific speed per interface
  • The interface is not valid for monitoring - simply untick the interface from being monitored
  • You still want to monitor the interface status - set the threshold to a dash (which means that no threshold check will be required) or set an absolute threshold, so the speed check is ignored

There are duplicate names in the interface SNMP table which has some limitations

Interfaces are tracked by their name rather than their ID as provided by the device being monitored - this is because some devices reallocate ID's on a reboot.

Opsview tracks these interfaces by fetching each interface 'IfDescr' and shortening it to 52 characters and storing it as the 'short interface name'. This limit is the standard length of interface description supported by the majority of devices. This can appear to cause duplicate interface names however if the IfDescr contains unnecessary duplicate text, i.e.

Nortel Ethernet Routing Switch 5510-48T Module - Unit 1 Port 1
Nortel Ethernet Routing Switch 5510-48T Module - Unit 1 Port 2
Nortel Ethernet Routing Switch 5510-48T Module - Unit 1 Port 3

would all be shortened to

Nortel Ethernet Routing Switch 5510-48T Module - Un

The best solution to this problem is to reconfigure all the interface IfDescr's on the device to only contain short unique names such as

5510-48T Unit 1 Port 13

And the re-running the 'Query Host' on the device configuration SNMP page.

Navigation
Print/export
Toolbox