Welcome to docs.opsview.com

Monitoring Web User Interface

See the Quick Start guide for information about various page elements.

General

The language used in the Opsview user interface can be changed. See the internationalisation page for more information.

Times in the web user interface as displayed based on the local timezone of the Opsview Master server. However, all times stored in the database is based on UTC.

Common Status Messages

In addition to specific plugin data you may see the following results returned by service checks:

MessageMeaningNotes
Agent not respondingMonitoring plugin could not connect to SNMP agentSNMP agent is not running on device, or plugin is configured to use wrong SNMP version
Connection RefusedMonitoring plugin could not connect to hostOften means that service you are checking is not available
Plugin timeout after 10 secondsMonitoring plugin will stop trying connection after pre-specified time interval (10 seconds)Service is down or very slow to respond
Service results are staleData has not been received from slave server in timely mannerLarge numbers of 'stale' service checks indicated problem communicating with slave server
Socket timeout after 20 secondsMonitoring plugin will stop trying connection after pre-specified time interval (20 seconds)Service is down, or very slow to respond

Status views

Host status:

  • UP: Host is responding to network requests
  • DOWN: Host is not respondng to network requests within pre-defined timeframe
  • UNKNOWN: It was not possible to establish status of host (due to network outage for example)

Service status

  • UP: Service check completed successfully and no thresholds have been exceeded
  • WARNING: Service check completed successfully and warning threshold was exceeded
  • CRITICAL: Service check completed successfully and critical threshold was exceeded
  • UNKNOWN: It was not possible to establish status of service check, or check failed to run

Host Group Hierarchy

Provides a top-down view of your entire system. This allows you to see at a high level which hosts and services are in what state and you can drill down to see other host groups.

Note: Only hosts that have at least 1 service will be displayed. Empty hosts are not displayed.

  • Host Group: Hosts and network devices are grouped into Host Groups
  • Host Status Totals: Status totals for hosts and network devices (UP, DOWN and UNREACHABLE)
  • Service Status Totals: Status totals for service checks (UP, WARNING, CRITICAL and UNKNOWN)
  • Handled: Host or Service events that have been 'handled' via scheduling downtime or acknowledging
  • Unhandled: Host or Service events that have have not been handled by an operator. The precise meaning of unhandled is:
    • A host is unhandled if:
      • It is DOWN and
      • it has not been acknowledged and
      • it is not in a period of downtime,
      • Otherwise it is handled
    • A service is unhandled if:
      • Its host is UP and
      • the service is not OK and
      • the service has not been acknowledged and
      • the service is not in a period of downtime,
      • Otherwise it is handled

Sorting of status information

Status information can be sorted (see URL_parameters below) and with some sorting data may be repeated. For example, when sorting by state a host with both a warning and critical alert will be shown twice.

Keyword View

This gives an aggregated view of a collection of services. See the keyword view page for more information.

Network Map

Clickable map of network infrastructure and hosts. This map is drawn from perspective of monitoring system and may not directly match other network diagrams.

  • Layout Method: Different ways of drawing network map
  • Scaling factor: Allows you to change scale of network map (zoom in or zoom out)
  • Drawing Layers: Allows you to filter devices based on Host Group

Spacing out a subset of hosts

There is a hidden option which will try and space out the hosts in the network map.

A circular network map can look like this:

If you select the host groups (either by including or excluding), you can reduce the number of hosts:

However, this keeps the same spacing, which doesn't increase the readability of the hosts as they will overlap.

If you add the hidden option onlyincludehostsinlayerlist=1 to the URL, the algorithm will space out the hosts based on the host groups in the include list:

This algorithm is not perfect, as the links between hosts may not be correct if there is not a full tree between the monitoring system and the hosts, but it can help with visualising the parent tree relationship.

You can then bookmark this URL to get this view.

Service Detail

Detailed view of all hosts and service checks.

For each service, various icons will appear based on what information is available:

State Icon Notes
Acknowledged For more information, see the acknowledgements page
Scheduled Downtime (downtime is in the future)
Currently in downtime
Comments The comments will not show if an acknowledgement or downtime icon is displayed
Graphing You may have to reload Opsview for new service checks to be marked with the graphing icon
Flapping

Host Commands

Disable active checks for this host

Disables any active checks for this hosts. Active checks are those which use Nagios plugins, SNMP traps will still be processed.

Re-schedule the next check of this host

Allows the next host check to be rescheduled, useful if host has recently recovered from problem.

Start / stop accepting passive checks for this host

Passive checks allow host status data to be submitted for this host from external source

Submit passive check for this host

If passive checks enabled. Allows host status to be set manually.

Start obsessing for this host

Enable / Disable notifications for this host

Configures whether status notifications should be sent for host (does not affect service checks)

Enable / Disable notifications for all services on this host

Configures whether status notifications should be sent for services associated with this host

Schedule downtime for this host

Enable / Disable checks for all services on this host

Configures whether this host should be monitored. It is strongly recommended that checks are not disabled.

Enable / Disable event handler for this host

Configures whether event handler is enabled. The event handler is used by distributed monitoring, it can also be used for integration with other management systems.

Enable / Disable flap detection for this host

Flap detection is a mechanism used to suppress notifications for a host which is frequently changing state (eg: rebooting, reoccuring network issue). Generally this should be enabled.

Service Commands

Enable / disable active checks for this service

Configures whether active checks should be enabled for this service check. Active checks use Nagios plugins, passive checks may be SNMP traps or data from external systems.

Re-schedule the next check of this host

Allows the next service check to be rescheduled, useful if service has recently recovered from problem.

Start / stop accepting passive checks for this service

Passive checks allow service status data to be submitted for this host from external source

Submit passive check for this host

If passive checks enabled. Allows service status and check result to be submitted manually.

Start obsessing for this service

Enable / Disable notifications for this host

Configures whether status notifications should be sent for host (does not affect service checks)

Schedule downtime for this service

Enable / Disable event handler for this service

Configures whether event handler is enabled. The event handler is used by distributed monitoring, it can also be used for integration with other management systems.

Enable / Disable flap detection for this service

Flap detection is a mechanism used to suppress notifications for a service which is frequently changing state (eg: service restarting, reoccuring network issue). Generally this should be enabled.

URL parameters

You can add URL parameters to affect the display of services. This works for /status/hostgroup, /status/host, /status/service and /viewport/{keyword}.

Parameters are specified on the URL by

/status/hostgroup?parameter=value&parameter=value

These are some of the filter parameters you can use:

ParameterEffect
changeaccess=1 Will show list of services that this logged in contact has change access for
host={name}Will filter according to host name. Can use % as a wildcard. Can specify multiple times
servicecheck={name}Will filter according to service name. Can use % as a wildcard. Can specify multiple times
host_state={hoststateid}Will filter according to host state id. Can specify multiple times (up=0, down=1, unreachable=2)
state={stateid} Will filter according to service state id. Can specify multiple times (ok=0, warning=1, critical=2, unknown=3)
host_filter={type} Type is either handled or unhandled. Will filter results according to whether the host is handled or unhandled.
filter={type} Type is either handled or unhandled. Will filter results according to whether the service is handled or unhandled.
asuser={contactname}Will show list of services based on this contact. Only available for view all roles

For /status/service, extra parameters available are:

htid={id}Will show services related to host template id
bcid={id}Will show services related to business component id
bsid={id}Will show services related to business service id

In addition, you can specify an ordering for the results (ordering is not available with /status/host). These can be specified using order={value}. You can specify this multiple times, with the priority from left to right. Valid orders are:

Name Effect
state Orders by service state (order of states: OK, UNKNOWN, WARNING, CRITICAL)
service Orders by service name
host Orders by host name
host_stateOrders by host state (order of states: UP, DOWN, UNREACHABLE)
last_checkOrders by time of last check
last_state_changeOrders by time of last state change

Syntax notes: To have a space in the URL, i.e. 'Unix Memory', you must have a %20 instead of the space - making the URL something like http://opsview/status/service/?state=2&servicename=Unix%20Memory. To have a WILDCARD in the URL (i.e. *), you must use %25, i.e. to show all critical 'Unix *' service checks, use the following: http://opsview/status/service/?state=2&servicename=Unix%25

In addition, each order can be reversed by suffixing _desc to the value.

The default order is host, service.

General URL parameters

These parameters are available across Opsview web:

  • include_header - if set to 0, then the top banner is removed although the page header will remain
  • lang - override the language variable for this page
Navigation
Print/export
Toolbox