Welcome to docs.opsview.com

Runtime Database

Opsview relies on updates to the Runtime database hold the latest status data. Opsview uses NDOUtils for NagiosĀ® Core to write data, and its own custom process to take the data and update the database.

Architecture Diagram

Nagios Core writes its data to the /usr/local/nagios/var/ndo.dat file via the ndomod broker module. This file is moved into the /usr/local/nagios/var/ndologs directory every 5 seconds.

The main daemon is import_ndologsd. This takes the ndo.dat files and handles the interaction with the Runtime database.

The interface between Nagios Core and import_ndologsd are the log files that are held in /usr/local/nagios/var/ndologs.

Note that if there is a problem with the import process, Nagios Core will continue to run and do its monitoring and continue alerting. However, the status screens in Opsview may not be up to date.

Troubleshooting

Where are the debug logs?

import_ndologsd writes its log files into /var/log/opsview/opsviewd.log.

Can I get extra debug?

You can uncomment the following line in /usr/local/nagios/etc/Log4perl.conf. It may take up to 30 seconds for the daemon to recognise the change:

log4perl.logger.import_ndologsd=DEBUG

You will get timing and size information for every log file imported.

Note that a DEBUG level will also copy any NDO log files from /usr/local/nagios/var/ndologs into /usr/local/nagios/var/ndologs.archive, so do not leave the debug level on for longer than necessary as this will use a lot of disk space. If you have issues with slow Runtime performance, gather these files over a reload and escalate to Opsview support for further investigation.

What does "Import of 1257049915.210140, size=1123, took 5.63 seconds > 5 seconds" mean?

If you get log entries such as:

[2009/10/06 21:08:28] [import_ndologsd] [WARN] Import of 1254859702.684564, size=530288, took 6.21 seconds > 5 seconds

This is usually fine because a large import (around 530K above) is occurring. This is likely to be at the end of the reload process.

If you have entries where the size is small and the time is long, then there are probably some database tuning you can do.

[2009/11/01 04:32:01] [import_ndologsd] [WARN] Import of 1257049915.210140, size=1123, took 5.63 seconds > 5 seconds

How do I know that the updates are timely?

In a default install of Opsview, a service check called Opsview NDO will be created and associated with the opsview host.

This service check uses the check_opsview_ndo_import plugin to collect information.

We recommend that you have the appropriate notifications setup for this service check so that you are informed when the imports are not being updated to the database in time.

NDO.dat is stale - service check statuses are not updating in UI

If you find that your NDO.dat file is stale and dashboard statuses are not updating it may be caused by the noexec mount option being enabled on /tmp. Review your /etc/fstab file, remove the entry and remount.

What happens when the database is down?

If the database connection is down, the import process is blocked until the database is available again.

You will see errors like this in opsviewd.log

[2013/04/04 11:37:59] [import_ndologsd] [WARN] Reconnecting to database
[2013/04/04 11:38:02] [import_ndologsd] [FATAL] ..reconnecting failed
[2013/04/04 11:38:02] [import_ndologsd] [WARN] Reconnecting to database
[2013/04/04 11:38:05] [import_ndologsd] [FATAL] ..reconnecting failed

I had a database problem - how can I get the imports to catch up?

If you have a major database problem and the ndologs directory has many entries, then you can either:

  • leave the import process to continue, so all the data is inserted and wait for the latest status data to be displayed
  • or remove the log files from the directory, thus getting the latest status data into the Runtime database straight away, but losing state history data

You can get the latest database status time by hovering over the Last Updated value in the status bar (when a refresh occurs). For example, it could read:

(Server status: 2009-11-03 09:02:08)

If you find that the Runtime database is updating very slowly, you should check if the underlying I/O subsystem has a problem. We have seen issues where a virtual machine running Opsview has slow I/O due to backups on the vm host.

Also, you maybe able to improve performance by tuning MySQL.

Can I tell how long historically updates are taking?

The new perl NDO does not log the connection information anymore, so this is not possible.

Navigation
Print/export
Toolbox