Welcome to docs.opsview.com

Differences

This shows you the differences between two versions of the page.

opsview4.6:runtimedb [2014/09/09 12:19] (current)
Line 1: Line 1:
 +====== Runtime Database ======
 +Opsview relies on updates to the Runtime database hold the latest status data. Opsview uses NDOUtils for NagiosĀ® Core to write data, and its own custom process to take the data and update the database.
 +===== Architecture Diagram =====
 +{{:opsview4.6:nagios_to_runtime_architecture.png|}}
 +
 +Nagios Core writes its data to the ''/usr/local/nagios/var/ndo.dat'' file via the ''ndomod'' broker module. This file is moved into the ''/usr/local/nagios/var/ndologs'' directory every 5 seconds.
 +
 +The main daemon is ''import_ndologsd''. This takes the ndo.dat files and handles the interaction with the Runtime database.
 +
 +The interface between Nagios Core and import_ndologsd are the log files that are held in ''/usr/local/nagios/var/ndologs''.
 +
 +Note that if there is a problem with the import process, Nagios Core will continue to run and do its monitoring and continue alerting. However, the status screens in Opsview may not be up to date.
 +
 +
 +
 +===== Troubleshooting =====
 +==== Where are the debug logs? ====
 +''import_ndologsd'' writes its log files into ''/var/log/opsview/opsviewd.log''.
 +
 +==== Can I get extra debug? ====
 +You can uncomment the following line in ''/usr/local/nagios/etc/Log4perl.conf''. It may take up to 30 seconds for the daemon to recognise the change:
 +<code>
 +log4perl.logger.import_ndologsd=DEBUG
 +</code>
 +
 +You will get timing and size information for every log file imported.
 +
 +Note that a DEBUG level will also copy any NDO log files from ''/usr/local/nagios/var/ndologs'' into ''/usr/local/nagios/var/ndologs.archive'', so do not leave the debug level on for longer than necessary as this will use a lot of disk space. If you have issues with slow Runtime performance, gather these files over a reload and escalate to Opsview support for further investigation.
 +
 +
 +==== What does "Import of 1257049915.210140, size=1123, took 5.63 seconds > 5 seconds" mean? ====
 +If you get log entries such as:
 +<code>
 +[2009/10/06 21:08:28] [import_ndologsd] [WARN] Import of 1254859702.684564, size=530288, took 6.21 seconds > 5 seconds
 +</code>
 +
 +This is usually fine because a large import (around 530K above) is occurring. This is likely to be at the end of the reload process.
 +
 +If you have entries where the size is small and the time is long, then there are probably some [[opsview4.6:mysql#mysql_tuning|database tuning]] you can do.
 +<code>
 +[2009/11/01 04:32:01] [import_ndologsd] [WARN] Import of 1257049915.210140, size=1123, took 5.63 seconds > 5 seconds
 +</code>
 +
 +
 +==== How do I know that the updates are timely? ====
 +In a default install of Opsview, a service check called //Opsview NDO// will be created and associated with the opsview host.
 +
 +This service check uses the check_opsview_ndo_import plugin to collect information.
 +
 +We recommend that you have the appropriate notifications setup for this service check so that you are informed when the imports are not being updated to the database in time.
 +
 +
 +==== NDO.dat is stale - service check statuses are not updating in UI ====
 +If you find that your NDO.dat file is stale and dashboard statuses are not updating it may be caused by the noexec mount option being
 +enabled on /tmp.  Review your /etc/fstab file, remove the entry and remount.
 +
 +
 +==== What happens when the database is down? ====
 +If the database connection is down, the import process is blocked until the database is available again.
 +
 +You will see errors like this in opsviewd.log
 +<code>
 +[2013/04/04 11:37:59] [import_ndologsd] [WARN] Reconnecting to database
 +[2013/04/04 11:38:02] [import_ndologsd] [FATAL] ..reconnecting failed
 +[2013/04/04 11:38:02] [import_ndologsd] [WARN] Reconnecting to database
 +[2013/04/04 11:38:05] [import_ndologsd] [FATAL] ..reconnecting failed
 +</code>
 +
 +
 +==== I had a database problem - how can I get the imports to catch up? ====
 +If you have a major database problem and the ndologs directory has many entries, then you can either:
 +  * leave the import process to continue, so all the data is inserted and wait for the latest status data to be displayed
 +  * or remove the log files from the directory, thus getting the latest status data into the Runtime database straight away, but losing state history data
 +
 +You can get the latest database status time by hovering over the //Last Updated// value in the status bar (when a refresh occurs). For example, it could read:
 +<code>
 +(Server status: 2009-11-03 09:02:08)
 +</code>
 +
 +If you find that the Runtime database is updating very slowly, you should check if the underlying I/O subsystem has a problem. We have seen issues where a virtual machine running Opsview has slow I/O due to backups on the vm host.
 +
 +Also, you maybe able to improve performance by [[opsview4.6:mysql#mysql_performance_tuning|tuning MySQL]].
 +
 +==== Can I tell how long historically updates are taking? ====
 +The new perl NDO does not log the connection information anymore, so this is not possible.
Navigation
Print/export
Toolbox