Upgrading Opsview
This pages details the steps to perform when upgrading Opsview. While most actions are handled during the package installation, there maybe some pre-upgrade and post-upgrade steps that require manual intervention.
Pre upgrade
- Verify that the Opsview configuration is committed. As part of the upgrade, an Opsview reload will be run
- Verify all the Opsview slaves are contactable - as part of the upgrade on the slaves will be pushed the new files
- Run the backup script if required:
/usr/local/nagios/bin/rc.opsview backup. This will backup the whole of /usr/local/opsview-web, the important parts of /usr/local/nagios and the configuration parts of the databases. Check that the backup files have been created in $backup_dir (set in /usr/local/nagios/etc/opsview.conf) - If you are using lighttp with fastcgi, we recommend you stop lighttp as it uses a lot of cpu time when the opsview-web fastcgi processes are not available
You are now ready to upgrade Opsview, using your normal OS package methods.
Database schema changes
As part of the upgrade process, the database schema may change. Normal output looks like:
Mon Nov 9 17:47:00 2009: Starting for runtime-nagios Mon Nov 9 17:47:01 2009: DB at version 3.3.0 Re-arranging configuration indexes Updated database to version 3.3.1
From Opsview 3.4.0, it is possible to override some of the steps taken as part of the upgrade process.
The process to override the steps is:
- Find the appropriate upgrade script. There is an upgrade script for each database, with the name
upgradedb_{name}.pl, which you can find at https://secure.opsview.com/svn/opsview/trunk/opsview-core/installer/ - Review the upgrade step that does not need to be run, noting the database name and the upgrade step number (runtime is split into runtime-nagios and runtime-opsview)
- Create the directory
/tmp/opsview_upgrade_override - Create a file with the name
{db_name}-{upgrade_step_number} - Now, when the upgrade runs, the step number should be avoided
For example, the upgrade script upgradedb_runtime.pl has a step 3.0.1 for runtime-nagios which converts tables to InnoDB format. If you have already done this, then you do not need to run this step. Create a file to stop this step from running:
touch /tmp/opsview_upgrade_override/runtime-nagios-3.0.1
This upgrade step will now not run during the upgrade process.
Version specific
These are some steps that should be checked before upgrading to a specific level.
| Version | Steps required | ||
|---|---|---|---|
| 3.1.0 | Host group hierarchy constraint | ||
| 3.3.0 | ODW table index change | ||
| 3.3.1 | Conversion to BIGINT for two tables in Runtime | ||
| 3.3.2 | NagiosĀ® Core instances in Runtime database | ||
| 3.7.2 | Runtime database index changes, Service checks expected on your system |
During upgrade
Upgrade your package using your normal OS package methods. Notes below are based on your package management system.
Note: If you have a distributed environment, please check the output from the package installation. We've discovered a case where a slave upgrade can fail if there are files that are not readable by the nagios user. This means that the files on the slave may not be in sync with the master. This is tracked as a bug OPS-896.
YUM
Due to the dependency logic not working correctly, you need to tell yum to install each package individually:
yum install opsview opsview-core opsview-base opsview-perl opsview-web
This is tracked in OPS-1593.
RPMs
In Redhat 4, if a post scriptlet fails, it does not fail the entire installation, so it is possible to have a mixture of code in an inconsistent state. You will need to reinstall the original package. The post scripts have been designed to not be a problem to rerun multiple times.
Ensure you review the output of the RPM installation to verify that the install was successful.
To reinstall a failed package, you have to remove it first. For instance, if you were attempting to install opsview-core-3.0.2.2276 and this failed, you would have to run:
rpm -e --nodeps opsview-core-3.0.2.2276
to delete the old package before trying to reinstall it again.
SLES
Erroneous Message of "Packages are Not Supported By Vendor"
When running zypper update, you could get output like:
Loading repository data... Reading installed packages... The following packages are going to be upgraded: opsview opsview-base opsview-core opsview-perl opsview-web The following packages are not supported by their vendor: opsview opsview-base opsview-core opsview-perl opsview-web
Although it says that the packages are not supported by their vendor, Opsview Limited will continue to support these packages.
This appears to be a configuration setting within SLES' zypper repositories. This is tracked as a bug in https://secure.opsview.com/jira/browse/OPS-1784.
Packages Cannot Be Upgraded Individually
Although zypper update and zypper list-updates displays Opsview packages that can be upgraded, running zypper update opsview-perl says that there is Nothing to do.
The only solution appears to be to run zypper update, though this is not ideal if there is a large number of packages that need to be upgraded.
This is tracked as a bug at https://secure.opsview.com/jira/browse/OPS-1785.
Vendor Change
If you are upgrading Opsview using zypper from a version before March 2012, you may get a message such as:
The following package update will NOT be installed: opsview-perl opsview-base opsview-web opsview-core
This is likely to be because of changes to the vendor specification in the RPM package files from Opsera Limited to Opsview Limited.
To workaround this, we recommend that you change the zypper configuration to allow vendor changes temporarily. You will need to edit /etc/zypp/zypp.conf and uncomment:
solver.allowVendorChange = true
If you now run zypper update, the packages will be allowed to upgrade.
You will only need to do this once.
Solaris PKGs
The normal method of upgrading Solaris packages is to remove all the current packages and then install the new ones.
The correct order to remove the packages is
- ALTovweb
- ALTovcore
- ALTovperl (or opsview-perl, from March 2012 onwards)
- ALTovbase
The correct order of installing the new packages is
- ALTovbase
- opsview-perl
- ALTovcore
- ALTovweb
- ALTovreports
Post upgrade
Upgrade Bug on Solaris
Due to a bug in the Solaris packages, the nagios user's crontab may not have Opsview generated entries. Also, the opsview-agent will not be running.
This affects existing Solaris system where you have upgraded to a version beyond 3.5.1 or higher.
Check the nagios user's crontab:
crontab -l nagios | grep OPSVIEW-START
If this returns with no lines, then you need to execute:
su - nagios /usr/local/nagios/installer/postinstall
This will replace the crontab entries.
You will also need to start the agent:
/etc/init.d/opsview-agent start
Debian
Upgrading to Lenny
If you upgrade your Debian server from Etch to Lenny, please ensure you reboot with the upgraded kernel. There have been reports that using an old kernel with Lenny files on an ext3 filesystem can cause RRD to lose data.
Apache proxy
If you are using the Apache proxy (which we recommend for all installs), check the sample configuration file at /usr/local/nagios/installer/apache_proxy.conf for any changes.
Version specific
There are some manual post install steps that need to be done.
Note: You should follow instructions for all the versions that you upgrade through.
| Version | Steps required | ||
|---|---|---|---|
| 3.0.4 | Windows Event Checks, Opsview Daemon Service Check | ||
| 3.1.0 | Menus | ||
| 3.2.1 | Unix Swap Arguments | ||
| 3.3.0 | SNMP Trap Configuration | ||
| 3.7.0 | LDAP Synchronisation | ||
| 3.7.1 | SNMP Interface Checks, Opsview Slave Checks | ||
| 3.9.0 | New MySQL Checks, Unique Constraints on SNMP Trap Rules and Host Template Management URLs | ||
| 3.9.1 | New Check for import_perfdatarrd, New Performance Data in ODW, Changes to Enable SNMP Logic | ||
| 3.11.0 | Access Objects Moved to Roles Keyword Name Restrictions |
Troubleshooting
Reloads fail following an upgrade
After an upgrade, a reload will automatically take place.
In a distributed environment, if there is a problem where the newly upgraded files have not been sent to the slaves, then new configurations created by Opsview may not pass validation, so you may see errors like:
Reading configuration data... Error in configuration file '/usr/local/nagios/tmp/nagios.27520/nagios.cfg' - Line 635 (UNKNOWN VARIABLE)
This usually means that the slave servers do not have the latest software.
To force sending the new binaries to the slave systems, run:
su - nagios
/usr/local/nagios/bin/send2slaves {slavename}
There may be permission errors which mean that the sending to slaves did not work correctly during the upgrade.
Database errors after an upgrade
If you are encountering database errors after an upgrade, it maybe that the database upgrade scripts didn't run properly. All database upgrades are handled automatically and will continue from when the last database state was.
To invoke the database changes, run as the nagios user:
/usr/local/nagios/installer/upgradedb.pl
This is not destructive (it will only make changes that are required), but you should only run it if you have problems.