Welcome to docs.opsview.com

Troubleshooting

I've added a new host, but there isn't a SVN entry created

When a reload occurs, Opsview will generate a list of all the hosts that have RANCID enabled. However, if the websvn repository doesn't show the host, then maybe the message has not been passed through. Check the timestamp of the following directory:

ls -l /var/opt/opsview/activemq/spool/queue/rancid.master.events/

If this has not been updated at the time of the reload, then this could be a permissions issue. See the next FAQ entry.

Messages don't appear to be transferred

Check the various logs for activemq: /var/opt/opsview/activemq/log/opsview-activemq-scripts.log

Permissions could be an issue.

Check that the nagios user is a member of the opsview group. If this has changed, you will need to restart most daemons to get them to pick up the new permissions.

The flow of data is:

  • rancid collection is invoked by nagios user's 4 hourly cron job
  • this collection puts checksum information into /var/opt/opsview/rancid/checksums
  • the timestamps on the checksums tell you that routers have been discovered
  • a message is placed into the activemq's spool are /var/opt/opsview/activemq/spool/queue/rancid.master.events
  • file2activemq picks up this file and places it into ActiveMQ
  • ActiveMQ will route the message to the rancid master which then processes it, using the consume_rancid_events script
  • the script will update the files on the rancid master in /var/opt/opsview/rancid/svn and run an svn commit to commit the changes to svn

It has been seen that ActiveMQ could have a problem with registering the consumer at the rancid collector. Run the check_rancid_queues plugin to check that ActiveMQ is sending messages correctly to its destinations. You may need to restart ActiveMQ on the master and the slave.

Where's the rancid configuration files?

/usr/local/nagios/etc/plugins/rancid

Have router configuration files been updated?

Look in /var/opt/opsview/rancid/checksums. This gives the latest checksums for the router configurations.

To reduce traffic, only configurations that have changed will be sent to the Rancid Master. To force sending data back to Rancid master, remove the checksum files. Then the file will be sent to the Rancid Master, but if there is no difference in svn, then there will not be a new check-in.

Has the router configuration reached SVN?

Look in /var/opt/opsview/rancid/svn. If the file here contains information, then this is what should be in subversion. You can run an svn status to check compared with the subversion repository.

SVN shows the file, but websvn shows a blank file

This could be due to enscript. In /opt/opsview/repository/include/config.php, uncomment this line:

$config->useEnscript();

If websvn now shows the router configuration, then there is probably an issue with enscript.

Where are the latest files?

On the RANCID master, in /var/opt/opsview/rancid/svn will be all the latest versions of the RANCID router output files.

Testing configuration

To test the RANCID configuration for a specific host, use the following on the monitoring server:

su - nagios
cd /opt/opsview/rancid/bin
export CLOGINRC=/usr/local/nagios/etc/plugins/rancid/cloginrc
./clogin -t 20 <hostname>
  • clogin may need to be changed to a device specific login script. See /opt/opsview/rancid/bin/check_rancid_connection for the device type to script table lookup

This should give you a terminal session. You may need to type exit to come out.

If this works but it doesn't from the Opsview RANCID tab, it could be a tty setting.

Troubleshooting RANCID tab test connection

This simulates running the code to test the RANCID connection with credentials.

Create a temporary file with this data:

add password 192.168.13.2 {terminal} {password}
add method 192.168.13.2 telnet

Change 192.168.13.2 with the hostname. Change terminal to the password, with a 2nd value for Cisco devices. Change telnet to ssh if applicable.

ssh {slave} 'cat /tmp/tempfile | /opt/opsview/rancid/bin/check_rancid_connection -t {vendor} {hostname}'

There maybe issues with tty, as this ssh does not have a tty assigned (see the e30login script for setting tty settings within the expect script).

Testing collection

On the appropriate Opsview server:

su - nagios
export CLOGINRC=/usr/local/nagios/etc/plugins/rancid/cloginrc
. /etc/opt/opsview/rancid/rancid.conf
mkdir /tmp/directory
cd /tmp/directory
rancid -d -l {hostname}
  • Make sure the cloginrc file has the authentication information
  • rancid may need to be switched to a different name (eg, arancid) depending on the type of device - see /opt/opsview/rancid/bin/check_rancid_connection for the device type to script table lookup

You should get a file in this temporary directory which is the router configuration as it will be pushed into SVN.

Walking through a collection

Running rancid -d -l {hostname} will show the command being run. You can run clogin {hostname} and do the commands listed in turn to see what output comes up. This may help if you are having specific issues in the collection of data.

WebSVN

If you get “repos 1” listed as a repository, check /etc/websvn/svn_deb_conf.inc to remove this. Not sure why this occurs.

check_rancid_status says: 'Some routers not updated'

This plugin checks for routers which have not been updated within the last 10 hours. You may get errors due to:

  • credentials changing on the device
  • the device being unavailable

This has also been seen for hosts that are monitored by a slave system which is deactivated. This will be fixed in a future version of Opsview RANCID.

Navigation
Print/export
Toolbox