Welcome to docs.opsview.com

Differences

This shows you the differences between two versions of the page.

opsview4.6:downtime [2014/09/09 12:19] (current)
Line 1: Line 1:
 +====== Downtime ======
 +
 +Downtimes allow you to set hostgroups, hosts or services that will be expected to fail. This is useful for "planned outages", such as an engineer visit, or a software upgrade.
 +
 +The main difference is that the checks will continue to be run, but notifications will not be sent out.
 +
 +Host groups can have downtime set on them which propagates to all hosts and services within.
 +
 +Use the contextual menus in Host group Hierarchy to Schedule downtime. This is because you can schedule downtime to lots of hosts and services at the same time.
 +
 +**Note**: Because the downtime affects all hosts and all services within a hostgroup, the ability to schedule downtime at a host group level is only available to a user with the [[opsview4.6:access|DOWNTIMEALL access]]. Users with DOWNTIMESOME can set downtime on a per-host or per-service level (if they have permission).
 +
 +**Note**: Opsview is designed so that the downtime for a host will always set the downtime for its services.
 +
 +In a distributed environment, setting downtime at the master will propagate to all the slaves (however, setting downtime on a slave will **not** propagate to the master). This downtime information is synchronised with other slaves and slave clusters at certain times - see the [[opsview4.6:slavesynchronisation|documentation]] for limitations you should be aware of.
 +
 +===== Scheduling Downtime =====
 +Select ''Schedule Downtime'' from the contextual menus.
 +
 +==== Host Groups ===
 +{{:opsview4.6:downtime_hostgroup.png|}}
 +
 +All host groups underneath, and all hosts and services contained will have downtime set against them.
 +
 +==== Hosts ====
 +{{:opsview4.6:downtime_host.png|}}
 +
 +The host and all its services will have downtime set.
 +
 +==== Services ====
 +{{:opsview4.6:downtime_service.png|}}
 +
 +This service will have downtime set.
 +
 +===== Submitting Downtime =====
 +You need to fill out information to submit downtime.
 +
 +{{:opsview4.6:downtime_set.png|}}
 +
 +==== Comment ====
 +Reason for the downtime; should be descriptive but short.
 +
 +==== Start and End Dates ====
 +=== Format ===
 +
 +The basic format for the start and end date and time is:
 +  
 +  YYYY/MM/DD hh:mm:ss
 +
 +Some English phrases are also recognised:
 +
 +  now
 +  5am tomorrow
 +  midnight
 +  next friday
 +  8pm
 +  in 1 hours
 +  in 3 days
 +  8 hours from now
 +  wednesday this week
 +
 +Times are either taken specifically, or are relative to now (i.e. wednesday next week means now on wednesday this week).
 +
 +From Opsview 3.11.2, you can also use [[http://www.atlassian.com/software/jira/|Atlassian Jira]] style durations when prefixed with a +, such as:
 +<code>
 ++5d     - add 5 days
 ++1d 8h  - add 1 day and 8 hours
 ++48h    - add 48 hours
 +</code>
 +
 +These are assumed to be the relative to the start date.
 +
 +The times are assumed to be in the local time zone for the Opsview server.
 +
 +===== Cancelling downtime =====
 +Downtime can be cancelled, either before it has started or during a downtime period.
 +
 +From the Hostgroup Hierarchy pages, click on the downtime icon, or select the contextual menu for Detail - you can then delete the downtime. In a distributed environment, the downtime deletion will be propagated to any slaves.
 +
 +Please note that deleting downtime entries on a slave will not result in the change being propagated to the master server. For this reason, administration should be performed on the master server wherever possible.
 +
 +===== Troubleshooting =====
 +
 +==== I have received a recovery notification, even though downtime was scheduled ====
 +This can occur if a host or a service is in a failure state, had previously sent out a notification about the failure and then downtime as scheduled after the failure. In this case, a recovery of the host or service will send a notification.
 +
 +This is the behaviour in Nagios Core, because the reasoning is that if a notification was sent out about the failure, then a notification should also be sent out about the recovery.
 +
 +However, in Opsview, scheduling downtime is considered to be a suppression of notifications, so a change has been made where Opsview will stop the recovery notification from being sent.
 +
 +
 +==== I have cancelled some downtime, but it does not disappear from the Detailed Downtime view ====
 +
 +This could occur if you have deleted the downtime but the record for this information to the Runtime database was lost
 +(possibly if there was a full filesystem, or old ndo.dat files were removed). You can confirm this is the case if Nagios
 +thinks there is no downtime associated.
 +
 +To remove these downtimes, you can run a command in mysql as nagios user (change the comment_data appropriately):
 +<code>
 +cdn
 +utils/cx runtime
 +delete from nagios_scheduleddowntime where comment_data = "Host 'hostname': Bank holiday";
 +</code>
Navigation
Print/export
Toolbox