Today an issue was brought to my attention in which a Nagios service check was genuinely stuck in downtime, with the only way to fix it being manually updating the values in the database.
The service did not show up in the "Scheduled Downtime" page in the XI UI, nor did it show up in the
nagios_scheduleddowntime table in our MariaDB instance. Yet, the service page still showed it as being in downtime.
According to the comment history on the service, the downtime was scheduled until 2042, so we couldn't really just wait and hope it expired.
This service has been scheduled for fixed downtime from 2019-04-30 00:39:16 to 2042-02-21 07:39:16.
Note: Before I talk about how the problem was resolved, if you're just here wondering how to remove scheduled downtime then you can do so in the aforementioned "Scheduled Downtime" page. It's at
nagiosxi/includes/components/xicore/downtime.php in the XI UI, and under the "System -> Downtime" page in Nagios Core.
The issue that presented itself to us was that the entry in the
nagios_servicestatus table had incorrect values in the
acknowledgement_type column can either 0, 1, or 2 which represent None, Normal, or Sticky, respectively.
scheduled_downtime_depth column can be any smallint number, and it represents the number of downtimes that the service is in (since a service can be in multiple levels of downtime).
After determining the object ID of the service and verifying that no entries corresponding to it existed in
nagios_scheduleddowntime, I used the following SQL to fix the service's status.
-- Fix acknowledgement_type, replace service_object_id with the appropriate ID update nagios.nagios_servicestatus set acknowledgement_type='0' where service_object_id='8540'; -- Fix scheduled_downtime_depth, replace service_object_id with the appropriate ID update nagios.nagios_servicestatus set scheduled_downtime_depth='0' where service_object_id='8540';
After doing this, you can also optionally remove the comment history for the downtime from the
-- replace object_id and commenthistory_id with appropriate values. delete from nagios.nagios_commenthistory where object_id = '8540' and commenthistory_id='2461761';