Sunday, August 28, 2011

Nagios Group Service Checks and Exclusions

Using hostgroup_name in service checks is very, very useful. However, I find that I often want to include the Nagios server in one of the hostgroups, but don't necessarily want to configure nrpe for the nagios host. An easy way around this is to use Nagios's exclude feature. You can exclude a group (or host) by prefacing it with "!"

For instance:

1. create a group that contains only your nagios host

define hostgroup {
        hostgroup_name localhost-monitor-hosts
        members         mynagiosserver
}


2. create the service:

define service {
        hostgroup_name          unix-hosts,!localhost-monitor-hosts
        use                             critical-service
        service_description             check_disk
        check_command                   check_nrpe!check_disk!80!85
        check_freshness                 1
        register                        1
        }

This will check all hosts in the unix-hosts group, excluding any in localhost-monitor-hosts. You can obviously create multiple groups, each with a different purpose.

Note, you can also define a service with more than one host, and exclude hosts that way:

define service {
         host_name www1, www2, !www3
         use generic-service
         service_description check_ping
         check_command check_ping!100.0,20%!500.0,60%
}

but I find that more taxing, as you have to maintain the host_name... which is kind ignoring the whole point of groups.