Alerting on Missing Data in Prometheus
Alerting on missing data in Prometheus is commonly handled by the absent
function, but that's really only useful when you know the labels you expect to be there ahead of time. How can you dynamically alert on missing data then?
By using the unless
operator, you can return a set of labels only when a different matching metric does not exist. For example,
group without (instance) (up{job="blackbox_http_2xx"})
unless
count without (instance) (probe_http_status_code{job="blackbox_http_2xx"} == 200)
Only if there are no HTTP 200s for the label set that results from the group
query will the alert would fire. The alert would fire with a label set that looks similar to this in my environment:
{job="blackbox_http_2xx", environment="production", cluster="clusterA", service="website"}
Having these extra labels can be extremely useful in your Alertmanager routing configs and any templating you do, which is why I strive to keep as many labels as possible when designing alerting rules.