Deduplicating Prometheus Blackbox ICMP checks with File Based Service Discovery

In my Prometheus set up, most of my scrape targets come from a file_sd service discovery set up. This poses a challenge of deduplicating ICMP probes from the Blackbox Exporter for hosts with multiple scrape endpoints.

Most of my file_sd targets have a JSON files that looks like this:

[
  {
    "targets": [ "foo.example.com:9100","bar.example.com:9100" ],
    "labels": {
      "env": "prod",
      "job": "node",
      "service": "foobar"
    }
  },
  {
    "targets": [ "foo.example.com:9104" ],
    "labels": {
      "env": "prod",
      "job": "mysql",
      "service": "barfoo"
    }
  }
]

When setting up the Blackbox Exporter in a "standard" way, it uses relabel_configs to take the target's URL (including the port) and put that into the __param_target label. For example:

  relabel_configs:
    - source_labels: [__address__]
      target_label: __param_target

This means that foo.example.com would get scraped twice because it has two different __address__ labels (one on port 9100 and one on port 9104).

To prevent this, we can employ the use of regex to remove the port and prevent the creation of multiple ICMP metrics for a single host. Simply update the part of the relabel_configs configuration that sets the __param_target to look like this:

  relabel_configs:
    - source_labels: [__address__]
      regex: (.*?)(:[0-9]+)?
      target_label: __param_target
      replacement: ${1}

This regex will take everything up to the colon (the URL without the port) and save that in a capture group ( ${1} ), which we then use as the __param_target label.

Now you can update your config, reload Prometheus, and you'll see in the Targets page that it's no longer duplicating targets on your ICMP job.