File-based service discovery is one of the most popular and flexible methods of service discovery available in Prometheus. However, there was no good way that I knew of to test the validity of the files that file_sd will try to use prior to actually giving them to Prometheus to be read.

This particular solution uses Drone, as it is my CI/CD tool of choice, but you can use the same logic in Jenkins, Gitlab-CI, or whatever you utilize for your builds/deployments. I also use JSON as my file-format for file_sd but you can apply the same concepts if you use YAML.

First, we need a way to actually define what the file should look like. To do this, I chose to use JSON schemas. You can pop on over to jsonschema.net, paste in one of your valid JSON target files, and infer a schema based off of it. Personally, I changed it so that env, team, and service labels are required on targets in order to enforce uniformity in our targets regardless of what team is adding them. Go ahead and save that schema as schema.json in a drone folder in the top level of your repository.

With that schema in hand, now we need a way to compare our JSON files to it. To do this, I wrote a simple Python script:

from jsonschema import validate
from jsonschema.exceptions import ValidationError
from json.decoder import JSONDecodeError

import glob
import json

import sys

errors: {str: str} = {}
error_encountered = False

with open("drone/schema.json") as f:
    schema = json.load(f)
    f.close()

for f in glob.glob("files/targets/*.json"):
    with open(f) as inst:
        try:
            validate(json.load(inst), schema)
        except ValidationError as e:
            error_encountered = True
            errors[inst.name] = f"Schema error: {e.message}"
        except JSONDecodeError as e:
            error_encountered = True
            errors[inst.name] = f"JSON decoding error: {e.msg} on line {e.lineno}"

for f, err in errors.items():
    if err:
        print(f"[ ERROR in {f} ]")
        print("  |>", err)
        print()

if error_encountered:
    sys.exit(1)

print("All tests passed! Have a cookie. 🍪")

The script presumes the location of the schema file (drone/schema.json) and the target file(s) (files/targets/*.json) but you can adapt that as needed.

Now in your .drone.yml just add a step to your pipeline to run the validation script (and install its dependency):

  - name: validate JSON of targets
    image: python:3-alpine
    commands:
      - pip3 install --quiet jsonschema
      - python3 drone/validate.py

Easy peasy. Now you'll never have another malformed target file again! (err... at least not one that makes it to prod).