infra-vintage/monitoring/prometheus
2022-07-11 16:28:55 +02:00
..
src/etc/prometheus prometheus: add matrix.pub.solar scrape endpoints 2022-07-11 16:26:14 +02:00
default.nix Use prom/prometheus docker image from DockerHub 2022-07-03 17:00:53 +02:00
gen-pass.py prometheus: add helper script for pw hash 2022-07-11 16:28:55 +02:00
README.md prometheus: docs for linting the config yml file 2022-07-11 16:27:13 +02:00

Custom prometheus docker image for pub.solar monitoring

Note: Opted for pulling the prometheus docker image from Docker Hub, because nixpkgs lagged behind a few releases at the time of this writing.

Linting the prometheus.yml after changing it

nix-shell -p prometheus --run 'promtool check config ./src/etc/prometheus/prometheus.yml'

Updating our custom image to a new prometheus version

Update default.nix with new image tag, once a new prometheus version got released, you can check the tags on DockerHub for this. Look for Digest for OS/ARCH=linux/amd64 for the new version. Paste the new digest, e.g. sha256:f2fa04806b65f49b652c8d418544bb9660bb8224619ee8c960a778f46614dddf to the imageDigest option.

Build the image and load it into locally running docker:

docker load < $(nix-build ./monitoring/prometheus/default.nix)

Push the image with the new tag to a private registry:

docker push registry.greenbaum.cloud/pub_solar/prometheus:$NEW_TAG

Run the newly pushed image with docker on triton: Note: the current updating process will delete prometheus' time series database, only newly incoming metrics after the update will appear in grafana. This could be IMPROVED by using a volume or seperate prometheus container for running the TSDB.

  • Setup your shell's environment, preferably using tritonshell and use the profile pub_solar in data center lev-1
  • Rename the currently running prometheus container:
docker rename pub_solar_prometheus pub_solar_prometheus-old
  • Start a new container, using docker's --label=triton.cns.services=prometheus flag to auto-generate the DNS records prometheus.svc.e5756d08-36fd-424b-f8bc-acdb92ca7b82.lev-1.greenbaum.zone and prometheus.svc.e5756d08-36fd-424b-f8bc-acdb92ca7b82.lev-1.int.greenbaum.zone for the new container:
docker run -d \
  --name pub_solar_prometheus \
  -p 9090 \
  --label "triton.cns.services=prometheus" \
  --label "solar.pub.monitoring=true" \
    registry.greenbaum.cloud/pub_solar/prometheus:$NEW_TAG

Stop the old container, then verify the new one works fine (in grafana -> datasources):

docker stop pub_solar_prometheus-old

If everything works fine, remove the old container:

docker rm pub_solar_prometheus-old