diff --git a/monitoring/prometheus/README.md b/monitoring/prometheus/README.md new file mode 100644 index 0000000..fc46228 --- /dev/null +++ b/monitoring/prometheus/README.md @@ -0,0 +1,55 @@ +# Custom prometheus docker image for pub.solar monitoring + +**Note:** Opted for pulling the prometheus docker image from Docker Hub, +because nixpkgs lagged behind a few releases at the time of this writing. + +### Updating our custom image to a new prometheus version +Update `default.nix` with new image tag, once a new prometheus version got +released, you can check the [tags on DockerHub](https://hub.docker.com/r/prom/prometheus/tags) +for this. Look for `Digest` for `OS/ARCH=linux/amd64` for the new version. +Paste the new digest, e.g. `sha256:f2fa04806b65f49b652c8d418544bb9660bb8224619ee8c960a778f46614dddf` +to the `imageDigest` option. + +Build the image and load it into locally running `docker`: +``` +docker load < $(nix-build ./monitoring/prometheus/default.nix) +``` + +Push the image with the new tag to a private registry: +``` +docker push registry.greenbaum.cloud/pub_solar/prometheus:$NEW_TAG +``` + +Run the newly pushed image with docker on triton: +**Note:** the current updating process will delete prometheus' time series +database, only newly incoming metrics after the update will appear in grafana. +This could be IMPROVED by using a volume or seperate prometheus container for +running the TSDB. + +- Setup your shell's environment, preferably using [`tritonshell`](https://git.greenbaum.cloud/dev/tritonshell) +and use the profile `pub_solar` in data center `lev-1` +- Rename the currently running prometheus container: +``` +docker rename pub_solar_prometheus pub_solar_prometheus-old +``` +- Start a new container, using docker's `--label=triton.cns.services=prometheus` +flag to auto-generate the DNS records +`prometheus.svc.e5756d08-36fd-424b-f8bc-acdb92ca7b82.lev-1.greenbaum.zone` and +`prometheus.svc.e5756d08-36fd-424b-f8bc-acdb92ca7b82.lev-1.int.greenbaum.zone` +for the new container: +``` +docker run -d \ + --name pub_solar_prometheus \ + -p 9090 \ + --label "triton.cns.services=prometheus" \ + --label "solar.pub.monitoring=true" \ + registry.greenbaum.cloud/pub_solar/prometheus:$NEW_TAG +``` +Stop the old container, then verify the new one works fine (in grafana -> datasources): +``` +docker stop pub_solar_prometheus-old +``` +If everything works fine, remove the old container: +``` +docker rm pub_solar_prometheus-old +```