.. | ||
src/etc/prometheus | ||
default.nix | ||
gen-pass.py | ||
README.md |
Custom prometheus docker image for pub.solar monitoring
Note: Opted for pulling the prometheus docker image from Docker Hub, because nixpkgs lagged behind a few releases at the time of this writing.
Linting the prometheus.yml after changing it
nix-shell -p prometheus --run 'promtool check config ./src/etc/prometheus/prometheus.yml'
Updating our custom image to a new prometheus version
Update default.nix
with new image tag, once a new prometheus version got
released, you can check the tags on DockerHub
for this. Look for Digest
for OS/ARCH=linux/amd64
for the new version.
Paste the new digest, e.g. sha256:f2fa04806b65f49b652c8d418544bb9660bb8224619ee8c960a778f46614dddf
to the imageDigest
option.
Build the image and load it into locally running docker
:
docker load < $(nix-build ./monitoring/prometheus/default.nix)
Push the image with the new tag to a private registry:
docker push registry.greenbaum.cloud/pub_solar/prometheus:$NEW_TAG
Run the newly pushed image with docker on triton: Note: the current updating process will delete prometheus' time series database, only newly incoming metrics after the update will appear in grafana. This could be IMPROVED by using a volume or seperate prometheus container for running the TSDB.
- Setup your shell's environment, preferably using
tritonshell
and use the profilepub_solar
in data centerlev-1
- Rename the currently running prometheus container:
docker rename pub_solar_prometheus pub_solar_prometheus-old
- Start a new container, using docker's
--label=triton.cns.services=prometheus
flag to auto-generate the DNS recordsprometheus.svc.e5756d08-36fd-424b-f8bc-acdb92ca7b82.lev-1.greenbaum.zone
andprometheus.svc.e5756d08-36fd-424b-f8bc-acdb92ca7b82.lev-1.int.greenbaum.zone
for the new container:
docker run -d \
--name pub_solar_prometheus \
-p 9090 \
--label "triton.cns.services=prometheus" \
--label "solar.pub.monitoring=true" \
registry.greenbaum.cloud/pub_solar/prometheus:$NEW_TAG
Stop the old container, then verify the new one works fine (in grafana -> datasources):
docker stop pub_solar_prometheus-old
If everything works fine, remove the old container:
docker rm pub_solar_prometheus-old