Commit graph

57 commits

Author SHA1 Message Date
Slavi Pantaleev 226c550ffa Add support for stream writer Synapse workers
As stream writer workers are also powered by the `generic_worker`
Synapse app, this necessitated that we provide means for distinguishing
between them and regular `generic_workers`.

I've also taken the time to optimize nginx configuration generation
(more Jinja2 macro usage, less duplication).

Worker names have also changed.
Workers are now named sequentially like this:
- `matrix-synapse-worker-0-generic`
- `matrix-synapse-worker-1-stream-writer-typing`
- `matrix-synapse-worker-2-pusher`

instead of `matrix-synapse-worker_generic_worker-18111` (indexed with a
port number).

People who modify `matrix_synapse_workers_enabled_list` directly will
need to adjust their configuration.
2022-09-15 08:10:04 +03:00
Aine f8cc48eacc
Update Prometheus 2.37.0 -> 2.38.0 2022-08-16 15:36:54 +00:00
Slavi Pantaleev d073c7ecb3 More ansible-lint fixes 2022-07-18 13:01:19 +03:00
Slavi Pantaleev ddf18eadc7 More ansible-lint fixes 2022-07-18 13:01:17 +03:00
Slavi Pantaleev 34cdaade08 Use fully-qualified module names for builtin Ansible modules
Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/1939
2022-07-18 12:58:41 +03:00
Aine 4bc12fd560
update prometheus 2.36.2 -> 2.37.0 2022-07-17 17:31:41 +03:00
Aine e149f33140
add/unify 'Project source code URL' link across all roles 2022-07-16 23:59:21 +03:00
Aine c71fea70d3
matrix-prometheus feedback 2022-06-26 12:01:57 +03:00
Aine 1542e8bca0
Update roles/matrix-prometheus/templates/systemd/matrix-prometheus.service.j2
Co-authored-by: Slavi Pantaleev <slavi@devture.com>
2022-06-26 06:59:46 +00:00
Aine 574f57c82c
expose prometheus process args 2022-06-26 08:41:22 +03:00
Aine c793fc5ff0
Update Prometheus (v2.33.3 -> v2.36.2) 2022-06-25 18:07:30 +00:00
Aine 502ea21fba
add retires to all get_url actions 2022-04-19 22:01:14 +03:00
Slavi Pantaleev 0364c6c634 Suppress old container cleanup (kill/rm) failures
People often report and ask about these "failures".
More-so previously, when the `docker kill/rm` output was collected,
but it still happens now when people do `systemctl status
matrix-something` and notice that it says "FAILURE".

Suppressing to avoid further time being wasted on saying "this is
expected".
2022-04-11 09:05:33 +03:00
Aine 2da3768b20
Added retries to the docker pulls (#1701) 2022-03-17 17:37:11 +02:00
Jim Myhrberg eeca3c8dca
fix: avoid yaml being wrapped at column 80 via to_nice_yaml
The `to_nice_yaml` helper will by default wrap any string YAML values on
the first space after column 80. This can in worst case yield invalid
YAML syntax. More details in Ansible's documentation here:

https://docs.ansible.com/ansible/latest/user_guide/playbooks_filters.html#formatting-data-yaml-and-json

In short, you need to explicitly provide a custom width argument of a
high number of some kind to avoid the line wrapping.
2022-03-16 01:10:26 +00:00
GoliathLabs 728123b9ab
Updated: prometheus to v2.33.3 2022-02-22 12:52:00 +01:00
Marko Weltzer 819574b8ba
Merge branch 'spantaleev:master' into master 2022-02-05 21:37:53 +01:00
Marko Weltzer 7e5b88c3b7 fix: all praise the allmighty yamllinter 2022-02-05 21:32:54 +01:00
Slavi Pantaleev 86c36523df Replace ExecStopPost with ExecStop
Reverts b1b4ba501f, 90c9801c56, a3c84f78ca, ..

I haven't really traced it (yet), but on some servers, I'm observing
`ansible-playbook ... --tags=start` completing very slowly, waiting
to stop services. I can't reproduce this on all Matrix servers I manage.
I suspect that either the systemd version is to blame or that some
specific service is not responding well to some `docker kill/rm` command.

`ExecStop` seems to work great in all cases and it's what we've been
using for a very long time, so I'm reverting to that.
2022-02-05 12:13:36 +02:00
GoliathLabs e0a088dbe3
Updated: prometheus to v.2.33.1 2022-02-05 11:01:52 +01:00
HarHarLinks 321ed9b609 Merge remote-tracking branch 'origin/master' into hookshot 2022-01-14 19:26:31 +01:00
HarHarLinks 87871040df add hookshot metrics to internal prometheus 2022-01-11 00:56:51 +01:00
Slavi Pantaleev b1b4ba501f Replace ExecStop with ExecStopPost
ExecStopPost should allow us to clean up (docker kill + docker rm)
even if the ExecStart (docker run ..) command failed, and not just after
a graceful service stop was initiated.

Source: https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStopPost=
2022-01-04 17:27:25 +02:00
WobbelTheBear 3f0e8122ec
Update prometheus 2021-12-02 13:41:12 +01:00
Slavi Pantaleev 735c966ab6 Disable systemd services when stopping to uninstall them
Until now, we were leaving services "enabled"
(symlinks in /etc/systemd/system/multi-user.target.wants/).

We clean these up now. Broken symlinks may still exist in older
installations that enabled/disabled services. We're not taking care
to fix these up. It's just a cosmetic defect anyway.
2021-11-10 17:39:21 +02:00
sakkiii d09609b3bd
Update prometheus (2.29.2 -> 2.30.3) 2021-10-27 23:11:01 +05:30
WobbelTheBear 972077aa33
Update prometheus (2.29.1 -> 2.29.2)
Update prometheus (2.29.1 -> 2.29.2)
2021-08-27 16:51:38 +02:00
sakkiii 00d1804dd9 prometheus & its exporter updates 2021-08-24 10:24:54 +05:30
sakkiii 49455a9ce0
prometheus version 2.28.0 -> 2.28.1 2021-07-07 21:53:05 +05:30
Stuart Mumford 2aa457efcc Use a prom variable and not a synapse role variable 2021-07-02 15:41:36 +00:00
Stuart Mumford 09ee5ce52e we index from 0 apparently 2021-06-30 21:32:19 +00:00
Stuart Mumford 3d063f6ace make them show as jobs in grafana 2021-06-30 21:30:18 +00:00
Stuart Mumford 7b52e6ad5e Add worker metrics to prometheus exporter 2021-06-30 20:52:49 +00:00
sakkiii 2b881e245b
Update prometheus v2.27.1 -> v2.28.0 2021-06-24 10:07:14 +05:30
sakkiii 897c982517
prometheus security update 2.27.1 2021-05-30 14:32:51 +05:30
Raymond Coetzee 4e2780ff88 Add support for a prometheus postgres exporter
This commit introduces a new role that downloads and installs the
prometheus community postgres exporter  https://github.com/prometheus-community/postgres_exporter.
A new credential is added to matrix_postgres_additional_databases that
allows the exporter access to the database to gather statistics.
A new dashboard was added to the grafana role, with some refactoring
to enable the dashboard only if the new role is enabled.
I've included some basic instructions for how to enable the role in
the Docs section.

In terms of testing, I've tested enabling the role, and disabling
it to make sure it cleans up the container and systemd role.
2021-05-27 20:13:29 +01:00
sakkiii d5cd3d443d
Update prometheus (2.26.0->2.27.0) 2021-05-14 18:56:33 +05:30
Slavi Pantaleev adcecaffaf Fix connectivity between prometheus and prometheus-node-exporter
Expected to have regressed after https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/1008

This patch comes with its own downsides (as described in the comments
for matrix_prometheus_node_exporter_container_http_host_bind_port),
but at least there's:
- no security issue
- metrics remain readable from matrix-prometheus (even if the network metrics are inaccurate)

A better patch is certainly welcome.
2021-04-19 18:29:03 +03:00
Dan Arnfield f04614a993 Fix prometheus network for ansible < 2.8 2021-04-17 20:15:26 -05:00
Slavi Pantaleev badd81e0ec Revert "Attempt to fix docker_network result discrepancy between Ansible versions"
This reverts commit 68ca81c8c2.
2021-04-17 19:31:20 +03:00
Slavi Pantaleev 68ca81c8c2 Attempt to fix docker_network result discrepancy between Ansible versions
Supposedly fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/907
2021-04-17 11:42:06 +03:00
Dan Arnfield 8a550ce67c Update prometheus (2.24.1->2.26.0) 2021-04-16 09:25:45 -05:00
Ahmad Haghighi e335f3fc77 rename matrix_global_registry to matrix_container_global_registry_prefix related to #990
Signed-off-by: Ahmad Haghighi <haghighi@fedoraproject.org>
2021-04-12 17:23:55 +04:30
Ahmad Haghighi f52a8b6484 use custom docker registry 2021-04-12 17:23:55 +04:30
Slavi Pantaleev 0585a3ed9f
Merge pull request #896 from rakshazi/add_version_to_each_role
added "matrix_%SERVICE%_version" variable to all roles
2021-02-21 12:26:17 +02:00
Slavi Pantaleev 77ab0d3e98 Do not delete Prometheus/Grafana Docker images
Same reasoning as in 1cd251ed78
2021-02-21 11:14:40 +02:00
rakshazi 2f887f292c
added "matrix_%SERVICE%_version" variable to all roles, use it in "matrix_%SERVICE%_docker_image" var (preserving backward-compatibility) 2021-02-20 19:08:28 +02:00
Slavi Pantaleev 66d5b0e5b9 Do not fail on unrelated validation tasks when Prometheus not enabled
These validation tasks should only run when Prometheus is enabled.
2021-02-12 15:41:15 +02:00
Slavi Pantaleev c8ab200cb1 Break dependency between matrix-prometheus and (matrix-prometheus-node-exporter, matrix-synapse) 2021-02-12 11:59:24 +02:00
Slavi Pantaleev 6842102e00 Split install/uninstall tasks in matrix-prometheus 2021-02-12 11:59:24 +02:00