pub-solar/matrix-docker-ansible-deploy

Author	SHA1	Message	Date
Slavi Pantaleev	bab1ee2233	Work around mx-puppet-discord failing with "No relay found" after reboot Related to https://gitlab.com/mx-puppet/discord/mx-puppet-discord/-/issues/117 Looks like the bridge is too quick to start and fails to initialize itself by connecting to Synapse. It's mostly observed after a system reboot, because Synapse (and everything else) is slower to start. Once mx-puppet-discord fails to initialize itself, a "No relay found" error will be observed any time you try to relay a Matrix message to Discord. Relaying messages in the other direction (Discord to Matrix) also fails. With this workaround (longer delay on mx-puppet-discord startup), I observe mx-puppet-discord working well, even after a full reboot. Of course, a proper fix is preferable, instead of delaying by a magic number of seconds.	2022-05-17 11:34:00 +03:00
Slavi Pantaleev	0364c6c634	Suppress old container cleanup (kill/rm) failures People often report and ask about these "failures". More-so previously, when the `docker kill/rm` output was collected, but it still happens now when people do `systemctl status matrix-something` and notice that it says "FAILURE". Suppressing to avoid further time being wasted on saying "this is expected".	2022-04-11 09:05:33 +03:00
Slavi Pantaleev	86c36523df	Replace ExecStopPost with ExecStop Reverts `b1b4ba501f`, `90c9801c56`, `a3c84f78ca`, .. I haven't really traced it (yet), but on some servers, I'm observing `ansible-playbook ... --tags=start` completing very slowly, waiting to stop services. I can't reproduce this on all Matrix servers I manage. I suspect that either the systemd version is to blame or that some specific service is not responding well to some `docker kill/rm` command. `ExecStop` seems to work great in all cases and it's what we've been using for a very long time, so I'm reverting to that.	2022-02-05 12:13:36 +02:00
Slavi Pantaleev	b1b4ba501f	Replace ExecStop with ExecStopPost ExecStopPost should allow us to clean up (docker kill + docker rm) even if the ExecStart (docker run ..) command failed, and not just after a graceful service stop was initiated. Source: https://www.freedesktop.org/software/systemd/man/systemd.service.html#ExecStopPost=	2022-01-04 17:27:25 +02:00
Slavi Pantaleev	512f42aa76	Do not report docker kill/rm attempts as errors These are just defensive cleanup tasks that we run. In the good case, there's nothing to kill or remove, so they trigger an error like this: > Error response from daemon: Cannot kill container: something: No such container: something and: > Error: No such container: something People often ask us if this is a problem, so instead of always having to answer with "no, this is to be expected", we'd rather eliminate it now and make logs cleaner. In the event that: - a container is really stuck and needs cleanup using kill/rm - and cleanup fails, and we fail to report it because of error suppression (`2>/dev/null`) .. we'd still get an error when launching ("container name already in use .."), so it shouldn't be too hard to investigate.	2021-01-27 10:22:46 +02:00
Slavi Pantaleev	1692a28fe4	Work around annoying Docker warning about undefined $HOME > WARNING: Error loading config file: .dockercfg: $HOME is not defined .. which appeared in Docker 20.10.	2021-01-15 00:23:01 +02:00
Slavi Pantaleev	d08b27784f	Fix systemd services autostart problem with Docker 20.10 The Docker 19.04 -> 20.10 upgrade contains the following change in `/usr/lib/systemd/system/docker.service`: ``` -BindsTo=containerd.service -After=network-online.target firewalld.service containerd.service +After=network-online.target firewalld.service containerd.service multi-user.target -Requires=docker.socket +Requires=docker.socket containerd.service Wants=network-online.target ``` The `multi-user.target` requirement in `After` seems to be in conflict with our `WantedBy=multi-user.target` and `After=docker.service` / `Requires=docker.service` definitions, causing the following error on startup for all of our systemd services: > Job matrix-synapse.service/start deleted to break ordering cycle starting with multi-user.target/start A workaround which appears to work is to add `DefaultDependencies=no` to all of our services.	2020-12-10 11:43:20 +02:00
Scott Crossen	fa5d85426b	Renamed systemd descriptions for all bridges	2020-10-13 16:40:30 -07:00
Hugues Morisset	42e7f5e9bc	Add mx-puppet-discord	2020-07-01 13:31:31 +02:00

9 commits