Commit graph

34 commits

Author SHA1 Message Date
Michael 33ec5710d9 0.2.1 revision 2021-02-28 22:21:40 +08:00
Slavi Pantaleev 512f42aa76 Do not report docker kill/rm attempts as errors
These are just defensive cleanup tasks that we run.
In the good case, there's nothing to kill or remove, so they trigger an
error like this:

> Error response from daemon: Cannot kill container: something: No such container: something

and:

> Error: No such container: something

People often ask us if this is a problem, so instead of always having to
answer with "no, this is to be expected", we'd rather eliminate it now
and make logs cleaner.

In the event that:
- a container is really stuck and needs cleanup using kill/rm
- and cleanup fails, and we fail to report it because of error
suppression (`2>/dev/null`)

.. we'd still get an error when launching ("container name already in use .."),
so it shouldn't be too hard to investigate.
2021-01-27 10:22:46 +02:00
Slavi Pantaleev 1692a28fe4 Work around annoying Docker warning about undefined $HOME
> WARNING: Error loading config file: .dockercfg: $HOME is not defined

.. which appeared in Docker 20.10.
2021-01-15 00:23:01 +02:00
Slavi Pantaleev 05ca9357a8 Add .service suffix to systemd units list
We'll be adding `.timer` units later on, so it's good to be
more explicit.
2021-01-14 23:02:10 +02:00
Slavi Pantaleev 568cb3d86f Upgrade matrix-mailer (4.93-r0 -> 4.93-r1)
This is a bit misleading, because the old Docker image
was tagged as `4.93.1`. There hasn't been a `4.93.1` version yet though.

Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/792
2021-01-13 17:37:31 +02:00
Slavi Pantaleev 7593d969e3 Make matrix-mailer not occupy matrix_server_fqn_matrix
Starting with Docker 20.10, `--hostname` seems to have the side-effect
of making Docker's internal DNS server resolve said hostname to the IP
address of the container.

Because we were giving the mailer service a hostname of `matrix.DOMAIN`,
all requests destined for `matrix.DOMAIN` originating from other
services on the container network were resolving to `matrix-mailer`.
This is obviously wrong.

Initially reported here: https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/748

We normally try to not use the public hostname (and IP address) on the
container network and try to make services talk to one another locally,
but it sometimes could happen.

With this, we use a `matrix-mailer` hostname for the matrix-mailer
container. My testing shows that it doesn't cause any trouble with
email deliverability.
2020-12-10 23:51:11 +02:00
Slavi Pantaleev d08b27784f Fix systemd services autostart problem with Docker 20.10
The Docker 19.04 -> 20.10 upgrade contains the following change
in `/usr/lib/systemd/system/docker.service`:

```
-BindsTo=containerd.service
-After=network-online.target firewalld.service containerd.service
+After=network-online.target firewalld.service containerd.service multi-user.target
-Requires=docker.socket
+Requires=docker.socket containerd.service
Wants=network-online.target
```

The `multi-user.target` requirement in `After` seems to be in conflict
with our `WantedBy=multi-user.target` and `After=docker.service` /
`Requires=docker.service` definitions, causing the following error on
startup for all of our systemd services:

> Job matrix-synapse.service/start deleted to break ordering cycle starting with multi-user.target/start

A workaround which appears to work is to add `DefaultDependencies=no`
to all of our services.
2020-12-10 11:43:20 +02:00
Slavi Pantaleev 5eed874199 Improve self-building experience (avoid conflict with pullable images)
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/716

This patch makes us use more fully-qualified container image names
(either prefixed with docker.io/ or with localhost/).

The latter happens when self-building is enabled.

We've recently had issues where if an image was removed manually
and the service was restarted (making `docker run` fetch it from Docker Hub, etc.),
we'd end up with a pulled image, even though we're aiming for a self-built one.
Re-running the playbook would then not do a rebuild, because:
- the image with that name already exists (even though it's something
else)
- we sometimes had conditional logic where we'd build only if the git
repo changed

By explicitly changing the name of the images (prefixing with localhost/),
we avoid such confusion and the possibility that we'd automatically pul something
which is not what we expect.

Also, I've removed that condition where building would happen on git
changes only. We now always build (unless an image with that name
already exists). We just force-build when the git repo changes.
2020-11-14 23:00:49 +02:00
Slavi Pantaleev ab32f6adf6 Add self-building support to matrix-mailer (exim-relay) 2020-06-08 09:52:34 +03:00
Slavi Pantaleev 1f414a44ff Upgrade matrix-mailer 2020-06-08 09:37:28 +03:00
Chris van Dijk 6334f6c1ea Remove hardcoded command paths in systemd unit files
Depending on the distro, common commands like sleep and chown may either
be located in /bin or /usr/bin.

Systemd added path lookup to ExecStart in v239, allowing only the
command name to be put in unit files and not the full path as
historically required. At least Ubuntu 18.04 LTS is however still on
v237 so we should maintain portability for a while longer.
2020-05-27 23:14:54 +02:00
Chris van Dijk 7585bcc4ac Allow the matrix user username and groupname to be configured separately
No migration steps should be required.
2020-05-01 19:59:32 +02:00
mooomooo eebc6e13f8 Made directory variables for /etc/systemd/system , /etc/cron.d , /usr/local/bin 2020-03-24 11:27:58 -07:00
Slavi Pantaleev 6b8ca70a0b Upgrade Exim (4.92.1 -> 4.92.2) 2019-09-09 07:22:45 +03:00
Slavi Pantaleev 14e242aec1 Make matrix-mailer exit more gracefully 2019-09-04 10:04:57 +03:00
Slavi Pantaleev 18f6b29372 Bump matrix-mailer / exim release (4.92.1-r0-0 -> 4.92.1-r0-1)
It adds support for a new `DISABLE_SENDER_VERIFICATION` environment
variable that can be used to disable verification of sender addresses.

It doesn't matter for us, but we upgrade to keep up with latest.
2019-07-31 10:47:57 +03:00
Slavi Pantaleev 0e3b73a612 Upgrade matrix-mailer / exim (4.92 -> 4.92.1) 2019-07-30 20:56:05 +03:00
Slavi Pantaleev d8a4007220 Upgrade exim (4.91 -> 4.92)
Note: https://www.us-cert.gov/ncas/current-activity/2019/06/13/Exim-Releases-Security-Patches

That said, I don't believe we've been affected.
Not in a bad way at least, because:
- we run exim as non-root and capabilities dropped
- we run exim in a private Docker network with known trusted relayers
(Synapse and mxisd)
2019-06-14 08:07:54 +03:00
Slavi Pantaleev 7d3adc4512 Automatically force-pull :latest images
We do use some `:latest` images by default for the following services:
- matrix-dimension
- Goofys (in the matrix-synapse role)
- matrix-bridge-appservice-irc
- matrix-bridge-appservice-discord
- matrix-bridge-mautrix-facebook
- matrix-bridge-mautrix-whatsapp

It's terribly unfortunate that those software projects don't release
anything other than `:latest`, but that's how it is for now.

Updating that software requires that users manually do `docker pull`
on the server. The playbook didn't force-repull images that it already
had.

With this patch, it starts doing so. Any image tagged `:latest` will be
force re-pulled by the playbook every time it's executed.

It should be noted that even though we ask the `docker_image` module to
force-pull, it only reports "changed" when it actually pulls something
new. This is nice, because it lets people know exactly when something
gets updated, as opposed to giving the indication that it's always
updating the images (even though it isn't).
2019-06-10 14:30:28 +03:00
Dan Arnfield 9c23d877fe Fix docker_image option for ansible < 2.8 2019-05-22 05:43:33 -05:00
Dan Arnfield db15791819 Add source option to docker_image to fix deprecation warning 2019-05-21 10:29:12 -05:00
Dan Arnfield 3982f114af Fix CONDITIONAL_BARE_VARS deprecation warning in ansible 2.8 2019-05-21 10:25:59 -05:00
Slavi Pantaleev ae7c8d1524 Use SyslogIdentifier to improve logging
Reasoning is the same as for matrix-org/synapse#5023.

For us, the journal used to contain `docker` for all services, which
is not very helpful when looking at them all together (`journalctl -f`).
2019-05-16 09:43:46 +09:00
Hugues De Keyzer c451025134 Fix indentation in templates
Use Jinja2 lstrip_blocks option in templates to ensure consistent
indentation in generated files.
2019-05-07 21:23:35 +02:00
Sylvia van Os 75b1528d13 Add the possibility to pass extra flags to the docker container 2019-04-30 16:35:18 +02:00
Slavi Pantaleev 18a562c000 Upgrade services 2019-04-21 08:57:49 +03:00
Slavi Pantaleev 2d56ff0afa Skip some uninstall tasks if not necessary to run 2019-03-13 07:40:51 +02:00
Slavi Pantaleev 45618679f5 Reload systemd services when they get updated
Fixes #69 (Github Issue)
2019-03-03 11:55:15 +02:00
Slavi Pantaleev a43bcd81fe Rename some variables 2019-02-28 11:51:09 +02:00
Slavi Pantaleev 0be7b25c64 Make (most) containers run with a read-only filesystem 2019-01-29 18:52:02 +02:00
Slavi Pantaleev 316d653d3e Drop capabilities in containers
We run containers as a non-root user (no effective capabilities).

Still, if a setuid binary is available in a container image, it could
potentially be used to give the user the default capabilities that the
container was started with. For Docker, the default set currently is:
- "CAP_CHOWN"
- "CAP_DAC_OVERRIDE"
- "CAP_FSETID"
- "CAP_FOWNER"
- "CAP_MKNOD"
- "CAP_NET_RAW"
- "CAP_SETGID"
- "CAP_SETUID"
- "CAP_SETFCAP"
- "CAP_SETPCAP"
- "CAP_NET_BIND_SERVICE"
- "CAP_SYS_CHROOT"
- "CAP_KILL"
- "CAP_AUDIT_WRITE"

We'd rather prevent such a potential escalation by dropping ALL
capabilities.

The problem is nicely explained here: https://github.com/projectatomic/atomic-site/issues/203
2019-01-28 11:22:54 +02:00
Slavi Pantaleev 299a8c4c7c Make (most) containers start as non-root
This makes all containers (except mautrix-telegram and
mautrix-whatsapp), start as a non-root user.

We do this, because we don't trust some of the images.
In any case, we'd rather not trust ALL images and avoid giving
`root` access at all. We can't be sure they would drop privileges
or what they might do before they do it.

Because Postfix doesn't support running as non-root,
it had to be replaced by an Exim mail server.

The matrix-nginx-proxy nginx container image is patched up
(by replacing its main configuration) so that it can work as non-root.
It seems like there's no other good image that we can use and that is up-to-date
(https://hub.docker.com/r/nginxinc/nginx-unprivileged is outdated).

Likewise for riot-web (https://hub.docker.com/r/bubuntux/riot-web/),
we patch it up ourselves when starting (replacing the main nginx
configuration).
Ideally, it would be fixed upstream so we can simplify.
2019-01-27 20:25:13 +02:00
Slavi Pantaleev c10182e5a6 Make roles more independent of one another
With this change, the following roles are now only dependent
on the minimal `matrix-base` role:
- `matrix-corporal`
- `matrix-coturn`
- `matrix-mailer`
- `matrix-mxisd`
- `matrix-postgres`
- `matrix-riot-web`
- `matrix-synapse`

The `matrix-nginx-proxy` role still does too much and remains
dependent on the others.

Wiring up the various (now-independent) roles happens
via a glue variables file (`group_vars/matrix-servers`).
It's triggered for all hosts in the `matrix-servers` group.

According to Ansible's rules of priority, we have the following
chain of inclusion/overriding now:
- role defaults (mostly empty or good for independent usage)
- playbook glue variables (`group_vars/matrix-servers`)
- inventory host variables (`inventory/host_vars/matrix.<your-domain>`)

All roles default to enabling their main component
(e.g. `matrix_mxisd_enabled: true`, `matrix_riot_web_enabled: true`).
Reasoning: if a role is included in a playbook (especially separately,
in another playbook), it should "work" by default.

Our playbook disables some of those if they are not generally useful
(e.g. `matrix_corporal_enabled: false`).
2019-01-16 18:05:48 +02:00
Slavi Pantaleev 51312b8250 Split playbook into multiple roles
As suggested in #63 (Github issue), splitting the
playbook's logic into multiple roles will be beneficial for
maintainability.

This patch realizes this split. Still, some components
affect others, so the roles are not really independent of one
another. For example:
- disabling mxisd (`matrix_mxisd_enabled: false`), causes Synapse
and riot-web to reconfigure themselves with other (public)
Identity servers.

- enabling matrix-corporal (`matrix_corporal_enabled: true`) affects
how reverse-proxying (by `matrix-nginx-proxy`) is done, in order to
put matrix-corporal's gateway server in front of Synapse

We may be able to move away from such dependencies in the future,
at the expense of a more complicated manual configuration, but
it's probably not worth sacrificing the convenience we have now.

As part of this work, the way we do "start components" has been
redone now to use a loop, as suggested in #65 (Github issue).
This should make restarting faster and more reliable.
2019-01-12 18:01:10 +02:00