Commit graph

253 commits

Author SHA1 Message Date
Slavi Pantaleev 316d653d3e Drop capabilities in containers
We run containers as a non-root user (no effective capabilities).

Still, if a setuid binary is available in a container image, it could
potentially be used to give the user the default capabilities that the
container was started with. For Docker, the default set currently is:
- "CAP_CHOWN"
- "CAP_DAC_OVERRIDE"
- "CAP_FSETID"
- "CAP_FOWNER"
- "CAP_MKNOD"
- "CAP_NET_RAW"
- "CAP_SETGID"
- "CAP_SETUID"
- "CAP_SETFCAP"
- "CAP_SETPCAP"
- "CAP_NET_BIND_SERVICE"
- "CAP_SYS_CHROOT"
- "CAP_KILL"
- "CAP_AUDIT_WRITE"

We'd rather prevent such a potential escalation by dropping ALL
capabilities.

The problem is nicely explained here: https://github.com/projectatomic/atomic-site/issues/203
2019-01-28 11:22:54 +02:00
Slavi Pantaleev 299a8c4c7c Make (most) containers start as non-root
This makes all containers (except mautrix-telegram and
mautrix-whatsapp), start as a non-root user.

We do this, because we don't trust some of the images.
In any case, we'd rather not trust ALL images and avoid giving
`root` access at all. We can't be sure they would drop privileges
or what they might do before they do it.

Because Postfix doesn't support running as non-root,
it had to be replaced by an Exim mail server.

The matrix-nginx-proxy nginx container image is patched up
(by replacing its main configuration) so that it can work as non-root.
It seems like there's no other good image that we can use and that is up-to-date
(https://hub.docker.com/r/nginxinc/nginx-unprivileged is outdated).

Likewise for riot-web (https://hub.docker.com/r/bubuntux/riot-web/),
we patch it up ourselves when starting (replacing the main nginx
configuration).
Ideally, it would be fixed upstream so we can simplify.
2019-01-27 20:25:13 +02:00
Slavi Pantaleev 56d501679d Be explicit about the UID/GID we start Synapse with
We do match the defaults anyway (by default that is),
but people can customize `matrix_user_uid` and `matrix_user_uid`
and it wouldn't be correct then.

In any case, it's better to be explicit about such an important thing.
2019-01-26 20:21:18 +02:00
Slavi Pantaleev 1a80058a2a Indent (non-YAML) using tabs
Fixes #83 (Github issue)
2019-01-26 09:37:29 +02:00
Slavi Pantaleev a88b24ed2c Update matrix-corporal (1.2.2 -> 1.3.0) 2019-01-25 16:58:20 +02:00
Slavi Pantaleev fcceb3143d Update riot-web (0.17.8 -> 0.17.9) 2019-01-23 08:13:27 +02:00
Slavi Pantaleev a4e7ad5566 Use async Ansible task for importing Postgres
A long-running import task may hit the SSH timeout value
and die. Using async is supposed to improve reliability
in such scenarios.
2019-01-21 08:34:49 +02:00
Slavi Pantaleev 0392822aa7 Show Postgres import command and mention manual importing 2019-01-21 08:33:10 +02:00
Slavi Pantaleev 8d186e5194 Fix Postgres import when Postgres had never started
If this is a brand new server and Postgres had never started,
detecting it before we even start it is not possible.

This moves the logic, so that it happens later on, when Postgres
would have had the chance to start and possibly initialize
a new empty database.

Fixes #82 (Github issue)
2019-01-21 07:32:19 +02:00
Slavi Pantaleev fef6c052c3 Pass Host/X-Forwarded-For everywhere
It hasn't mattered much to have these so far, but
it's probably a good idea to have them.
2019-01-17 16:25:08 +02:00
Slavi Pantaleev ba75ab496d Send Host/X-Forwarded-For to mxisd
It worked without it too, but doing this is more consistent with the
mxisd recommendations.
2019-01-17 16:22:49 +02:00
Slavi Pantaleev cb11548eec Use mxisd for user directory searches
Implements #77 (Github issue).
2019-01-17 15:55:23 +02:00
Slavi Pantaleev df0d465482 Fix typos in some variables (matrix_mxid -> matrix_mxisd) 2019-01-17 14:47:37 +02:00
Slavi Pantaleev f4f06ae068 Make matrix-nginx-proxy role independent of others
The matrix-nginx-proxy role can now be used independently.
This makes it consistent with all other roles, with
the `matrix-base` role remaining as their only dependency.

Separating matrix-nginx-proxy was relatively straightforward, with
the exception of the Mautrix Telegram reverse-proxying configuration.
Mautrix Telegram, being an extension/bridge, does not feel important enough
to justify its own special handling in matrix-nginx-proxy.

Thus, we've introduced the concept of "additional configuration blocks"
(`matrix_nginx_proxy_proxy_matrix_additional_server_configuration_blocks`),
where any module can register its own custom nginx server blocks.

For such dynamic registration to work, the order of role execution
becomes important. To make it possible for each module participating
in dynamic registration to verify that the order of execution is
correct, we've also introduced a `matrix_nginx_proxy_role_executed`
variable.

It should be noted that this doesn't make the matrix-synapse role
dependent on matrix-nginx-proxy. It's optional runtime detection
and registration, and it only happens in the matrix-synapse role
when `matrix_mautrix_telegram_enabled: true`.
2019-01-17 13:32:46 +02:00
Slavi Pantaleev c10182e5a6 Make roles more independent of one another
With this change, the following roles are now only dependent
on the minimal `matrix-base` role:
- `matrix-corporal`
- `matrix-coturn`
- `matrix-mailer`
- `matrix-mxisd`
- `matrix-postgres`
- `matrix-riot-web`
- `matrix-synapse`

The `matrix-nginx-proxy` role still does too much and remains
dependent on the others.

Wiring up the various (now-independent) roles happens
via a glue variables file (`group_vars/matrix-servers`).
It's triggered for all hosts in the `matrix-servers` group.

According to Ansible's rules of priority, we have the following
chain of inclusion/overriding now:
- role defaults (mostly empty or good for independent usage)
- playbook glue variables (`group_vars/matrix-servers`)
- inventory host variables (`inventory/host_vars/matrix.<your-domain>`)

All roles default to enabling their main component
(e.g. `matrix_mxisd_enabled: true`, `matrix_riot_web_enabled: true`).
Reasoning: if a role is included in a playbook (especially separately,
in another playbook), it should "work" by default.

Our playbook disables some of those if they are not generally useful
(e.g. `matrix_corporal_enabled: false`).
2019-01-16 18:05:48 +02:00
Slavi Pantaleev 294a5c9083 Fix YAML serialization of empty matrix_synapse_federation_domain_whitelist
We've previously changed a bunch of lists in `homeserver.yaml.j2`
to be serialized using `|to_nice_yaml`, as that generates a more
readable list in YAML.

`matrix_synapse_federation_domain_whitelist`, however, couldn't have
been changed to that, as it can potentially be an empty list.

We may be able to differentiate between empty and non-empty now
and serialize it accordingly (favoring `|to_nice_yaml` if non-empty),
but it's not important enough to be justified. Thus, always
serializing with `|to_json`.

Fixes #78 (Github issue)
2019-01-16 17:06:58 +02:00
Sylvia van Os cec2aa61c1 Fix scalar widgets
Riot-web parses integrations_widgets_urls as a list, thus causing it to incorrectly think Scalar widgets are non-Scalar and not passing the scalar token
2019-01-16 14:03:39 +01:00
Stuart Mumford f8ebd94d08
Make the mode of the base path configurable 2019-01-14 14:40:11 +00:00
Slavi Pantaleev e8c78c1572 Merge branch 'master' into split-into-multiple-roles 2019-01-14 08:27:53 +02:00
Slavi Pantaleev 857603d9d7 Make nginx-proxy files owned by matrix:matrix, not root:root 2019-01-14 08:26:56 +02:00
Slavi Pantaleev b80d44afaa Stop Postgres before finding files to move over 2019-01-12 18:16:08 +02:00
Slavi Pantaleev 51312b8250 Split playbook into multiple roles
As suggested in #63 (Github issue), splitting the
playbook's logic into multiple roles will be beneficial for
maintainability.

This patch realizes this split. Still, some components
affect others, so the roles are not really independent of one
another. For example:
- disabling mxisd (`matrix_mxisd_enabled: false`), causes Synapse
and riot-web to reconfigure themselves with other (public)
Identity servers.

- enabling matrix-corporal (`matrix_corporal_enabled: true`) affects
how reverse-proxying (by `matrix-nginx-proxy`) is done, in order to
put matrix-corporal's gateway server in front of Synapse

We may be able to move away from such dependencies in the future,
at the expense of a more complicated manual configuration, but
it's probably not worth sacrificing the convenience we have now.

As part of this work, the way we do "start components" has been
redone now to use a loop, as suggested in #65 (Github issue).
This should make restarting faster and more reliable.
2019-01-12 18:01:10 +02:00
Slavi Pantaleev 6d253ff571 Switch to a better riot-web image (avhost/docker-matrix-riot -> bubuntux/riot-web)
The new container image is about 20x smaller in size, faster to start up, etc.

This also fixes #26 (Github issue).
2019-01-11 21:20:21 +02:00
Slavi Pantaleev 14a237885a Fix missing SMTP configuration for mxisd
Regression since 9a9b7383e9.
2019-01-11 20:26:40 +02:00
Slavi Pantaleev 9a9b7383e9 Completely redo how mxisd configuration gets generated
This change is provoked by a few different things:

- #54 (Github Pull Request), which rightfully says that we need a
way to support ALL mxisd configuration options easily

- the upcoming mxisd 1.3.0 release, which drops support for
property-style configuration (dot-notation), forcing us to
redo the way we generate the configuration file

With this, mxisd is much more easily configurable now
and much more easily maintaneable by us in the future
(no need to introduce additional playbook variables and logic).
2019-01-11 19:33:54 +02:00
Slavi Pantaleev fca2f2e036 Catch misconfigured REST Auth password provider during installation 2019-01-11 01:03:35 +02:00
Slavi Pantaleev 46c5d11d56 Update components 2019-01-10 19:29:56 +02:00
Slavi Pantaleev 2ae7c5e177
Merge pull request #68 from spantaleev/manage-cronjobs-with-cron-module
Switch to managing cronjobs with the Ansible cron module
2019-01-08 16:21:57 +02:00
Slavi Pantaleev 00ae435044 Use |to_json filter for serializing booleans to JSON
This should account for all cases where we were still doing such a thing.

Improvement suggested in #65 (Github issue).
2019-01-08 13:12:56 +02:00
Slavi Pantaleev b222d26c86 Switch to managing cronjobs with the Ansible cron module
As suggested in #65 (Github issue), this patch switches
cronjob management from using templates to using Ansible's `cron` module.

It also moves the management of the nginx-reload cronjob to `setup_ssl_lets_encrypt.yml`,
which is a more fitting place for it (given that this cronjob is only required when
Let's Encrypt is used).

Pros:
- using a module is more Ansible-ish than templating our own files in
special directories

- more reliable: will fail early (during playbook execution) if `/usr/bin/crontab`
is not available, which is more of a guarantee that cron is working fine
(idea: we should probably install some cron package using the playbook)

Cons:
- invocation schedule is no longer configurable, unless we define individual
variables for everything or do something smart (splitting on ' ', etc.).
Likely not necessary, however.

- requires us to deprecate and clean-up after the old way of managing cronjobs,
because it's not compatible (using the same file as before means appending
additional jobs to it)
2019-01-08 12:52:03 +02:00
Slavi Pantaleev ef2dc3745a Check DNS SRV record for _matrix-identity._tcp when mxisd enabled 2019-01-08 10:39:22 +02:00
Slavi Pantaleev f92c4d5a27 Use Ansible dig lookup instead of calling the dig program
This means we no longer have a dependency on the `dig` program,
but we do have a dependency on `dnspython`.

Improves things as suggested in #65 (Github issue).
2019-01-08 10:19:45 +02:00
Jan Christian Grünhage 29d10804f0 Use yaml syntax instead of key=value syntax consistently
fixes #62
2019-01-07 23:38:39 +01:00
Slavi Pantaleev 5135c0cc0a Add Ansible guide and Ansible version checks
After having multiple people report issues with retrieving
SSL certificates, we've finally discovered the culprit to be
Ansible 2.5.1 (default and latest version on Ubuntu 18.04 LTS).

As silly as it is, certain distributions ("LTS" even) are 13 bugfix
versions of Ansible behind.

From now on, we try to auto-detect buggy Ansible versions and tell the
user. We also provide some tips for how to upgrade Ansible or
run it from inside a Docker container.

My testing shows that Ansible 2.4.0 and 2.4.6 are OK.
All other intermediate 2.4.x versions haven't been tested, but we
trust they're OK too.

From the 2.5.x releases, only 2.5.0 and 2.5.1 seem to be affected.
Ansible 2.5.2 corrects the problem with `include_tasks` + `with_items`.
2019-01-03 16:24:14 +02:00
Slavi Pantaleev 99af4543ac Replace include usage with include_tasks and import_tasks
The long-deprecated (since Ansible 2.4) use of include is
no more.
2019-01-03 15:24:08 +02:00
Slavi Pantaleev 76506f34e0 Make media-store restore work with server files, not local
This is a simplification and a way to make it consistent with
how we do Postgres imports (see 6d89319822), using
files coming from the server, not from the local machine.

By encouraging people NOT to use local files,
we potentially avoid problems such as #34 (Github issue),
where people would download `media_store` to their Mac's filesystem
and case-sensitivity issues will actually corrupt it.

By not encouraging local files usage, it's less likely that
people would copy (huge) directories to their local machine like that.
2019-01-01 15:57:50 +02:00
Slavi Pantaleev e604a7bd43 Fix error message inaccuracy 2019-01-01 15:25:52 +02:00
Slavi Pantaleev 4c2e1a0588 Make SQLite database import work with server files, not local
This is a simplification and a way to make it consistent with
how we do Postgres imports (see 6d89319822), using
files coming from the server, not from the local machine.
2019-01-01 15:21:52 +02:00
Slavi Pantaleev f153c70a60 Reorganize some files 2019-01-01 14:47:22 +02:00
Slavi Pantaleev 6d89319822 Add support for importing an existing Postgres database 2019-01-01 14:45:37 +02:00
Slavi Pantaleev f472c1b9e5 Ensure psql returns a failure exit code when it fails
Until now, if the .sql file contained invalid data, psql would
choke on it, but still return an exit code of 0.
This is very misleading.

We need to pass `-v ON_ERROR_STOP=1` to make it exit
with a proper error exit code when failures happen.
2019-01-01 14:05:11 +02:00
Slavi Pantaleev a7f791f8f9 Make Postgres version detection logic reusable to ease maintenance
We've had that logic in 2 places so far, leading to duplication
and a maintenance burden.

In the future, we'll also have an import-postgres feature,
which will also need Postgres version detection,
leading to more benefit from that logic being reusable.
2019-01-01 13:43:51 +02:00
Slavi Pantaleev c59a53551a Make well-known self-check not depend on Content-Type: application/json
Fixes #60 (Github issue)
2018-12-31 11:19:59 +02:00
Hardy Erlinger 2fc0f5f3cf Set MAILNAME env variable to FQDN hostname for matrix-mailer. 2018-12-30 21:50:59 +01:00
Slavi Pantaleev 87b5f0a4d4 Server non-scary page at matrix domain (take 2)
Fix for 12b65d8ccc.
2018-12-29 20:11:37 +02:00
Slavi Pantaleev f7aa362961 Make "obtain certificates" tasks have unique names
We always skip at least one of these tasks, depending on which
SSL retrieval method is enabled, so it could have been confusing why.
2018-12-24 09:39:27 +02:00
Slavi Pantaleev 4757c13a2e Do not install openssl if not necessary
Fix for d28bdb3258.

We were only supposed to install openssl when the self-signed
SSL certificate retrieval method is used, not always.
2018-12-24 09:38:00 +02:00
Slavi Pantaleev 12b65d8ccc Serve a non-scary page at the matrix domain
Fixes #18 (Github issue).

It would probably be better if we serve our own page,
as the Matrix one says:

"To use this server you'll need a Matrix client", which
is true, but we install Riot by default and it'd be better if we mention
that instead.
2018-12-23 19:45:03 +02:00
Slavi Pantaleev b9b5674b8a Lowercase host_specific_hostname_identity to prevent troubles
If uppercase is used, certain tools (like certbot) would cause trouble.
They would retrieve a certificate for the lowercased domain name,
but we'd try to use it from an uppercase-named directory, which will
fail.

Besides certbot, we may experience other trouble too.
(it hasn't been investigated how far the breakage goes).

To fix it all, we lowercase `host_specific_hostname_identity` by default,
which takes care of the general use-case (people only setting that
and relying on us to build the other domain names - `hostname_matrix`
and `hostname_riot`).

For others, who decide to override these other variables directly
(and who may work around us and introduce uppercase there directly),
we also have the sanity-check tool warn if uppercase is detected
in any of the final domains.
2018-12-23 19:25:57 +02:00
Slavi Pantaleev fe9b9773c0 Move setup sanity checks to a central place 2018-12-23 19:15:23 +02:00