Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/716
This patch makes us use more fully-qualified container image names
(either prefixed with docker.io/ or with localhost/).
The latter happens when self-building is enabled.
We've recently had issues where if an image was removed manually
and the service was restarted (making `docker run` fetch it from Docker Hub, etc.),
we'd end up with a pulled image, even though we're aiming for a self-built one.
Re-running the playbook would then not do a rebuild, because:
- the image with that name already exists (even though it's something
else)
- we sometimes had conditional logic where we'd build only if the git
repo changed
By explicitly changing the name of the images (prefixing with localhost/),
we avoid such confusion and the possibility that we'd automatically pul something
which is not what we expect.
Also, I've removed that condition where building would happen on git
changes only. We now always build (unless an image with that name
already exists). We just force-build when the git repo changes.
also, worker.yaml.j2:
- hone worker_name
- remove worker_pid_file entry (would only be used if worker_daemonize
set to true; also, synapse only knows about the container namespace
and thus can not provide the required host-view PID)
We do use some `:latest` images by default for the following services:
- matrix-dimension
- Goofys (in the matrix-synapse role)
- matrix-bridge-appservice-irc
- matrix-bridge-appservice-discord
- matrix-bridge-mautrix-facebook
- matrix-bridge-mautrix-whatsapp
It's terribly unfortunate that those software projects don't release
anything other than `:latest`, but that's how it is for now.
Updating that software requires that users manually do `docker pull`
on the server. The playbook didn't force-repull images that it already
had.
With this patch, it starts doing so. Any image tagged `:latest` will be
force re-pulled by the playbook every time it's executed.
It should be noted that even though we ask the `docker_image` module to
force-pull, it only reports "changed" when it actually pulls something
new. This is nice, because it lets people know exactly when something
gets updated, as opposed to giving the indication that it's always
updating the images (even though it isn't).
This doesn't replace all usage of `-v`, but it's a start.
People sometimes troubleshoot by deleting files (especially bridge
config files). Restarting Synapse with a missing registration.yaml file
for a given bridge, causes the `-v
/something/registration.yaml:/something/registration.yaml:ro` option
to force-create `/something/registration.yaml` as a directory.
When a path that's provided to the `-v` option is missing, Docker
auto-creates that path as a directory.
This causes more breakage and confusion later on.
We'd rather fail, instead of magically creating directories.
Using `--mount`, instead of `-v` is the solution to this.
From Docker's documentation:
> When you use --mount with type=bind, the host-path must refer to an existing path on the host.
> The path will not be created for you and the service will fail with an error if the path does not exist.
The goal is to move each bridge into its own separate role.
This commit starts off the work on this with 2 bridges:
- mautrix-telegram
- mautrix-whatsapp
Each bridge's role (including these 2) is meant to:
- depend only on the matrix-base role
- integrate nicely with the matrix-synapse role (if available)
- integrate nicely with the matrix-nginx-proxy role (if available and if
required). mautrix-telegram bridge benefits from integrating with
it.
- not break if matrix-synapse or matrix-nginx-proxy are not used at all
This has been provoked by #174 (Github Issue).
The code used to check for a `homeserver.yaml` file and generate
a configuration (+ key) only if such a configuration file didn't exist.
Certain rare cases (setting up with one server name and then
changing to another) lead to `homeserver.yaml` being there,
but a `matrix.DOMAIN.signing.key` file missing (because the domain
changed).
A new signing key file would never get generated, because `homeserver.yaml`'s
existence used to be (incorrectly) satisfactory for us.
From now on, we don't mix things up like that.
We don't care about `homeserver.yaml` anymore, but rather
about the actual signing key.
The rest of the configuration (`homeserver.yaml` and
`matrix.DOMAIN.log.config`) is rebuilt by us in any case, so whether
it exists or not is irrelevant and doesn't need checking.
In most cases, there's not really a need to touch the system
firewall, as Docker manages iptables by itself
(see https://docs.docker.com/network/iptables/).
All ports exposed by Docker containers are automatically whitelisted
in iptables and wired to the correct container.
This made installing firewalld and whitelisting ports pointless,
as far as this playbook's services are concerned.
People that wish to install firewalld (for other reasons), can do so
manually from now on.
This is inspired by and fixes#97 (Github Issue).
By default, `--tags=self-check` no longer validates certificates
when `matrix_ssl_retrieval_method` is set to `self-signed`.
Besides this default, people can also enable/disable validation using the
individual role variables manually.
Fixes#124 (Github Issue)
Using `docker_container` with a `cap_drop` argument requires
Ansible >=2.7.
We want to support older versions too (2.4), so we either need to
stop invoking it with `cap_drop` (insecure), or just stop using
the module altogether.
Since it was suffering from other bugs too (not deleting containers
on failure), we've decided to remove `docker_container` usage completely.
`matrix_synapse_no_tls` is now implicit, so we've gotten rid of it.
The `homeserver.yaml.j2` template has been synchronized with the
configuration generated by Synapse v0.99.1 (some new options
are present, etc.)
This reverts commit 0dac5ea508.
Relying on pyOpenSSL is the Ansible way of doing things, but is
impractical and annoying for users.
`openssl` is easily available on most servers, even by default.
We'd better use that.
This makes all containers (except mautrix-telegram and
mautrix-whatsapp), start as a non-root user.
We do this, because we don't trust some of the images.
In any case, we'd rather not trust ALL images and avoid giving
`root` access at all. We can't be sure they would drop privileges
or what they might do before they do it.
Because Postfix doesn't support running as non-root,
it had to be replaced by an Exim mail server.
The matrix-nginx-proxy nginx container image is patched up
(by replacing its main configuration) so that it can work as non-root.
It seems like there's no other good image that we can use and that is up-to-date
(https://hub.docker.com/r/nginxinc/nginx-unprivileged is outdated).
Likewise for riot-web (https://hub.docker.com/r/bubuntux/riot-web/),
we patch it up ourselves when starting (replacing the main nginx
configuration).
Ideally, it would be fixed upstream so we can simplify.
The matrix-nginx-proxy role can now be used independently.
This makes it consistent with all other roles, with
the `matrix-base` role remaining as their only dependency.
Separating matrix-nginx-proxy was relatively straightforward, with
the exception of the Mautrix Telegram reverse-proxying configuration.
Mautrix Telegram, being an extension/bridge, does not feel important enough
to justify its own special handling in matrix-nginx-proxy.
Thus, we've introduced the concept of "additional configuration blocks"
(`matrix_nginx_proxy_proxy_matrix_additional_server_configuration_blocks`),
where any module can register its own custom nginx server blocks.
For such dynamic registration to work, the order of role execution
becomes important. To make it possible for each module participating
in dynamic registration to verify that the order of execution is
correct, we've also introduced a `matrix_nginx_proxy_role_executed`
variable.
It should be noted that this doesn't make the matrix-synapse role
dependent on matrix-nginx-proxy. It's optional runtime detection
and registration, and it only happens in the matrix-synapse role
when `matrix_mautrix_telegram_enabled: true`.
With this change, the following roles are now only dependent
on the minimal `matrix-base` role:
- `matrix-corporal`
- `matrix-coturn`
- `matrix-mailer`
- `matrix-mxisd`
- `matrix-postgres`
- `matrix-riot-web`
- `matrix-synapse`
The `matrix-nginx-proxy` role still does too much and remains
dependent on the others.
Wiring up the various (now-independent) roles happens
via a glue variables file (`group_vars/matrix-servers`).
It's triggered for all hosts in the `matrix-servers` group.
According to Ansible's rules of priority, we have the following
chain of inclusion/overriding now:
- role defaults (mostly empty or good for independent usage)
- playbook glue variables (`group_vars/matrix-servers`)
- inventory host variables (`inventory/host_vars/matrix.<your-domain>`)
All roles default to enabling their main component
(e.g. `matrix_mxisd_enabled: true`, `matrix_riot_web_enabled: true`).
Reasoning: if a role is included in a playbook (especially separately,
in another playbook), it should "work" by default.
Our playbook disables some of those if they are not generally useful
(e.g. `matrix_corporal_enabled: false`).
As suggested in #63 (Github issue), splitting the
playbook's logic into multiple roles will be beneficial for
maintainability.
This patch realizes this split. Still, some components
affect others, so the roles are not really independent of one
another. For example:
- disabling mxisd (`matrix_mxisd_enabled: false`), causes Synapse
and riot-web to reconfigure themselves with other (public)
Identity servers.
- enabling matrix-corporal (`matrix_corporal_enabled: true`) affects
how reverse-proxying (by `matrix-nginx-proxy`) is done, in order to
put matrix-corporal's gateway server in front of Synapse
We may be able to move away from such dependencies in the future,
at the expense of a more complicated manual configuration, but
it's probably not worth sacrificing the convenience we have now.
As part of this work, the way we do "start components" has been
redone now to use a loop, as suggested in #65 (Github issue).
This should make restarting faster and more reliable.