Fedora 35 is:
- `ansible_os_family = 'RedHat'`
- `ansible_distribution_major_version = '35'`
Our RedHat checks against v7/v8 are really for RHEL derivatives (CentOS, Rockylinux,
AlmaLinux), but the same checks (by coincidence) apply for Fedora 35.
The problem is that `'35' > '7'` (comparing these as strings) is
`false`.
This patch makes sure that we always cast
`ansible_distribution_major_version` to an integer.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/1610
This also removes the `matrix_synapse_version_arm64` variable we've
been dragging around for a long time.
Since https://github.com/matrix-org/synapse/pull/11810, a multiarch Synapse
container image (for AMD64 and ARM64) is released at the same time.
Not hardcoding 'CentOS' and using the OS family ('RedHat') instead,
we now behave better on Rockylinux and AlmaLinux, etc.
With that said, we may or may not fully support CentOS/Rockylinux/AlmaLinux v8 yet.
Certain things were improved in
https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/300.
v8 support is discussed here: https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/300
Certain things (firewalld?) may still be problematic. This patch does not try to address those.
If the remaining issues are confirmed to be fixed in the future, we can mark v8 as supported.
Reverts b1b4ba501f, 90c9801c56, a3c84f78ca, ..
I haven't really traced it (yet), but on some servers, I'm observing
`ansible-playbook ... --tags=start` completing very slowly, waiting
to stop services. I can't reproduce this on all Matrix servers I manage.
I suspect that either the systemd version is to blame or that some
specific service is not responding well to some `docker kill/rm` command.
`ExecStop` seems to work great in all cases and it's what we've been
using for a very long time, so I'm reverting to that.
1.0.2 is the first container image tag that is available as a multi-arch image
with support for linux/amd64, linux/arm64/v8 (arm64) and linux/arm/v7 (arm32),
so self-building is no longer necessary on all these platforms.
4.95-r0-1 was problematic, because `/etc/exim/exim.conf` in the
container had the wrong permissions (writable by the `exim` user).
Fixed in 697f3cff7e
which is built as 4.95-r0-2
4.95-r0-1 is the first container image tag that is available as a multi-arch image
with support for linux/amd64, linux/arm64/v8 (arm64) and linux/arm/v7 (arm32),
so self-building is no longer necessary on all these platforms.
2.2.3 is the first container image tag that is available as a multi-arch image
with support for linux/amd64, linux/arm64/v8 (arm64) and linux/arm/v7 (arm32),
so self-building is no longer necessary on all these platforms.
The OAuth credentials method seems to be the only viable way to
configure the mx-puppet-bridge now. Legacy tokens can no longer be
created, and the other methods (xoxs and xoxc tokens) come with warnings
about them being against Slack's terms of service.
v1.50.0 was found to be buggy for people using a `webclient` listener.
This is fixed in v1.50.1.
We don't use such a listener, so we weren't affected anyway.
Also get rid of `--tags=update-user-password` in the
`matrix-dendrite` role, as what we had doesn't work.
We may be able to do it with some Ansible helper or something else.
For now, we'll omit this feature.
All the `matrix_s3_*` stuff happens in the `matrix-synapse` role.
If we are to have such S3 support for Dendrite, we should probably
extract it out of the `matrix-synapse` role (into a `matrix-s3` role or
`matrix-goofys`, etc.) and wire `matrix-dendrite` accordingly.
This may or may not be done in the future though. For now, I'm
cleaning things up in the `matrix-dendrite` role.
Seems like Dendrite encourages serving both the Client and Federation
API at the same port.
Coming from Synapse and how things are done there, we have separate
ports. Using separate ports probably makes matrix-corporal (etc.)
integration easier, so separating the APIs by default probably makes
sense.
The goal is to have a single variable which tells us which homeserver
software is in use. Much simpler than having if/elif/elif checks for
variables like (`matrix_synapse_enabled` and `matrix_dendrite_enabled`, etc.)
everywhere.
As of docker-jitsi-meet stable-6433 [1], `/config/interface_config.js`
is regenerated on every boot. The correct way to modify the interface
config is now via `/config/custom-interface_config.js`, which is
appended to a default copy of `interface_config.js` by
`/etc/cont-init.d/10-config` on every boot of the docker image.
Given that `interface_config.js` is considered deprecated by upstream
(all options will eventually be moved to `config.js`), we also deprecate
the `matrix_jitsi_web_interface_config_*` variables in favour of
`matrix_jitsi_web_custom_interface_config_extension`.
[1] https://github.com/jitsi/docker-jitsi-meet/blob/stable-6433/CHANGELOG.md#stable-6433
Enables optional access to Etherpad's web-UI. This is useful for
managing Etherpad plugins.
Among other things, plugins makes it easy to manage/delete pads if you
install the adminpads2 plugin.
This upgrade is technically not needed due to 1.49.1 and 1.49.2 being identical with a lone fix to Debian packaging being the only change.
Still some might want us to be on the absolutely latest version even tho these 2 are practically identical.
ARM64 has yet to be built so this has to wait for that before merge.
Related to https://github.com/matrix-org/synapse/issues/11604
Getting an upstream fix is preferable. In any case, it's probably nice
to have this defined explicitly in our configuration. This way, people
can more easily discover that they can override the URL preview
language.
Updates based off the variable names used in mautrix-facebook role.
Also update port number in defauts/main.yml, and disable presence
checking, because Twitter doesn't support that.
* Upgrade Mjolnir from 1.1.20 to version 1.2.1
https://hub.docker.com/r/matrixdotorg/mjolnir/tags
using the "latest" tag seems inefficient as it doesn't actually redirect to the latest release
In any case, the latest release is now 1.2.1
docker pull matrixdotorg/mjolnir:v1.2.1
* Fixup
Co-authored-by: Slavi Pantaleev <slavi@devture.com>
The check was checking for an empty string in `matrix_jitsi_prosody_auth_internal_accounts`,
which is unlikely to happen. We should check for an empty list instead.
The check was not validating username/password values, so telling the user that they need a non-empty
username/password is misleading. It was merely checking if there's at least one entry in the list.
This patch adjusts the check and message accordingly.
Was renamed in 087dbe4ddc
It is unclear to me if there is anything you actually need to adjust with these variables. It looks like that is done automatically in `matrix_servers`.
We had to remove UID/GID environment variables that we used to pass
to the Synapse container, because it was causing a problem after
https://github.com/matrix-org/synapse/pull/11209
We were using both `--user` and UID/GID environment variables until now.
Until now, we were leaving services "enabled"
(symlinks in /etc/systemd/system/multi-user.target.wants/).
We clean these up now. Broken symlinks may still exist in older
installations that enabled/disabled services. We're not taking care
to fix these up. It's just a cosmetic defect anyway.
We were applying the low-memory system patch to webpack.config.js
on systems with < 4GB memory.
From now on, we also let people force-enable patching by toggling the
`matrix_client_element_container_image_self_build_low_memory_system_patch_enabled`
variable.
Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/1357
In short, the problem is that older Postgres versions store passwords
hashed as md5. When you dump such a database, the dump naturally also
contains md5-hashed passwords.
Restoring from that dump used to create users and updates their passwords
with these md5 hashes.
However, Postgres v14 prefers does not like md5-hashed passwords now (by default),
which breaks connectivity. Postgres v14 prefers `scram-sha-256` for
authentication.
Our solution is to just ignore setting passwords (`ALTER ROLE ..`
statements) when restoring dumps. We don't need to set passwords as
defined in the dump anyway, because the playbook creates users
and manages their passwords by itself.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/1340
#1192 lead to the following error for me on Archlinux:
`TASK [matrix-base : Install host dependencies] *******************************************************************************************************************************
fatal: [matrix.***.de]: FAILED! => changed=false
msg: |-
failed to install systemd-timesyncd: error: target not found: systemd-timesyncd`
There is no package called `systemd-timesyncd` on Archlinux. The service is installed with the [`systemd`](https://archlinux.org/packages/core/x86_64/systemd/) package itself.
I suggest removing the `systemd-timesyncd` from 2453876eb9/roles/matrix-base/tasks/server_base/setup_archlinux.yml (L7)
* add timeout param for nginx proxy
default value matrix_nginx_proxy_request_timeout is 60s
* default matrix_nginx_proxy_request_timeout - 60s
* few more variables for request timeout
* Update nginx.conf.j2
* Update nginx.conf.j2
This bridge doesn't support SQLite anyway, so it's not necessary
to carry around configuration fields and code for migration from SQLite
to Postgres. There's nothing to migrate.
This configuration is supposed to be kept clean and not reference variables defined in other roles.
`group_vars/matrix_servers` redefines these to hook our various roles together.
The Github link is just a redirect to Tulir's own GitLab, so I replaced the self-build link
The docker container repository was rearranged hierarchically (dock.mau.dev/tulir/mautrix-facebook -> dock.mau.dev/mautrix/facebook)
Tagged versions have been made available, thus :latest -> :v0.3.1
Updated settings in template file:
* relay for any user
* user permissions only for HS domain users
Co-authored-by: Jan <31133207+Jaffex@users.noreply.github.com>
Permissions, when set in the template, will be augmented rahter than replaced when using matrix_mautrix_signal_configuration_extension_yaml. Therefore, permissions shall only be set in the defaults/vars.yml or in the HS specific vars.yml file
Per default relay bot functionality is disabled; the bridge user permissions depends on the relay bot, if enabled the base domain users are on level relay, else remain on user;
I changed the conditional statement in prosody systemd template to bind the localhost port by default if people have set ```matrix_nginx_proxy_enabled == false ```.
Hopefully that should make it the default behaviour now.
Jitsi-meet enabled websockets by default, claiming better reliability.
Matrix-nginx-proxy configuration has been set up according to the
Prosody documentation: https://prosody.im/doc/websocket
**HSTS Preloading:**
In its strongest and recommended form, the [HSTS policy](https://www.chromium.org/hsts) includes all subdomains, and indicates a willingness to be “preloaded” into browsers:
`Strict-Transport-Security: max-age=31536000; includeSubDomains; preload`
**X-Xss-Protection:**
`1; mode=block` which tells the browser to block the response if it detects an attack rather than sanitising the script.
Expected to have regressed after https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/1008
This patch comes with its own downsides (as described in the comments
for matrix_prometheus_node_exporter_container_http_host_bind_port),
but at least there's:
- no security issue
- metrics remain readable from matrix-prometheus (even if the network metrics are inaccurate)
A better patch is certainly welcome.
These old versions of TLS rely on MD5 and SHA-1, both now broken, and contain other flaws. TLS 1.0 is no longer PCI-DSS compliant and the TLS working group has adopted a document to deprecate TLS 1.0 and TLS 1.1.
See these last commits:
tulir/mautrix-signal@4fc34330c1f6947aece67863b0d04da34c776f80
tulir/mautrix-signal@64bc5c36a509ba435a0b01cf44afb1b5d2642efd
tulir/mautrix-signal@ddda1666d41d28750cc59d070e4388b24add6ad9
Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/963
This also simplifies Prerequisites, which is great.
It'd be nice if we were doing these checks in some optional manner
and reporting them as helpful messages (using
`matrix_playbook_runtime_results`), but that's more complicated.
I'd rather drop these checks completely.
This variable was previously undefined in the role and was only getting
defined via `group_vars/matrix_servers`.
We now properly initialize it (and its good default value) in the role
itself.
Fixes a regression caused by a5ee39266c.
If the user id and group id were different than 991:991
(which used to be a hardcoded default for us long ago),
there was a mismatch between what Synapse was trying to use (991:991)
and what it was actually started with (in `--user=..`). It was then
trying to change ownership, which was failing.
This was mostly affecting newer installations which were not using the
991:991 defaults we had long ago (since a1c5a197a9).
We're talking about a webserver running on the same machine, which
imports the configuration files generated by the `matrix-nginx-proxy`
in the `/matrix/nginx-proxy/conf.d` directory.
Users who run an nginx webserver on some other machine will need to do
something different.
This give us the possibility to run multiple instances of
workers that that don't expose a port.
Right now, we don't support that, but in the future we could
run multiple `federation_sender` or `pusher` workers, without
them fighting over naming (previously, they'd all be named
something like `matrix-synapse-worker-pusher-0`, because
they'd all define `port` as `0`).
This leads to much easier management and potential safety
features (validation). In the future, we could try to avoid port
conflicts as well, but it didn't seem worth the effort to do it now.
Our port ranges seem large enough.
This can also pave the way for a "presets" feature
(similar to `matrix_nginx_proxy_ssl_presets`) which makes it even easier
for people to configure worker counts.
The quotes around "host" for both `--pid` and `--net` were
causing trouble for me:
> docker: --pid: invalid PID mode.
and:
> docker: Error response from daemon: network "host" not found.
I've also changed the `-v` call to `--mount` for consistency with the
rest of the playbook.
Also includes the dashboards for Synapse and for Node Exporter.
Again has only been tested on debian amd64 so far, but the grafana docker image is available for arm64 and arm32. Nice.
Basic system stats, to show stuff the synapse metrics
can't show such as resource usage by bridges, etc
Seems to work fine as well.
This too has only been tested on debian amd64 so far
I felt that adding another variable was probably going to be the easiest way to do this. I may end up adding another variable to enable this feature, for consistency with some of the other things.
This passes any arguments given to 'matrix-postgres-cli' to the 'psql' command.
Examples:
$ # start an interactive shell connected to a given db
$ sudo matrix-postgres-cli -d synapse
$ # run a query, non-interactively
$ sudo matrix-postgres-cli -d synapse -c 'SELECT group_id FROM groups;'
If they do, our next playbook runs would simply revert it
and report "changed" for that task.
There's no benefit to letting the bridge spew a new config file.
This does not apply to the mautrix whatsapp bridge, because that one
is written in Go (not Python) and takes different flags. There's no
equivalent flag there.
Fixes a regression introduced in f6097fbba1, which was cauing Synapse
to die with this error message:
> ValueError: sender_localpart needs characters which are not URL encoded.
These are just defensive cleanup tasks that we run.
In the good case, there's nothing to kill or remove, so they trigger an
error like this:
> Error response from daemon: Cannot kill container: something: No such container: something
and:
> Error: No such container: something
People often ask us if this is a problem, so instead of always having to
answer with "no, this is to be expected", we'd rather eliminate it now
and make logs cleaner.
In the event that:
- a container is really stuck and needs cleanup using kill/rm
- and cleanup fails, and we fail to report it because of error
suppression (`2>/dev/null`)
.. we'd still get an error when launching ("container name already in use .."),
so it shouldn't be too hard to investigate.
Not specifying bind addresses for the worker resulted in this warning:
> synapse.app - 47 - WARNING - None - Failed to listen on 0.0.0.0, continuing because listening on [::]
Additionally, metrics listening only on 127.0.0.1 seems like a no-op.
Only having it accessible from within the container is likely not what
we intend. Changed that to all interfaces as well.
Whether it actually gets exposed or not depends on the systemd service
and `matrix_synapse_workers_container_host_bind_address`.
This switches the `docker exec` method of spawning
Synapse workers inside the `matrix-synapse` container with
dedicated containers for each worker.
We also have dedicated systemd services for each worker,
so this are now:
- more consistent with everything else (we don't use systemd
instantiated services anywhere)
- we don't need the "parse systemd instance name into worker name +
port" part
- we don't need to keep track of PIDs manually
- we don't need jq (less depenendencies)
- workers dying would be restarted by systemd correctly, like any other
service
- `docker ps` shows each worker separately and we can observe resource
usage
We do this by creating one more layer of indirection.
First we reach some generic vhost handling matrix.DOMAIN.
A bunch of override rules are added there (capturing traffic to send to
ma1sd, etc). nginx-status and similar generic things also live there.
We then proxy to the homeserver on some other vhost (only Synapse being
available right now, but repointing this to Dendrite or other will be
possible in the future).
Then that homeserver-specific vhost does its thing to proxy to the
homeserver. It may or may not use workers, etc.
Without matrix-corporal, the flow is now:
1. matrix.DOMAIN (matrix-nginx-proxy/matrix-domain.conf)
2. matrix-nginx-proxy/matrix-synapse.conf
3. matrix-synapse
With matrix-corporal enabled, it becomes:
1. matrix.DOMAIN (matrix-nginx-proxy/matrix-domain.conf)
2. matrix-corporal
3. matrix-nginx-proxy/matrix-synapse.conf
4. matrix-synapse
(matrix-corporal gets injected at step 2).
This removes some `multi-target.wants` symlinks as well, etc.
But despite systemd saying:
> Removed symlink /etc/systemd/system/matrix-synapse.service.wants/matrix-synapse-worker@appservice:0.service
.. I still see such symlinks tehre for me for some reason, so keeping the
code (below) to find & delete them still seems like a good idea.
There was a `matrix_nginx_proxy_enabled|default(False)` check, but:
- it didn't seem to work reliably for some reason (hmm)
- referring to a `matrix_nginx_proxy_*` variable from within the
`matrix-synapse` role is not ideal
- exposing always happened on `127.0.0.1`, which may not be good enough
for some rarer setups (where the own webserver is external to the host)
I guess it didn't hurt to do it until now, but it's not great serving
federation APIs on the client-server API port, etc.
matrix-corporal doesn't work yet (still something to be solved in the
future), but its firewalling operations will also be sabotaged
by Client-Server APIs being served on the federation port (it's a way to get around its firewalling).
If we load it at runtime, during matrix-synapse role execution,
it's good enough for matrix-synapse and all roles after that,
but.. it breaks when someone uses `--tags=setup-nginx-proxy` alone.
The downside of including this vars file like this in `setup.yml`
is that the variables contained in it cannot be overriden by the user
(in their inventory's `vars.yml`).
... but it's not like overriding these variables was possible anyway
when including them at runtime.
Some people run Coturn or Jitsi, etc., by themselves and disable it
in the playbook.
Because the playbook is trying to be nice and clean up after itself,
it was deleting these Docker images.
However, people wish to pull and use them separately and would rather
they don't get deleted.
We could make this configurable for the sake of this special case, but
it's simpler to just avoid deleting these images.
It's not like this "cleaning things up" thing works anyway.
As time goes on, the playbook gets updated with newer image tags
and we leave so many images behind. If one doesn't run
`docker system prune -a` manually once in a while, they'd get swamped
with images anyway. Whether we leave a few images behind due to the lack
of this cleanup now is pretty much irrelevant.
We log everything in systemd/journald for every service already,
so there's no need for double-logging, bridges rotating log files
manually and other such nonsense.
In short, this makes Synapse a 2nd class citizen,
preparing for a future where it's just one-of-many homeserver software
options.
We also no longer have a default Postgres superuser password,
which improves security.
The changelog explains more as to why this was done
and how to proceed from here.
I had intentionally held it back in 39ea3496a4
until:
- it received more testing (there were a few bugs during the
migration, but now it seems OK)
- this migration guide was written
While administering we will occasionally invoke this script interactively with the "non-interactive" switch still there, yet still sit at the desk waiting for 300 seconds for this timer to run out.
The systemd-timer already uses a 3h randomized delay for automatic renewals, which serves this purpose well.
The `mobile` branch got merged to `master`, which ends up becoming
`:latest`. It's a "rewrite" of the bridge's backend and only
supports a Postgres database.
We'd like to go back (well, forward) to `:latest`, but that will take
a little longer, because:
- we need to handle and document things for people still on SQLite
(especially those with external Postgres, who are likely on SQLite for
bridges)
- I'd rather test the new builds (and migration) a bit before
releasing it to others and possibly breaking their bridge
Brave ones who are already using the bridge with Postgres
can jump on `:latest` and report their experience.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/756
Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/737
I feel like timers are somewhat more complicated and dirty (compared to
cronjobs), but they come with these benefits:
- log output goes to journald
- on newer systemd distros, you can see when the timer fired, when it
will fire, etc.
- we don't need to rely on cron (reducing our dependencies to just
systemd + Docker)
Cronjobs work well, but it's one more dependency that needs to be
installed. We were even asking people to install it manually
(in `docs/prerequisites.md`), which could have gone unnoticed.
Once in a while someone says "my SSL certificates didn't renew"
and it's likely because they forgot to install a cron daemon.
Switching to systemd timers means that installation is simpler
and more unified.
This reverts commit 2a25b63bb6.
Looking at other roles, we trigger building regardless of this.
It's better to always trigger it, because it's less fragile.
If the build fails and we only trigger it on "git changes"
then we won't trigger it for a while. That's not good.
Triggering it each and every time may seem like a waste,
but it supposedly runs quickly due to Docker caching.
This variable has been useless since 2019-01-08.
We probably don't need to check for its usage anymore,
given how much time has passed since then, but ..
Before we potentially clone to that path, we'd better make sure it exists.
We also simplify `when` statements a bit.
Given that we're in `setup_install.yml`, we know that the bridge is enabled,
so there's no need to check for that.
Not sure if it breaks with them or not, but no other directive
uses quotes and the nginx docs show examples without quotes,
so we're being consistent with all of that.
The different configurations are now all lower case, for consistent
naming.
`matrix_nginx_proxy_ssl_config` is now called
`matrix_nginx_proxy_ssl_preset`. The different options for "modern",
"intermediate" and "old" are stored in the main.yml file, instead of
being hardcoded in the configuration files. This will improve the
maintainability of the code.
The "custom" preset was removed. Now if one of the variables is set, it
will use it instead of the preset. This will allow to mix and match more
easily, for example using all the intermediate options but only
supporting TLSv1.2. This will also provide better backward
compatibility.
While it's kind of nice having it, it's also somewhat raw
and unnecessary.
Having a good default and not even mentioning it seems better
for most users.
People who need a more exposed bridge (rare) can use
override the default configuration using
`matrix_mautrix_signal_configuration_extension_yaml`.
The answer to these is: it's good to have them in both places.
The role defines the obvious things it depends on (not knowing
what setup it will find itself into), and then
`group_vars/matrix_servers` "extends" it based on everything else it
knows (the homeserver being Synapse, whether or not the internal
Postgres server is being used, etc.)
We need to suppress systemd service-stopping requests in certain rare
cases like https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/771
That issue seems to describe a case, where a migration from mxisd to
ma1sd was happening (DB files had just been moved), and then we were
attemping to stop `matrix-ma1sd.service` so we could import that database into
Postgres. However, there's neither `matrix-mxisd.service`, nor
`matrix-ma1sd.service` after `migrate_mxisd.yml` had just run, so
stopping `matrix-ma1sd.service` was failing.
Fixes a problem like this:
> File "/usr/lib/python3.8/site-packages/mautrix/bridge/e2ee.py", line 79, in __init__
> raise RuntimeError("Unsupported database scheme")
mautrix-python's e2ee.py module expects to find `postgres://` instead of
`postgresql://`.
Our old (base-path -> data-path) SQLite migration can't work otherwise.
It's probably not necessary to keep it anymore, but since we still do,
at least we should take care to ensure it works.
Raspbian doesn't seem to support arm64, so this is somewhat pointless
right now.
However, they might in the future. Doing this should also unify us
some more with `setup_debian.yml` with the ultimate goal of
eliminating `setup_raspbian.yml`.
Until now, we've only supported non-amd64 on Raspbian.
Seems like there are now people running Debian/Ubuntu on ARM,
so we were forcing them into amd64 Docker packages.
I've gotten a report that this change fixes support
for Ubuntu Server 20.04 on RPi 4B.
A new variable called `matrix_nginx_proxy_ssl_config` is created for
configuring how the nginx proxy configures SSL. Also a new configuration
validation option and other auxiliary variables are created.
A new variable configuration called `matrix_nginx_proxy_ssl_config` is
created. This allow to set the SSL configuration easily using the
default options proposed by Mozilla. The default configuration is set to
"Intermediate", removing the weak ciphers used in the old
configurations.
The new variable can also be set to "Custom" for a more granular control.
This allows to set another three variables called:
- `matrix_nginx_proxy_ssl_protocols`,
- `matrix_nginx_proxy_ssl_prefer_server_ciphers`
- `matrix_nginx_proxy_ssl_ciphers`
Also a new task is added to validate the SSL configuration variable.
Revert "Correct inabillity for appservice-discord to connect"
This reverts commit 673e19f830.
While certain things do work even with such a local URL, sending
messages leads to an error like this:
> [DiscordBot] verbose: DiscordAPIError: Invalid Form Body
> avatar_url: Not a well formed URL.
Fixes https://github.com/Half-Shot/matrix-appservice-discord/issues/649
The sample configuration file for appservice-discord
c29cfc72f5/config/config.sample.yaml (L8)
explicitly says that we need a public URL.
Now that 0.7.2 is out, the Docker image supports Postgres
and we can do the (SQLite -> Postgres) migration.
I've also found out that we needed to fix up the `tokens.ex_date` column
data type a bit to prevent matrix-registration from raising exceptions
when comparing `datetime.now()` with `ex_date` coming from the database.
Example:
> File "/usr/local/lib/python3.8/site-packages/matrix_registration/tokens.py", line 58, in valid
> expired = self.ex_date < datetime.now()
> TypeError: can't compare offset-naive and offset-aware datetimes
In cases where pgloader is not enough and we need to do some additional
migration work after it, we can now use
`additional_psql_statements_list` and
`additional_psql_statements_db_name`.
This is to be used when migrating `matrix-registration`'s data at the
very least.
This switches us to a container image maintained by the
matrix-registration developer.
0.7.2 also supports a `base_url` configuration option we can use to
make it easier to reverse-proxy at a different base URL.
We still keep some workarounds, because of this issue:
https://github.com/ZerataX/matrix-registration/issues/47
We were running into conflicts, because having initialized
the roles (users) and databases, trying to import leads to
errors (role XXX already exists, etc.).
We were previously ignoring the Synapse database (`homeserver`)
when upgrading/importing, because that one gets created by default
whenever the container starts.
For our additional databases, it's a similar situation now.
It's not created by default as soon as Postgres starts with an empty
database, but rather we create it as part of running the playbook.
So we either need to skip those role/database creation statements
while upgrading/importing, or to avoid creating the additional database
and rely on the import for that. I've gone for the former, because
it's already similar to what we were doing and it's simpler
(it lets `setup_postgres.yml` be the same in all scenarios).
Auto-migration and everything seems to work. It's just that
matrix-registration cannot load the Python modules required
for talking to a Postgres database.
Tracked here: https://github.com/ZerataX/matrix-registration/issues/44
Until this gets fixed, we'll continue default to 'sqlite'.
I was thinking that it makes sense to be more specific,
and using `_postgres_` also separated these variables
from the `_database_` variables that ended up in bridge configuration.
However, @jdreichmann makes a good point
(https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/740#discussion_r542281102)
that we don't need to be so specific and can allow for other engines (like MySQL) to use these variables.
Regression since 2d99ade72f and 9bf8ce878e, respectively.
When SQLite is to be used, these bridges expect an `sqlite://`
connection string, and not a plain file name (path), like Appservice
Discord and mautrix-whatsapp do.
Instead of passing the connection string, we can now pass a name of a
variable, which contains a connection string.
Both are supported for having extra flexibility.
Since we'll likely have generic SQLite database importing
via [pgloader](https://pgloader.io/) for migrating bridge
databases from SQLite to Postgres, we'd rather avoid
calling the "import Synapse SQLite database" command
as just `--tags=import-sqlite-db`.
Similarly, for the media store, we'd like to mention that it's
related to Synapse as well.
We'd like to be more explicit, so as to be less confusing,
especially in light of other homeserver implementations
coming in the future.
People can toggle between them now. The playbook also defaults
to using SQLite if an external Postgres server is used.
Ideally, we'd be able to create databases/users in external Postgres
servers as well, but our initialization logic (and `docker run` command,
etc.) hardcode too many things right now.
While these modules are really nice and helpful, we can't use them
for at least 2 reasons:
- for us, Postgres runs in a container on a private Docker network
(`--network=matrix`) without usually being exposed to the host.
These modules execute on the host so they won't be able to reach it.
- these modules require `psycopg2`, so we need to install it before
using it. This might or might not be its own can of worms.
The tasks in `create_additional_databases.yml` will likely
ensure `matrix-postgres.service` is started, etc.
If no additional databases are defined, we'd rather not execute that
file and all these tasks that it may do in the future.
> Invalid data passed to 'loop', it requires a list, got this instead: matrix_postgres_additional_databases. Hint: If you passed a list/dict of just one element, try adding wantlist=True to your lookup invocation or use q/query instead of lookup.
Well, or working around it, as I've done in this commit (which seems
more sane than `wantlist=True` stuff).
To avoid needing to have `jq` installed on the machine, we could:
- try to run jq in a Docker container using some small image providing
that
- better yet, avoid `jq` altogether
Moving it above the "uninstalling" set of tasks is better.
Extracting it out to another file at the same time, for readability,
especially given that it will probably have to become more complex in
the future (potentially installing `jq`, etc.)
v0.7.0 is broken right now, because it calls
`/_matrix/client/r0/admin/register`, which is now at
`/_synapse/admin/v1/register`.
This has been fixed here: 6b26255fea
.. but it's not part of any release.
Switching to `master` (`docker.io/devture/zeratax-matrix-registration:latest`) until it gets resolved.
Reported upstream here: https://github.com/ZerataX/matrix-registration/issues/43
Starting with Docker 20.10, `--hostname` seems to have the side-effect
of making Docker's internal DNS server resolve said hostname to the IP
address of the container.
Because we were giving the mailer service a hostname of `matrix.DOMAIN`,
all requests destined for `matrix.DOMAIN` originating from other
services on the container network were resolving to `matrix-mailer`.
This is obviously wrong.
Initially reported here: https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/748
We normally try to not use the public hostname (and IP address) on the
container network and try to make services talk to one another locally,
but it sometimes could happen.
With this, we use a `matrix-mailer` hostname for the matrix-mailer
container. My testing shows that it doesn't cause any trouble with
email deliverability.
If a service is enabled, a database for it is created in postgres with a uniqque password. The service can then use this database for data storage instead of relying on sqlite.
The Docker 19.04 -> 20.10 upgrade contains the following change
in `/usr/lib/systemd/system/docker.service`:
```
-BindsTo=containerd.service
-After=network-online.target firewalld.service containerd.service
+After=network-online.target firewalld.service containerd.service multi-user.target
-Requires=docker.socket
+Requires=docker.socket containerd.service
Wants=network-online.target
```
The `multi-user.target` requirement in `After` seems to be in conflict
with our `WantedBy=multi-user.target` and `After=docker.service` /
`Requires=docker.service` definitions, causing the following error on
startup for all of our systemd services:
> Job matrix-synapse.service/start deleted to break ordering cycle starting with multi-user.target/start
A workaround which appears to work is to add `DefaultDependencies=no`
to all of our services.
After recently updating my matrix-docker-ansible-deploy installation, matrix-appservice-discord would refuse to start, logging ECONNREFUSED to https://matrix.[mydomain]:443, which was resolving to 172.18.0.2 due to the `--hostname` in mailer grabbing that hostname.
Curious why the IRC bridge didn't have this issue, I looked into it, and it was connecting to `http://matrix-synapse:8008`. Correcting this one to that URL resolved the issue.
ma1sd requires the openid endpoints for certain functionality.
Example: 90b2b5301c/src/main/java/io/kamax/mxisd/auth/AccountManager.java (L67-L99)
If federation is disabled, we still need to expose these openid APIs on the
federation port.
Previously, we were doing similar magic for Dimension.
As per its documentation, when running unfederated, one is to enable
the openid listener as well. As per their recommendation, people
are advised to do enable it on the Client-Server API port
and use the `federationUrl` variable to override where the federation
port is (making federation requests go to the Client-Server API).
Because ma1sd always uses the federation port (unless you do some
DNS overwriting magic using its configuration -- which we'd rather not
do), it's better if we just default to putting the `openid` listener
where it belongs - on the federation port.
With this commit, we retain the "automatically enable openid APIs" thing
we've been doing for Dimension, but move it to the federation port instead.
We also now do the same thing when ma1sd is enabled.
We've had a report of the `connection` value getting cut off,
supposedly because it contains something that breaks off the string.
Using `|to_json` takes care of it.
This may be a bit premature, because the bridge didn't work for me
the last time I tried it (RC3).
Some bugs have been fixed to make our config compatible with v1.0.0
though, so it may work for some people (especially those starting
fresh).
I'm not for shipping potentially broken things, but given that we were
using `docker.io/halfshot/matrix-appservice-discord:latest` and that
points to v1.0.0 already (with no other tag we can use), our setup was
already broken in any case.
Now, at least it has some chance of running.
Many people probably didn't even know this - that ansible can be
quite a bit picky about what it will be willing to work with remotely.
Thanks @maxklenk !
Some people requested that `--tags=start` not set up service autostart.
One can now do `--tags=start --extra-vars="matrix_services_autostart_enabled=false"`
to just start services ones and not set up autostarting.
It's not like it worked anyway, because we don't have the necessary
services installed for transcription (Jigasi), nor recording (Jibri).
Disabling these, should hopefully disable their related elements
in the Jitsi Web UI.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/726
This supersedes/fixes-up this Pull Request:
https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/719
The Jitsi Web and JVB containers now (in build 5142) always
start by bulding their own default configuration
(`config.js` and `sip-communicator.properties`, respectively).
The fact that we were generating these files ourselves was no longer of use,
because our configuration was thrown away in favor of the one created
by the containers on startup.
With this commit, we're completely redoing things. We no longer
generate these configuration files. We try to pass the proper
environment variables, so that Jitsi services can generate the
configuration files themselves.
Besides that, we try to use the "custom configuration" mechanism
provided by Jitsi Web and Jitsi JVB (`custom-config.js` and
`custom-sip-communicator.properties`, respectively), so that
we and our users can inject additional configuration.
Some configuration options we had are gone now. Others are no longer
controllable via variables and need to be injected using
the `_config_extension` variables that we provide.
The validation logic that is part of the role should take care
to inform people about how to upgrade (if they're using some custom
configuration, which needs special care now). Most users should not
have to do anything special though.
Since the switch from `-v` to `--mount` (in 1fca917ad1),
we've regressed when `matrix_ssl_retrieval_method == 'none'`.
In such a case, we don't create `/matrix/ssl` directories at all
and shouldn't be trying to mount them into the `matrix-nginx-proxy`
container.
Previously, with `-v`, Docker would auto-create them, effectively hiding
our mistake. Now that `--mount` doesn't do such auto-creation magic,
the `matrix-nginx-proxy` container was failing to start.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/734
`-v` magically creates the source destination as a directory,
if it doesn't exist already. We'd like to avoid this magic
and the potential breakage that it might cause.
We'd rather fail while Docker tries to find things to `--mount`
than have it automatically create directories and fail anyway,
while having contaminated the filesystem.
There's a lot more `-v` instances remaining to be fixed later on.
This is just some start.
Things like `matrix_synapse_container_additional_volumes` and
`matrix_nginx_proxy_container_additional_volumes` were not changed to
use `--mount`, as options for each one are passed differently
(`ro` is `ro`, but `rw` doesn't exist and `slave` is `bind-propagation=slave`).
To avoid breaking people's custom volume mounts, we keep it as it is for now.
A deficiency with `--mount` is that it lacks the `z` option (SELinux
ownership changes), and some of our `-v` instances use that. I'm not
sure how supported SELinux is for us right now, but it might be,
and breaking that would not be a good idea.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/716
This patch makes us use more fully-qualified container image names
(either prefixed with docker.io/ or with localhost/).
The latter happens when self-building is enabled.
We've recently had issues where if an image was removed manually
and the service was restarted (making `docker run` fetch it from Docker Hub, etc.),
we'd end up with a pulled image, even though we're aiming for a self-built one.
Re-running the playbook would then not do a rebuild, because:
- the image with that name already exists (even though it's something
else)
- we sometimes had conditional logic where we'd build only if the git
repo changed
By explicitly changing the name of the images (prefixing with localhost/),
we avoid such confusion and the possibility that we'd automatically pul something
which is not what we expect.
Also, I've removed that condition where building would happen on git
changes only. We now always build (unless an image with that name
already exists). We just force-build when the git repo changes.
We'd like the roles to be self-contained (as much as possible).
Thus, the `matrix-nginx-proxy` shouldn't reference any variables from
other roles. Instead, we rely on injection via
`group_vars/matrix_servers`.
Related to #681 (Github Pull Request)
Having it unset in the role itself (while referencign it) is a little strange.
Now people can look at the `roles/matrix-dynamic-dns/defaults/main.yml`
file and figure out everything that's necessary to run the role.
Related to #681 (Github Pull Request)
This broke in 63a49bb2dc.
Proxying the OpenID Connect endpoints is now possible,
but needs to be enabled explicitly now.
Supersedes #702 (Github Pull Request).
This patch builds up on the idea from that Pull Request,
but does things in a cleaner way.
We do this to match Synapse's new default "max_upload_size" (50MB).
This `matrix_nginx_proxy_proxy_matrix_client_api_client_max_body_size_mb`
default value only affects standalone usage of the `matrix-nginx-proxy`
role. When the role is used in the context of the playbook,
the value is dynamically assigned from `group_vars/matrix_servers`.
Somewhat related to #692 (Github Issue).
The regex introduced in 63a49bb2dc seems to take precedence
over the bare location blocks, causing a regression.
> It is important to understand that, by default, Nginx will serve regular expression matches in preference to prefix matches.
> However, it evaluates prefix locations first, allowing for the administer to override this tendency by specifying locations using the = and ^~ modifiers.
Source: https://www.digitalocean.com/community/tutorials/understanding-nginx-server-and-location-block-selection-algorithms
also, worker.yaml.j2:
- hone worker_name
- remove worker_pid_file entry (would only be used if worker_daemonize
set to true; also, synapse only knows about the container namespace
and thus can not provide the required host-view PID)