I was thinking that it makes sense to be more specific,
and using `_postgres_` also separated these variables
from the `_database_` variables that ended up in bridge configuration.
However, @jdreichmann makes a good point
(https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/740#discussion_r542281102)
that we don't need to be so specific and can allow for other engines (like MySQL) to use these variables.
Regression since 2d99ade72f and 9bf8ce878e, respectively.
When SQLite is to be used, these bridges expect an `sqlite://`
connection string, and not a plain file name (path), like Appservice
Discord and mautrix-whatsapp do.
Instead of passing the connection string, we can now pass a name of a
variable, which contains a connection string.
Both are supported for having extra flexibility.
Since we'll likely have generic SQLite database importing
via [pgloader](https://pgloader.io/) for migrating bridge
databases from SQLite to Postgres, we'd rather avoid
calling the "import Synapse SQLite database" command
as just `--tags=import-sqlite-db`.
Similarly, for the media store, we'd like to mention that it's
related to Synapse as well.
We'd like to be more explicit, so as to be less confusing,
especially in light of other homeserver implementations
coming in the future.
People can toggle between them now. The playbook also defaults
to using SQLite if an external Postgres server is used.
Ideally, we'd be able to create databases/users in external Postgres
servers as well, but our initialization logic (and `docker run` command,
etc.) hardcode too many things right now.
While these modules are really nice and helpful, we can't use them
for at least 2 reasons:
- for us, Postgres runs in a container on a private Docker network
(`--network=matrix`) without usually being exposed to the host.
These modules execute on the host so they won't be able to reach it.
- these modules require `psycopg2`, so we need to install it before
using it. This might or might not be its own can of worms.
The tasks in `create_additional_databases.yml` will likely
ensure `matrix-postgres.service` is started, etc.
If no additional databases are defined, we'd rather not execute that
file and all these tasks that it may do in the future.
> Invalid data passed to 'loop', it requires a list, got this instead: matrix_postgres_additional_databases. Hint: If you passed a list/dict of just one element, try adding wantlist=True to your lookup invocation or use q/query instead of lookup.
Well, or working around it, as I've done in this commit (which seems
more sane than `wantlist=True` stuff).
To avoid needing to have `jq` installed on the machine, we could:
- try to run jq in a Docker container using some small image providing
that
- better yet, avoid `jq` altogether
Moving it above the "uninstalling" set of tasks is better.
Extracting it out to another file at the same time, for readability,
especially given that it will probably have to become more complex in
the future (potentially installing `jq`, etc.)
v0.7.0 is broken right now, because it calls
`/_matrix/client/r0/admin/register`, which is now at
`/_synapse/admin/v1/register`.
This has been fixed here: 6b26255fea
.. but it's not part of any release.
Switching to `master` (`docker.io/devture/zeratax-matrix-registration:latest`) until it gets resolved.
Reported upstream here: https://github.com/ZerataX/matrix-registration/issues/43
Starting with Docker 20.10, `--hostname` seems to have the side-effect
of making Docker's internal DNS server resolve said hostname to the IP
address of the container.
Because we were giving the mailer service a hostname of `matrix.DOMAIN`,
all requests destined for `matrix.DOMAIN` originating from other
services on the container network were resolving to `matrix-mailer`.
This is obviously wrong.
Initially reported here: https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/748
We normally try to not use the public hostname (and IP address) on the
container network and try to make services talk to one another locally,
but it sometimes could happen.
With this, we use a `matrix-mailer` hostname for the matrix-mailer
container. My testing shows that it doesn't cause any trouble with
email deliverability.
If a service is enabled, a database for it is created in postgres with a uniqque password. The service can then use this database for data storage instead of relying on sqlite.
The Docker 19.04 -> 20.10 upgrade contains the following change
in `/usr/lib/systemd/system/docker.service`:
```
-BindsTo=containerd.service
-After=network-online.target firewalld.service containerd.service
+After=network-online.target firewalld.service containerd.service multi-user.target
-Requires=docker.socket
+Requires=docker.socket containerd.service
Wants=network-online.target
```
The `multi-user.target` requirement in `After` seems to be in conflict
with our `WantedBy=multi-user.target` and `After=docker.service` /
`Requires=docker.service` definitions, causing the following error on
startup for all of our systemd services:
> Job matrix-synapse.service/start deleted to break ordering cycle starting with multi-user.target/start
A workaround which appears to work is to add `DefaultDependencies=no`
to all of our services.
After recently updating my matrix-docker-ansible-deploy installation, matrix-appservice-discord would refuse to start, logging ECONNREFUSED to https://matrix.[mydomain]:443, which was resolving to 172.18.0.2 due to the `--hostname` in mailer grabbing that hostname.
Curious why the IRC bridge didn't have this issue, I looked into it, and it was connecting to `http://matrix-synapse:8008`. Correcting this one to that URL resolved the issue.
ma1sd requires the openid endpoints for certain functionality.
Example: 90b2b5301c/src/main/java/io/kamax/mxisd/auth/AccountManager.java (L67-L99)
If federation is disabled, we still need to expose these openid APIs on the
federation port.
Previously, we were doing similar magic for Dimension.
As per its documentation, when running unfederated, one is to enable
the openid listener as well. As per their recommendation, people
are advised to do enable it on the Client-Server API port
and use the `federationUrl` variable to override where the federation
port is (making federation requests go to the Client-Server API).
Because ma1sd always uses the federation port (unless you do some
DNS overwriting magic using its configuration -- which we'd rather not
do), it's better if we just default to putting the `openid` listener
where it belongs - on the federation port.
With this commit, we retain the "automatically enable openid APIs" thing
we've been doing for Dimension, but move it to the federation port instead.
We also now do the same thing when ma1sd is enabled.
We've had a report of the `connection` value getting cut off,
supposedly because it contains something that breaks off the string.
Using `|to_json` takes care of it.
This may be a bit premature, because the bridge didn't work for me
the last time I tried it (RC3).
Some bugs have been fixed to make our config compatible with v1.0.0
though, so it may work for some people (especially those starting
fresh).
I'm not for shipping potentially broken things, but given that we were
using `docker.io/halfshot/matrix-appservice-discord:latest` and that
points to v1.0.0 already (with no other tag we can use), our setup was
already broken in any case.
Now, at least it has some chance of running.
Many people probably didn't even know this - that ansible can be
quite a bit picky about what it will be willing to work with remotely.
Thanks @maxklenk !
Some people requested that `--tags=start` not set up service autostart.
One can now do `--tags=start --extra-vars="matrix_services_autostart_enabled=false"`
to just start services ones and not set up autostarting.
It's not like it worked anyway, because we don't have the necessary
services installed for transcription (Jigasi), nor recording (Jibri).
Disabling these, should hopefully disable their related elements
in the Jitsi Web UI.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/726
This supersedes/fixes-up this Pull Request:
https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/719
The Jitsi Web and JVB containers now (in build 5142) always
start by bulding their own default configuration
(`config.js` and `sip-communicator.properties`, respectively).
The fact that we were generating these files ourselves was no longer of use,
because our configuration was thrown away in favor of the one created
by the containers on startup.
With this commit, we're completely redoing things. We no longer
generate these configuration files. We try to pass the proper
environment variables, so that Jitsi services can generate the
configuration files themselves.
Besides that, we try to use the "custom configuration" mechanism
provided by Jitsi Web and Jitsi JVB (`custom-config.js` and
`custom-sip-communicator.properties`, respectively), so that
we and our users can inject additional configuration.
Some configuration options we had are gone now. Others are no longer
controllable via variables and need to be injected using
the `_config_extension` variables that we provide.
The validation logic that is part of the role should take care
to inform people about how to upgrade (if they're using some custom
configuration, which needs special care now). Most users should not
have to do anything special though.
Since the switch from `-v` to `--mount` (in 1fca917ad1),
we've regressed when `matrix_ssl_retrieval_method == 'none'`.
In such a case, we don't create `/matrix/ssl` directories at all
and shouldn't be trying to mount them into the `matrix-nginx-proxy`
container.
Previously, with `-v`, Docker would auto-create them, effectively hiding
our mistake. Now that `--mount` doesn't do such auto-creation magic,
the `matrix-nginx-proxy` container was failing to start.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/734
`-v` magically creates the source destination as a directory,
if it doesn't exist already. We'd like to avoid this magic
and the potential breakage that it might cause.
We'd rather fail while Docker tries to find things to `--mount`
than have it automatically create directories and fail anyway,
while having contaminated the filesystem.
There's a lot more `-v` instances remaining to be fixed later on.
This is just some start.
Things like `matrix_synapse_container_additional_volumes` and
`matrix_nginx_proxy_container_additional_volumes` were not changed to
use `--mount`, as options for each one are passed differently
(`ro` is `ro`, but `rw` doesn't exist and `slave` is `bind-propagation=slave`).
To avoid breaking people's custom volume mounts, we keep it as it is for now.
A deficiency with `--mount` is that it lacks the `z` option (SELinux
ownership changes), and some of our `-v` instances use that. I'm not
sure how supported SELinux is for us right now, but it might be,
and breaking that would not be a good idea.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/716
This patch makes us use more fully-qualified container image names
(either prefixed with docker.io/ or with localhost/).
The latter happens when self-building is enabled.
We've recently had issues where if an image was removed manually
and the service was restarted (making `docker run` fetch it from Docker Hub, etc.),
we'd end up with a pulled image, even though we're aiming for a self-built one.
Re-running the playbook would then not do a rebuild, because:
- the image with that name already exists (even though it's something
else)
- we sometimes had conditional logic where we'd build only if the git
repo changed
By explicitly changing the name of the images (prefixing with localhost/),
we avoid such confusion and the possibility that we'd automatically pul something
which is not what we expect.
Also, I've removed that condition where building would happen on git
changes only. We now always build (unless an image with that name
already exists). We just force-build when the git repo changes.
We'd like the roles to be self-contained (as much as possible).
Thus, the `matrix-nginx-proxy` shouldn't reference any variables from
other roles. Instead, we rely on injection via
`group_vars/matrix_servers`.
Related to #681 (Github Pull Request)
Having it unset in the role itself (while referencign it) is a little strange.
Now people can look at the `roles/matrix-dynamic-dns/defaults/main.yml`
file and figure out everything that's necessary to run the role.
Related to #681 (Github Pull Request)
This broke in 63a49bb2dc.
Proxying the OpenID Connect endpoints is now possible,
but needs to be enabled explicitly now.
Supersedes #702 (Github Pull Request).
This patch builds up on the idea from that Pull Request,
but does things in a cleaner way.
We do this to match Synapse's new default "max_upload_size" (50MB).
This `matrix_nginx_proxy_proxy_matrix_client_api_client_max_body_size_mb`
default value only affects standalone usage of the `matrix-nginx-proxy`
role. When the role is used in the context of the playbook,
the value is dynamically assigned from `group_vars/matrix_servers`.
Somewhat related to #692 (Github Issue).
The regex introduced in 63a49bb2dc seems to take precedence
over the bare location blocks, causing a regression.
> It is important to understand that, by default, Nginx will serve regular expression matches in preference to prefix matches.
> However, it evaluates prefix locations first, allowing for the administer to override this tendency by specifying locations using the = and ^~ modifiers.
Source: https://www.digitalocean.com/community/tutorials/understanding-nginx-server-and-location-block-selection-algorithms
also, worker.yaml.j2:
- hone worker_name
- remove worker_pid_file entry (would only be used if worker_daemonize
set to true; also, synapse only knows about the container namespace
and thus can not provide the required host-view PID)
- cherry-pick "Ensure worker config exists in systemd service (#7528)"
from synapse d74cdc1a42e8b487d74c214b1d0ca575429d546a:
"check that the worker config file exists instead of silently failing."
If the SQLite database was from an older version of Synapse, it appears
that Synapse would try to run migrations on it first, before importing.
This was failing, because the file wasn't writable.
Hopefully, this fixes the problem.
Interestingly, no one has reported this failure before #662 (Github
Issue).
It doesn't make sense to keep saying that we support such old Ansible
versions, when we're not even testing on anything close to those.
Time is also passing and such versions are getting more and more
ancient. It's time we bumped our requirements to something that is more
likely to work.
showLabsSettings is the new enableLabs I guess. enableLabs doesn't seem to do anything anymore. It had been deprecated for a while.
This PR also removes @riot-bot:matrix.org as the default welcome_user_id since it doesn't exist anymore.
We recently had a report of the Postgres backup container's log file
growing the size of /var/lib/docker until it ran out of disk space.
Trying to prevent similar problems in the future.
Certain more-minimal Debian installations may not have
lsb-release installed, which makes the playbook fail.
We need lsb-release on Debian, so that ansible_lsb
could tell us if this is Debian or Raspbian.
In #628 I proposed a CORS change that turns out not to be the root of the issue. Caffeine-addled diagnosis leads to sloppy thinking, and this change should be reverted. In fact, if left it will cause problems for new installations.
Even with the v2 updates listed in #503 and partially addressed in #614, this is still needed to enable identity services to function with Element Desktop/Web. Testing on multiple clients with a clean config has confirmed this, at least for my installation.
Fix regression since 2a50b8b6bb (#597).
Dimension is intended to be embedded in various clients,
be it the Element service that we host (at element.DOMAIN),
some other Element (element-desktop running locally), etc.
The when statement is supposed to be on the block, not on the individual task.
It affects all tasks within the block (they're all to be executed when ma1sd is enabled and self-building is requested0.
The tag format used in the `ma1sd` repo have change. Versions no longer
start with 'v', and when building for non-amd64, we also need to strip
off the '-$arch' bit from the Docker image name.
Further, when building the .jar file, `ma1sd` currently names the .jar
based on the project's directory, which we call 'docker-src'. This means
other parts of the `ma1sd` build can't find the .jar file. Remedy this
by ensuring that the dir is called `docker-src/ma1sd`.
`matrix_container_images_self_build` was not really doing anything
anymore. It previously was influencing `matrix_*_self_build` variables,
but it's no longer the case since some time ago.
Individual `matrix_*_self_build` variables are still available.
People that would like to toggle self-building for a specific component
ought to use those.
These variables are also controlled automatically (via
`group_vars/matrix_servers`) depending on `matrix_architecture`.
In other words, self-building is being done automatically for
all components when they don't have a prebuilt image for the specified
architecture. Some components only support `amd64`, while others also
have images for other architectures.
There's no change in the source code. Just a release bump for packaing
reasons. It doesn't matter much for us here, but let's be on the latest
tag anyway.
Postgres setup like
matrix_mautrix_telegram_configuration_extension_yaml: |
appservice:
database: "postgres://XXX:XXX@matrix-postgres:5432/mxtg"
will fail without the right Dockernetwork
`/etc/nginx/conf.d/default.conf` was previously causing
some issues when used with our `--user`.
It's not the case anymore, so we can remove it.
Fixes#369 (Github Issue).
Depending on the distro, common commands like sleep and chown may either
be located in /bin or /usr/bin.
Systemd added path lookup to ExecStart in v239, allowing only the
command name to be put in unit files and not the full path as
historically required. At least Ubuntu 18.04 LTS is however still on
v237 so we should maintain portability for a while longer.
the current version fails the import, because the volume for the media is missing. It still fails if you have the optional shared secret password provider is enabled, so that might need another mount. Commenting out the password provider in the hoimeserver.yaml during the run works as well.
This is mostly here to guard against problems happening
due to server migration and doing `chown -R matrix:matrix /matrix`.
Normally, the file is owned by `1000:1000`, as expected.
If ownership changes, Dimension could still start, but it will fail the
first time it tries to write to the database. Explicitly chowning
before startup guards against this.
Related to #485 and #486 (Github Pull Requests).
Also related to ccc7aaf0ce.
Dimension runs as the `node` user in the container (`1000:1000`).
It doesn't seem like we have a way around it. Thus, its configuration
must also be readable by that user (or group, in this case).
We don't really need to fail in such a spectactular way,
but it's probably good to do. It will only happen for people
who are defining their own user/group id, which is rare.
It seems like a good idea to tell them that this doesn't work
as they expect anymore and to ask them to remove these variables,
which otherwise give them a fake sense of hope.
Related to #486 (Github Pull Request).
If one runs the playbook with `--tags=setup-all`, it would have been
fine.
But running with a specific tag (e.g. `--tags=setup-riot-web`) would
have made that initialization be skipped, and the `matrix-riot-web` role
would fail, due to missing variables.
Ansible will migrate the ownership of the base path and config path, but
manual intervention will be required in order to migrate the ownership
of files in those directories (i.e. dimension.db).
Stop the services:
(local)$ ansible-playbook -i inventory/hosts setup.yml --tags=stop
Fix the permissions on the server:
(server)# chown -Rv "{{ matrix_user_username }}:{{ matrix_user_username }}" "{{ matrix_dimension_base_path }}"
which would typically look like:
(server)# chown -Rv matrix:matrix /matrix/dimension/
Reconfigure Dimension and start the services:
(local)$ ansible-playbook -i inventory/hosts setup.yml --tags=setup-dimension,start
* add permalinkPrefix to riot-web config
* add feature to change default theme of riot-web via its config file
* remove matrix_riot_web_change_default_theme and provide sane default
· 😅 How to keep this in sync with the matrix-synapse documentation?
· regex location matching is expensive
· nginx syntax limit: one location only per block / statement
· thus, lots of duplicate statements in this file
Well, actually 8cd9cde won't work, unless we put the
`|to_nice_yaml` thing on a new line.
We can, but that takes more lines and makes things look uglier.
Using `|to_json` seems good enough.
The whole file is parsed as YAML later on and merged with the
`_extension` variable before being dumped as YAML again in the end.
Hopefully fixes an error like this (which I haven't been able to
reproduce, but..):
> [modules/xmpp/strophe.util.js] <Object.i.Strophe.log>: Strophe: Error: Failed to construct 'RTCPeerConnection': 'matrix.DOMAIN' is not one of the supported URL schemes 'stun', 'turn' or 'turns'.
We define this password in the `sip-communicator.properties`
configuration file, so this is not needed for actually running JVB.
However, it does a (useless) safety check during container startup,
and we need to make that check happy.
We do this for 2 reasons:
- so we can control things which are not controllable using environment
variables (for example `stunServers` in jitsi/web, since we don't wish
to use the hardcoded Google STUN servers if our own Coturn is enabled)
- so playbook variable changes will properly rebuild the configuration.
When using Jitsi environment variables, the configuration is only built
once (the first time) and never rebuilt again. This is not the
consistent with the rest of the playbook and with how Ansible operates.
We're not perfect at it (yet), because we still let the Jitsi containers
generate some files on their own, but we are closer and it should be
good enough for most things.
Related to #415 (Github Pull Request).
This keeps the roles cleaner and more independent of matrix-base,
which may be important for people building their own playbook
out of the individual roles and not using the matrix-base role.
This adds into the Riot config.json the field
'default_server_config.m.homeserver.server_name'
with, by default, the value of the playbook's 'matrix_domain' variable.
Riot displays this string in its login page and will now say 'Sign in to
your Matrix account on example.org' (the server name) instead of 'Sign
in ... on matrix.example.org' (the server domain-name).
This string can be configured by setting the playbook variable
'matrix_riot_web_default_server_name'
to any string, so we can make Riot say for example 'Sign in ... on Our
Server'.
This fixes an incorrect indentation in the database specification for
appservice-irc which caused matrix-appservice-irc to refuse to start
with the remarkably unhelpful error message:
```
ERROR:CLI Failed to run bridge.
```
This also updates doc links to the new matrixdotorg repo because the
tedomum repo contains out-of-date documentation.
Synapse v1.9.0 changed some things which made the REST Auth Password
Provider break.
The ma1uta/matrix-synapse-rest-password-provider implements some
workarounds for now and will likely deliver a proper fix in the future.
Not much has changed between the 2 projects, so this should be a
painless transition.
This change allows us to work with both our existing Docker image
(`tedomum/matrix-appservice-irc:latest`) and with the
official Docker image (`matrixdotorg/matrix-appservice-irc`).
The actual change to the official Docker image requires more testing
and will be done separately.
Can you double check that the way I have this set only exposes it locally? It is important that the manhole is not available to the outside world since it is quite powerful and the password is hard coded.
Switching to the official image (vectorim/riot-web) should ensure:
- there's less breakage, as it's maintained by the same team as riot-web
- there's fewer actors we need to trust
- we can upgrade riot-web faster, as newer versions should be released
on Docker hub at the same time riot-web releases are made
Riot used to be fine with it being blank but now it complains. This creates an ugly looking comma when there is an identity server configured but I guess that's fine.
Prompted by: https://matrix.org/blog/2019/11/09/avoiding-unwelcome-visitors-on-private-matrix-servers
This is a bit controversial, because.. the Synapse default remains open,
while the general advice (as per the blog post) is to make it more private.
I'm not sure exactly what kind of server people set up and whether they
want to make the room directory public. Our general goal is to favor
privacy and security when running personal (family & friends) and corporate
homeservers, both of which likely benefit from having a more secure default.
Don't mention systemd-journald adjustment anymore, because
we've changed log levels to WARNING and Synapse is not chatty by default
anymore.
The "excessive log messages may get dropped on CentOS" issue no longer
applies to most users and we shouldn't bother them with it.
matrix_synapse_storage_path is already defined in matrix-synapse/defaults/main.yml (with a default of "{{ matrix_synapse_base_path }}/storage"), but was not being used for its presumed purpose in matrix-synapse.service.j2. As a result, if matrix_synapse_storage_path was overridden (in a vars.yml), the synapse service failed to start.
Discord client IDs are numeric (e.g. 12345).
Passing them as integers however, causes the Discord bridge's YAML parser
to parse them as integers and its config schema validation will fail.
Fixes#240 (Github Issue)
This only gets triggered if:
- the Synapse role is used standalone and the default values are used
- the whole playbook is used, with `matrix_mxisd_enabled: false`
Continuation of #234 (Github Pull Request).
I had unintentionally updated the documentation for the feature,
saying the page is available at `https://matrix.DOMAIN/nginx_status`.
Looks like it wasn't the case, going against my expectations.
I'm correcting this with this patch.
The status page is being made available on both HTTP and HTTPS.
Serving over HTTP is likely necessary for services like
Longview
(https://www.linode.com/docs/platform/longview/longview-app-for-nginx/)