Skip to content

[WIP] MSC4284: Policy Servers #4284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
188 changes: 188 additions & 0 deletions proposals/4284-policy-servers.md
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation requirements:

TBD when leaving WIP.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To start tracking the implementations early though:

Original file line number Diff line number Diff line change
@@ -0,0 +1,188 @@
# MSC4284: Policy Servers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally this MSC in its current state is proposing something that sounds like the reverse of a recent idea in Draupnir of extending the API as to allow the easy creation of external protections. But this MSC instead proposes an API for calling into a Evaluator over fed.

I personally think this sounds like a great idea as this approach allows the maintainence of the philosophy of letting homeservers focus on being homeservers and farms out moderation to dedicated tooling that is fit for purpose. This is great for not only extensibility but also maintainability of these capabilities.

Draupnir has recently started to venture into this direction a tiny bit via its HS Admin Capabilities and Synapse module. This shows theres definetively an appatite for this type of capabilitity in the ecosystem. I have also seen wishes for this exact type of MSC expressed recently.


**Note**: The concepts and architecture proposed by this MSC are rapidly iterating and will change.
Review is appreciated with the understanding that absolutely nothing is set in stone. Where security
issues are present, please use the [Security Disclosure Policy](https://matrix.org/security-disclosure-policy/)
rather than leaving inline comments. This is to ensure application integrity for those opting to use
highly experimental and changing implementations of this proposal.

Communities on Matrix are typically formed through [Spaces](https://spec.matrix.org/v1.14/client-server-api/#spaces),
but can be made up of singular rooms or loose collections of rooms. These communities often have a
desire to push unwelcome content out of their chats, and rely on bots like [Mjolnir](https://github.com/matrix-org/mjolnir),
[Draupnir](https://github.com/the-draupnir-project/Draupnir), and [Meowlnir](https://github.com/maunium/meowlnir)
to help manage their community. Many of these communities have additionally seen a large increase in
abusive content being sent to their rooms recently. While these existing tools allow for reactive
moderation (redactions after the fact), some impacted communities may benefit from having an option
to use a server of their choice to automatically filter events at a server level, reducing the spread
of abusive content. This proposal experiments with this idea, calling the concept *Policy Servers*.

This proposal does not seek to replace community management provided by the existing moderation bots,
but does intend to supplement a large part of the "protections" concept present in many of these bots
to the room's designated policy server.

At a high level, policy servers are *optional* recommendation systems which help proactively moderate
communities on Matrix. Communities which elect to use a policy server advertise their choice through
state events in individual rooms, and homeservers in the room may reach out to the chosen server for
opinions on how to handle local and remote events *before* those events are delivered to their users.
The functional role of being a policy server may be implemented by a dedicated server, typically
optimized for moderation, or integrated within a homeserver implementation.

In the general case, a homeserver which honours a policy server's recommendation to flag an event as
spam would [soft fail](https://spec.matrix.org/v1.14/server-server-api/#soft-failure) remote events
and reject local events to avoid delivering them to users. Servers which don't honour those recommendations
may see redactions issued by a server/user in the room to help protect those users too, much as the
moderation bots listed above already do today.

This tooling is entirely optional, and decided upon by the room/community itself, similar to moderation
bots. The specific filtering behaviour is left as an implementation detail, and is expected to be
configurable by the community using the policy server. Some examples may include preventing images
from being sent to their rooms, disallowing lots of messages from being sent in a row, and limiting
the number of mentions a user can make.

While there isn't anything which prevents policy servers from operating in private or encrypted rooms,
the intended audience is public (or near-public) rooms and communities. Most communities may not need
a policy server and can instead rely on moderation bots or other forms of moderation. Those which do
decide to use a policy server may find that they have it disabled or in a low power state most of the
time.

## Proposal

**This is a work in progress.**

A *Policy Server* (PS) is a server which implements the newly-defined `/check` API described below.
This may be an existing logical server, such as matrix.org, or a dedicated host which serves no other
purpose.
Comment on lines +52 to +54
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs to spell out that the PS implementation does not have to implement the full federation API if it doesn't want to: just the bits needed to handle /check and get a join event into the room.


Rooms which elect to use a policy server would do so via the new `m.room.policy` state
event (empty state key). The `content` would be something like:

```json5
{
"via": "policy.example.org"
}
```

**TODO**: Array for multiple policy servers?

Provided `policy.example.org` is in the room, that server receives events as any other homeserver
in the room would, *plus* becomes a Policy Server. If `policy.example.org` is not in the room, the
assignment acts as though it was undefined: the room does not use a policy server.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something obvious, but why is the policy server required to be a Matrix server that is participating in the room?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also appears to contradict the earlier statement: "This may be an existing logical server, such as matrix.org, or a dedicated host which serves no other purpose."

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Johennes My understanding of that sentence is simply that the /check endpoint may be implemented by an existing homeserver domain, or by a dedicated homeserver domain that doesn't provide any other Matrix functionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm missing something obvious, but why is the policy server required to be a Matrix server that is participating in the room?

This is to ensure the policy server consents to its role in the room. Otherwise, any random policy server could be dragged into doing work it's not prepared to deal with. The same concept is applied to homeservers: if your server isn't in a position to deal with HQ, then simply don't join HQ and you'll be okay.

Policy servers by their nature will need to consider how they work under attack/load, but most communities won't need massive DDoS protections to operate a policy server. In the general case, 99% uptime is probably fine, which will typically be the consequence of using a higher performance language like Rust or Go. Communities like matrix.org itself will need to consider putting Cloudflare in front of the server and designing the architecture to be resistant to takedown attempts, getting closer to 99.999% (or whatever SLA is more reasonable than just 99%).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also appears to contradict the earlier statement: "This may be an existing logical server, such as matrix.org, or a dedicated host which serves no other purpose."

What @jivanpal said. A logical server would be matrix.org (which runs Synapse) having code that supports the policy server API. A dedicated server would implement the policy server API and whatever else is a necessity to participate in Matrix (basically just /send and /key/v2/server from the Federation API) - endpoints like /get_missing_events would be unimplemented.

This is supposed to be clarified in the MSC, but isn't. See https://github.com/matrix-org/matrix-spec-proposals/pull/4284/files#r2090214975


If a policy server is in use by the room, homeservers SHOULD call the `/check` API defined below on
all locally-generated events before fanning them out and on all remote events before delivering them
to local users. If the policy server recommends treating the event as spam, the event SHOULD be soft
Comment on lines +71 to +73

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an escape hatch, should the homeserver accept events recognised as coming from a room admin without a policy check?

(Maybe even allow a room admin to set a threshold, so that a moderation bot can operate directly, while letting the admins to remove the bot with a normal kick)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be something for implementations to consider for sure. The MSC uses SHOULD to be deliberately permissive to this kind of behaviour.

failed if remote and rejected if local. This means local users should encounter an error if they
attempt to send "spam" (by the policy server's definition), and events sent by remote users will
never make it to a local user's client. If the policy server recommends allowing the event, the event
should pass unimpeded.

For Synapse homeservers, the above paragraph's consequences are natural behaviour of the spam checker
module feature. A server could, with some performance penalty, deploy a module which calls the `/check`
API to enact the consequences described above.
Comment on lines +79 to +81
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


The new `/check` is described as follows:

```
POST /_matrix/policy/v1/event/:eventId/check
Authorization: X-Matrix ...
Content-Type: application/json
{PDU-formatted event}
```

The request body is *optional* but *strongly recommended* for efficient processing, as the policy
Copy link

@nexy7574 nexy7574 May 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever a situation where a server would be checking an event, having only its ID, and no PDU content? It just feels like an odd thing to make optional, and opens up the door (imo) to corner-cutting in implementations that might end up being incompatible, making the policy server potentially ineffective.

An example could be the new Meowlnir policy server being deployed on a non-Synapse server. If the server requesting the check doesn't include the PDU, meowlnir has nothing to work with. Calling /event to fetch it from the homeserver has a high probability of not working, since it's likely that the event has been soft failed (and consequently won't be returned), which effectively defeats the check.
It's also been mentioned elsewhere that minimal federation implementations are something that has been thought about, and since the requirement is only that a local user is joined to the room that's being checked, this optionality might also catch them short.

I'd argue that making the PDU in the body required will be better and generally just more consistent, assuming slightly more bandwidth isn't something this MSC cares too much about,

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MSC is very concerned about bandwidth/performance, and about the policy server being lied to :)

A server may not include the PDU for a couple of primary reasons:

  1. It's processing a lot of traffic and those extra bytes are highly problematic for its operational environment.
  2. It wants to know if the policy server hasn't seen the event yet, as it could be an indication that the policy server is being excluded by a malicious server during fanout.

The second may be resolved by having the policy server return a flag, but the threat of having unknown events being flagged as spam should hopefully deter the behaviour.

I would expect that an implementation in meowlnir or another bot would not call /event to fetch it if excluded, and instead have (maybe configurable) behaviour to return a static response.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This MSC is very concerned about bandwidth/performance

Ah, well, that changes things, and makes it being optional completely understandable!

It wants to know if the policy server hasn't seen the event yet, as it could be an indication that the policy server is being excluded by a malicious server during fanout.

(assuming I haven't just missed it in my reading) this is probably something that should be clarified. Never crossed my mind that excluding the PDU could also be used to detect things like that, which actually brings it more use than bandwidth saving.

server may not make efforts to locate the event over federation, especially during `/check`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add «If the body is provided, the policy server SHALL verify that the event is signed by the claimed originating server.» to make the answer to the concerns in #4284 explicit?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be an implementation detail, but one we might want to formalize as a SHOULD rather than MUST.


Authentication is achieved using normal [Federation API request authentication](https://spec.matrix.org/v1.14/server-server-api/#request-authentication).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is routing to the server subject to the normal federation server name resolution stuff (https://spec.matrix.org/v1.14/server-server-api/#resolving-server-names) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume it makes sense to extend delegation in this MSC if we are using the normal server name stuff.

Since if the CS API can be separated it would make sense if this can also be separate but ofc using a separate subdomain always works as the MSC shows as an example.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is routing to the server subject to the normal federation server name resolution stuff (https://spec.matrix.org/v1.14/server-server-api/#resolving-server-names) ?

Yes - it's intended that the server participates in the room, but may not implement the full suite of federation endpoints.


Requests may be rate limited, but SHOULD have relatively high limits given event traffic.

The endpoint always returns `200 OK`, unless rate limited or a server-side unexpected error occurred.
If the request shape is invalid, the policy server SHOULD respond with a `spam` recommendation, as
shown below. If the event (or room) is not known to the policy server, it is left as an implementation
detail for whether to consider that event as `spam` or `ok`.

```
200 OK
Content-Type: application/json
{"recommendation": "spam"}
```

```
200 OK
Content-Type: application/json
{"recommendation": "ok"}
```

```
429 Rate Limited
Content-Type: application/json
{"error":"Too many requests","errcode":"M_RATE_LIMITED"}
```

```
500 Internal Server Error
Content-Type: application/json
{"error":"It broke","errcode":"M_UNKNOWN"}
```

**TODO**: Figure out a way to expose which filters (if any) caused an event to be flagged as spam, to
allow for more nuanced decision making by servers/bots. Also, if exposed, figure out a way to do so
which doesn't expose that detail to potential attackers. Consider exposing with score metrics per filter.

As shown, `recommendation` may either be `spam` or `ok`. (**TODO**: Consider different keywords)

**TODO**: Support namespaced identifiers in an array for more recommendations? ie: `[ok, org.example.warn_user]`
Comment on lines +133 to +139
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A way to send a warning to the user without needing to send the user a message would be very useful even when their event is marked as spam. Particularly when on-boarding users who are new to a community https://marewolf.me/posts/draupnir/25/02.html#priorities-why-on-boarding-is-the-best-place-to-focus. Because they probably will not always read the community's rules and welcome and try to do things straight away such as send images and links.

It's not clear to me whether the error message they will hopefully get in the client-server API for a user resident to a homeserver complying with the policy server will be enough. As clients might not make those user facing? What do you think?

(Another workaround would sending a phony whisper event down /sync but it's a bit cheeky and requires way more changes)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd definitely need to specify some error codes for clients to use/warn as needed. We would also need to define some default behaviour for what a "warning" even is. Do we use server notice rooms? Do we return an M_WARN error code (but otherwise allow the event to send normally?)? Do we build some other system to handle warnings?

(these questions don't have answers)

I expect a warning system of any variety would be its own MSC to avoid this one getting stuck in process, once this one is closer to actually moving forward.


Homeserver implementations SHOULD fail safely and assume events are *not* spam when they cannot reach
the policy server. However, they SHOULD also attempt to retry the request for a reasonable amount of
time.

In some implementations, a homeserver may cooperate further with the policy server to issue redactions
for spammy events, helping to keep the room clear for users on servers which didn't check with the
policy server ahead of sending their event(s). For example, `matrix.example.org` may have a user in
the room with permission to send redactions and `/check`s all events.

## Potential issues
Copy link

@CobaltCause CobaltCause Apr 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be worried about situations in which this MSC is in use in a room but not implemented in all HSs participating in that room, as the following could happen:

  1. all room moderators are on HSs that implement this MSC
  2. abuse content is sent to the room by abusive users
  3. moderators do not see these users or their content
  4. thus moderators don't redact the content and ban the users
  5. thus users on HSs that don't implement this MSC will still see the abuse content

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly this could be mitigated by having draupnir (or another moderation bot) also make requests to the policy server, and then redact events. You might see abuse content temporarily that way if your HS doesn't implement this MSC, but it may be better than nothing.

Important note about this is that the moderation bot would need to run on a server that does not soft-fail events based on the policy server, otherwise it wouldn't be able to see the events it needs to redact.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Important note about this is that the moderation bot would need to run on a server that does not soft-fail events based on the policy server, otherwise it wouldn't be able to see the events it needs to redact.

Perhaps there should be a way to opt-in to receiving soft-failed events as a client, so that moderation tools can still view and send redactions for abuse content and such. Such functionality would feel less hacky than making sure to run moderation tools on an HS that doesn't support this MSC.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're currently experimenting with having this done at the server level, via https://github.com/element-hq/policyserv_spam_checker for now

element-hq/synapse#18238 is another early example of experimentation in this area

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It kind of feels conceptually that each server should be able to redact events for their users only. This is in keeping with Matrix's server autonomy. Mod/admin servers can still redact which will still be broadcast to all other servers, but really whether this is applied is a per-server decision.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kegsay Such a design could also help prevent existing quirks, like DM conversation partners being able to unilaterally delete all conversation participants' messages.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Servers which use the policy server should be rejecting the events before they're sent to the room, instead of redacting them. A server could feasibly redact instead of send, but then it's just wasted events which may be further against room policy (flooding, "apparent malicious redaction", etc).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mimi89999 says:

What if a malicious server claims that they received abusive content from a specific user (and forges events that seem to be originating from that user) in order to get a user banned globally? Is there anything protecting against such a scenario or is it something that should be considered by policy servers?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not following the attack vector here. The server cannot forge events to appear as coming from another server without getting those events rejected completely, making them out of scope of the policy server.

If a malicious policy server does decide to auto-fail all events from a specific sender though, it can do so. The room can also elect to not use that policy server anymore. Policy servers should be cognisant of the impact they have when making decisions.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attack vector is: a malicious homeserver in the room asks the policy server about another homeserver's event, and includes the event content as an optimisation. The content is forged, however. The policy server only follows explicit SHALL and SHOULD of the spec, accepts the event content and does not properly check its authenticity. The policy server proceeds to give a signed response that the event should be redacted. The malicious homeserver posts the response to get other homeservers to throw out the event. The stretch goal is to have enough blocked events allegedely from the same user in a row that some other policy mechanism bans the user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose the question is whether the policy server should be compelled to check hashes, or if it should be left as an implementation detail. I'm leaning towards implementation detail, as a room can change policy servers if needed, and it could be considered a security vulnerability in the implementation.

This comment was marked as duplicate.

This comment was marked as duplicate.

This comment was marked as duplicate.

This comment was marked as duplicate.

This comment was marked as duplicate.

This comment was marked as duplicate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Gnuxie says:

If pass-through for bots and appservices is going to be left out of the proposal will there be some assurances that there will be something like a synapse module callback (so that a passthrough http api can be added via synapse-http-antispam)? Otherwise we're gonna lock out a few tools.

(prior comments at link provide a bit more context. Paraphrasing: bots will have a hard time dealing with X-Matrix auth)

see also:

It is annoying because it means that basic policy servers written as bots need to ask their own homeserver for each event (ie via GET /_matrix/client/v3/rooms/{roomId}/event/{eventId}) they get asked about. But i haven't read the whole proposal yet so my bad if i'm missing something.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll be experimenting with different architectures over the next few months, but some of those learnings will appear in future MSCs rather than this one.

Currently, the implementation behaves as a standalone server with the essential functionality implemented. This has led to some complications when we want to do more sophisticated checks, so we're working on shifting some of the responsibility out of a standalone server and into an existing server like Synapse.

Once the core functionality is in Synapse, it could export useful information out to the policy server itself through the appservice API. How non-appservice bots work, if at all, is not yet determined - it's possible they can't technically be policy servers, but they'll still have important value as layered protection for communities (especially if the advice is to not use a policy server most of the time).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Flooding the policy server with events can still lead to a denial of service. This can be complicated by exploits that are being used in the wild that starve homeservers themselves of CPU.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup - the MSC needs to call out more clearly that Policy Servers have to deal with that problem as a fact of life. They will be DDoS'd, either with event traffic or more generic attacks, and they need to survive that.

Copy link

@viccuad viccuad May 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed a concern for Policy Servers. For Kubewarden's policy-server we tackled this with a configurable deadline, and by failing-closed by default (if deadline extinguished, reject. Configurable too). Given that we can provision several policy-servers in parallel with differing policies on each, it is possible to have 1 policy-server with fast and important policies that fails closed, and another policy-server that fails open with more exploratory policies.

Kubewarden's policy-server gates Kubernetes via the Kubernetes Admission controller API; maybe it is worth to look at k8s' admission controllers as a similar enough field of study.


**TODO**: This section.

Broadly:
* Lack of batching is unfortunate (**TODO**: Fix this)

## Safety considerations

**TODO**: This section.

## Security considerations
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Servers should not send policy server state events through the policy server, or at least not act on them. Especially if the senders are the same.

This is to prevent a malicious policy server from not allowing the room to escape, but we should still have a mechanism to prevent compromised accounts from adding/setting malicious servers if we can.

Risk level of these 2 concerns is TBD.


**TODO**: This section.

## Alternatives

**TODO**: This section. Many of the inline TODOs describe some alternatives.

An alternative was considered where, in a future room version, all events must be signed by the policy
server before they're able to be added to the DAG. However, this results in compulsory centralization
and usage, removing the room's agency to choose which moderation tools they utilize and that room's
ability to survive network partitions. This alternative does have an advantage of reducing bandwidth
Comment on lines +169 to +172

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a Certificate-Transparency-like option?

A room admin MAY set a list L of policy servers, a count N, and a power level threshold T. By default, count is N=0, list L is empty, PL threshold is T=100. An even can only be incorporated if it originates from a user with PL≥T, OR it has N signatures from policy servers in L.

This way room admins can choose whom to trust, while avoiding a SPOF; they can also do an emergency override if it becomes necessary.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gets very complicated very quickly when working with state resolution and the auth rules, at least with how those algorithms currently work. Changes to the room model would certainly make a CT-style approach easier, though for now the rough plan is to spin the signing alternative out to a dedicated MSC to make this MSC more backwards-compatibility focused. The signing stuff would require a new room version to be enforceable, which doesn't help rooms that already exist today.

spend across the federation (as there's no point in sending a spammy event if the policy server won't
sign it), but would require that communities upgrade their rooms to a compatible room version, which
typically take significant time to specify and deploy.
Comment on lines +169 to +175
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can still be added as an optimization - servers MAY get a signature from the policy server before sending an event to the room, which would allow other servers to skip the policy server check when receiving the event.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just doesn't have the same effect, unfortunately. The purpose of requiring a signature is to frontload the effort required to send event rather than be purely reactive.


Copy link

@viccuad viccuad Apr 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We’d especially like to see implementations for non-Synapse servers

I would like to suggest Kubewarden's policy-server. Kubewarden is a CNCF project implementing a policy engine for Kubernetes, and its policy-server can run outside k8s and process any type of query (disclaimer, I'm a maintainer of the Kubewarden project).

The policy-server runs small policies compiled to Wasm (written in Rust, Go, CEL, Rego, with our provided small SDKs). These policies are shipped as OCI wasm artifacts. The policy-server implements a Wasm host with wapc-rs, receives a JSON request and accepts, rejects, or mutates the request depending on the outcome from the policies. The policy-server also does context-aware calls to other services apart from the JSON request that is evaluating at that moment (in this specific case, that would be adding querying a Matrix client). Policy definitions allow for specifying granular permissions WRT those context-aware needs, ala smartphone apps.

Users of the policy-server can develop their own policies in their preferred languages, test them locally with our kwctl CLI tool, and share and reuse them via OCI registries.

If interested, don't hesitate to have a look at our docs, ping me, or get in contact in our Slack channel.

## Unstable prefix

While this proposal is not considered stable, implementations should use the following unstable identifiers:

| Stable | Unstable |
|-|-|
| `/_matrix/policy/v1/event/:eventId/check` | `/_matrix/policy/unstable/org.matrix.msc4284/event/:eventId/check` |
| `m.room.policy` | `org.matrix.msc4284.policy` |

## Dependencies

This proposal has no direct dependencies.