Skip to content
This repository has been archived by the owner on Nov 25, 2024. It is now read-only.

More efficient query of if server can see event #215

Open
erikjohnston opened this issue Sep 6, 2017 · 2 comments
Open

More efficient query of if server can see event #215

erikjohnston opened this issue Sep 6, 2017 · 2 comments
Labels
C-Roomserver T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. X-Performance Issue/PR around something that is slow or taking lots of memory

Comments

@erikjohnston
Copy link
Member

Currently we implement the question of "can a server see an event" by pulling out the state at the event (and potentially the current state). This is probably needlessly inefficient, especially as its currently in the room server.

@NegativeMjark
Copy link
Contributor

NegativeMjark commented Sep 6, 2017

The problem with the "can a server see an event" query is that in order to answer it we need to know if the server is in the room at a given point. A server is in the room if there exists an event in the room state with event["type"] == "m.room.member", event["content"]["membership"] == "join" and SplitN(event["state_key"], ":", 2)[1] == serverName.

Currently we answer this query by iterating over the entire room state, which can be thousands of events for some matrix rooms.

We could improve the query by only selecting those state events with event["type"] == "m.room.member" using existing indexes, but since those events account for the majority of state events in a lot of matrix rooms that doesn't help us much.

We could improve the query by only selecting only those state events where SplitN(event["state_key"], ":", 2)[1] == serverName by adding an extra index somewhere, but for rooms that are bridged from IRC nearly all the state events are for users on a single server, so we still end up loading nearly all the room state for queries for that server.

We could improve the query (StateSnapshotID, ServerName) -> InRoom by caching the result in RAM, but an individual room can have multiple snapshots and there are many rooms and many servers so this approach creates an arms race between remote server request patterns and the size of the in-memory cache. Also it makes starting a server really painful since it runs very slowly until the cache warms.

We could improve the query (StateSnapshotID, ServerName) -> InRoom by precalculating the result when we create a new snapshot and storing the result in the database. This will slow down inserting events in the room server, but might be the best option in terms of overall performance.

@kegsay
Copy link
Member

kegsay commented Mar 20, 2020

Creating an extra table in roomserver to store this information per:

precalculating the result when we create a new snapshot and storing the result in the database.

seems the most sensible approach here.

@kegsay kegsay added T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. C-Roomserver X-Performance Issue/PR around something that is slow or taking lots of memory and removed maintenance labels Dec 5, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
C-Roomserver T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. X-Performance Issue/PR around something that is slow or taking lots of memory
Projects
None yet
Development

No branches or pull requests

3 participants