-
-
Notifications
You must be signed in to change notification settings - Fork 672
Caching design #1102
Comments
Writing this all down while it's still somewhat fresh in my mind. ProblemAt the moment, we have two caches in a first-class implementation—
If we are going to embrace caching as a real architectural design decision, rather than something that we shoe-horn in ad-hoc, then we need to give some thought as to what this will actually look like and the rules we should set for ourselves. MutabilityTo start with, both Following #1094, where we now properly implement key-fetching once the validity of a key passes, we now have had to change DiscoverabilityIn addition to this, we have other caches dotted around which aren't implemented in a first-class caching structure, like Generally, I believe in first-class structures like the one in #1101. They're easy to understand and to reason about, the logic and tuning for them lives in one place and it's really easy to see where they are used (both as readers and as writers). Go gives us good machinery as well in the form of MonolithIn monolith mode, a lot of this is quite simple—for each type of cache, we can just use something like There is an additional benefit here, in that one component that caches something is also benefiting all other components, as they will be able to hit the cache and get a value rather than have to make an API or a database call themselves. PolylithIn polylith mode, things get a lot more complicated. We currently build new For anything immutable, this is fine. We don't need to worry about invalidating anything as the values will never change. If we hit upper-bounds on, e.g. item count or memory, we can just evict the cache entries. For anything mutable, this presents a problem. If the federation sender goes and gets an updated server key and updates its own cache, then there is no signalling or invalidation to other components to also update their caches, therefore the other components will operate on out-of-date information until the cache entries are evicted through other means. PossibilitiesThe following are possible things that we could do to try as long goals—read: not necessarily right now! Tiered cachingThe idea here is that we would implement long caches and short caches as a two-tiered system:
Questions:
Single perspective (Redis?)This doesn't matter in monolith mode, because we already have single caches in-process. We can continue to do exactly that. In polylith mode, if we don't want to deal with invalidation notifications/streams (like Synapse does presently), then we ideally need to maintain the illusion of having only one set of central caches. We could do this in a polylith deployment by offloading to Redis, which could either run on the same machine or on a different machine, and will still ultimately be many times faster than pulling something from Postgres. This probably avoids the invalidation problem somewhat in that all components are all pulling from the same place - we don't need to stream invalidations between components. Questions:
JustificationNone of this should be taken as my wanting to eagerly cache everything. So far, room version and server key caching exist because we have met significant performance gains in doing so (particularly where joining federated rooms and retrieving history are concerned). I would like to ensure that we are only building in caches where we know it is justifiable to do so, since they can also create other problems. Whatever model we come up with, I think that we ideally need to update our contributors doc with information such as:
Other issuesSome other questions that we might want to answer:
|
We need to think about caching properly, off the back of the following things:
RoomVersionCache
ServerKeyCache
EDUCache
transactions.Cache
And probably many others.
The text was updated successfully, but these errors were encountered: