-
Notifications
You must be signed in to change notification settings - Fork 13
Offline and persistent cache support #61
Comments
Thank you for working on this. I think it would be good to allow everyone to use whatever persistent storage they want. For example, the https://github.com/apollographql/apollo-cache-persist allows you to select these:
or any custom storage, for example I use this to connect to the indexedDB: import { get, set, keys, del, clear } from './idb-keyval';
export default {
clear() {
return clear();
},
getItem(key) {
return get(key);
},
setItem(key, value) {
return set(key, value);
},
keys() {
return keys();
},
remove(key) {
return del(key);
},
removeItem(key) {
return del(key);
},
}; |
Hi I'm the maintainer of the Apollo-Cache-Persist and various offline libraries for GraphQL. I have been playing with the Urql-exchange-graphcache for a while and I absolutely love it. I think from the community side Apollo-Cache-Persist is the main reason why so many people use Apollo Client at the moment in their React-Native apps that tend to kill views when transitioning. |
I will also put some extra info for context after 2 years working with GraphQL cache:
I think for the quick win we could adjust apollo-cache-persist implementation to work with the urql or create a separate package that will hook into cache write operations and will know how to restore it. I haven't really tried to do it to say how hard it will be. https://github.com/apollographql/apollo-cache-persist/blob/master/src/onCacheWrite.ts |
Hey @wtrocki I'm super happy that people are interested in this issue, we encourage community exchanges and are happy to help out where possible.
I can look into that wrapper this weekend |
The way it will work is that there will be separate persistor available globally that will need to be awaited and then it will setup initial cache. If graphcache has ability to seed initial data then this is very trivial to implement
Absolutely not. They should be ignored completely. Most of the frameworks like offix or luna.js recreates them anyway. Trick is to have cache that do not apply optimistic responses to data. It has two separate fields:
Restart is kinda tricky as there is no way to restore promise chain that usually removes optimnistic responses so best to not store them.
This should be possible by hooking into client.destroy() method. Since cache is an separate exchange to client it is best to hook into client lifecycle. (but not sure about that
IMHO this will be trivial. Let's collaborate on this If there is sample app for url that has cache and if we can simply add console.log everytime backend payload gets saved then integration should be trivial and we can donate tons of code from cache-persist that will work here. My main question will be to see if save should be persisting entire cache every time or it should be connected to individual server responses (which comes with tricky normalization challenge) |
Optimistic responses are layered on top of data so that in essence is no issue, my reasoning behind keeping optimism around is that we want to restore the data AND be able to dispatch the request when the user gets online. Maybe I was putting too much eggs in one basket though.
Definitley, https://github.com/JoviDeCroock/threed-web is an app we can use to test it, linearly we have an API for that https://github.com/kitten/threed-example-api We can use that to test on, this has all things of graphCache implemented (optimism, ...) |
See https://offix.dev . This is the exact use case of this library. However it is way too much responsibility for an cache persistence.
Perfect. Going to check this and provide update in this PR. |
@wtrocki That looks awesome! I’m thinking of how we could approach this at “scale” 😂 So pessimism ensures stable perf and immutability (which we don’t need right now) but is otherwise really simple. I’ve been thinking that it’d be nice if we could have a store wrapper (or modify pessimism) that provides a synchronous KV layer (like what pessimism does right now) but flushes writes to any async storage. On start we’d then only have to restore from that async storage and queue up operations while we wait for it 🤔 I think, like you said, we wouldn’t even have to preserved optimistic writes, since on a restart we’d just reexecute offline operations, which then restore the optimistic writes anyway. Regarding what needs to be saved, it’s only records, connections, and links that are relevant to persist data. Edit: so my thinking is; we could allow for a persistence layer that allows to pass in any store that adheres to an interface with:
Does that sound about right? |
Yes. I had that exactly in mind and it should be trivial.
Would it work like singleton - first call will try to restore from persistence.
Really nice idea. Need to think on how this would work - optimistic responses and update methods will be global right? |
I think we’re in a much better position now to tackle this 🥳 The pessimism KV layer is gone and has been replaced with a much simpler backing store. It’s still storing and treating optimistic entries separately, which is perfect since we don’t want to persist them. The next step would hence be to allow a persisted store to be slotted in that we can flush writes to regularly. Then we’d want to introduce operation buffering to delay operations on startup while the store is being seeded. And lastly we’ll want to persist optimistic operations (and flush them after seeding and when the user goes back online) Lastly we may want to enable full cache invalidation, which may need to be automatic. We could look at schema information that is persisted and invalidate parts of the offline store if it doesn’t match the schema anymore (and allow full clearing on logout for instance) One unanswered question is how we can achieve this without I creasing the footprint of Graphcache massively. |
I’d say that we’d only need a certain amount of things in graphCache:
This way offix handles all the complex offline-online logic while graphcache remains focussed on being a normalised cache. I do agree that we should have some low_prio work that would involve taking our schema and removing fields,... this can be considered nice to have in the start though. I think I still have a working implementation of the buffered operations, this does imply that we expect an async function to be passed to retrieve the offline store data -> run our adapter/transformer -> inject into our store. I think if we limit ourselves to a serializer - transformer - “hydrator” the footprint impact would be small since the transformer/serializer part can be tree-shaken out and the added logic will be minimal. First things imo would be to see how solutions for offline storage persist at this time and see how we can deserialize on our end to inject it. |
I think that is the key to everything
There is actually nice thread on apollo client repo (as it will get cache storage feature for 3.0. |
Awesome! thank you so much and sorry for not making it on time. Yes. I will try this out with the customized apollo-cache-persist and it that work I will post it as a package. Having dev version of the PR published will be amazing. Follow up will be to get more complex use cases like cache invalidation etc. (offix) |
Persistence has been implemented now by #137 and #138. There's an example that demoes it in #141. The next step now is working on an offline exchange (or a built in one into the main @wtrocki We publish every PR via Pika CI. So you can already give this a go by installing |
I think to efficiently do this in a separate exchange we'll need to add, |
Hi to all, wora/cache-persist: uses a javascript object synchronously and processes communication with storage asynchronously (highly configurable in all its aspects, storages: localStorage, sessionStorage, indexedDB, React-Native AsyncStorage & any custom storage) wora/netinfo: simple library that implements the react-native netinfo interface to use it also in the web wora/offline-first: persistent Cache store for Offline-First applications, with first-class support for optimistic UI. Use with React, React Native, or any web app. i used it for create: The main advantages of integrating these libraries are:
In the repository offline-examples you can find examples of using the offline for apollo (web and react-native) and relay (web and react-native) For any additional information or if interested in making a beta in which they are integrated, please contact me I will be happy to answer and help. |
Offline
We all think about this in the modern PWA-era but there's a lot to this. We'll have to keep track of what requests the user needs to send when the connection is restored, after these requests are sent there will MOST likely be several optimistic entries to clear.
Operations
So for knowing what operations to cache it should be sufficient to only cache mutation operations. These will then be kept in a map<key, operation> and be persisted to some
indexedDB/localStorage
when we kill the application and they haven't been dispatched yet.The hard part about this is that we would have to restore the
optimisticKeys
in the exchange, this makes me think about moving these to our instance of store instead. Since the serialisation ofentities
,links
andoptimisticKeys
could then happen from one place. This brings as additional advantage that it can be done with onerestore
method.One concern would be the read/write speed of killing/rebooting the cache in this state. The HAMT structure is quite hard to serialise taking in account that it will contain optimistic values mixed with normal ones.
Connection checking
This should be easily doable by means of navigator.online, we could buffer all requests until we come online and then send them in the correct order one by one to avoid concurrency problems. The difficult part here wold be that we buffer up until all operations are dispatched, this means that if the user performs another action while we are emptying the queue this could take a while to get a response (given we are using optimisticResponses though).
Ideally when we see we are offline we filter all queries, and just keep them incomplete. When we see we are going offline all subscriptions should receive an active teardown.
Exchange
When reasoning about this my thoughts always wonder to a separate exchange to manage the operation buffering and to incorporate the restoring/serialising inside the graphCache. This has a bit of an overlap but I think it's sufficient reason to keep them separate.
Persistance
Here I'm having issues seeing how we could effectively solve this, we have the schema now so we could potentially just iterate over the whole schema and write it that way but that won't cover the case where people just want persisted-cache without the whole schema effort.
What scares me the most about this is that localStorage isn't the ideal candidate for persisted cache but by using indexedDB we exclude about 5% of the browser population.
IndexedDB seems to ask for permission if a blob is >50MB on Firefox, that's about it no explicit size limitations for even a single data field.
The max size for localStorage is 10MB so I don't really think this is sufficient for big applications, since the initial cost of the data structure is also there. We could strip everything down but how do we rebuilt it then, maybe by bucket size?
This is a brain dump of what I've been thinking about and is by no means a final solution but I think this could serve as an entry to finding the solution to what feels like a really awesome feature.
Other relevant solution: https://github.com/redux-offline/redux-offline/tree/v1.1.0#persistence-is-key
This uses
redux-persist
under the hood that also relies onindexedDB
under the hood. Since this is a reliable and widespread solution I think it's safe to resort toindexedDB
and fallback tolocalStorage
when needed.For
react-native
we can easily resort to theAsyncStorage
module. It seems that AsyncStorage isn't 100% safe either since on android this errors out when you exceed a 6MB write.Introducing some way of leaving certain fields/queries out seems very mandatory to me since in the test described underneath we see that we're hitting the limits of localStorage pretty quickly.
Test
I did a small test with our current benchmarking where I serialised 50k entities and just wrote them to a JSON file to look at the size:
This already exceeds the limits of
localStorage
and would cause a prompt inindexedDB
asking for permissions saving this amount of data.Code used:
Wild thoughts
I've been thinking about maybe making a distinction between a storage.native and a storage file. This way we could leverage web workers and application cache to write our results at runtime instead of just when we close the application.
Requirements
To implement persistent data we would have to implement an adapter with an API surface for getting setting and deleting. People can in turn pass in every storage they would like, this way people who use something like PouchDB can write an adapter and just use that.
We should decide on an approach when to write, after every query? This would make us have to write after every optimistic write as well which makes everything a tad harder certainly since it's going to be hard to incrementally write changes from our HAMT structure. I think it's better to work with a hydrate and exit approach. This could make writes take up more time but in the end would require a whole lot less logic.
We would need an approach that can evict certain portions of the state from being cached, examples would be an exclude/include pattern. When we include something that will be the only thing being cached. When we exclude something all but that exclude will be cached. These should be mutually exclusive.
When not supplied with a schema how would we arrange for excluding data.
Drew up a diagram of how I expect this to happen, the code for the offline part was easy to write and is done.
The text was updated successfully, but these errors were encountered: