Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce persistent cache for providers #1585

Merged
merged 1 commit into from
Feb 6, 2025

Conversation

szymonmaslowski
Copy link
Contributor

Context

We want to improve caching mechanism by making it persistent and therefore decrease the number of requests made to blockfrost

Proposed Solution

Allow utxo and chain history providers to use persistent cache

Important Changes Introduced

  • define a contract of the cache
  • enable chain history and utxo providers to use injected cache
  • create persistent cache storage
    • utilizing web extension storage
    • implementing the cache contract
    • having data evicting capability

The eviction mechanism

  • checks how much space is occupied by all of the collection's keys
  • if the allowed (specified in the parameters) space is exceeded it removes most rarely accessed items
  • the set of items to be removed is chosen based on the item's accessTime stored in metadata
  • the size of items to be removed is 10% of their count.
  • a fallback mechanism in case the getBytesInUse is not present - it is not present in the MDN docs neither in the polyfill lib we use (recent version)
  • the fallback checks if the total count of cached items exceeded the amount provided in fallbackMaxCollectionItemsGuard parameter

Example of data managed by the persistent cache implementation. Resouce name: utxo-provider-cache.
Result of storage.local.get():

{
  utxo-provider-cache-metadata: {
    utxo-provider-cache-item-x: {
      accessTime: 1738765316652
    },
    utxo-provider-cache-item-y: {
      accessTime: 1738765316653
    },
    utxo-provider-cache-item-z: {
      accessTime: 1738765316654
    }
  },
  utxo-provider-cache-item-x: {/* data */},
  utxo-provider-cache-item-y: {/* data */},
  utxo-provider-cache-item-z: {/* data */},
  /* other stored data */
}

Reasons behind that structure:

  1. Having metadata in a separate object we can safely access all the cache items for envicting.
  2. Storing each item flat allows to:
    • easily count occupied size (storage.local.getBytesInUse([...allKeys]))
    • easy removal and access to particular items.

@szymonmaslowski szymonmaslowski marked this pull request as draft February 5, 2025 14:42
@szymonmaslowski szymonmaslowski self-assigned this Feb 5, 2025
@szymonmaslowski
Copy link
Contributor Author

szymonmaslowski commented Feb 5, 2025

I need to finish unit tests but the implementation can already get the initial reviews

@@ -28,3 +28,8 @@ type Impossible<K extends keyof any> = {
[P in K]: never;
};
export type NoExtraProperties<T, U> = U & Impossible<Exclude<keyof U, keyof T>>;

export type Cache<T> = {
get(key: string): Promise<T | undefined>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I parametrised only the value type. The extension storage allows only for string keys.


const sizeOfChunkToBePurged = 0.1;

const isGetBytesInUsePresent = (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The webextension-polyfill does not list getBytesInUse method in the storage API. With this type guard we ensure the getBytesInUse is present.

for (const key of mostDatedKeysToPurge) {
delete metadata[key];
}
await storage.local.set({ [metadataKey]: metadata });
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In worst-case scenario we will evict an item that was accessed right before the eviction logic started - accessing and evicting happening simultaneously:

  1. item X has an old accessTime
  2. item X gets accessed, but its accessTime update is only scheduled
  3. the eviction logic kicks in and accesses the metadata where X has old accessTime
  4. X gets clasified to be evicted
  5. X gets accessTime updated in the metadata
  6. X gets removed.
  7. X's metadata gets removed.

Another scenario: eviction starts first and simultaneously accessing happens

  1. metadata with old X's accessTime queried and X classified for eviction
  2. X accessed and its accessTime update scheduled
  3. X removed
  4. X's accessTime updated
  5. X's updated metadata removed

@szymonmaslowski szymonmaslowski force-pushed the feat/LW-12228-improve-caching-mechanism branch 2 times, most recently from d497913 to 0bc2019 Compare February 6, 2025 09:43
@szymonmaslowski szymonmaslowski marked this pull request as ready for review February 6, 2025 09:45
@szymonmaslowski szymonmaslowski force-pushed the feat/LW-12228-improve-caching-mechanism branch from 0bc2019 to d412ba9 Compare February 6, 2025 09:59
@szymonmaslowski szymonmaslowski marked this pull request as draft February 6, 2025 10:22
@szymonmaslowski szymonmaslowski force-pushed the feat/LW-12228-improve-caching-mechanism branch from d412ba9 to 7b53ece Compare February 6, 2025 10:31

void (async () => {
await updateAccessTime(itemKey);
if (await isQuotaExceeded()) await evict();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check exceeded quota before setting item, so we know we have room to insert it?
Or maybe we can do try ... catch and handle evict in catch, and after re do loaded.set in finally?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a conversation with Piotr and agreed we dont want to catch any error here considering we are gonna have unlimited storage

pczeglik-iohk
pczeglik-iohk previously approved these changes Feb 6, 2025
Copy link

@pczeglik-iohk pczeglik-iohk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @szymonmaslowski 👏

@szymonmaslowski szymonmaslowski marked this pull request as ready for review February 6, 2025 11:02
- define a contract of the cache
- enable chain history and utxo providers to use injected cache
- create persistent cache storage
  - utilizing web extension storage
  - implementing the cache contract
  - having data envicting capability
Copy link
Contributor

@przemyslaw-wlodek przemyslaw-wlodek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏🏻 Great job @szymonmaslowski!

@szymonmaslowski szymonmaslowski merged commit 783b037 into master Feb 6, 2025
5 of 6 checks passed
@szymonmaslowski szymonmaslowski deleted the feat/LW-12228-improve-caching-mechanism branch February 6, 2025 13:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants