Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[QUESTION] Is it possible to have distinct objects for L1 and L2? #321

Open
mejas opened this issue Oct 23, 2024 · 10 comments
Open

[QUESTION] Is it possible to have distinct objects for L1 and L2? #321

mejas opened this issue Oct 23, 2024 · 10 comments

Comments

@mejas
Copy link

mejas commented Oct 23, 2024

Hi, right now I am considering using FusionCache as an integration for an existing project, but have a bit of a challenge since it seems that the L1 and L2 caches have to be exactly the same.

A quick overview:

  1. I am currently using IMemoryCache to store compiled Roslyn objects from a string
  2. I would like to add FusionCache, but would like to do the following:
    1. L1 - store the compilation results (delegates) for execution by callers
    2. L2 - store the string data since serializing the delegates is not a guaranteed process

It looks like FusionCache only serializes what it has in L1 for L2 and no way to provide a distinction. Implementing my own IFusionCacheSerializer might work, but if I return the delegate only for L1, the serializer has no access to the string. I could try storing the string as a pair then see it in L2, but now I have a string and wasted memory since its only needed for the serializer.

Why do this?

  1. The compilation process is a bit painful so I would like the results stored in-memory
  2. FusionCache provides a nice backplane integration that I would like to utilize as well (for scaling-up... or having another service signal a change in data)
  3. I would prefer to not roll my own especially with a battle-tested library available

Your thoughts would be greatly appreciated!

@jodydonetti
Copy link
Collaborator

Uuh, this is an interesting challenge!
You made the scenario quite clear, so let me think about it, will let you know.

@adnan-kamili
Copy link

@jodydonetti I have a related issue. The app memory is limited so typically one would not store everything in L1. Some things are meant to be stored only in L1 (e.g. memoization), some in L1 and L2 (very hot data) and some only in L2 as L2 has much larger memory to store data.

For this, I have created multiple named caches, but the note on skipping memory caches which is a very common scenario is not encouraging:

NOTE: this option must be used very carefully and is generally not recommended, as it will not protect you from some problems like Cache Stampede. Also, it can lead to a lot of extra work for the 2nd level (distributed cache) and a lot of extra network traffic.

        #region add fusion cache
        builder.Services.AddFusionCacheSystemTextJsonSerializer();
        builder.Services.AddFusionCacheStackExchangeRedisBackplane(options =>
        {
               options.Configuration = builder.Configuration.Get<AppOptions>().Redis.Url;
               options.ConfigurationOptions.ChannelPrefix = "fusion-cache-invalidation-channel";
        });

        // this named cache is used for memoization with no invalidation needed
        builder.Services.AddFusionCache("MemoizationCache")
            .WithDefaultEntryOptions(new FusionCacheEntryOptions
            {
                Duration = TimeSpan.FromDays(1),
                JitterMaxDuration = TimeSpan.FromHours(10)
            });

        // this named cache is used for caching hot data in memory and distributed cache
        builder.Services.AddFusionCache("HybridCache")
            .WithDefaultEntryOptions(new FusionCacheEntryOptions
            {
                Duration = TimeSpan.FromDays(1),
                JitterMaxDuration = TimeSpan.FromHours(1)
            })
            .WithRegisteredSerializer()
            .WithRegisteredDistributedCache()
            .WithRegisteredBackplane();

        // this named cache is used for caching long lived data in distributed cache
        builder.Services.AddFusionCache("DistributedCache")
            .WithDefaultEntryOptions(new FusionCacheEntryOptions
            {
                Duration = TimeSpan.FromDays(1),
                JitterMaxDuration = TimeSpan.FromHours(1),
                SkipMemoryCacheRead = true,
                SkipMemoryCacheWrite = true
            })
            .WithRegisteredSerializer()
            .WithRegisteredDistributedCache();
        #endregion

@jodydonetti
Copy link
Collaborator

Hi @adnan-kamili

For this, I have created multiple named caches, but the note on skipping memory caches which is a very common scenario is not encouraging:

NOTE: this option must be used very carefully and is generally not recommended, as it will not protect you from some problems like Cache Stampede. Also, it can lead to a lot of extra work for the 2nd level (distributed cache) and a lot of extra network traffic.

Thanks for the input, appreciate it!
I may change the wording and add something like this: "if you are in a scenario with limited memory, instead of skipping L1 entirely you can set a lower duration for L1 so that memory usage will not grow".

Your setup then becomes like this:

// this named cache is used for caching long lived data in distributed cache
builder.Services.AddFusionCache("DistributedCache")
  .WithDefaultEntryOptions(new FusionCacheEntryOptions
  {
    Duration = TimeSpan.FromSeconds(1),
    DistributedCacheDuration = TimeSpan.FromDays(1),
    JitterMaxDuration = TimeSpan.FromHours(1)
  })
  .WithRegisteredSerializer()
  .WithRegisteredDistributedCache();

Thoughts?

@adnan-kamili
Copy link

Thanks for the quick reply. This gets me to another question, how often does fusion cache scan the memory cache to evict the expired items. Usually in other caching libs it is configurable and if it is not configurable what are the defaults used by fusion cache?

@jodydonetti
Copy link
Collaborator

Good question!

I recently answered to exactly this recently here.

Hope this helps, let me know.

@adnan-kamili
Copy link

Thanks, this has been helpful!

I still believe that storing non-hot data in the in-memory cache is an unnecessary use of resources. It increases the number of entries that the expiration scan needs to process, which can impact performance. Additionally, according to one of the references I came across, cache entries are not immediately removed from memory when they expire or are evicted from the cache—they are only cleared during the next garbage collection (GC) cycle. Under high load, this can lead to temporary memory spikes, which could have been avoided by solely relying on a distributed cache for non-hot data.

@jodydonetti
Copy link
Collaborator

One thing I'd like to understand is this: when you say "non-hot data" you still think that will be anyhow accessed, right?
I'm asking because if it's accessed and L1 is skipped entirely, you would:

  • not pay for the L1 memory that would be allocated, true
  • but pay at every single access for network (cpu+memory) and deserialization (cpu+memory)
    So I'm not sure it's worth it.

Anyway, I would recap the 2 options as:

  1. LOW L1 DURATION: full protection from cache stampede no matter what. If data is accessed, lower memory allocation and cpu usage. If data is not accessed, some memory allocated, for a little time
  2. SKIP L1: no full cache stampede protection. If data is accessed, higher memory allocation and cpu usage (network+deserialization). If data is not accessed, no memory allocated at all.

The nice thing is that you can pick the one you prefer (or try both and see how it goes) since they are both available.

Hope this helps, let me know.

@adnan-kamili
Copy link

Thanks for the comprehensive answer. I am now more inclined towards your suggestion LOW L1 DURATION, in case we face major memory consumption issues we can easily switch to SKIP L1.

@mejas
Copy link
Author

mejas commented Feb 5, 2025

Thanks for the insightful discussion! Using named caches might be a good idea.

I'll leave this ticket open for now should anyone else have any suggestions.

@jodydonetti
Copy link
Collaborator

Thanks all, these discussions are really, really great.
On one hand I can better share the reasonings behind a certain design decision, and on the other hand I can discover different use cases and perspectives.
Chef's kiss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants