-
Notifications
You must be signed in to change notification settings - Fork 813
Experiment: Plug caches in #18468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Experiment: Plug caches in #18468
Conversation
❗ Release notes requiredCaution No release notes found for the changed paths (see table below). Please make sure to add an entry with an informative description of the change as well as link to this pull request, issue and language suggestion if applicable. Release notes for this repository are based on Keep A Changelog format. The following format is recommended for this repository:
If you believe that release notes are not necessary for this PR, please add NO_RELEASE_NOTES label to the pull request. You can open this PR in browser to add release notes: open in github.dev
|
As usual I installed this in Visual Studio. Things seem to work normally. Performance feels good. |
I suggest getting stats for project during working with it in ide and adjusting cache size to avoid resizes. |
Also, perf benefits won't show on normal solution, you can take the original openmp one from my original fix |
Original fix for reference: #17668, because every time I'm looking for it I can't find it 😄 |
src/Compiler/Utilities/Caches.fs
Outdated
// this leads to and exception when trying to evict without locking (The index is equal to or greater than the length of the array, | ||
// or the number of elements in the dictionary is greater than the available space from index to the end of the destination array.) | ||
// this is casuedby insertions happened between reading the `Count` and doing the `CopyTo`. | ||
// This solution introduces a custom `ConcurrentDictionary.sortBy` which will be calling a proper `CopyTo`, the one on the ConcurrentDictionary itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there would be any drawbacks in implementing that ToArray
special handling (via a typecheck) in the Seq.
module directly.
This is an unpleasant bug to deal with and I cannot thing of regressions when moving from old behavior to the one suggested here.
(big drawback is the added cost of type test for every single user, I do understand that :( )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can be implemented and documented as "technically breaking, but correct change". I didn't bother with it, and just did the implementation I required in the cache itself.
Can we do it without type test via the inline compiler-only feature (i.e. type testing on type level and choosing correct implementation)?
module Seq =
let sortBy<'T> projection source =
checkNonNull "source" source
mkDelayedSeq (fun () ->
let array = source |> toArray
Array.stableSortInPlaceBy projection array
array :> seq<_>)
when 'T : ConcurrentDictionary<_> =
checkNonNull "source" source
mkDelayedSeq (fun () ->
let array = source.ToArray()
Array.sortInPlaceBy projection array
array :> seq<_>)
This way, there's no runtime type test drawback.
I'm not sure if this is sensible at all in light of #17668 (comment) |
With the cache, those comments are not really relevant anymore. I've made cache keys rely on stamps (and more). Plus with the cache instead of CD, we now have eviction and in the worst case scenario sizes will need to be adjusted. |
Ah, now I see, |
So, disregarding hit rate for now, I did count just unique keys that get cached. opening the handful of .fs files in OpenTK 5 branch, the count explodes to 200k immediately. Edits in some of its most affected files cause a lot of churn, too, but it is usable. To see in action, start VisualFSharp debug build, total count will show up in Output window. |
To reiterate, steps to test this with Visual Studio:
on |
Some ideas:
|
Question is what is the memory use increase in IDE when using cache? |
Not a call to action, but I'd be curious about the hit/miss ratio, and maybe overall mru/lru rate - I.e. how many keys are we consistently hitting 1) in the first typecheck and 2) in the rest of the tooling lifetime. This should help figuring out the default sizes for different projects. Maybe even have some heuristic on how to grow/shrink the cache during the tooling use. |
I was afraid that we might require something like that, just didn't want to overcomplicate it. Evict failures aren't bad per se as long as we are maintaining cache properly, I.e. we evict the right amount, don't grow way beyond the capacity, and have the amount of items we expect at any given moment. Update 1: Actually, I don't fully understand what do those represent? Are they amount of times we tried to remove element from the cache, but it wasn't there by the time we tried? If so, during which time period? Also, I don't think at this point I get when it can happen. If we evict synchronously on the same thread, or on separate thread (also synchronously), how they can already be evicted? |
This is just failure count per second. I tried making everything synchronous and it's still there. I suspect equality / hashcode is not stable 😐 |
The thing it boils down to is we put something in the CD and later it can't be found by supposedly the same key. In effect the store grows forever and failure rates pile up |
This is expected, since cache keys rely on stamp. So any change to type will result in a new stamp. But those would be misses. I don't get what are evict failures. Eviction is simply "get keys with specific expiration, remove them". If we count removal failures, it means either that they were removed between fetch and remove, or we didn't use the same key. What does evict success metric look like. And evict attempts. Like how many times we tried to evict, how many keys we decided we want to evict, how many did we successfully evict and unsuccessfully. When I tested it last time with BDN, I didn't see this behaviour. I saw cache grow and shrink as I would expect it to. |
I tested this theory by making the override this.GetHashCode() : int = this.canCoerce.GetHashCode() immediately the cache behaves perfectly: MaxCapacity is 10k. It can be seen the moment it's reached evictions start and do keep it without fails. Also a lot more cache hits, because now we can actually find items in the store. I don't know how feasible this is but the fix I envision is to provide hashcode that is tied to the source text as much as possible, not to the mutable type resolutions. It means take update: With this, fsharp/src/Compiler/Utilities/TypeHashing.fs Line 131 in 85ea990
|
The problem was in using the signature hash functions. I guess they will not produce a stable hash while the compilation is not done, constraints are still being solved and so on. This seems to do the trick and things work ok now. On the side of the cache, I added a simple eviction queue and an object pool to reuse nodes and keep allocations constant. |
Description
Just connect #18190 to see what the performance is in the IDE.