Skip to content

Reduce background GC allocations #114320

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

timcassell
Copy link

Fixes #101536

I tried to build this to test locally, but was unable to.

Changes:

  1. Added internal ref struct enumerator to enumerate ConditionalWeakTable objects without allocations.
  2. Added a ThreadStatic GCMemoryInfoData to remove allocations after the first call to GC.GetGCMemoryInfo().

Both of these are used by SharedArrayPool.Trim() which is called from Gen2GcCallback.

For 2, it looks like the extern void GetMemoryInfo(GCMemoryInfoData data, int kind) function could be extern void GetMemoryInfo(GCMemoryInfo* data, int kind), passing a struct pointer instead of class to be always zero alloc, but I don't know enough about the system or C++. @Maoni0

cc @jkotas

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Apr 7, 2025
@@ -96,22 +92,64 @@ internal sealed class GCMemoryInfoData
/// </remarks>
public readonly struct GCMemoryInfo
{
private readonly GCMemoryInfoData _data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will change the atomicity of GCMemoryInfo. Is there anything depending on this?

Copy link
Author

@timcassell timcassell Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All uses in the repo just read a value from the struct and throw it away (except one ArrayTest that stores the struct in a static readonly field). Also, it's just an implementation detail, and the old implementation before #37879 did not wrap a class, so it would be unsafe for anyone else to have taken a dependency on it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that the implementation was changed to use class internally to avoid large amount of copying. This is likely a performance regression for a typical GCMemoryInfo use.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unlikely to affect performance for typical use cases (which is just reading some values and throwing it away). It should at least reduce GC pressure for typical use cases that always throw away the allocated object. @Maoni0 Maybe you have some more insight here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, this change makes ReadOnlySpan<System.GCGenerationInfo> GenerationInfo getters dangerous. These getters assume that the lifetime for the underlying storage is managed by the GC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that object stack allocation might be able to optimize at some point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that object stack allocation might be able to optimize at some point?

Possibly. cc @dotnet/jit-contrib

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that object stack allocation might be able to optimize at some point?

Possibly. cc @dotnet/jit-contrib

Escape analysis doesn't handle objects with finalizers, since the runtime allocation helper for finalizable objects immediately causes an object reference to escape.

But it seems like if the JIT could prove that object was created within a try/finally with proper invocation of Dispose in the finally, stack allocation might be possible. I believe with foreach enumerations the enumerator constructor invocation is not typically within the try... will need to double-check.

In order to stack allocate we'd also need to ensure that the various enumerator methods were inlined. The methods here are fairly large so not clear if that would happen with current heuristics.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AndyAyersMS What about the GC.GetGCMemoryInfo, which allocates an object without a finalizer, but does send it to native code?

Copy link
Author

@timcassell timcassell Apr 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it seems like if the JIT could prove that object was created within a try/finally with proper invocation of Dispose in the finally, stack allocation might be possible. I believe with foreach enumerations the enumerator constructor invocation is not typically within the try... will need to double-check.

You're correct, the enumerator is retrieved outside of the try block. But why should it matter in .Net Core since it doesn't have thread aborts? Is there something else that could interrupt it between the enumerator constructor and the try?

Maybe Roslyn can change it so that it does retrieve inside of the try if it's necessary.

Also, would Dispose need to call GC.SuppressFinalize for the JIT proof? (Currently the CWT enumerator does not.)

@jkotas
Copy link
Member

jkotas commented Apr 7, 2025

As I have said in the issue, I am not sure whether it is worth it to try to eliminate some of these background allocations.

}

/// <summary>Provides an enumerator for the table that can be used on the stack without allocation.</summary>
internal ref struct RefEnumerator : IEnumerator<KeyValuePair<TKey, TValue>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #53296 for a long discussion about why it is not the best idea to expose struct enumerator for ConditionalWeakTable

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I see you have mentioned in a comment - but that does not make it a good idea.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm well aware of the dangers, which is why it's internal only. I left a lengthy comment about it on the EnumerateOnStack() method.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not interested in spreading more unsafe or otherwise dangerous code throughout the libraries. We believe we have too much of it and we are actively working on reducing it.

Adjust cpp impls to take struct pointers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-VM-coreclr community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

RuntimeInformation causes unexpected allocations
4 participants