Skip to content

JIT: Support for devirtualizing array interface methods #108153

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Oct 2, 2024

Conversation

AndyAyersMS
Copy link
Member

Update JIT and runtime to devirtualize interface calls on arrays over non-shared element types.

Shared types are not (yet) handled.

Add intrinsic and inlining attributes to key methods in the BCL. This allows the JIT to devirtualize and inline enumerator creation and devirtualize and inline all methods that access the enumerator.

And this in turn allows the enumerator to be stack allocated.

However, the enumerator fields are not (yet) physically promoted, because of an optimization in the BCL to return a static empty array enumerator. So the object being accessed later is ambiguous.

Progress towards #62457.

Update JIT and runtime to devirtualize interface calls on arrays
over non-shared element types.

Shared types are not (yet) handled.

Add intrinsic and inlining attributes to key methods in the BCL.
This allows the JIT to devirtualize and inline enumerator creation
and devirtualize and inline all methods that access the enumerator.

And this in turn allows the enumerator to be stack allocated.

However, the enumerator fields are not (yet) physically promoted,
because of an optimization in the BCL to return a static empty
array enumerator. So the object being accessed later is ambiguous.

Progress towards dotnet#62457.
@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 23, 2024
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@EgorBo
Copy link
Member

EgorBo commented Sep 23, 2024

@EgorBot

using BenchmarkDotNet.Attributes;

public class Bench
{
    IList<int> list = new int[10];

    [Benchmark]
    public void Test()
    {
        foreach (var _ in list) {}
    }
}

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Sep 23, 2024

I wouldn't expect much perf from this PR, unless perhaps the array is in a local (say an arg)... but would expect the enumerator to be on the stack so a bit less allocation.

Ah, forgot that we are also devirtualizing / inlining, so I guess it does help a bit (been focusing on the stack alloc / promotion end of things lately).

@EgorBo
Copy link
Member

EgorBo commented Sep 23, 2024

I wouldn't expect much perf from this PR, unless perhaps the array is in a local (say an arg)... but would expect the enumerator to be on the stack so a bit less allocation.

Ah, forgot that we are also devirtualizing / inlining, so I guess it does help a bit (been focusing on the stack alloc / promotion end of things lately).

well, it does show an improvement EgorBot/runtime-utils#94 (comment) (forgot to add memory analyzer)

@AndyAyersMS
Copy link
Member Author

It's not handling the GDV case right (at least sometimes)... need to dig in a bit more.

The complication here is that GDV resolves the virtual call twice, and
expects to get similar results both times, but for array interface calls
the first devirt switches things over to the nonvirtual SZArrayHelper
method. The second call needs to take pains to exactly match that.
@AndyAyersMS
Copy link
Member Author

The JIT ends up calling resolveVirtualMethod twice (actually 3 times) for a GDV call site:

  1. the first one fails, as we can't devirtualize the site without GDV info
  2. the second one builds the GDV candidate (we won't do GDV unless we know the info will make the call devirtualize)
  3. the third one more or less replays the second one, but also does the transformation of the call. Here we often plug the outputs of (2) as the inputs for (3) and this actually works out ok.

For array interface methods the second call produces outputs that look nothing like the inputs, eg an interface + virtual method is mapped to a concrete class SZArrayHelper and a generic method. So when we get to step (3) we can't just use step (2)'s outputs as the inputs; we need to use the same inputs as step (2). That's more or less what my last commit does.

A separate issue we should look at is why we can't just remember everything from (2) and not reinvoke in (3)... One issue is that (3)'s invocation is done in the middle of impDevirtualizeCall which then goes on to do the transformation... if we split off the latter half of this method perhaps we could just reinvoke that part when we get to step (3).

@AndyAyersMS
Copy link
Member Author

@MihuBot

@AndyAyersMS
Copy link
Member Author

This seems to mostly work, but unfortunately doesn't enable stack allocation. The problem now is that the GetEnumerator calls don't appear inlineable, because we're now checking (2)'s inputs instead of (2)'s outputs when we ask if the GDV candidate can be inlined.

@AndyAyersMS
Copy link
Member Author

Fix for that is to remember the original context in the GDV info, and use that (plus original method) if we're trying to (re)-devirtualize an array interface call.

With that, we can now (once again) inline GetEnumerator and, for non-GDV cases, stack allocate the enumerator object in many cases. We still see the "empty array" path so we can't promote.

For GDV we also see the non-GDV paths lower down, and because of that, we think the array enumerator might escape (escape analysis is flow-insensitive...).

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Sep 25, 2024

A bit more on the above.

The path A->B->B1->D->F is infeasible, but escape analysis doesn't know that, so it thinks that the newobj enumerator escapes, when in fact it doesn't.

image - 2024-09-25T112702 621

Note the path A->C->D->E is feasible (as is A->B->B2->D->E), so even in the inlined code in E it is unclear which object is being accessed: it could be the newobj, the static, or some other other SZGenericArrayEnumerator returned by the interface call.

Without GDV the path A->C->D is gone, which means the type of e is known, which removes the path D->F, but in E there is still ambiguity as to which object is being used (newobj or static).

@AndyAyersMS
Copy link
Member Author

@MihuBot

@AndyAyersMS
Copy link
Member Author

Something is broken in the host testing, it is trying to add symbolic links and these are failing, eg

✘ HostActivation.Tests.SymbolicLinks.Run_apphost_behind_transitive_symlinks(firstSymlinkRelativePath: "a/b/FirstSymlink", secondSymlinkRelativePath: "c/SecondSymlink")​5ms
Error:
System.IO.IOException : Error creating symbolic link at C:\repos\runtime2\artifacts\tests\host\windows.x64.Release\ha\crzkrgns.dvp\symlink\c/SecondSymlink.exe pointing to C:\repos\runtime2\artifacts\tests\host\windows.x64.Release\ha\zo4lgsgj.5fa\HelloWorld\HelloWorld.exe: CreateSymbolicLink failed with error number 1314

Stack trace:
   at Microsoft.DotNet.CoreSetup.Test.SymLink..ctor(String src, String dest) in C:\repos\runtime2\src\installer\tests\TestUtils\SymLink.cs:line 19
   at HostActivation.Tests.SymbolicLinks.Run_apphost_behind_transitive_symlinks(String firstSymlinkRelativePath, String secondSymlinkRelativePath) in C:\repos\runtime2\src\installer\tests\HostActivation.Tests\SymbolicLinks.cs:line 59
   at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor)
   at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span`1 copyOfArgs, BindingFlags invokeAttr)

@lewing
Copy link
Member

lewing commented Sep 26, 2024

/azp run runtime

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@MichalStrehovsky
Copy link
Member

MichalStrehovsky commented Sep 30, 2024

We now know the type, but it still doesn't lead to devirtualization:

That looks like a separate problem. I have a fix in #108379. With that and your CorInfoImpl.cs change, this devirts fine!

Seems like if nothing else, adjustments are needed in the various ResolveVirtualMethod layers to handle arrays as interfaces, and devirtualization to a generic method.

Native AOT fortunately doesn't do the "non-generic interface call leads to a generic method" thing.

@AndyAyersMS
Copy link
Member Author

We now know the type, but it still doesn't lead to devirtualization:

That looks like a separate problem. I have a fix in #108379. With that and your CorInfoImpl.cs change, this devirts fine!

Seems like if nothing else, adjustments are needed in the various ResolveVirtualMethod layers to handle arrays as interfaces, and devirtualization to a generic method.

Native AOT fortunately doesn't do the "non-generic interface call leads to a generic method" thing.

Nice! Thanks for looking deeper.

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I need the changes to recognize [Intrinsic] after devirtualization too 🙂. Should we run PGO outerloops for it? (and, presumably, NativeAOT outerloop now)

@EgorBo

This comment was marked as resolved.

@EgorBo
Copy link
Member

EgorBo commented Sep 30, 2024

@EgorBot -intel

using BenchmarkDotNet.Attributes;

[MemoryDiagnoser]
public class Bench
{
    IList<int> list = new int[1024];

    [Benchmark]
    public int Test()
    {
        int sum = 0;
        foreach (var i in list)
            sum += i;
        return sum;
    }
}

@MichalStrehovsky
Copy link
Member

Is there something that prevents this kicking in for crossgen2? Do we need to set the "is array devirt" in resolveVirtualMethod under #if READYTORUN?

@AndyAyersMS
Copy link
Member Author

Is there something that prevents this kicking in for crossgen2? Do we need to set the "is array devirt" in resolveVirtualMethod under #if READYTORUN?

I haven't debugged it yet, but here's what the jit sees for R2R:

impDevirtualizeCall: Trying to devirtualize interface call:
    class for 'this' is int[] [exact] (attrib 000c0010)
    base method is System.Collections.Generic.IEnumerable`1[int]::GetEnumerator
--- base class is interface
ResolveVirtualMethod (method 4000000000420040 class 4000000000420088 context 4000000000420049)
--- no derived method: object class could not be cast to interface class
    Class not final or exact
Considering guarded devirtualization at IL offset 9 (0x9)
Too many exact classes implementing System.Collections.Generic.IEnumerable`1[int] (-1 > 1)
Not guessing; no PGO and no exact classes
INLINER: during 'impMarkInlineCandidate' result 'failed this call site' reason 'target not direct' for 'X:Main():int' calling 'System.Collections.Generic.IEnumerable`1[int]:GetEnumerator():System.Collections.Generic.IEnumerator`1[int]:this'
INLINER: during 'impMarkInlineCandidate' result 'failed this call site' reason 'target not direct

@MichalStrehovsky
Copy link
Member

I see, so it gets rejected during devirtualization. Psychic debugging tells me it's because of this:

/// <summary>
/// CoreCLR has no Array`1 type to hang the various generic interfaces off.
/// Return nothing at compile time so the runtime figures it out.
/// </summary>

Should be fine then.

@AndyAyersMS
Copy link
Member Author

@MichalStrehovsky are you going to sign off? You got added as a reviewer.

@MichalStrehovsky
Copy link
Member

@MichalStrehovsky are you going to sign off? You got added as a reviewer.

Only because of the CorInfoImpl.cs change. That diff LGTM.

@AndyAyersMS AndyAyersMS merged commit 6906730 into dotnet:main Oct 2, 2024
154 checks passed
MichalStrehovsky added a commit that referenced this pull request Oct 3, 2024
…08379)

We avoid devirtualizing into abstract classes because whole program view might have optimized away the method bodies and devirtualizing them doesn't lead to anything good.

However, if the whole program view had a non-abstract child of this, we can no longer optimize this out and devirtualization should be fine.

Fixes issue encountered in #108153 (comment)
sirntar pushed a commit to sirntar/runtime that referenced this pull request Oct 3, 2024
Update JIT and runtime to devirtualize interface calls on arrays
over non-shared element types.

Shared types are not (yet) handled.

Add intrinsic and inlining attributes to key methods in the BCL.
This allows the JIT to devirtualize and inline enumerator creation
and devirtualize and inline all methods that access the enumerator.

And this in turn allows the enumerator to be stack allocated in some
simple cases.

However, the enumerator fields are not (yet) physically promoted,
because of an optimization in the BCL to return a static empty
array enumerator. So the object being accessed later is ambiguous.

Alse ensure that since GDV resolves the virtual call twice, and
expects to get similar results both times, things work for the array
case by keeping track of the initial devirtualization inputs.

Progress towards dotnet#62457.
sirntar pushed a commit to sirntar/runtime that referenced this pull request Oct 3, 2024
…tnet#108379)

We avoid devirtualizing into abstract classes because whole program view might have optimized away the method bodies and devirtualizing them doesn't lead to anything good.

However, if the whole program view had a non-abstract child of this, we can no longer optimize this out and devirtualization should be fine.

Fixes issue encountered in dotnet#108153 (comment)
lambdageek pushed a commit to lambdageek/runtime that referenced this pull request Oct 3, 2024
Update JIT and runtime to devirtualize interface calls on arrays
over non-shared element types.

Shared types are not (yet) handled.

Add intrinsic and inlining attributes to key methods in the BCL.
This allows the JIT to devirtualize and inline enumerator creation
and devirtualize and inline all methods that access the enumerator.

And this in turn allows the enumerator to be stack allocated in some
simple cases.

However, the enumerator fields are not (yet) physically promoted,
because of an optimization in the BCL to return a static empty
array enumerator. So the object being accessed later is ambiguous.

Alse ensure that since GDV resolves the virtual call twice, and
expects to get similar results both times, things work for the array
case by keeping track of the initial devirtualization inputs.

Progress towards dotnet#62457.
lambdageek pushed a commit to lambdageek/runtime that referenced this pull request Oct 3, 2024
…tnet#108379)

We avoid devirtualizing into abstract classes because whole program view might have optimized away the method bodies and devirtualizing them doesn't lead to anything good.

However, if the whole program view had a non-abstract child of this, we can no longer optimize this out and devirtualization should be fine.

Fixes issue encountered in dotnet#108153 (comment)
@LoopedBard3
Copy link
Member

Found regressions:
Linux x64: dotnet/perf-autofiling-issues#42759

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
Status: Team User Stories
Development

Successfully merging this pull request may close these issues.

5 participants