Skip to content

Improve support for 512-bit Vector<T> #111472

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Mar 20, 2025
Merged

Conversation

saucecontrol
Copy link
Member

@saucecontrol saucecontrol commented Jan 15, 2025

Fixes #104978
Contributes to #108869

  • Fixes tests that had bogus failures (e.g. not enough test data)
  • Adds tests to cover some existing broken paths
  • Fixes JIT and CoreLib paths that did not support 512-bit Vector<T> or assumed AVX2 support meant Vector<T> was 256-bit
  • Enables support for --max-vectort-bitwidth:512 in AOT and adds a smoke test for it
  • Ensures Vector<T> is no larger than PreferredVectorBitWidth

Tests pass locally (AVX-512 machine) with DOTNET_MaxVectorTBitWidth=512 and DOTNET_MaxVectorTBitWidth=128

I believe this also addresses any problems that prevented opportunistic enablement of AVX2 on AOT when AVX is enabled, but I'll do some more testing around that and make a separate PR for that change if all is well.

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 15, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Jan 15, 2025
Copy link
Member Author

@saucecontrol saucecontrol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CC @tannergooding. This is ready for review.

Would it be all right to add the alt-sized Vector<T> configs to one of the jitstress-isas pipelines?

@@ -17,6 +17,6 @@
</PropertyGroup>

<ItemGroup>
<Compile Include="CpuId.cs" />
<Compile Include="../../../JIT/HardwareIntrinsics/X86/X86Base/CpuId.cs" />
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The local copy of these tests had been bit rotting for a while. I could bring it up to date, but it seems reasonable to just share it with JIT.

succeeded &= VectorByteCount == 32;
// MaxVectorTBitWidth env variable can be used to change Vector<T> size.
// We can only assume it is at least 16 bytes.
succeeded &= VectorByteCount >= 16;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept changes to the regression tests minimal -- this is handled more robustly in the CpuId tests.

@tannergooding
Copy link
Member

/azp run runtime-coreclr jitstress-isas-x86

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@xtqqczze
Copy link
Contributor

xtqqczze commented Jan 17, 2025

We do not guarantee alignment of writes in SpanHelpers.Fill<T>, which means using an AVX-512 path could potentially regress performance due to stores that cross a cache line boundary (64 bytes). Since AVX-512 operates with a width of 64 bytes, any misalignment of writes ensures that a cache line split will occur.

@tannergooding
Copy link
Member

which means using an AVX-512 path could potentially regress performance due to stores that cross a cache line boundary (64 bytes)

That's an issue with the code using Vector<T> without either pinning or attempting opportunistic alignment. The issue already exists due to the variable sized nature of Vector<T> and us not guaranteeing a maximize size.

It's not a blocker for this PR which enables opt-in usage of Vector<T> as Vector512<T>, nor would it be an issue for say Arm64 SVE where Vector<T> can be up to 2048 bits (although no such hardware exists today).

@BruceForstall
Copy link
Contributor

@saucecontrol Are you waiting for review, or are you still working on this? @tannergooding You've looked at this before; can you look again?

@BruceForstall
Copy link
Contributor

(@saucecontrol you should probably merge up to main)

@saucecontrol
Copy link
Member Author

@saucecontrol Are you waiting for review, or are you still working on this?

It's ready on my end. Never did get a clean CI run, but no failures appeared to be related. Let's see how it does after merging back to main.

if (Vector<uint>.Count == 8)
if (Vector<uint>.Count == 16)
{
// TODO: P/Invoke tests do not yet handle 512-bit Vector<T>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to either be handled or have a tracking issue that is picked up shortly after merge in a follow up PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worded that comment oddly. The issue is not specific to Vector<T> but rather Vector512 in general isn't covered by these interop tests. Do we have something tracking that already?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't believe so. These tests are really centered around Vector and most of the V128/256/512 tests are elsewhere and newer/different

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are under tests/Interop/PInvoke/Generics, which AFAICT cover scenarios involving all the generic types that have specific ABI handling. What I mean is that we don't currently have any tests for Vector512 ABI handling. Do we even have support for that?

Copy link
Member

@tannergooding tannergooding Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we even have support for that?

Yesn't. It's expected to throw today, but the tests are still setup since there's only a single known issue actually blocking the support (which is that the default calling convention for returning vectors on Windows should be in register, not via a return buffer).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the tests are still setup

This is the part I'm confused about. Vector512 is conspicuously absent from the list of types covered by the tests under https://github.com/dotnet/runtime/tree/main/src/tests/Interop/PInvoke/Generics.

I'm happy to open a tracking issue for adding them, but I would have assumed that would be an implicit part of enabling the ABI support.

I'm now concerned we may be missing some coverage, though. If Vector512 used in P/Invoke is supposed to throw, we'd need to be sure the same happens for 512-bit Vector<T>, but I don't see where we're testing either.

@tannergooding
Copy link
Member

CC. @dotnet/jit-contrib, @EgorBo, @BruceForstall for secondary review

@BruceForstall BruceForstall merged commit 67c10c8 into dotnet:main Mar 20, 2025
153 of 155 checks passed
@saucecontrol saucecontrol deleted the vectort512 branch March 20, 2025 00:38
@github-actions github-actions bot locked and limited conversation to collaborators Apr 19, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOTNET_PreferredVectorBitWidth is not respected by Vector<T>
5 participants