-
Notifications
You must be signed in to change notification settings - Fork 5k
NativeAOT: Run-time simd checking #68110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsAFAIK NativeAOT just targets some low cpu instruction set. Like: if(Avx2.IsSupported){ // not jit constant, an actual getter.
for(...) ExpensiveCallWithAvx2();
} else {
for(...) ExpensiveCall();
} Obviously don't emit them if it will not provide a massive performance boost. ... Yet I don't know how much those instruction sets add performance ... Hold on... Bepuphysics2 may benefit from it! (I think)
|
Related: #68038 |
The compiler already generates runtime checks wherever possible (e.g. ssse3.issupported is a runtime check with default settings). AVX.issupported cannot be a runtime check because of how ISA support is structured in RyuJIT. Not all use of VEX encoding in RyuJIT happens under IsSupported checks. |
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsAFAIK NativeAOT just targets some low cpu instruction set. Like: if(Avx2.IsSupported){ // not jit constant, an actual getter.
for(...) ExpensiveCallWithAvx2();
} else {
for(...) ExpensiveCall();
} Obviously don't emit them if it will not provide a massive performance boost. ... Yet I don't know how much those instruction sets add performance ... Hold on... Bepuphysics2 may benefit from it! (I think)
|
Would the expectation be that RyuJIT would know only to use VEX encoding on paths that have checked |
When Avx+ instruction set is allowed, the VEX encoding can be used implicitly by RyuJIT pretty much anywhere, e.g. even for zeroing locals in the method prolog. I think that the simplest variant of this would be for RyuJIT to only use the VEX encoding for explicit Avx and similar hardware intrinsics, and nothing else.
Yes, that would be a more advanced version. We would need a profitability function that decides where the code duplication is worth it. |
Could be done with profiles?! |
Uh oh!
There was an error while loading. Please reload this page.
Run-time simd checks can be beneficial in long-running cases (
SpanHelpers.SequenceEqual
for example)Run-time simd checks can hurt performance because of redundant checks...
Could it be fixed? Eg;
Can be transformed into:
This could be done by inlining, but it won't work for the most part...
Compiler somehow should produce an unique codegen for every simd for every subsequent method:
Will produce
But this unbiased
codegen
eration will cause dramatic size increase.Could PGO be possibly used to know where checks are actually useful?
category:cq
theme:ready-to-run
skill-level:expert
cost:medium
impact:small
The text was updated successfully, but these errors were encountered: