-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Universal abi used in amm functions, no need for OS differentiation #1869
Conversation
Hey Dan, thanks for looking into this! Could you please re-enable the code path (https://github.com/aws/aws-lc/blob/main/crypto/fipsmodule/cpucap/internal.h#L144) to test the change in the CI? |
2e8d424 does include that change! |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1869 +/- ##
==========================================
- Coverage 78.68% 78.67% -0.02%
==========================================
Files 585 585
Lines 100854 100861 +7
Branches 14298 14298
==========================================
- Hits 79357 79351 -6
- Misses 20863 20875 +12
- Partials 634 635 +1 ☔ View full report in Codecov by Sentry. |
### Description of changes: This extends the work done on #1869. * Updates to latest version of Intel SDE; * Fixes minor bug in Go test script. * Disables use of AVX512 IFMA on Windows. By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license. --------- Co-authored-by: Dan Pittman <[email protected]> Co-authored-by: Justin Smith <[email protected]>
I have been able to reproduce the issue, but it remains inconsistent. Sometimes it even happens with MSVC 2022, but not always; I’ve never seen it happen with MinGW. I have stepped through the code with Windbg and gdb, depending on which I compiler I use, and when using a MSVC compiler, these lines in CHECK_ABI(rsaz_amm52x20_x2_ifma256, &res, &a, &b, &m, k2);
CHECK_ABI(extract_multiplier_2x20_win5, &red_Y, red_table2k, idx1, idx2); When I hit the first line I copied above, this is what
What I would expect for un-zeroed malloc’d memory. When I step over that line, despite it not using
The address is garbage now, and when we hit the first access[3], SEH c0..5 is raised—an access fault. Through the help of a data breakpoint, I’ve found that, with MSVC compilers, the
TL;DRThe SDE tests that hung were because I blew the stack and that specific type of failure was not caught by the Go automation. When I moved to using heap memory for the tables, but nothing else, I no longer blew the stack, but the assembly routines were wrecking it near the |
On a c7i instance with Amazon Linux 2023 (meaning GCC 11.4.1 tooling and ifma instructions), we ran into this issue! However, on an Ubuntu box with GCC 11.4.0 tooling couldn't reproduce... |
During the review process, I mistakenly believed that the
@_6_arg_universal_ABI
did not account for Windows, as it conspicuously matches the Linux calling convention. Additionally, Windows does not have 6 registers in its calling convention, only 4—args 5 and 6 would be pushed onto the stack, so what I had here was wrong in two ways.While debugging, I could see that this was segfaulting during a read of a pointer in
%r11
; this was what tipped me off.Running
crypto_test
was successful locally on:On an SPR-based system. I have not tested this change on Linux locally yet, but I will in the next few hours.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and the ISC license.