Skip to content

std.int128 unittest failure with -O1 -mcpu=x86-64-v3 since #4892 #4916

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
the-horo opened this issue Apr 27, 2025 · 10 comments
Open

std.int128 unittest failure with -O1 -mcpu=x86-64-v3 since #4892 #4916

the-horo opened this issue Apr 27, 2025 · 10 comments

Comments

@the-horo
Copy link
Contributor

1.41.0-beta1 is failing one of its unittests for me:

$ ctest -V -R std.int128
UpdateCTestConfiguration  from :/home/happy/tmp/ldc-build/DartConfiguration.tcl
UpdateCTestConfiguration  from :/home/happy/tmp/ldc-build/DartConfiguration.tcl
Test project /home/happy/tmp/ldc-build
Constructing a list of tests
Done constructing a list of tests
Updating test list for fixtures
Added 0 tests to meet fixture requirements
Checking test dependency graph...
Checking test dependency graph end
test 370
    Start 370: std.int128-shared

370: Test command: /home/happy/tmp/ldc-build/runtime/phobos2-test-runner-shared "std.int128"
370: Working Directory: /home/happy/tmp/ldc-build/runtime
370: Test timeout computed to be: 10000000
370: ****** FAIL release64 std.int128
370: core.exception.AssertError@std/int128.d(574): Int128(Cent(14, 0)) != Int128(Cent(15, 0))
370: ----------------
370: ??:? _d_assert_msg [0x7f703c761383]
370: ??:? pure nothrow @nogc @safe void std.int128.__unittest_L521_C1() [0x7f703d7507c7]
370: ??:? [0x7f703d751a8f]
370: ??:? [0x5632a7c2c697]
370: ??:? [0x5632a7c2c53f]
370: ??:? [0x5632a7c2c451]
370: ??:? runModuleUnitTests [0x7f703c784173]
370: ??:? void rt.dmain2._d_run_main2(char[][], ulong, extern (C) int function(char[][])*).runAll() [0x7f703c79f78a]
370: ??:? _d_run_main2 [0x7f703c79f5a6]
370: ??:? _d_run_main [0x7f703c79f38c]
370: ??:? [0x7f703c4bd16d]
370: ??:? __libc_start_main [0x7f703c4bd228]
370: ??:? [0x5632a7c2c1c4]
1/2 Test #370: std.int128-shared ................***Failed    0.01 sec
test 813
    Start 813: std.int128-debug-shared

813: Test command: /home/happy/tmp/ldc-build/runtime/phobos2-test-runner-debug-shared "std.int128"
813: Working Directory: /home/happy/tmp/ldc-build/runtime
813: Test timeout computed to be: 10000000
813: 0.000s PASS debug64 std.int128
2/2 Test #813: std.int128-debug-shared ..........   Passed    0.01 sec

The following tests passed:
	std.int128-debug-shared

50% tests passed, 1 tests failed out of 2

Total Test time (real) =   0.03 sec

The following tests FAILED:
	370 - std.int128-shared (Failed)
Errors while running CTest
Output from these tests are in: /home/happy/tmp/ldc-build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

I've configured the project with -DD_FLAGS_RELEASE="-O1;-mcpu=x86-64-v3"

@kinke
Copy link
Member

kinke commented Apr 27, 2025

Reduced:

import std.int128;

void main() {
    auto c = Int128(5, 6);
    c *= Int128(10, 20);
    c /= Int128(10, 20);
    assert(c == Int128(0, 15));
}

Works fine with -O or -mcpu=x86-64-v3 alone, but fails with -O -mcpu=x86-64-v3, at least with LLVM 19 as used by the v1.41.0-beta1 package: https://d.godbolt.org/z/nP93WM6jK. The earlier multiplication is mandatory to keep it failing. - If this isn't an LLVM bug, it might be caused by incomplete inline-asm clobbers in https://github.com/dlang/dmd/blob/5aee9a9eb57d2566eca8c05ebbd9a798bd3645ea/druntime/src/core/int128.d#L640-L646.

@kinke
Copy link
Member

kinke commented Apr 27, 2025

Seems to work with LLVM 20, at least the isolated test case.

Edit: Hmm not sure, as I can't reproduce the failure on my Intel Raptorlake CPU, when using the prebuilt v1.41.0-beta1 binaries...

@the-horo
Copy link
Contributor Author

I'm on an AMD Ryzen 7 5825U if it matters

@the-horo
Copy link
Contributor Author

LLVM 15 is failing

@the-horo
Copy link
Contributor Author

The prebuilt packages are failing for me too so the flags are enough to be applied to the test program, not the entire runtime build.

@kinke
Copy link
Member

kinke commented Apr 27, 2025

How's -mcpu=native for you? That made it work again on godbolt (once), but I'm not sure the godbolt runner CPUs are stable.

@the-horo
Copy link
Contributor Author

-mcpu=native is what originally made it fail but I tried to trim down the attributes:

$ ldc2 -mcpu=native -vv
Targeting 'x86_64-pc-linux-gnu' (CPU 'znver3' with features '+prfchw,-cldemote,+avx,+aes,+sahf,+pclmul,-xop,+crc32,+xsaves,-avx512fp16,-usermsr,-sm4,-egpr,+sse4.1,-avx512ifma,+xsave,+sse4.2,-tsxldtrk,-sm3,-ptwrite,-widekl,+invpcid,+64bit,+xsavec,-avx10.1-512,-avx512vpopcntdq,+cmov,-avx512vp2intersect,-avx512cd,+movbe,-avxvnniint8,-ccmp,-amx-int8,-kl,-avx10.1-256,-sha512,-avxvnni,-rtm,+adx,+avx2,-hreset,-movdiri,-serialize,+vpclmulqdq,-avx512vl,-uintr,-cf,+clflushopt,-raoint,-cmpccxadd,+bmi,-amx-tile,+sse,-gfni,-avxvnniint16,-amx-fp16,-ndd,+xsaveopt,+rdrnd,-avx512f,-amx-bf16,-avx512bf16,-avx512vnni,-push2pop2,+cx8,-avx512bw,+sse3,+pku,+fsgsbase,+clzero,+mwaitx,-lwp,+lzcnt,+sha,-movdir64b,-ppx,+wbnoinvd,-enqcmd,-avxneconvert,-tbm,-pconfig,-amx-complex,+ssse3,+cx16,+bmi2,+fma,+popcnt,-avxifma,+f16c,-avx512bitalg,+rdpru,+clwb,+mmx,+sse2,+rdseed,-avx512vbmi2,-prefetchi,+rdpid,-fma4,-avx512vbmi,+shstk,+vaes,-waitpkg,-sgx,+fxsr,-avx512dq,+sse4a')

@kinke
Copy link
Member

kinke commented Apr 27, 2025

On godbolt, I get znver3 too for some attempts. And there -O1 -mcpu=native fails, but -O3 -mcpu=native passes again.

[FWIW, the isolated testcase works because the std.int128 binops are templates, and core.int128 is newly pragma(inline, true) for LDC, so the relevant code is all emitted during compilation with those flags.]

On my laptop's Intel i7-13700H, I haven't managed to make it fail so far.

@the-horo
Copy link
Contributor Author

LLVM 20 also fails (I've used #4911). -O3 is also failing, both x86-64-v3 and native

@kinke
Copy link
Member

kinke commented Apr 29, 2025

On an AMD Ryzen 3960X (Zen v2), I can reproduce the failures with prebuilt v1.41.0-beta1 (FWIW, on Ubuntu 24, same as my laptop), with both x86-64-v3 and native.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants