-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use clang-12 to fix issue with LLVM_ABI_BREAKING_CHECKS #4760
Use clang-12 to fix issue with LLVM_ABI_BREAKING_CHECKS #4760
Conversation
5194c86
to
00b0907
Compare
Do we need to update LLVM hash in the same PR? |
It's actually not needed since not updating the llvm-hash will mean an old (in the cache) LLVM built with clang-11 but also LLVM_ABI_BREAKING_CHECKS=FORCE_OFF, but I needed to change it in order to test, so I included for completeness.
I'm actually not sure, but I guess that it's because mac-12 should be using a newer version of clang by default. I could not find which one they are using. |
does that mean any Triton build with clang <= 11 or gcc would show the error? |
It sounds risky to update the build flags while keeping the hash, because the prebuilt LLVM stored in Azure would no longer match the build script.
Any build of Triton with clang > 11 or gcc shows the error when linking against LLVM built with clang <= 11 (which is the case before this PR). |
Updated the llvm-hash (after #4791). Everything still green. Waiting for your input @ThomasRaoux :) |
sorry for the delay, I was OOO.
I'm still trying to understand that part. So right LLVM built with clang <= 11 is broken unless we set I thought the problem was that this preprocessor mismatches between LLVM build and triton?
Did I get that wrong? |
That's correct. You can't pass hash containers across a component A built with clang <= 11 and a different component B built with clang > 11 or gcc, unless one uses DLLVM_ABI_BREAKING_CHECKS=FORCE_OFF. Currently A=llvm and B=Triton, but it could also be the other way around.
No, you didn't get that wrong. This is the problem. |
Thanks for clarifying. So based on that if we were to keep |
The motivation is that we want to have checks. NDEBUG builds automatically disable LLVM_ABI_BREAKING_CHECKS, but we currently build with assertions, so ABI breaking checks should be enabled as well. |
I see your point. What kind of problem could that catch? |
actually, would that only break triton when built with clang <= 11? If so maybe we just need to require clang > 11 which would be reasonable |
Yes, only the issue is clang <= 11 on one side vs clang > 11 or gcc on the other. If we don't use clang <= 11 for LLVM or Triton, everything is fine. Clang 11 is the default for the GitHub Ubuntu 20.04 runner image though, so there is a chance that people will run into this. I can look into making the build fail explicitly for bad compiler combinations. |
That would be great. I think we should at least make clang minimum version to 12. It should be easy to do in cmake |
9ef20cc
to
c34395c
Compare
The issue raised in triton-lang#4212 (comment) and temporary fixed in triton-lang#4512 has now been fixed upstream. The non-deterministic seed inside the hashing function (included in this llvm commit: llvm/llvm-project@ce80c80) was disabled for clang <= 11 for non-pic builds. This was removed again in xgupta/llvm-project@5255b81 and we can enable ABI-breaking checks again. We build LLVM with PIC. If anyone ever builds with non-PIC, they will get a compilation error and need to disable LLVM_ABI_BREAKING_CHECKS manually.
c34395c
to
78050f1
Compare
The non-pic workaround in LLVM has been removed (llvm/llvm-project#110952). We can therefore remove the |
We should still make clang 12 a minimum requirement to build triton or is this not needed? |
No, not needed. I doubt anyone will build Triton with non-pic, and if anyone does and happens to use clang-11 for that it will produce a compile error. Specifically, the risk of silent ODR-violations no longer exists. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks good thanks for explaining the problem in detail. I'm probably doing something wrong but I can't find the hash 5255b819c1b117e6048f354c035f5d36f3afad2b in llvm?
cmake/llvm-hash.txt
Outdated
@@ -1 +1 @@ | |||
61f8a7f618901797ee8663389a29722f29216a96 | |||
5255b819c1b117e6048f354c035f5d36f3afad2b |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird I can't find this hash in llvm repo?
Yeah, sorry. I grabbed the wrong commit. The right one is |
This commit is fixing the root cause of the issue that appeared during an llvm update, and was raised in #4212 (comment).
The issue was temporary fixed in #4512, where we set LLVM_ABI_BREAKING_CHECKS to FORCE_OFF constant in order to not use a non-deterministic seed inside the hashing function (included in this llvm commit: llvm/llvm-project@ce80c80).
A further investigation (with chsigg@) found that the issue is that LLVM is built with clang11, while Triton use a newer version. The ABI issue is brought up here: llvm/llvm-project#96282 (comment), but the consensus seemed to be that this setup is rare.
Updated the clang version to 12 in the ubuntu build fixed the issue and therefore we can revert setting LLVM_ABI_BREAKING_CHECKS constant. I am additionaly erasing LLVM_ABI_BREAKING_CHECKS in the setup of the other hardwares (it seems it was not needed) and I am splitting ubuntu and macOS configurations because it seems cleaner than having a variable that sets the compiler version for each of them.