-
Notifications
You must be signed in to change notification settings - Fork 604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bazel: upgrade to clang-19 and fix sanitizers #25120
base: dev
Are you sure you want to change the base?
Conversation
"@seastar//:use_system_allocator": [], | ||
"//conditions:default": ["@platforms//:incompatible"], | ||
"@seastar//:use_system_allocator": ["@platforms//:incompatible"], | ||
"//conditions:default": [], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, i wonder why it was working before... that's concerning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the default build uses the seastar allocator still, so it was just always disabled
src/v/rpc/rpc_compiler.py
Outdated
{{service_name}}_service_base({{service_name}}_service_base&& o) noexcept | ||
: _sc(std::move(o._sc)), _ssg(std::move(o._ssg)), _methods(std::move(o._methods)) {} | ||
: _sc(std::move(o._sc)), _ssg(std::move(o._ssg)) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We must never move these RPC services because this would be very
buggy otherwise.
was the intention to instead delete the move consturctor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm game for deleting both the move constructor and move assignment operator - I don't think either are sound for _methods
. I do think we don't need to move the methods and it's fine with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah what I was trying to say was, today in the code base there is no way we in practice move any services today, because there would be mass bugs with rpcs invoking the wrong object. Maybe we should formalize this and delete the move ctor and assignment operator
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should formalize this and delete the move ctor and assignment operator
Yes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pushed a commit to remove them.
7e3c785
to
64949d6
Compare
64949d6
to
35318ed
Compare
"linux-x86_64": ["http://redpanda-core-toolchain.s3-website-us-east-1.amazonaws.com/llvm-19.1.1-ubuntu-22.04-x86_64.tar.gz"], | ||
}, | ||
llvm_version = "19.1.7", | ||
# Currently our custom build of clang causes false positives with sanitizers, so we use the default clang distribution. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, bummer. I guess something about the way we build it causes this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah exactly, I'm not sure what. We may have to go back to custom builds and figure out what flag is calling this, because in CI we need older libraries, but locally I need newer ones
@@ -1 +1,2 @@ | |||
leak:seastar::net::conntrack::handle::~handle | |||
leak:fips/self_test.c |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note: self tests are run once per shard and, unfortunately, are within the validated section of the OpenSSL FIPS module which means we cannot touch it and maintain validation.
It looks like CI cpp lint (which I assume uses clang-format-18) passes against the changes in 35318ed. Do we need to ensure that core knows to use clang-format in bazel rather than in vtools? Should we remove |
Yeah people should use |
@@ -240,8 +240,8 @@ redpanda_cc_btest( | |||
"batch_cache_reclaim_test.cc", | |||
], | |||
target_compatible_with = select({ | |||
"@seastar//:use_system_allocator": [], | |||
"//conditions:default": ["@platforms//:incompatible"], | |||
"@seastar//:use_system_allocator": ["@platforms//:incompatible"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a WHY to the commit messages 🙇♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Should apply as well to buildkite? |
yeh i think it runs as a github action? |
redpanda/.github/workflows/lint-cpp.yml Line 38 in ceae5b0
|
This was incorrectly configured, the test relies on the seastar allocator's reclaim ability, which is required to pass. As is this always is disabled on CI but we only want to disable the test if the system allocator is being used because the reclaim ability is not available.
_methods is an array of non_copyable functions that have `[this]` pointers captured, we cannot move them into a new object because the `this` pointer become invalidated and pointing to the old object. We must never move these RPC services because this would be very buggy otherwise.
clang 19 is more strict on this
clang 19 is more strict here
libc++ 19 is more strict, but it requires types are copy assignable, and according to the standard copy assignable types must be move asssignable as well. https://en.cppreference.com/w/cpp/named_req/CopyAssignable
In getting the sanitizers work in openssl tests with fips enabled we need to suppress leaks there. The benchmarks are full of them, and I assume these are not caught in CMake because benchmarks don't run with sanitizers.
Upgrade to the latest clang. The real motivation here is that we want the sanitizers to work, however for some reason the custom built clang we use has false positives with coroutines and the sanitizers. However clang-18 is compiled against a super old version of ubuntu and missing libraries are missing (namely `libtinfo.5.so`) and not available on recent ubuntu and fedora distributions. Since we're upgrading the clang-19 we need to address a few deprecations in libc++, so we upgrade seastar for this.
In bazel we test benchmarks by running them once, well 1000 batches is really slow in debug mode, so running once doesn't do anything. Instead change the number of batches in debug mode to make the benchmark fast.
This isn't really safe because pointers are taken to `this`, we don't use these move operators anyways and they are a big footgun, so just remove them.
44213ea
to
082c967
Compare
CI test resultstest results on build#62070
|
Backports Required
Release Notes