Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch gtest to terminate as soon as it detects that a ScopedTrace has migrated to another worker thread #1258

Merged
merged 2 commits into from
Jan 29, 2025

Conversation

msimberg
Copy link
Collaborator

See first commit message for more details.

I've added a temporary commit which explicitly adds more yields while holding a ScopedTrace to check that it actually works in CI.

I've opted for a local patch which I think is easily updated if we update the gtest tag. An alternative strategy could be to apply the patch through the spack package and I don't have a strong preference either way, other than me not (yet) knowing how to apply the patch with spack to something that is pulled only at the CMake configure stage. One benefit of patching gtest like this is that it will always apply, even if one uses a spack build-env which doesn't run CMake through spack.

@msimberg msimberg requested a review from rasolca January 14, 2025 15:22
@msimberg msimberg self-assigned this Jan 14, 2025
@msimberg msimberg mentioned this pull request Jan 14, 2025
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg msimberg force-pushed the gtest-terminate-on-thread-migrate branch from 78fdf9c to a15bb32 Compare January 16, 2025 16:42
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg msimberg force-pushed the gtest-terminate-on-thread-migrate branch from a15bb32 to 83a4712 Compare January 16, 2025 17:00
@msimberg
Copy link
Collaborator Author

cscs-ci run

external/CMakeLists.txt Outdated Show resolved Hide resolved
@msimberg msimberg force-pushed the gtest-terminate-on-thread-migrate branch from 83a4712 to 07cf5f1 Compare January 17, 2025 16:31
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg msimberg force-pushed the gtest-terminate-on-thread-migrate branch from 07cf5f1 to 64e970c Compare January 17, 2025 19:12
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

The patch seems to be correctly applied now, but can't be tested because CI needs to be moved to alps.

Copy link
Collaborator

@rasolca rasolca left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but still open TODO

…read different than where it was destroyed

gtest ScopedTrace/SCOPED_TRACE uses thread local stacks to keep track of traces. When combined with pika
tasks that may yield and be stolen onto other worker threads, destroying a ScopedTrace on a different
thread can segfault, cause hangs, or simply corrupt data. Instead of allowing that to happen every now
and then we instead patch gtest to check if this happens and terminate as soon as it's detected. Use of
ScopedTrace should be avoided in situations where tasks can yield. This patch does not fix the issue, but
helps recognizing when e.g. a segfault is the cause of "only" improper use of ScopedTrace or if it is a
result of other algorithmic or implementation issues.
@msimberg msimberg force-pushed the gtest-terminate-on-thread-migrate branch from 64e970c to a8ad787 Compare January 27, 2025 17:00
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

Looks good, but still open TODO

Yep, I still want to confirm that it does the right thing in CI. Now that #1262 is merged we should get some results soon enough...

@msimberg msimberg force-pushed the gtest-terminate-on-thread-migrate branch from a8ad787 to 1ca6fc5 Compare January 28, 2025 10:53
@msimberg
Copy link
Collaborator Author

Seems to be working as intended: https://gitlab.com/cscs-ci/ci-testing/webhook-ci/mirrors/4700071344751697/7514005670787789/-/jobs/8962660045#L2249. I've removed the temporary commit.

@msimberg
Copy link
Collaborator Author

cscs-ci run

@rasolca rasolca merged commit f112e64 into eth-cscs:master Jan 29, 2025
4 of 5 checks passed
github-actions bot pushed a commit that referenced this pull request Jan 29, 2025
@msimberg msimberg deleted the gtest-terminate-on-thread-migrate branch January 29, 2025 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants