-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: intermittent segfault in PyPy #3089
Conversation
ianna
commented
Apr 22, 2024
•
edited
Loading
edited
b682f40
to
4c40472
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope.
63243ba
to
e4baf54
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jpivarski - removing coverage from the PyPy test job cures the segfault! I think, you can merge this PR to fix the failing jobs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the diff in this PR removes the unnecessary length
argument from awkward_unique_ranges
, which can't be responsible for the PyPy segfault unless the length argument had been passed inconsistently, and now it's consistent. (It doesn't look that way—it looks to me like a neutral-refactoring.)
Another part of this diff is explicit integer casts in the C code. That also should be unrelated to PyPy segfaults, since it's not in the part of the code that's different between PyPy and CPython. Perhaps PyPy's ctypes is different, passing data as a different integer type than intended, but these are conversion-casts in the C code, not reinterpret-casts.
I don't see anything in this PR that explains why PyPy was segfaulting, or would cause PyPy to stop segfaulting. But if it's not segfaulting now, I'm definitely willing to merge this update and see what happens to the PyPy test for the next few weeks. (Everything in this PR is a net-positive, regardless of whether it fixes the segfault or not.) If the segfault is fixed, then it's likely something that I didn't catch by eye, but is a correction nonetheless.
@ianna, do you have a theory about what was causing the segfault before and not now? Some line of code in the diff that you can point to? Either way, go ahead and merge this PR and we'll see what happens. I'll even propagate it out to the Ragged library, where I initially encountered the segfault, to see if it's fixed there, too.
Actually, don't merge this until #3086 is merged. There's going to be a conflict, and it will be easier to resolve the conflict in this PR (the diff is smaller). So let's get #3086 into The conflict will be in how removing the unnecessary |
#3086 has been merged, and so I updated this branch and will run through the tests again. If all the tests pass, this PR can be merged, so I've enabled auto-merge. |
To record what I learned from @ianna during our meeting: the relevant change is that this PR runs the code-coverage test for CPython and not for PyPy. The hypothesis is that the segfault was never in Awkward, but in pytest-cov. (You have a link to an issue about that in the pytest-cov project, right, @ianna?) That hypothesis would make sense of a few things:
Although it's still possible that an error in Awkward is only revealed when pytest-cov runs in PyPy (because it somehow stomps on memory that doesn't get stomped on without pytest-cov), if I remove pytest-cov from ragged's tests and get no segfault, that will be a strong indication that it's caused by pytest-cov and not Awkward, since ragged's test suite is so different from Awkward's. It wouldn't be 100% conclusive, but it would be conclusive enough for me. |
There's a test of our minimally supported pyarrow version that intermittently, but rarely, fails because of AssertionError: assert [[['', 'αβγ', '', ''], ['', '', '']], [], [['', '→δε←', '', ''], ['', 'ζz', 'zζ', '', ''], ['', 'abc', ' ']]] == [[['', 'αβγ', '', ''], ['', '', '']], [], [['', '→δε←', '', ''], ['', 'ζz', 'zζ', '', ''], ['', 'abc', '', '']]] I am very sure that this one is due to something in pyarrow. We could increase our minimal pyarrow, but that wouldn't be for compatibility reasons, it would be because pyarrow 7 has issues. Instead, I'll just re-run this test by hand. |
I'm testing it in ragged here: scikit-hep/ragged#47 Although I remember segfaults in PyPy, I'm not seeing them now—with or without codecov. |
FYI: #3097 was updated to I was going to make the PyPy test required, but it looks like it's not ready for that, yet. |
yes, I was just looking at it :-( |