Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Py_CLEAR() to avoid a data race (#221) #222

Merged
merged 1 commit into from
Jan 22, 2025

Conversation

tornaria
Copy link
Contributor

Seems to fix #221

It turns out that

        Py_XDECREF(cysigs.exc_value)
        cysigs.exc_value = NULL

is unsafe, if the code is interrupted in between the two statements [0].

This is what happens in #221. Then verify_exc_value() is called again with
cysigs.exc_value.ob_refcnt=0, breaking badly. Later on this causes
verify_exc_value() to be called with cysigs.interrupt_received=1 which
should not happen.

Using instead the safe macro Py_CLEAR() seems to fix the issue.

[0] See the warning in
https://docs.python.org/3/c-api/refcounting.html#c.Py_DECREF
and see also
https://docs.python.org/3/c-api/refcounting.html#c.Py_CLEAR

@tornaria
Copy link
Contributor Author

Notes:

  • I wasn't able to reproduce this without using the divisors() integer method from sagemath. This seems to be really provoked by a high pressure on sig_occurred(), reentered several times possibly by gc.collect().
  • This doesn't address Nested signal handling leads to excessive slowdown of sig_occurred()  #215. The problem there is that the span of sig_occurred() is not well defined, and calling gc.collect() recursively just to test sig_occurred() produces a slow down. IMO the span of sig_occurred() has to be made precise and the code simplified.

@tornaria tornaria self-assigned this Jan 21, 2025
@tornaria
Copy link
Contributor Author

To ease reviewing, note that this PR should be very safe -- worst case a noop.

Indeed, all we do is replace

Py_XDECREF(cysigs.exc_value)

by

Py_CLEAR(cysigs.exc_value)

which has the exact same semantics except for making sure that cysigs.exc_value becomes NULL before decrementing the refcount. This is to make sure that it never happens that cysigs.exc_value is an invalid pointer.

In one case we do

Py_CLEAR(cysigs.exc_value)
cysigs.exc_value = val

so we are setting the pointer to NULL to later set it to val. Of course there's no harm on that. The better python idiom would be Py_XSETREF(cysigs.exc, val) but alas, that's not declared on cpython.ref.

@dimpase
Copy link
Member

dimpase commented Jan 22, 2025

ok, this might help potential races.

@dimpase dimpase merged commit 95ed336 into sagemath:main Jan 22, 2025
23 of 24 checks passed
@tornaria tornaria deleted the clear-exc_value branch January 22, 2025 03:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reentrancy issue in sig_occurred()
2 participants