Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compiler_fence example is misleading #133014

Open
lukas-code opened this issue Nov 13, 2024 · 7 comments · May be fixed by #136079
Open

compiler_fence example is misleading #133014

lukas-code opened this issue Nov 13, 2024 · 7 comments · May be fixed by #136079
Labels
A-atomic Area: Atomics, barriers, and sync primitives A-concurrency Area: Concurrency A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. T-opsem Relevant to the opsem team

Comments

@lukas-code
Copy link
Member

lukas-code commented Nov 13, 2024

Location

https://doc.rust-lang.org/nightly/core/sync/atomic/fn.compiler_fence.html#examples

Summary

use std::sync::atomic::{AtomicBool, AtomicUsize};
use std::sync::atomic::Ordering;
use std::sync::atomic::compiler_fence;

static IMPORTANT_VARIABLE: AtomicUsize = AtomicUsize::new(0);
static IS_READY: AtomicBool = AtomicBool::new(false);

fn main() {
    IMPORTANT_VARIABLE.store(42, Ordering::Relaxed);
    // prevent earlier writes from being moved beyond this point
    compiler_fence(Ordering::Release);
    IS_READY.store(true, Ordering::Relaxed);
}

fn signal_handler() {
    if IS_READY.load(Ordering::Relaxed) {
        assert_eq!(IMPORTANT_VARIABLE.load(Ordering::Relaxed), 42);
    }
}

This example makes it look like there is some kind of guarantee for a release fence that isn't paired with an acquire operation or fence, which as far as I can tell, is not the case. (See the C++ docs for regular (thread) fences that do not guarantee anything for an unpaired release fence and compiler (signal) fences that are described as strictly weaker than thread fences.)

So in this example there is no happens-before relation between the store of IMPORTANT_VARIABLE in main and the load thereof in the signal handler and therefore the signal handler may actually observe IS_READY == true and IMPORTANT_VARIABLE == 0.

Presumably the example should have a load fence in the signal handler and explain in the prose text above that the loads can be reordered, too:

fn signal_handler() {
    if IS_READY.load(Ordering::Relaxed) {
+       // prevent later reads from being moved beyond this point
+       compiler_fence(Ordering::Acquire);
        assert_eq!(IMPORTANT_VARIABLE.load(Ordering::Relaxed), 42);
    }
}

related: #129189

@lukas-code lukas-code added the A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools label Nov 13, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 13, 2024
@lukas-code lukas-code added A-concurrency Area: Concurrency T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. A-atomic Area: Atomics, barriers, and sync primitives T-opsem Relevant to the opsem team and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Nov 13, 2024
@hkBst
Copy link
Contributor

hkBst commented Jan 21, 2025

This looks logical, but it's been a while since I read up on memory orderings.

@RalfJung want to take a look?

@RalfJung
Copy link
Member

Good catch, the example indeed seems wrong. However, the fence is not defined in terms of "reordering", it is defined in terms of happens-before and other relationships, so that's also what the comment should refer to.

Cc @rust-lang/opsem @chorman0773

@chorman0773
Copy link
Contributor

The regular fence examples use the synchronizes-with term in particular, I'd recommend the same term here.

https://doc.rust-lang.org/nightly/core/sync/atomic/fn.fence.html#examples

@CAD97
Copy link
Contributor

CAD97 commented Jan 21, 2025

Iff the signal_handler ABI of the platform implies a logical compiler_fence(Acquire) when it starts (which isn't an unreasonable guarantee to have), then the current example would be acceptable.

But I agree that the example would be better written with an explicit acquire edge, avoiding implying logic of “I observed the (relaxed) flag write after the fence, therefore this is okay.”1

Footnotes

  1. Especially because only atomics are used in the example, it's super easy to mentally try to justify that because the latter write is visible, the former write must also be visible. But since the hook is a separate logical thread, IIUC that same-thread sequencing logic doesn't apply inter-thread. AIUI, the needed chain is that {IV.store(42, Rel) happens-before fence(Rel) hb IR.store(true, Rel) ??? IR.load(Rel) -> true hb IV.load(Rel)}2 or we could just rely on the simpler {IV.store(42) hb fence(Rel) synchronizes-with fence(Acq) hb IV.load(Rel)}.

  2. The ordering established by observing a value via Relaxed store/load is weak enough that I generally pretend it doesn't exist at all; IIUC it only extends to ordering of observed values of that singular atomic object.

@lukas-code
Copy link
Member Author

lukas-code commented Jan 21, 2025

Iff the signal_handler ABI of the platform implies a logical compiler_fence(Acquire) when it starts (which isn't an unreasonable guarantee to have), then the current example would be acceptable.

I don't see how that would fix this example. If we assume an implicit acquire at the start of the signal handler here:

use std::sync::atomic::{AtomicBool, AtomicUsize};
use std::sync::atomic::Ordering;
use std::sync::atomic::compiler_fence;

static IMPORTANT_VARIABLE: AtomicUsize = AtomicUsize::new(0);
static IS_READY: AtomicBool = AtomicBool::new(false);

fn main() {
    IMPORTANT_VARIABLE.store(42, Ordering::Relaxed);
    // prevent earlier writes from being moved beyond this point
    compiler_fence(Ordering::Release);
    IS_READY.store(true, Ordering::Relaxed);
}

fn signal_handler() {
+   compiler_fence(Ordering::Acquire);
    if IS_READY.load(Ordering::Relaxed) {
        assert_eq!(IMPORTANT_VARIABLE.load(Ordering::Relaxed), 42);
    }
}

To me it looks like there is still no happens-before relation between IS_READY.store(true, Ordering::Relaxed) and IMPORTANT_VARIABLE.load(Ordering::Relaxed). To quote from the fence docs:

A fence ‘A’ which has (at least) Release ordering semantics, synchronizes with a fence ‘B’ with (at least) Acquire semantics, if and only if there exist operations X and Y, both operating on some atomic object ‘M’ such that A is sequenced before X, Y is sequenced before B and Y observes the change to M. This provides a happens-before dependence between A and B.

    Thread 1                                          Thread 2

fence(Release);      A --------------
x.store(3, Relaxed); X ---------    |
                               |    |
                               |    |
                               -------------> Y  if x.load(Relaxed) == 3 {
                                    |-------> B      fence(Acquire);
                                                     ...
                                                 }

In this example, "B" is compiler_fence(Ordering::Acquire) and "Y" is IS_READY.load(Ordering::Relaxed) , but Y is not sequenced-before B, so there is no synchronizes-with relation between the fences, even if Y observes the change to M (IS_READY).


Now, here I wanted to make a point that this fence is actually important to prevent LLVM from reordering the reads in the signal handler, but it looks like LLVM happily miscompiles this anyway, even if you add the fence 😕 Nevermind, I misread the assembly https://rust.godbolt.org/z/Mjq4x5PaK

use std::sync::atomic::AtomicBool;
use std::sync::atomic::Ordering;
use std::sync::atomic::{compiler_fence, fence};

#[no_mangle]
static mut IMPORTANT_VARIABLE: usize = 0;

static IS_READY: AtomicBool = AtomicBool::new(false);

#[no_mangle]
fn main() {
    unsafe { IMPORTANT_VARIABLE = 42; }
    // prevent earlier writes from being moved beyond this point
    fence(Ordering::Release);
    IS_READY.store(true, Ordering::Relaxed);
}

// SAFETY: `unknown` must be 0
#[no_mangle]
fn signal_handler(unknown: usize) -> usize {
    let mut r = 0;
    for _ in 0..unknown {
        // Trick LLVM into reading IMPORTANT_VARIABLE early.
        // By our safety condition, this read does not actually happen,
        // so we don't have a data race here.
        r += unsafe { IMPORTANT_VARIABLE };
    }

    if IS_READY.load(Ordering::Relaxed) {
        // prevent later reads from being moved beyond this point
        fence(Ordering::Acquire);
        if unsafe { IMPORTANT_VARIABLE } == 42 {
            return 42;
        }
    }

    r
}

@CAD97
Copy link
Contributor

CAD97 commented Jan 21, 2025

I don't see how that would fix this example.

Ah, whoops, you're correct, of course. I had mentally adjusted the loop outside the signal hook, because the loop is just useless in this example; the compiler_fence doesn't help at all if the write comes from a different physical thread than the signal hook is looping on, which is needed for the value of the atomic to change during a busy loop.

So this example really is just busted currently. It seems to be trying to use a compiler fence to ensure the store to IMPORTANT_VARIABLE is visible before the store to IS_READY, which is definitely not how a compiler fence works.

@RalfJung RalfJung linked a pull request Jan 26, 2025 that will close this issue
@RalfJung
Copy link
Member

I proposed a fix for the example in #136079, please have a look.

fmease added a commit to fmease/rust that referenced this issue Jan 27, 2025
…=jhpratt

compiler_fence: fix example

The old example was wrong, an acquire fence is required in the signal handler. To make the point more clear, I changed the "data" variable to use non-atomic accesses.

Fixes rust-lang#133014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-atomic Area: Atomics, barriers, and sync primitives A-concurrency Area: Concurrency A-docs Area: Documentation for any part of the project, including the compiler, standard library, and tools T-libs-api Relevant to the library API team, which will review and decide on the PR/issue. T-opsem Relevant to the opsem team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants