MPISpace: element operators with atomic trait aren't atomic #99

brian-kelley · 2024-05-06T19:15:40Z

The operator overloads for proxy type MPIDataElement with atomic trait have the same implementation as the non-atomic version, for example at

kokkos-remote-spaces/src/impl/mpispace/Kokkos_MPISpace_Ops.hpp

Line 513 in df4c46a

KOKKOS_INLINE_FUNCTION

  KOKKOS_INLINE_FUNCTION
  void inc() const {
    T val = T();
    mpi_type_g(val, offset, pe, *win);
    val++;
    mpi_type_p(val, offset, pe, *win);
  }

These operators should use the one-sided atomic functions: MPI_Accumulate, MPI_Get_accumulate, or MPI_Fetch_and_op.

MPI_Compare_and_swap in a loop could be used for non-builtin operations like in Desul.

The text was updated successfully, but these errors were encountered:

vmiheer · 2024-05-06T21:47:45Z

Thanks for opening the issue, Brian!!! Is there trait which says, associative reordering is okay? Although in distributed scenario I wonder how it would be not okay.

brian-kelley · 2024-05-06T21:51:01Z

@vmiheer It should be safe to assume that it's always OK, since without that assumption Kokkos couldn't do parallel reduce or scan. Do the MPI one-sided atomics give you a choice in that?

vmiheer · 2024-05-06T21:54:32Z

I was looking at semantics for https://docs.nvidia.com/nvshmem/archives/nvshmem-113/api/docs/gen/mem-model.html#differences-between-nvshmem-and-openshmem and was wondering.
Although this is question for later. For now I am going for MPI_Accumulate.

devreal · 2024-05-06T22:04:34Z

MPI RMA accumulate operations are only defined on operator/type pairs that are associative.

brian-kelley · 2024-05-06T22:50:51Z

@vmiheer I see, so this is about how atomic operations are ordered. I wasn't familiar with this detail but the nvshmem behavior is actually different from Kokkos core, where if you do the two atomic fetch-adds from the same thread then they will always execute in order.

But KRS is supposed to be fully portable, so it only makes guarantees that all its backends make. And KRS doesn't add fences to all the nvshmem atomics (I assume this would be horrible for performance). So for the MPISpace backend, you don't have to worry about how atomics are ordered.

janciesko · 2024-05-08T22:03:30Z

Related to #25

brian-kelley · 2024-05-12T05:46:45Z

@janciesko Sorry, I didn't notice this was already an open issue!

janciesko self-assigned this May 7, 2024

vmiheer mentioned this issue May 7, 2024

mpi: use mpi get_accumulate as proxy for atomic add #100

Closed

brian-kelley closed this as completed May 12, 2024

brian-kelley closed this as not planned Won't fix, can't repro, duplicate, stale May 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPISpace: element operators with atomic trait aren't atomic #99

MPISpace: element operators with atomic trait aren't atomic #99

brian-kelley commented May 6, 2024 •

edited

Loading

vmiheer commented May 6, 2024

brian-kelley commented May 6, 2024 •

edited

Loading

vmiheer commented May 6, 2024

devreal commented May 6, 2024

brian-kelley commented May 6, 2024

janciesko commented May 8, 2024

brian-kelley commented May 12, 2024

MPISpace: element operators with atomic trait aren't atomic #99

MPISpace: element operators with atomic trait aren't atomic #99

Comments

brian-kelley commented May 6, 2024 • edited Loading

vmiheer commented May 6, 2024

brian-kelley commented May 6, 2024 • edited Loading

vmiheer commented May 6, 2024

devreal commented May 6, 2024

brian-kelley commented May 6, 2024

janciesko commented May 8, 2024

brian-kelley commented May 12, 2024

brian-kelley commented May 6, 2024 •

edited

Loading

brian-kelley commented May 6, 2024 •

edited

Loading