Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for communicator bug in MPI cache #72

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Blixodus
Copy link

@Blixodus Blixodus commented Dec 4, 2024

Added rank translation from the MPI communicator of the task to MPI_COMM_WORLD when checking if data needs to be exchanged before execution. This mitigates the issue raised in https://sympa.inria.fr/sympa/arc/starpu-devel/2024-12/msg00001.html and avoids StarPU mistakenly believing that some data has already been sent to a node that has the same rank as another node in some communicator.

The fix is somewhat slow (~1-2µs per translation); a better way would be to precompute all rank translations into a lookup table when registering new MPI communicators to StarPU.

@Blixodus Blixodus changed the title a slow but working fix for communicator bug in cache Fix for communicator bug in MPI cache Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant