-
Notifications
You must be signed in to change notification settings - Fork 877
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Topo aware coll comm v2 #12032
Topo aware coll comm v2 #12032
Conversation
ompi/communicator/comm_cid.c
Outdated
static int ompi_comm_activate_complete (ompi_communicator_t **newcomm, ompi_communicator_t *comm) | ||
/* Callback function to set comunicator disjointness flags */ | ||
static inline void ompi_comm_set_disjointness_nb_complete (ompi_comm_cid_context_t *context) { | ||
if (!OMPI_COMM_IS_DISJOINT_SET(*context->newcommp)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function should only be called once per communicator. Calling a second time should trigger an error, at least a message in debug mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks I will think about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please see update - I decided to add a warning message instead of hard failure.
ompi/communicator/comm_cid.c
Outdated
* send messages over the new communicator | ||
/** | ||
* Dual-purpose barrier: | ||
* 1. The communicator's disjointness is inferred fromimax_local_peers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks fixed.
c24c3e8
to
ea62e7c
Compare
@bosilca Heartbeat - I imagine you are busy in SC23. Please kindly review the change when you have time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please stash my comment into one of yours. Also, you need to sign your commits.
This patch introduces a new communicator flag to indicate if no processes share the same node. This flag is intended for optimization of hierarchy-aware collective operations to select the more efficient transport/algorithm. In this change we introduce a non-blocking allreducestep in communicator activation and set the new flag in the completion callback function. Signed-off-by: Wenduo Wang <[email protected]>
This patch introduces assertions to verify that sub-communicators are created with the expected OMPI_COMM_DISJOINT* flags. Signed-off-by: Wenduo Wang <[email protected]>
4fcafb0
to
61c9403
Compare
@bosilca Thank you very much! I have fixed the typo and rebased the branch. |
This patch refactors #11864 and prevents the deadlock reported in #12016
Specifically, this patch breaks down the disjointness flag determination logic in 2 parts:
max_local_peers
in a (non-blocking) allreduce unconditionally, and thenmax_local_peers
value and set the disjointness flag.As a result, this change makes the
context->id
field obsolete, and therefore we simply remove it. The allreduce also serves as a barrier function so we always invoke it, so we do not need the previous logic to check the parent disjointness flag - this simplifies the logic.I ran the reproducer provided by @dalcinl
NOTE: I observed a deadlock behavior while making this change:
context->id
andcontext->max_local_peers
, and pass them assubreqs
toompi_comm_request_schedule_append (request, ompi_comm_activate_nb_complete, &subreqs, 2);
the reproducer will hang. Back trace shows that the allreduce never completes:The allreduce used in this case was
ompi_comm_allreduce_intra_pmix_nb