Description
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
Using MPI v4.1.1 on Linux
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Built from source tarball, no other configure flags other than --prefix
Details of the problem
Its more of a question: with point-to-point communication, either the non-blocking sends or blocking sends in a thread each can be used to parallelise transfers but is there any way to setup/build/configure the OpenMPI library to do this at the target?
I have a master process and many workers, each worker process performs and MPI_Get with the master process as the target via passive synchronisation. The issue is that as the number of workers increases so does the walltime for the transfers, so I'd like the target to service the Get's in parallel threads - is that possible? With point-to-point comms, I could either use non-blocking sends or blocking sends in a parallel OpenMP sections to achieve this but not clear if I can do this in any way via RMA? Is there some trick that can be done with active synchronisation? The only other thing I can think of is to do an MPI_Put from the master in parallel - will that work?
Thanks in advance for any suggestions, or please point me to some doc, if available. The MPI spec. doesn't really address my issue so I'm hoping to get an understanding of how OpenMPI actually implements RMA to be able to fine tune my application