-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plans for MPI_THREAD_MULTIPLE? #17
Comments
In an earlier version I indeed used |
I see. Do you remember which MPI libraries you experimented with? With multiple threads participating in communication, there exits a design space which could explore the use of separate communicators, tags, etc. to expose parallel communication to the MPI library. Is there a communication kernel mini-application or microbenchmark that captures the communication pattern of Tensorflow? That would serve well to explore the performance of the different strategies in the design space of parallel MPI communication. |
If a mini-app isn't available, I would be happy to help with writing a mini-app that captures the communication pattern of Tensorflow. |
https://github.com/tensorflow/networking/blob/master/tensorflow_networking/mpi/mpi_utils.cc#L56
I see the use of
MPI_THREAD_MULTIPLE
has been commented out. From my understanding of the current design of exchanging data with MPI, we do not requireMPI_THREAD_MULTIPLE
since a dedicated thread is responsible for communication.Are there future plans of having multiple threads perform communication simultaneously (once MPI implementations better support
MPI_THREAD_MULTIPLE
of course)? If so, is it more likely that we have dedicated communication threads or is it possible that the computation threads also perform communication?The text was updated successfully, but these errors were encountered: