forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 4
WeeklyTelcon_20160322
Geoff Paulsen edited this page Mar 22, 2016
·
17 revisions
- Dialup Info: (Do not post to public mailing list or public wiki)
- Jeff Squyres
- Geoff Paulsen
- Brad Benton
- Geoffroy Vallee
- Howard Pritchard
- Josh Hursey
- Nathan Hjelm
- Nysal
- Ralph Castain
- Sylvain Jeaugey
- Todd Kordenbrock
- Tommy Janjusic (Mellanox)
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.3
- 1.10.3 in April - collecting some smaller fixes accumulating in the branch.
- Nothing critical at the moment
- Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20
- Blocker Issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker
-
Issue 1406
- TCP BTL THREAD_MULTIPLE deadlock
- If old and new rewrite of TCP BTL are "compatible", then we can switch based on threaded state of MPI_Init.
- OR could require two different TCP components "tcp" / "tcpmt", and expose this issue to users.
- George working on patch.
- TCP BTL THREAD_MULTIPLE deadlock
- Is Issue 299 an issue on 2.0.0? If so, that should also be a blocker!
- Issue linked on mailing list. User sent
- Related to Issue 429?
- Would merging opal and orte into one lib workaround this? - Probably, but that's alot of work.
- Still issue with UCX (who does hooking?)
- If we make a blocker, can accelerate fix? - Yes.
- Some assembly required. Might need someone to
- Assign to Mark Allen. - HIGH PRIORITY.
- Unblock PR 1353 move to 2.1?
- For 2.0 document what behavior is, and push change back to 2.1.
- 1038 - waiting on review mellanox
- 1015 - segv ibm - Howard
- 1014 - don't return Err_pending from collectives?
- Jeff will test USOCK today. - Coverity found an issue, Ralph will look at tomorrow.
- George has a patch to fix TCP (smaller patch to fix
-
Issue 1406
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0 *
- Master tests are failing.
- C++ bindings failing to compile is a symptom. cyclic dependencies?
-
Working on submit interface to make client easier to submit to. Should help Ralph's team.
-
PR1483 - allows base btl iterators to determine flag values.
- Only issues is how this is exposed to MTT
- Expose to Tools Working Group
- Gives you comma delineated list, and gives you all possible values.
-
PR1482 - Support in MCA base to force components to always be on.
- If you make your component static, and component says it's Always-on.
- Key usage for this is forcing BTL self (If you say not-self, will give error and abort)
- Useful for hook licensing framework.
- need to look at other components (Maybe Coll-basic), (Same with libnbc)
- Mellanox -
- Sandia - Todd Kordenbrock - Made a PR1443 PR 1037 for rondevue
- Intel - Working on Debugger Attach. If Ralph simulates debugger attach it works okay.
- Ralph doesn't have partner license.
- Not sure what the issue is, does mpirun not know if it's being debugged? Is the message not getting through?
- How does this relate to USOCK issue?
- Ralph was hoping to fix this, and debugger attach, and USOCK
- Once John helps Ralph reproduce issue, can do USOCK this right away.
- Still working alot on PMIx event reservation system.
- Still working on ORTE launch fun.
- Cisco, ORNL, UTK, NVIDIA
- Mellanox, Sandia, Intel
- LANL, Houston, IBM