forked from open-mpi/ompi
-
Notifications
You must be signed in to change notification settings - Fork 4
WeeklyTelcon_20160315
Geoff Paulsen edited this page Mar 15, 2016
·
14 revisions
- Dialup Info: (Do not post to public mailing list or public wiki)
- Jeff Squyres
- Geoff Paulsen
- Brad Benton
- Howard
- Josh Hursey
- Nathan Hjelm
- ralph
- Ryan Grant
- Sylvain Jequgey
- Todd Kordenbrock
- Yohann Burette
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.3
-
PR 1004 - MPI_Ineighbor_alltoallw
- Needs reviewer
-
PR 1002 - Memory allocation hooks
- Jeff to review
- PR 1006 - Ralph will review.
- PR 1008 - Jeff already reviewed.
- Anyone testing on PSM2 or Omnipath? Intel guys have been. Howard - Ralph check this.
- Symbol confusion fix requires 1.10.2, and need newer Omnipath (PSM2) library/driver.
-
PR 1004 - MPI_Ineighbor_alltoallw
- 1.10.3 in Late April - Some smaller fixes accumulating in the branch
- Nothing critical at the moment
- Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20
- Blocker Issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker
-
Issue 1425 - External PMIx server support
- Ralph working on a fix - should be quick.
-
Issue 1418 - fix MPI process suicide code
- Delayed until 2.1 or 3.0
-
Issue 1406 - TCP BTL THREAD_MULTIPLE deadlock
- Nathan - George is working on a fix, but it is a rewrite. So might take some time.
- If old and new rewrite of TCP BTL are "compatible", then we can switch based on threaded state of MPI_Init.
- OR could require two different TCP components "tcp" / "tcpmt", and expose this issue to users.
- Nathan - George is working on a fix, but it is a rewrite. So might take some time.
-
Issue 1353 - -host behavior
- Ticket has been updated with new commits. Jeff to test.
- Ralph is out of time to work on.
- Need to document behavior in different releases, then close this for v2.0.0.
-
Issue 1425 - External PMIx server support
- Milestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0
-
PR 1003 - Race condition in process matching thread
- Nathan to review the patch and sign off.
-
PR 1000 - misc warnings and missing include files
- Ralph to review and update
-
PR 977 - span on heterogeneous clusters
- Ralph to review and update
-
PR 973 - Parsing of envvars in MCA
- Nathan pushing update, Jeff to review after the call.
-
PR 1003 - Race condition in process matching thread
- Reviewed all Pull Requests, and pinged a few for comments.
- Really need more people doing testing on v2.0 branch
- Decision to shoot for April 24th for RC0 of 2.0.
- Jeff saw someone is getting OPAL-FIFO is failing on 2.0.
- Need more thread safety testing.
- Jeff added his thread safety tests are a 2 night cycle to do all the tests.
- Jeff seeing SIGPIPEs ONLY on master with usNIC - Ralph wonders if it's valgrind issue Mellanox was seeing.
- MPI Forum - MPI_Info under discussion
- Don't propagate infos with MPI_Comm_dup - use MPI_Comm_dup_with_info to propagate infos
- Still discussion about MPI_Info_get/MPI_Info_set behavior
- Open MPI Developer's Meeting
- Nathan - Enabling thread_multiple all the time
- PR 1397 - always enable MPI_THREAD_MULTIPLE support
- Should we turn this on for everyone? Generally feeling is to accept this
- Send another note to the devel list to give folks one last change to comment before commit.
- Need to do some performance testing
- v2.0.0 better MPI_THREAD_MULTIPLE correctness
- focus on performance improvements in next v2.X series release
-
https://github.com/open-mpi/ompi/pull/1417: "RFC: change default build to always be optimized (even for developers)" If no one has any further comments, it's time to merge.
- Jeff thinks we are in consensus, but wants to check with developers.
- To turn on --enable-debug, --enable-memdebug, --enable-picky.
- Nathan - heads up that mpool re-write is ready to go.
- will get merged in afternoon today.
- Really need more people doing testing on v2.0 branch
- delayed until next week.
- Cisco, ORNL, UTK, NVIDIA
- Mellanox, Sandia, Intel
- LANL, Houston, IBM