Skip to content

WeeklyTelcon_20160501

Geoff Paulsen edited this page May 3, 2016 · 20 revisions

Open MPI Weekly Telcon


  • Dialup Info: (Do not post to public mailing list or public wiki)

Attendees

  • Geoff Paulsen
  • Jeff Squyres
  • Brad Benton
  • Howard
  • Josh Hursey
  • Joshua Ladd
  • Nathan Hjelm
  • ralph
  • Todd Kordenbrock

Agenda

Review 1.10

  • Milestones: https://github.com/open-mpi/ompi-release/milestones/v1.10.3
    • Two pull requests that have not been reviewed:
      • NULL datatypes - Jeff's looking at some oddities to see if we care.
      • distinct namespaces for OSC for pt2pt. - Nathan thinks it's fine.
    • Blocker for "PSM stuff" - Howard reviewed this. Good with this.
    • Jeff owes a library versioning review. already updated for 1.10.3.

Review 2.0.x

  • Wiki: https://github.com/open-mpi/ompi/wiki/Releasev20
  • Blocker Issues: https://github.com/open-mpi/ompi/issues?utf8=%E2%9C%93&q=is%3Aopen+milestone%3Av2.0.0+label%3Ablocker *
  • Milestones: https://github.com/open-mpi/ompi-release/milestones/v2.0.0 *
    • Looking pretty good, until Paul found a bunch of obscure things.
      • have most of them either fixes, or have issues or PRs to fix them.
    • Nathan has - can't clobber EBX - would hope the compiler would put store/restore around it.
      • Fix against master in.
    • 32bit powerpc issue in hook.
    • PR 1129 - ralph pulled the fix, waiting for paul's test result.
    • PR 1133 - trivial.
    • PR 1134 - OMPIO comp on netbsd - Paul queued tests.
    • Howard queued up some Readme changes
    • PR 1051 - marked for 2.1, but is annoying, would like to pull back to 2.0
      • Howard is okay.
    • Nathan kinda wants PR1127 in v2.x - OSC correctness fixes. Fixes map-by node for Graph500.
      • Important for Mellanox, v2.0.0
      • Howard is concerned about the change churn to put this into v2.0.0, and would prefer this in v2.0.1
    • master PR1617 - hcoll, hang in Finalize with srun - Mellanox would prefer v2.0.0
      • Fix on 1.10, but not on master or 2.x, but haven't opened PR for v2.x yet (today).

v2.0 Migration Guide

  • Discussion:
    • What "gotchas" do we need to communicate to users? I.e., what will people upgrading from v1.8.x/v1.10.x be surprised by?
    • Want it to be googlable.
    • A couple of paragraphs or 3 on biggest changes.
    • Removed support.
    • We need to collectively edit on wiki, and then we'll put it up on the open-mpi website.
    • new OSHMEM interfaces added, but still not implemented until 2.1
      • Biggest change is job launch / stuff to support (Josh)
      • PMI support changed, it's a framework now, expect orte_info components.
    • New RMA capabilities (Nathan)
    • Two minute blurbs, not too much details here.
    • work on this over next couple of days.

Review Master MTT testing (https://mtt.open-mpi.org/)

  • min-dist mapper test failing. Jeff opened Issue 1623.

    • PMIx external seems like a red-herring.
    • hwloc was upgraded.
  • static build issue because MPIR_ symbols in wrong place, so ORTE

  • IBM would like an explicit declaration of license the website / documentation is available under

    • no objections.
    • IBM will file a pull request, and email devel for more discussion.

MTT Dev status:

Status Updates:


Status Update Rotation

  1. Mellanox, Sandia, Intel
  2. LANL, Houston, IBM
  3. Cisco, ORNL, UTK, NVIDIA

Back to 2016 WeeklyTelcon-2016

Clone this wiki locally