-
Notifications
You must be signed in to change notification settings - Fork 868
WeeklyTelcon_20220816
Geoffrey Paulsen edited this page Aug 23, 2022
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Austen Lauria (IBM)
- Geoffrey Paulsen (IBM)
- Jeff Squyres (Cisco)
- Brendan Cunningham (Cornelis Networks)
- Edgar Gabriel (UoH)
- Christoph Niethammer (HLRS)
- David Bernhold (ORNL)
- Harumi Kuno (HPE)
- Hessam Mirsadeghi (UCX/nVidia)
- Jingyin Tang
- Joseph Schuchart
- Josh Fisher (Cornelis Networks)
- Matthew Dosanjh (Sandia)
- Thomas Naughton (ORNL)
- Todd Kordenbrock (Sandia)
- Howard Pritchard (LANL)
- William Zhang (AWS)
- Jan (Sandia -ULT support in Open MPI)
- Josh Hursey (IBM)
- Tommy Janjusic (nVidia)
- Akshay Venkatesh (NVIDIA)
- Artem Polyakov (nVidia)
- Aurelien Bouteiller (UTK)
- Brandon Yates (Intel)
- Brian Barrett (AWS)
- Charles Shereda (LLNL)
- Erik Zeiske
- Geoffroy Vallee (ARM)
- George Bosilca (UTK)
- Joshua Ladd (nVidia)
- Marisa Roman (Cornelius)
- Mark Allen (IBM)
- Matias Cabral (Intel)
- Michael Heinz (Cornelis Networks)
- Nathan Hjelm (Google)
- Noah Evans (Sandia)
- Raghu Raja (AWS)
- Ralph Castain (Intel)
- Sam Gutierrez (LLNL)10513
- Scott Breyer (Sandia?)
- Shintaro iwasaki
- Xin Zhao (nVidia)
- v4.1.5
- Schedule: targeting ~6 mon (Sept, Oct? Don't remember)
- No driver on schedule yet.
- Potential CVE issue in libevent.. but might not need to do anything.
- Worse case we'd just updage our libevent version.
- CVE scanner doesn't find CVEs from Open MPI source or Open MPI with CVE fixes.
- New scanner doesn't find any issues with libevent anymore...
- Any updates on SLURM failures we're currently blocking on?
- Blocking on merging prte submodule pointer on SLURM.
- Testing mpirun command line options.
- Supposed to do automatic translations from old command line options to new options.
- Are we planning to get rid of options at some point?
- Not printing deprecated warning by default.
- We've made new options (that are the new way), but if we're not encouraging people to go to them, why?
- Can we even map old to new options one-to-one.
- We "own" the szitso component and we could ditch new options, and only use old options if we want.
- Before we force any change, we should get user's
- Old ones had auto-completion.
- If we have old options that are going to new options, weird that we don't print the messages.
- v5.0 was supposed to be pretty disruptive, but if we go back and make it less disruptive, that's fine, but we are kinda saying that the old options are the way.
- PRRTE v2 and v3 testing today.
- Where's the list that exists in general?
- What is this list to check on.
- It's be pretty good to make a test suite that assumes 2-4 Nodes with 4ppr or so... *
- Schedule:
- PMIx and PRRTE changes coming at end of August.
- Try to have bugfixes PRed end of August, to give time to iterate and merged.
- Still using Critical v5.0.x Issues (https://github.com/open-mpi/ompi/projects/3) yesterday
- PMIx and PRRTE changes coming at end of August.
- Issue 10641 Ralph changed the PRRTE branches (switching us to v3.2 branch)
- Lots of changes from PRRTE v2.1 -> v3.2
- Still working to get CI working
- MTT still failing with SLURM.
- Gone from segv in MPIRUN to resource detection.
- Ralph doesn't have SLURM to help with.
- Looking for someone with SLURM to help.
- Austen will open an Issue for this.
- Does ANYONE use Open MPI's Java Bindings?
- Docs
-
mpirun --help
is OUT OF DATE.- Have to do this relatively quickly, before PRRTE releases.
- Austen, Geoff and Tomi will be
- REASON for this, is because mpirun command line is in PRRTE.
-
- mpirun manpage needs to be re-written.
- Docs are online and can be updates asyncronously.
- Jeff posted PR to document runpath vs rpath
- Our configure checks some linker flags, but there might be default in linker or in system that really governs what happens.
- Ralph is looking to release PRRTE v3.x by end of the month.
- Java Binding discussion?
- If Open MPI wants Java Bindings, we'd need to do some Java work in PRRTE before end of the month.
- Small non-zero number of users, Howard may be interested.
- SLURM discussion
- PRRTE won't run mpirun inside of slurm allocation for SLURM < 17.11
- How many users will we hurt (require them to upgrade SLURM)?
- Jeff still thinks his case might be out of the norm
- HAN / Adapt runs.
- Post to Devel. Summary, and link to results.
- Want to make these the default.
- August 18, 2-3pm Central
- Geoff will send out web-ex to devel.
- Incompatibilities in User Level threading that Jan
- What's the schedule for fixes to get into v5.0.x
- Will try to get PRs in by end of August and then iterate.
- William said yesterday that they wanted one more day of testing.
- sm_cuda component was moved into framework.
- nVidia has some issues building, and will try again to test
- Accelerator framework Good first step, but will need to fix (super high level)
- Does this framework allow us to get rid of sm_cuda altogether.
- Brian added some comments and William needs to address before merege.
- Switching to builtin atomics,
- 10613 - Prefered PR. GCC / Clang should have that.
- Fallback to C11 atomics if not available.
- Had to do a bit in m4.
- Builtin atomics are volatile.
- Next step would be to refactor the atomics for post v5.0.
- 10613 - Prefered PR. GCC / Clang should have that.
- Joseph will post some additional info thing in the ticket
- We're probably not getting together in person anytime soon.
- So we'll send around a doodle to have time to talk about our rules.
- Reflect the way we worked several years ago, but not really right now.
- we're to review the admin steering committee in July (per our rules):
- we're to review the technical steering committee in July (per our rules):
- We should also review all the OMPI github, slack, and coverity members during the month of July.
- Jeff will kick that off sometime this week or next week.
- In the call we mentioned this, but no real discussion.
- Wiki for face to face: https://github.com/open-mpi/ompi/wiki/Meeting-2022
- Might be better to do a half-day/day-long virtual working session.
- Due to company's travel policies, and convenience.
- Could do administrative tasks here too.
- Might be better to do a half-day/day-long virtual working session.