-
Notifications
You must be signed in to change notification settings - Fork 877
WeeklyTelcon_20171010
Geoffrey Paulsen edited this page Jan 9, 2018
·
1 revision
- Dialup Info: (Do not post to public mailing list or public wiki)
- Geoff Paulsen (IBM)
- Jeff Squyres
- Edgar Gabriel
- Howard
- Josh Hursey
- Joshua Ladd
- George
- Mohan
- Nathan Hjelm
- Ralph
- Thomas Naughton
- Todd Kordenbrock
Review All Open Blockers
Review v2.0.x Milestones v2.0.4
- Iterating a bit on disabling cuda inside of hwloc 4249 PR on this branch.
- Issue 4248 - disabling cuda on hwloc
- On all existing release branches, do -cuda=no for hwloc configury.
- Issue 2525 - may close since users can't access.
- Schedule: if we get PRs in today, we should aim to get v2.0.x release DONE this week.
Review v2.x Milestones v2.1.2
-
v2.1.3 (unscheduled, but probably jan 19, 2018)
- PR4172 - a mix between feature / bugfix.
-
Are we going to do anything for v2.x for hwloc 2?
- At least put in a configure error if detects hwloc v2.x
-
HWLoc is about to release v2.0
- If topology info comes in from outside, what hwloc was that resource manager using?
- Is the XML annotated with which version of hwloc generated it?
- would be nice to gracefully fail, since fairly opaque.
- Seems like we'll need a rosetta stone for
- HWLOC is a static framework.
- Brice is going to get HWLOC by super computing, but it might be tight.
- Are we comfortable releasing with an alpha/beta version of HWLOC imbedded.
- OMPI 2.x will not work with HWLOC 2.0, because Changed APIs.
- May want some configure errors (not in there yet)
- 3.0 only works with older hwloc pre-2.0. In v3.0.x if it's hwloc 2.0, we error at configure.
- in 3.1 branch external hwloc allows either hwloc 2.0 or older hwloc, but must decide at build time.
- Still have to run 3.1 everywhere.
- Do we want to backport the hwloc 2.0 support to v3.0?
Review v3.0.x Milestones v3.0
- v3.0.1 - Opened the branch for bugfixes Sep 18th.
- Still targeting End of October for release of v3.0.1
- Everything ready to push has been.
- a few PRs need review.
Review v3.1.x Milestones v3.1](https://github.com/open-mpi/ompi/milestone/27)
-
Branched v3.1 last night, but forgot to build nightly tarballs.
- Building now.
- Cisco plans to drop v2.0.x, to pickup v3.1
-
v3.1.x Snapshots are not getting posted. Has to do with cron failures.
- Causing nightly mtts to not be run.
-
PMIx 2.1 should get in in time for v3.1
- In master, but no PR to OMPI v3.1.x yet, since they haven't released it yet.
- Schedule is at Risk:
- What hwloc are we shipping.
- PMIx on track, but needs PR.
- Known issues on master, that need to be associated with v3.1
-
Administration
- Looking at a way to recognize some supporting organizations to help acknowledge their support.
Review Master Master Pull Requests
- MTT Amazon ARM v8 is failing all CI.
- Default behavior of show load errors is true (has been true since 2006)
- Been true for at least 8 years.
- Don't remember this being a conscious change, maybe by accident.
- If you're a packager, you build with all packages, so you can support everyone.
- But then their users get a bunch of errors because they don't have everything installed.
- Should put a configure option of what do you want the default to be.
- https://github.com/open-mpi/ompi/issues/4306
- Secondary question is why should the configure option default to?
- Jeff Squyres - signed up to do this configury work. - Thanks.
- Python client doesn't have nightly snapshot integration.
- Need this since this is most of the release testing.
Review Master MTT testing
- Website - openmpi.org
- Brian trying to make things more automated, so can checkout repo, etc. Repo is TOO large.
- Majority of the problem is the Tarballs. and already storing those in S3.
-
Need to see if Attributes are MT - IBM will see if we have any tests to audit.
- Asked, need to get answer back from them.
- Jan / Feb
- Possible locations: San Jose, Portland, Albuquerque, Dallas
- Mellanox, Sandia, Intel
- LANL, Houston, IBM, Fujitsu
- Amazon,
- Cisco, ORNL, UTK, NVIDIA