Skip to content

6.0.x Feature List

Howard Pritchard edited this page Jan 14, 2025 · 61 revisions

Time Line

Target date - Q1CY25.

List of Features planned for the 6.0.x release stream

MPI 4.0:

  • Big count support
    • API level functions (In Progress PR OPEN feedback being addressed)
    • Collective embiggening (discussed at F2F, stage in none v,w functions first) (DONE)
    • Changes to datatype engine/combiner support (could be a challenge)
    • ROMIO refresh
    • Embiggen man pages (Howard, probably will do the way MPICH does this if possible)
    • Embiggen other documentation (which documentation?)
    • Remove hcol component? (its API doesn't support big count and its been superseded by UCC)
  • MPI_T events (probably won't do for 6.0.x).

MPI 4.1:

MPI 5.0 ABI:

  • If Jake's ABI work is ready, it might help solidify the standard to have our implementation done.
    • Merge ABI work into main, enable it only when requested, and stress in documentation it is experimental.

PRRTE switch Phase 1

  • Resync with upstream PRRTe and decide which branch to use for the 6.0.x branch
  • Documentation Changes (partially DONE UofL)
  • Prefix prte binary names (DONE UofL)
  • Remove --with-prrte configure option from ompi (DONE UofL)
  • Remove unneeded MCA components and frameworks (DONE UofL/rhc54)
  • Need to merge UofL changes into whatever solution we find for a PRRTE embedded in OMPI solution for 6.0.x. Note some UofL changes are in the OMPI source code.

Accelerator support:

  • extended accelerator API functionality (IPC) and conversion of the last components to use accelerator API (DONE for ROCM and CUDA, not ZE).
  • level zero (ze) accelerator component (DONE basic support, IPC not implemented, Howard)
  • support for MPI 4.1 memory kinds info object (assume we have PRRTE move, 1 month for basic support, AMD to do for rocm)
  • SMSC accelerator (Edgar - DONE CUDA needs to be testied)
  • Add features to coll accelerator (DONE)
  • Runtime and maybe config time big flag to turn off/on accelerator support (Edgar/AMD)

Things to remove:

  • GNI BTL - no longer have access to systems to support this (Howard) (DONE)
  • UDREG Rcache - no longer have access to systems that can use this (Howard) (DONE)
  • FS/PVFS2 an FBTL/PVFS2 - no longer have access to systems to support this (Edgar) (DONE)
  • coll/sm (DONE)
  • Remove TKR version of use mpi module. (Howard)
    • This was deferred from 4.0.x because in April/May 2018 (and then deferred again from v5.0.x in October 2018), it was discovered that:
      1. The RHEL 7.x default gcc (4.8.5) still uses the TKR mpi module
      2. The NAG compiler still uses the TKR mpi module.

Collectives:

  • mca/coll: hierarchical MPI_Alltoall(v), MPI_Gatherv, MPI_Scatterv. (various orgs working on this)
  • might benefit from a json file based parameter file (AWS/Luke)
  • mca/coll: new algorithms (various orgs working on this)

There are quite a few open PRs related to collectives. Can some of these get merged? See notes from 2024 F2F Meeting

Random:

  • Sessions - add support for UCX PML (Howard, 2-3 weeks) (DONE)
  • Sessions - various small fixes (Howard, 1 month) (DONE)
  • Require C11 (Joseph PR being reviewed)

Likely to miss the 6.0.0 release

  • Phase 2 PRRTE
    • MCA parameters move into ompi namespace.
    • prte_info is gone, move those to ompi_info, perhaps a prte-mca option?
  • BTL Self accelerator aware (probably defer to later release)
  • What about smart pointers?
  • reduction op (and others) offload support (Joseph estimates 1-2 months to get in)
  • Stream-aware datatype engine.
  • Datatype engine accelerator awareness(e.g. memcpy2d) (George).
  • mca/coll: blocking reduction on accelerator (this is discussed above, Joseph)
  • Atomics - can we just rely on C11 and remove some of this code? We are currently using gcc atomics for performance reasons. Joseph would like to have a wrapper for atomic types and direct load/store access.
  • ZE support for IPC (maybe)
Clone this wiki locally