Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is UCX working with MPI-Sessions? #12566

Closed
TimEllersiek opened this issue May 22, 2024 · 7 comments
Closed

Is UCX working with MPI-Sessions? #12566

TimEllersiek opened this issue May 22, 2024 · 7 comments

Comments

@TimEllersiek
Copy link

UCX and MPI-Sessions

When I try to use OpenMPI with USX on our small University-Cluster I got an error message
saying that MPI-Session Features are not supported by UCX (The Cluster uses an Infiniband connection).
However, when I install it on my Local-Machine (ArchLinux)
all seems to work fine. So I'm wondering whether the MPI-Sessions are supported by UCX or not?

Source Code (main.c):

#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

void function_my_session_errhandler(MPI_Session *foo, int *bar, ...) {
    fprintf(stderr, "my error handler called here with error %d\n", *bar);
}

void function_check_print_error(char *format, int rc) {
    if (MPI_SUCCESS != rc) {
        fprintf(stderr, format, rc);
        abort();
    }
}

int main(int argc, char *argv[]) {
    MPI_Session session;
    MPI_Errhandler errhandler;
    MPI_Group group;
    MPI_Comm comm_world, comm_self;
    MPI_Info info;
    int rc, npsets, one = 1, sum;

    rc = MPI_Session_create_errhandler(function_my_session_errhandler, &errhandler);
    function_check_print_error("Error handler creation failed with rc = %d\n", rc);


    rc = MPI_Info_create(&info);
    function_check_print_error("Info creation failed with rc = %d\n", rc);

    rc = MPI_Info_set(info, "thread_level", "MPI_THREAD_MULTIPLE");
    function_check_print_error("Info key/val set failed with rc = %d\n", rc);

    rc = MPI_Session_init(info, errhandler, &session);
    function_check_print_error("Session initialization failed with rc = %d\n", rc);

    rc = MPI_Session_get_num_psets(session, MPI_INFO_NULL, &npsets);
    function_check_print_error(" with rc = %d\n", rc);

    for (int i = 0; i < npsets; i++) {
        int psetlen = 0;
        char pset_name[256];

        MPI_Session_get_nth_pset(session, MPI_INFO_NULL, i, &psetlen, NULL);
        MPI_Session_get_nth_pset(session, MPI_INFO_NULL, i, &psetlen, pset_name);
        fprintf(stderr, "  PSET %d: %s (len: %d)\n", i, pset_name, psetlen);
    }


   
    rc = MPI_Group_from_session_pset(session, "mpi://WORLD", &group);
    function_check_print_error("Could not get a group for mpi://WORLD. rc = %d\n", rc);

    rc = MPI_Comm_create_from_group(group, "my_world", MPI_INFO_NULL, MPI_ERRORS_RETURN, &comm_world);
    function_check_print_error("Could not create Communicator my_world. rc = %d\n", rc);

    MPI_Group_free(&group);

    MPI_Allreduce(&one, &sum, 1, MPI_INT, MPI_SUM, comm_world);

    fprintf(stderr, "World Comm Sum (1): %d\n", sum);

    rc = MPI_Group_from_session_pset(session, "mpi://SELF", &group);
    function_check_print_error("Could not get a group for mpi://SELF. rc = %d\n", rc);

    MPI_Comm_create_from_group(group, "myself", MPI_INFO_NULL, MPI_ERRORS_RETURN, &comm_self);
    MPI_Group_free(&group);

    MPI_Allreduce(&one, &sum, 1, MPI_INT, MPI_SUM, comm_self);

    fprintf(stderr, "Self Comm Sum (1): %d\n", sum);


    MPI_Errhandler_free(&errhandler);
    MPI_Info_free(&info);
    MPI_Comm_free(&comm_world);
    MPI_Comm_free(&comm_self);
    MPI_Session_finalize(&session);

    return 0;
}

Commands used to compile and run

mpicc \-o main main.c
mpirun -np 1 -mca osc ucx out/main

Console Output Uni-Cluster

$ mpirun -np 1 -mca pml ucx main
  PSET 0: mpi://WORLD (len: 12)
  PSET 1: mpi://SELF (len: 11)
  PSET 2: mpix://SHARED (len: 14)
Could not create Communicator my_world. rc = 52
[nv46:97180] *** Process received signal ***
[nv46:97180] Signal: Aborted (6)
[nv46:97180] Signal code:  (-6)
--------------------------------------------------------------------------
Your application has invoked an MPI function that is not supported in
this environment.

  MPI function: MPI_Comm_from_group/MPI_Intercomm_from_groups
  Reason:       The PML being used - ucx - does not support MPI sessions related features
--------------------------------------------------------------------------
[nv46:97180] [ 0] /usr/lib/libc.so.6(+0x3c770)[0x72422de41770]
[nv46:97180] [ 1] /usr/lib/libc.so.6(+0x8d32c)[0x72422de9232c]
[nv46:97180] [ 2] /usr/lib/libc.so.6(gsignal+0x18)[0x72422de416c8]
[nv46:97180] [ 3] /usr/lib/libc.so.6(abort+0xd7)[0x72422de294b8]
[nv46:97180] [ 4] main(+0x12f4)[0x6239e33802f4]
[nv46:97180] [ 5] main(+0x1585)[0x6239e3380585]
[nv46:97180] [ 6] /usr/lib/libc.so.6(+0x25cd0)[0x72422de2acd0]
[nv46:97180] [ 7] /usr/lib/libc.so.6(__libc_start_main+0x8a)[0x72422de2ad8a]
[nv46:97180] [ 8] main(+0x1165)[0x6239e3380165]
[nv46:97180] *** End of error message ***
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 97180 on node nv46 exited on
signal 6 (Aborted).

Console Output Local:

$ mpirun -np 1 -mca osc ucx main
  PSET 0: mpi://WORLD (len: 12)
  PSET 1: mpi://SELF (len: 11)
  PSET 2: mpix://SHARED (len: 14)
  World Comm Sum (1): 1
  Self Comm Sum (1): 1

Installation

Small Uni-Cluster

UCX Output

Output von configure-release:

[[
configure:           ASAN check:   no
configure:         Multi-thread:   disabled
configure:            MPI tests:   disabled
configure:          VFS support:   yes
configure:        Devel headers:   no
configure: io_demo CUDA support:   no
configure:             Bindings:   < >
configure:          UCS modules:   < fuse >
configure:          UCT modules:   < ib rdmacm cma >
configure:         CUDA modules:   < >
configure:         ROCM modules:   < >
configure:           IB modules:   < >
configure:          UCM modules:   < >
configure:         Perf modules:   < >
]]

Output make install:

$UCXFOLDER/myinstall/bin/ucx_info -v
[[
# Library version: 1.17.0
# Library path: ${HOME}/itoyori/ucx/myinstall/lib/libucs.so.0
# API headers version: 1.17.0
# Git branch 'master', revision a48ad8f
# Configured with: --disable-logging --disable-debug --disable-assertions --disable-params-check --prefix=${HOME}/itoyori/ucx/myinstall --without-go
]]

OpenMPI

Output von configure:

[[
Open MPI configuration:
-----------------------
Version: 5.0.3
MPI Standard Version: 3.1
Build MPI C bindings: yes
Build MPI Fortran bindings: mpif.h, use mpi, use mpi_f08
Build MPI Java bindings (experimental): no
Build Open SHMEM support: yes
Debug build: no
Platform file: (none)
 
Miscellaneous
-----------------------
Atomics: GCC built-in style atomics
Fault Tolerance support: mpi
HTML docs and man pages: installing packaged docs
hwloc: external
libevent: external
Open UCC: no
pmix: external
PRRTE: external
Threading Package: pthreads
 
Transports
-----------------------
Cisco usNIC: no
Cray uGNI (Gemini/Aries): no
Intel Omnipath (PSM2): no (not found)
Open UCX: yes
OpenFabrics OFI Libfabric: yes (pkg-config: default search paths)
Portals4: no (not found)
Shared memory/copy in+copy out: yes
Shared memory/Linux CMA: yes
Shared memory/Linux KNEM: no
Shared memory/XPMEM: no
TCP: yes
 
Accelerators
-----------------------
CUDA support: no
ROCm support: no
 
OMPIO File Systems
-----------------------
DDN Infinite Memory Engine: no
Generic Unix FS: yes
IBM Spectrum Scale/GPFS: no (not found)
Lustre: no (not found)
PVFS2/OrangeFS: no
]]

Local

UCX Output

Output von configure-release:

configure: =========================================================
configure: UCX build configuration:
configure:         Build prefix:   ${HOME}/ucx/myinstall
configure:    Configuration dir:   ${prefix}/etc/ucx
configure:   Preprocessor flags:   -DCPU_FLAGS="" -I${abs_top_srcdir}/src -I${abs_top_builddir} -I${abs_top_builddir}/src
configure:           C compiler:   gcc -O3 -g -Wall -Werror -funwind-tables -Wno-missing-field-initializers -Wno-unused-parameter -Wno-unused-label -Wno-long-long -Wno-endif-labels -Wno-sign-compare -Wno-multichar -Wno-deprecated-declarations -Winvalid-pch -Wno-pointer-sign -Werror-implicit-function-declaration -Wno-format-zero-length -Wnested-externs -Wshadow -Werror=declaration-after-statement
configure:         C++ compiler:   g++ -O3 -g -Wall -Werror -funwind-tables -Wno-missing-field-initializers -Wno-unused-parameter -Wno-unused-label -Wno-long-long -Wno-endif-labels -Wno-sign-compare -Wno-multichar -Wno-deprecated-declarations -Winvalid-pch
configure:         Multi-thread:   disabled
configure:            MPI tests:   disabled
configure:          VFS support:   yes
configure:        Devel headers:   no
configure: io_demo CUDA support:   no
configure:             Bindings:   < >
configure:          UCS modules:   < fuse >
configure:          UCT modules:   < cma >
configure:         CUDA modules:   < >
configure:         ROCM modules:   < >
configure:           IB modules:   < >
configure:          UCM modules:   < >
configure:         Perf modules:   < >
configure: =========================================================

Output make install:

$UCXFOLDER/myinstall/bin/ucx_info -v
# Library version: 1.16.0
# Library path: ${HOME}/ucx/myinstall/lib/libucs.so.0
# API headers version: 1.16.0
# Git branch '', revision e4bb802
# Configured with: --disable-logging --disable-debug --disable-assertions --disable-params-check --prefix=${HOME}/ucx/myinstall --without-go

OpenMPI Output

Output von configure:

Open MPI configuration:
-----------------------
Version: 5.0.3
MPI Standard Version: 3.1
Build MPI C bindings: yes
Build MPI Fortran bindings: no
Build MPI Java bindings (experimental): no
Build Open SHMEM support: yes
Debug build: no
Platform file: (none)
 
Miscellaneous
-----------------------
Atomics: GCC built-in style atomics
Fault Tolerance support: mpi
HTML docs and man pages: installing packaged docs
hwloc: internal
libevent: external
Open UCC: no
pmix: internal
PRRTE: internal
Threading Package: pthreads
 
Transports
-----------------------
Cisco usNIC: no
Cray uGNI (Gemini/Aries): no
Intel Omnipath (PSM2): no (not found)
Open UCX: yes
OpenFabrics OFI Libfabric: no (not found)
Portals4: no (not found)
Shared memory/copy in+copy out: yes
Shared memory/Linux CMA: yes
Shared memory/Linux KNEM: no
Shared memory/XPMEM: no
TCP: yes
 
Accelerators
-----------------------
CUDA support: no
ROCm support: no
 
OMPIO File Systems
-----------------------
DDN Infinite Memory Engine: no
Generic Unix FS: yes
IBM Spectrum Scale/GPFS: no (not found)
Lustre: no (not found)
PVFS2/OrangeFS: no

MPI and UCX Installation

Ordnerstruktur:

${HOME}/ucx
${HOME}/openmpi-5.0.3

Install OpenUCX

cd ${HOME}
git clone https://github.com/openucx/ucx.git
cd ucx
git checkout v1.16.0
export UCXFOLDER=${HOME}/ucx
./autogen.sh
./contrib/configure-release --prefix=$UCXFOLDER/myinstall --without-go

Install:

make -j32
make install

OpenMPI

cd ${HOME}
wget https://download.open-mpi.org/release/open-mpi/v5.0/openmpi-5.0.3.tar.gz
tar xfvz openmpi-5.0.3.tar.gz
export MPIFOLDER=${HOME}/openmpi-5.0.3
cd $MPIFOLDER
./configure --disable-io-romio --with-io-romio-flags=--without-ze --disable-sphinx --prefix="$MPIFOLDER/myinstall" --with-ucx="$UCXFOLDER/myinstall" 2>&1 | tee config.out

Install:

make -j32 all 2>&1 | tee make.out
make install 2>&1 | tee install.out
export OMPI="${MPIFOLDER}/myinstall"
export PATH=$OMPI/bin:$PATH
export LD_LIBRARY_PATH=$OMPI/lib:$LD_LIBRARY_PATH
@janjust
Copy link
Contributor

janjust commented May 23, 2024

So I'm wondering whether the MPI-Sessions are supported by UCX or not?

Yes, that's the case, MPI-Sessions are not supported by UCX.

@hppritcha
Copy link
Member

On the feature list for the next major release.

@TimEllersiek
Copy link
Author

Thanks for the answers.

@devreal
Copy link
Contributor

devreal commented May 24, 2024

Let's keep this open until it's fixed. Other people will probably run into this too.

@devreal devreal reopened this May 24, 2024
@jprotze
Copy link

jprotze commented Jul 23, 2024

Is there any mode to execute OpenMPI 5 with sessions?

hppritcha added a commit to hppritcha/ompi that referenced this issue Jul 30, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

This PR also includes simplifications to the OFI MTL to make use of the
PMIX_GROUP_LOCAL_CID feature.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Aug 1, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

This PR also includes simplifications to the OFI MTL to make use of the
PMIX_GROUP_LOCAL_CID feature.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Aug 1, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

This PR also includes simplifications to the OFI MTL to make use of the
PMIX_GROUP_LOCAL_CID feature.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Aug 1, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

This PR also includes simplifications to the OFI MTL to make use of the
PMIX_GROUP_LOCAL_CID feature.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Sep 30, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

This PR also includes simplifications to the OFI MTL to make use of the
PMIX_GROUP_LOCAL_CID feature.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Oct 5, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

This PR also includes simplifications to the OFI MTL to make use of the
PMIX_GROUP_LOCAL_CID feature.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Oct 7, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

This PR also includes simplifications to the OFI MTL to make use of the
PMIX_GROUP_LOCAL_CID feature.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Oct 8, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

This PR also includes simplifications to the OFI MTL to make use of the
PMIX_GROUP_LOCAL_CID feature.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Oct 10, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

This PR also includes simplifications to the OFI MTL to make use of the
PMIX_GROUP_LOCAL_CID feature.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
hppritcha added a commit to hppritcha/ompi that referenced this issue Oct 18, 2024
Greatly simplify support for MPI_Comm_create_from_group and
MPI_Intercomm_create_from_group by removing the need to support
the 128-bit excid notion.

Instead, make use of a PMIx capability - PMIX_GROUP_LOCAL_CID and the notion of
PMIX_GROUP_INFO. This capability was introduced in Open PMIx 4.1.3.
This capability allows us to piggy-back a local cid selected
for the new communicator on the PMIx_Group_construct operation.
Using this approach, a lot of the complex active message style operations
implemented in the OB1 PML to support excids can be avoided.

Infrastructure for debugging communicator management routines was also
introduced, along with a new MCA parameter -  mpi_comm_verbose.

Related to open-mpi#12566

Signed-off-by: Howard Pritchard <[email protected]>
@hppritcha
Copy link
Member

closed via #12723

@hppritcha
Copy link
Member

No plans currently to push these changes back to v5.0.x branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants