You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If you have a problem launching MPI or OpenSHMEM applications, be sure to read this.
If you have a problem running MPI or OpenSHMEM applications (i.e., after launching them), be sure to read this.
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
Package: Open MPI root@sharp-ci-02 Distribution
Open MPI: 4.1.5rc2
Open MPI repo revision: v4.1.5rc1-16-g5980bac633
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
git克隆的
If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.
Please describe the system on which you are running
Operating system/version:
Computer hardware:
Network type:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 36
On-line CPU(s) list: 0-35
Thread(s) per core: 1
Core(s) per socket: 18
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz
Stepping: 7
CPU MHz: 999.914
CPU max MHz: 3900.0000
CPU min MHz: 1000.0000
BogoMIPS: 5200.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 25344K
NUMA node0 CPU(s): 0-17
NUMA node1 CPU(s): 18-35
Details of the problem
Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.
Currently, I have a computer cluster. I want to conduct communication by combining OpenMPI with RDMA. Here, I have configured UCX to support OpenMPI and set the communication modes of UCX as RC (Reliable Connected) and UD (Unreliable Datagram). I also set UCX_NET_DEVICES = mlx5_0. At present, local RDMA communication has been achieved. However, when I configure the host file and try to implement cross-node communication combining OpenMPI with RDMA, error messages will be reported.
Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:
Please submit all the information below so that we can understand the working environment that is the context for your question.
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
Package: Open MPI root@sharp-ci-02 Distribution
Open MPI: 4.1.5rc2
Open MPI repo revision: v4.1.5rc1-16-g5980bac633
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
git克隆的
If you are building/installing from a git clone, please copy-n-paste the output from
git submodule status
.Please describe the system on which you are running
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 36
On-line CPU(s) list: 0-35
Thread(s) per core: 1
Core(s) per socket: 18
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 85
Model name: Intel(R) Xeon(R) Gold 6240 CPU @ 2.60GHz
Stepping: 7
CPU MHz: 999.914
CPU max MHz: 3900.0000
CPU min MHz: 1000.0000
BogoMIPS: 5200.00
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 1024K
L3 cache: 25344K
NUMA node0 CPU(s): 0-17
NUMA node1 CPU(s): 18-35
Details of the problem
Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.
Currently, I have a computer cluster. I want to conduct communication by combining OpenMPI with RDMA. Here, I have configured UCX to support OpenMPI and set the communication modes of UCX as RC (Reliable Connected) and UD (Unreliable Datagram). I also set UCX_NET_DEVICES = mlx5_0. At present, local RDMA communication has been achieved. However, when I configure the host file and try to implement cross-node communication combining OpenMPI with RDMA, error messages will be reported.
Note: If you include verbatim output (or a code block), please use a GitHub Markdown code block like below:
The text was updated successfully, but these errors were encountered: