Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove manual sections for obsolete machines and add ALCF Aurora #5280

Merged
merged 2 commits into from
Jan 14, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
191 changes: 7 additions & 184 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -822,89 +822,19 @@ package. This was successfully tested under OS X 10.15.7 "Catalina" on October 2

ctest -R deterministic

Installing on ALCF Theta, Cray XC40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Theta is a 9.65 petaflops system manufactured by Cray with 3,624 compute nodes.
Each node features a second-generation Intel Xeon Phi 7230 processor and 192 GB DDR4 RAM.

::

export CRAYPE_LINK_TYPE=dynamic
module load cmake/3.20.4
module unload cray-libsci
module load cray-hdf5-parallel
module load gcc/8.3.0 # Make C++ 14 standard library available to the Intel compiler
export BOOST_ROOT=/soft/libraries/boost/1.64.0/intel
cmake -DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment ..
make -j 24
ls -l bin/qmcpack

Installing on ALCF Polaris
~~~~~~~~~~~~~~~~~~~~~~~~~~
Polaris is a HPE Apollo Gen10+ based 44 petaflops system.
Each node features a AMD EPYC 7543P CPU and 4 NVIDIA A100 GPUs.
A build recipe for Polaris can be found at ``<qmcpack_source>/config/build_alcf_polaris_Clang.sh``

Installing on ORNL OLCF Summit
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Summit is an IBM system at the ORNL OLCF built with IBM Power System AC922
nodes. They have two IBM Power 9 processors and six NVIDIA Volta V100
accelerators.

Building QMCPACK
^^^^^^^^^^^^^^^^

As of April 2023, LLVM Clang (>=15) is the only compiler, validated by QMCPACK developers,
on Summit for OpenMP offloading computation to NVIDIA GPUs.

For ease of reproducibility we provide build scripts for Summit.

::

cd qmcpack
./config/build_olcf_summit_Clang.sh
ls build_*/bin

Running QMCPACK
^^^^^^^^^^^^^^^
Job script example with one MPI rank per GPU.

::

#!/bin/bash
# Begin LSF directives
#BSUB -P MAT151
#BSUB -J test
#BSUB -o tst.o%J
#BSUB -W 60
#BSUB -nnodes 1
#BSUB -alloc_flags smt1
# End LSF directives and begin shell commands

module load gcc/9.3.0
module load spectrum-mpi
module load cuda
module load essl
module load netlib-lapack
module load hdf5/1.10.7
module load fftw
# private module until OLCF provides a new llvm build
module use /gpfs/alpine/mat151/world-shared/opt/modules
module load llvm/release-15.0.0-cuda11.0

NNODES=$(((LSB_DJOB_NUMPROC-1)/42))
RANKS_PER_NODE=6
RS_PER_NODE=6

exe_path=/gpfs/alpine/mat151/world-shared/opt/qmcpack/release-3.16.0/build_summit_Clang_offload_cuda_real/bin

prefix=NiO-fcc-S1-dmc

export OMP_NUM_THREADS=7
jsrun -n $NNODES -a $RANKS_PER_NODE -c $((RANKS_PER_NODE*OMP_NUM_THREADS)) -g 6 -r 1 -d packed -b packed:$OMP_NUM_THREADS \
--smpiargs="-disable_gpu_hooks" $exe_path/qmcpack --enable-timers=fine $prefix.xml >& $prefix.out
Installing on ALCF Aurora
~~~~~~~~~~~~~~~~~~~~~~~~~~
Aurora is a 10,624 node HPE Cray EX based system. It has 166 racks with 21,248 CPUs and 63,744 GPUs.
Each node consists of 2 Intel Xeon CPU Max 9470C (codename Sapphire Rapids or SPR) with on-package HBM
and 6 Intel Data Center GPU Max 1550 (codename Ponte Vecchio or PVC).
Each Xeon has 52 physical cores supporting 2 hardware threads per core and 64GB of HBM. Each CPU has 512 GB of DDR5.
A build recipe for Aurora can be found at ``<qmcpack_source>/config/build_alcf_aurora_icpx.sh``

Installing on ORNL OLCF Frontier/Crusher
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -965,113 +895,6 @@ Job script example with one MPI rank per GPU.
srun -n $TOTAL_RANKS --ntasks-per-node=$RANKS_PER_NODE --gpus-per-task=1 -c $THREAD_SLOTS --gpu-bind=closest \
$exe_path/qmcpack --enable-timers=fine $prefix.xml >& $prefix.out

Installing on NERSC Cori, Haswell Partition, Cray XC40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Cori is a Cray XC40 that includes 16-core Intel "Haswell" nodes
installed at NERSC. In the following example, the source code is
cloned in \$HOME/qmc/git\_QMCPACK and QMCPACK is built in the scratch
space.

::

mkdir $HOME/qmc
mkdir $HOME/qmc/git_QMCPACK
cd $HOME/qmc_git_QMCPACK
git clone https://github.com/QMCPACK/qmcpack.git
cd qmcpack
git checkout v3.7.0 # Edit for desired version
export CRAYPE_LINK_TYPE=dynamic
module unload cray-libsci
module load boost/1.70.0
module load cray-hdf5-parallel
module load cmake/3.14.4
module load gcc/8.3.0 # Make C++ 14 standard library available to the Intel compiler
cd $SCRATCH
mkdir build_cori_hsw
cd build_cori_hsw
cmake -DQMC_SYMLINK_TEST_FILES=0 -DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment $HOME/qmc/git_QMCPACK/qmcpack/
nice make -j 8
ls -l bin/qmcpack

When the preceding was tested on June 15, 2020, the following module and
software versions were present:

::

build_cori_hsw> module list
Currently Loaded Modulefiles:
1) modules/3.2.11.4 13) xpmem/2.2.20-7.0.1.1_4.8__g0475745.ari
2) nsg/1.2.0 14) job/2.2.4-7.0.1.1_3.34__g36b56f4.ari
3) altd/2.0 15) dvs/2.12_2.2.156-7.0.1.1_8.6__g5aab709e
4) darshan/3.1.7 16) alps/6.6.57-7.0.1.1_5.10__g1b735148.ari
5) intel/19.0.3.199 17) rca/2.2.20-7.0.1.1_4.42__g8e3fb5b.ari
6) craype-network-aries 18) atp/2.1.3
7) craype/2.6.2 19) PrgEnv-intel/6.0.5
8) udreg/2.3.2-7.0.1.1_3.29__g8175d3d.ari 20) craype-haswell
9) ugni/6.0.14.0-7.0.1.1_7.32__ge78e5b0.ari 21) cray-mpich/7.7.10
10) pmi/5.0.14 22) craype-hugepages2M
11) dmapp/7.1.1-7.0.1.1_4.43__g38cf134.ari 23) gcc/8.3.0
12) gni-headers/5.0.12.0-7.0.1.1_6.27__g3b1768f.ari 24) cmake/3.14.4

The following slurm job file can be used to run the tests:

::

#!/bin/bash
#SBATCH --qos=debug
#SBATCH --time=00:10:00
#SBATCH --nodes=1
#SBATCH --tasks-per-node=32
#SBATCH --constraint=haswell
echo --- Start `date`
echo --- Working directory: `pwd`
ctest -VV -R deterministic
echo --- End `date`

Installing on NERSC Cori, Xeon Phi KNL partition, Cray XC40
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Cori is a Cray XC40 that includes Intel Xeon Phi Knight's Landing (KNL) nodes. The following build recipe ensures that the code
generation is appropriate for the KNL nodes. The source is assumed to
be in \$HOME/qmc/git\_QMCPACK/qmcpack as per the Haswell example.

::

export CRAYPE_LINK_TYPE=dynamic
module swap craype-haswell craype-mic-knl # Only difference between Haswell and KNL recipes
module unload cray-libsci
module load boost/1.70.0
module load cray-hdf5-parallel
module load cmake/3.14.4
module load gcc/8.3.0 # Make C++ 14 standard library available to the Intel compiler
cd $SCRATCH
mkdir build_cori_knl
cd build_cori_knl
cmake -DQMC_SYMLINK_TEST_FILES=0 -DCMAKE_SYSTEM_NAME=CrayLinuxEnvironment $HOME/qmc/git_QMCPACK/qmcpack/
nice make -j 8
ls -l bin/qmcpack

When the preceding was tested on June 15, 2020, the following module and
software versions were present:

::

build_cori_knl> module list
Currently Loaded Modulefiles:
1) modules/3.2.11.4 13) xpmem/2.2.20-7.0.1.1_4.8__g0475745.ari
2) nsg/1.2.0 14) job/2.2.4-7.0.1.1_3.34__g36b56f4.ari
3) altd/2.0 15) dvs/2.12_2.2.156-7.0.1.1_8.6__g5aab709e
4) darshan/3.1.7 16) alps/6.6.57-7.0.1.1_5.10__g1b735148.ari
5) intel/19.0.3.199 17) rca/2.2.20-7.0.1.1_4.42__g8e3fb5b.ari
6) craype-network-aries 18) atp/2.1.3
7) craype/2.6.2 19) PrgEnv-intel/6.0.5
8) udreg/2.3.2-7.0.1.1_3.29__g8175d3d.ari 20) craype-mic-knl
9) ugni/6.0.14.0-7.0.1.1_7.32__ge78e5b0.ari 21) cray-mpich/7.7.10
10) pmi/5.0.14 22) craype-hugepages2M
11) dmapp/7.1.1-7.0.1.1_4.43__g38cf134.ari 23) gcc/8.3.0
12) gni-headers/5.0.12.0-7.0.1.1_6.27__g3b1768f.ari 24) cmake/3.14.4

Installing on systems with ARMv8-based processors
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
Loading