-
Notifications
You must be signed in to change notification settings - Fork 10
Home
hokiegeek2 edited this page Aug 3, 2021
·
5 revisions
Deploying Chapel programs such as Arkouda on Slurm via slurm-jupyterlab on k8s requires a slurm file such as this one:
#!/bin/bash
#
#SBATCH --job-name=arkouda-2-node
#SBATCH --output=/tmp/arkouda.out
#SBATCH --mem=4096
#SBATCH --ntasks=3
#SBATCH --uid=1000
#SBATCH --nodes=3
export CHPL_COMM_SUBSTRATE=udp
export GASNET_MASTERIP=ace
export SSH_SERVERS='finkel einhorn shickadance'
export GASNET_SPAWNFN=C
export GASNET_CSPAWN_CMD="srun -N%N %C"
/tmp/arkouda_server -nl 3
Chapel jobs can also be deployed using the S spawner:
#!/bin/bash
#
#SBATCH --job-name=arkouda-2-node
#SBATCH --output=/tmp/arkouda.out
#SBATCH --mem=4096
#SBATCH --ntasks=3
#SBATCH --uid=1000
# SBATCH --nodes=3
export CHPL_COMM_SUBSTRATE=udp
export GASNET_MASTERIP=ace
export SSH_SERVERS='finkel einhorn shickadance'
export GASNET_SPAWNFN=S
/tmp/arkouda_server -nl 3
To fix a slurmd node in "draining" state, use the method discussed here:
socntrol
scontrol: update NodeName=node10 State=DOWN Reason="undraining"
scontrol: update NodeName=node10 State=RESUME