Version 2.2.2
Here is a python script to run and monitor Slurm jobs from yaml configurations. More Info.
git clone https://github.com/Rhoana/3dxp.git
cd 3dxp/TASKS
0.5) Only if using 3DXP:
. harvard/environment.sh
(On the Harvard Odyssey Cluster)†
pip install -r ../PYTHON/requirements.txt
(Anwhere with pip...)
python slyml.py my_config.yaml
Example: go from volume to meshes in many blocks handled in parallel.
cp demos/many.yaml my_config.yaml
In Default.Constants
,
- Set
OUTPUT
to wherever you want your meshes. - Set
HD_IDS
to the path to your input HDF5 file. - Set
TODAY
to the current date (for logging). BLOCK_COUNT: 4
breaks meshes into 64 blocks.- Each block is ¼×¼×¼ of the volume.
In Main.Inputs
,
- Set
LIST_IDS
to select segments for stl generation.
Main:
python: ./PYTHON/all_stl.py
args: "-b {BLOCK_COUNT} -l {LIST_IDS} {HD_IDS} {MESH_IDS}"
Inputs:
MESH_IDS: "{OUTPUT}/{TODAY}/many"
LIST_IDS: "1:200:300"
Default:
Constants:
TODAY: "2017-11-11"
HD_IDS: ~/data/ids.h5
OUTPUT: ~/data
BLOCK_COUNT: 4
Workdir: "git rev-parse --show-toplevel"
Slurm: ./SLURM/many.sbatch
Exports: [python, args]
Logs: "./LOGS/{TODAY}"
Runs: "{BLOCK_COUNT}**3"
Running python slyml.py my_config.yaml
acts like this:
TODAY="2017-11-11"
HD_IDS=~/data/ids.h5
OUTPUT=~/data
BLOCK_COUNT=4
Workdir=`git rev-parse --show-toplevel`
Slurm=$Workdir/SLURM/many.sbatch
Logs=$Workdir/LOGS/$TODAY
Runs=$((BLOCK_COUNT**3))
MESH_IDS=$OUTPUT/$TODAY/many
LIST_IDS="1:200:300"
export python="$Workdir/PYTHON/all_stl.py"
export args="-b $BLOCK_COUNT -l $LIST_IDS $HD_IDS $MESH_IDS"
sbatch --job-name=A --output=$Logs/A/array_%a.out --error=~$Logs/A/array_%a.err --workdir=$Workdir --export=ALL --array=0-$((Runs-1)) $Slurm
This particular demo runs a python script from a very general sbatch file. The YAML file sent to slyml.py
can parallelize any command by exporting environment variables to any sbatch
file. The general format of my_config.yaml
is given by python slyml.py -h
!
- If your data does not need to be parallelized, you can omit
Runs
to schedule one job- Set
BLOCK_COUNT
to 1 to handle the whole volume.
- Set
- We can write the examples for one job and many jobs with fewer lines in a combined file.
- The
slyml.py
script will use any entry (likeMain
) if passed as the second argument. - With the power to anchor
&
, refer*
, and extend<<:
objects and lists, YAML allows the quick recombination of tasks and parameters.
- The
Main
can haveNeeds
that must be completed beforeMain
can start.- The
Needs
can haveNeeds
, recursively indefinitely. - The
Needs
inheritConstants
andInputs
asDefault
values. - Other keywords like
Slum
,Workdir
, or user-defined keys are not inherited.
- The
- The
slyml.py
script takes optional keywords that can be set and unset directly in the yaml file.python slyml.py -q
orDefault.Quiet
(only log errors or warnings).python slyml.py -d
orDefault.Debug
(only log, without scheduling jobs).
- Any key understands absolute paths (
/
), or paths relative to theWorkdir
(.
) or home directory (~
).- The
Workdir
can be an existing directory or a valid shell command. No other keyword works like this.
- The
- The
Exports
key lists all keys to export to theSlurm
file. - The
Flags
key lists all keys to use as flags tosbatch
. - The
Evals
key lists all keys to evaluate aspython
.- By default, this is
[Runs, Sync]
.
- By default, this is
- The
Constants
andInputs
can format any value exceptWorkdir
,Exports
,Evals
, orFlags
.- But only the
Constants
can format the values for theInputs
. - The
Constants
cannot be formatted and must be literal.
- But only the
- To run jobs, set
Slurm
to the path of a validsbatch
file.- Based on
Runs
, thesbatch
file can use$SLURM_ARRAY_TASK_ID
andSLURM_ARRAY_TASK_COUNT
- Based on
- Internally,
slymyl.py
setssbatch
argumentsjob-name
,workdir
,array
,dependency
,output
,error
, andexport
.- The
Flags
key selects any othersbatch
flags from the keys of each task.
- The
- Only Unix-like relative paths are expanded to absolute paths
Running slyml.py
requires only a simple setup on the harvard cluster, but in this example we also set up a virtual environment with the libraries needed to run the python used in 3DXP.