Skip to content

Latest commit

 

History

History
98 lines (76 loc) · 4.68 KB

README.md

File metadata and controls

98 lines (76 loc) · 4.68 KB

GaNDLF Experiments Template

This repo contains a mechanism to run multiple GaNDLF experiments on the UPenn CUBIC cluster.

Pre-requisites

  • This repo will allow you to submit multiple GPU jobs on the cluster, which gives you great power; and you know what comes with that. If Mark's wrath falls upon you, you are on your own.
  • Be intimately familiar with the data you are going to use.
  • Be familiar with GaNDLF's usage, and try to do a single epoch training on the toy dataset.
  • You have installed GaNDLF on your home directory or comp_space.
  • You have run a single epoch of the GaNDLF training loop (training and validation) using your own data somewhere (either CUBIC cluster or own machine - doesn't matter), so that you know how to customize the configuration.

Configurations

All configuration options can be changed depending on the experiment at hand.

  • In the file config_generator.py, there are examples using which the various hyper-parameters can be altered to create different configurations.
  • Maximum flexibility is given to the user to decide the folder and configuration file structure.
  • It is suggested that the user alters few hyper-parameters while keeping the rest consistent. This allows meaningful comparisons between different experiments.
  • This repo allows the creation of such an extensive experimental design.
  • The only requirement is that the configurations should be generated under a single folder structure. An example of such a structure exploring 2 different architectures for learning rates of [0.1,0.01] with optimizers of [adam,sgd] is shown:
experiment_template_folder
│
| README.md
└───unet
│   │
│   | lr_0.1_adam.yaml
│   | lr_0.01_adam.yaml
│   | lr_0.1_sgd.yaml
│   | lr_0.01_sgd.yaml
│   
└───transunet
│   │
│   | lr_0.1_adam.yaml
│   | lr_0.01_adam.yaml
│   | lr_0.1_sgd.yaml
│   | lr_0.01_sgd.yaml
│   
│ ...   
│   
└───unetr
│   │ ...

Usage

Creating Configurations

  • Once the experimental design has been established, the configurations can be generated using the config_generator.py script.
  • The user can edit this file to create the desired configurations.
python config_generator.py

Submitting Jobs to the CUBIC Cluster

python submitter.py -h
usage: GANDLF_Experiment_Submitter [-h] [-i] [-g] [-d] [-f] [-r] [-e] [-gpu] [-gpur]

Submit GaNDLF experiments on CUBIC Cluster.

Contact: [email protected]

This program is NOT FDA/CE approved and NOT intended for clinical use.
Copyright (c) 2023 University of Pennsylvania. All rights reserved.

optional arguments:
  -h, --help            show this help message and exit
  -i , --interpreter    Full path of python interpreter to be called.
  -g , --gandlfrun      Full path of 'gandlf_run' script to be called.
  -d , --datafile       Full path to 'data.csv'.
  -f , --foldertocopy   Full path to the data folder to copy into the location in '$CBICA_TMP'.
  -r , --runnerscript   'runner.sh' script to be called.
  -e , --email          Email address to be used for notifications.
  -gpu , --gputype      The parameter to pass after '-l' to the submit command.
  -gpur , --gpuratio    The number of jobs (starting from '0') to send to 'gpu' vs 'A40', since 'gpu' is more prevalent - ignores parameter `--gputype`.

Getting overall statistics

The following command will collect the training and validation logs from all experiments and provide the best loss values along with specified metrics for each experiment:

python config_generator.py -c False

This will generate a file best_info.csv in the current directory. This file can be used to generate a table of best results for each experiment.

Important Notes

  • All parameters have some defaults, and should be changed based on the experiment at hand.
  • Use this repo as template to create a new PRIVATE repo.
  • Update common config properties as needed.
  • Edit the data.csv file to fill in updated data list (channel list should not matter as long as it is consistent). Ensure you have read access to the data. This can be changed to separate train.csv and val.csv files if needed, which can be passed as comma-separated.
  • Run python ./submitter.py with correct options (OR change the defaults - whatever is easier) to submit the experiments.