- Clone the data repository:
git lfs clone https://github.com/jump-cellpainting/pilot-cpjump1-data
. If you do not have git lfs, refer to installation instructions. - Clone this repository:
git clone https://github.com/nivedithasi/gene-embed
- Symlink the CPJUMP data in gene-embed:
cd gene-embed
andln -s ../pilot-cpjump1-data/ .
- Create a micromamba environment:
micromamba env create -n gene-embed -f env.yaml
. If you do not have micromamba set up, refer to this guide. - Activate the environment
micromamba activate gene-embed
and install mkl:pip install mkl==2022.1.0
.
cd gene-embed/code
- Download gene_embed_experiment_runs.zip from Zenodo (link here).
- Unzip the folder to obtain
gene_embed_experiment_runs
. - Move the sub-folders into
gene-embed/code
:mv gene_embed_experiment_runs/* gene-embed/code
cd gene-embed/code
and activate the environmentmicromamba activate gene-embed
.- Create a grid search by editing
grid.py
(edit the search space andoutput_dir
folder path). - Create config files by running
python grid.py
.output_dir
should contain sub-folders for each config now. - Launch the experiments:
bash run.sh output_dir
. - All experiment results and the trained model weights will be stored under
output_dir
(see test_scores.json, model.pt etc.) - Find the best model on the validation set by running
python find_best_val.py
. This will display the test set performance of the best val model under each experiment. Editsource_folders
in this file to obtain results on additional experiments.