In this module, we download the SQLite files from AWS with aws-cli on Aug 10, 2023 using instructions provided from JUMP Cell Painting Datasets. There are 51 plates from the pilot dataset (cpg0000), totalling 1.1 TB of storage from the SQLite files.
Firstly, we generate a manifest file in the data folder called jump_dataset_location_manifest.csv. Afterwards, we process each plate using CytoTable.
Optionally, to download only the SQLite plates, please use the download_from_aws.sh file, which contains the bash script that will download the files from the paths in the manifest.
Please see the notes from the main README.md
on processing this step.