Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mantis deskewing with multiprocessing and slurm #47

Merged
merged 74 commits into from
Jul 10, 2023
Merged
Show file tree
Hide file tree
Changes from 21 commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
3062cc4
prototype parallel deskew implementation
ieivanov May 1, 2023
5606cd2
Merge branch 'main' of github.com:czbiohub/mantis into feature/parall…
edyoshikun May 26, 2023
2b61e4e
added scipy dependency
edyoshikun May 26, 2023
60651ad
refactoring deskew to comply to our single position standard.
edyoshikun May 27, 2023
32bb3bc
adding napari and iohub dependencies
edyoshikun May 27, 2023
54e20c4
fixing mistake on changing this file
edyoshikun May 27, 2023
ee7c575
revive accidental deletion
talonchandler May 30, 2023
57fc884
fix input-output mixup bug
talonchandler May 31, 2023
94b6cb4
cleanup
talonchandler May 31, 2023
b10417c
whoops
talonchandler May 31, 2023
0353f18
remove vestigial param
talonchandler May 31, 2023
e8dcc43
process positions sequentially and parellize T and C.
edyoshikun May 31, 2023
7bf7cb6
change reader mode
talonchandler May 31, 2023
c06aa67
have a flag for slurm
edyoshikun Jun 8, 2023
e10fc93
adding the option 1 : creating an empty array beforehand
edyoshikun Jun 9, 2023
ed8bb2e
cleaning up the deskew cli function.
edyoshikun Jun 9, 2023
e5e1fbf
adding the fixes that ensure that the creation of the arrays follow n…
edyoshikun Jun 9, 2023
c8d8fcf
Merge branch 'feature/parallel-deskew' of github.com:czbiohub/mantis …
edyoshikun Jun 13, 2023
74bec5f
adding comments and the natsort
edyoshikun Jun 13, 2023
3f80a26
slurmkit example to consolidate mantis slurm jobs
edyoshikun Jun 13, 2023
df13a6b
adding todos and cleanup
edyoshikun Jun 13, 2023
e02171c
renamed to slurm deskew,
edyoshikun Jun 14, 2023
9c57c94
removed the slurm flag and cleaned the code
edyoshikun Jun 14, 2023
fe89a04
adding slurmkit dependency
edyoshikun Jun 14, 2023
9438c43
removing 's' in params for consistency
edyoshikun Jun 14, 2023
94c78e8
added some typing and defaults.
edyoshikun Jun 14, 2023
d33cf66
deleting the bash and python scripts from option 1
edyoshikun Jun 14, 2023
3a1c366
solving over wrapping, making this a script, cleaning up unused imports
edyoshikun Jun 14, 2023
d02d0c8
changing prints to click.echo
edyoshikun Jun 14, 2023
3feacae
upgrade to python 3.10
talonchandler Jun 21, 2023
fedfcf9
specify released iohub
talonchandler Jun 21, 2023
cc2f6ca
depend on slurmkit main branch
talonchandler Jun 21, 2023
9d35195
initial cleaning of example script
talonchandler Jun 21, 2023
3843fa4
rename deskew io
talonchandler Jun 21, 2023
eed65ab
docstring cleaning
talonchandler Jun 21, 2023
0370c98
clean unused
talonchandler Jun 21, 2023
de4b899
clean unused 2
talonchandler Jun 21, 2023
fe499e6
input_path -> input_paths
talonchandler Jun 21, 2023
85f7a01
`single_process` -> `deskew_zyx_and_save`
talonchandler Jun 21, 2023
44e4010
`deskew_cli` -> `deskew_single_position`
talonchandler Jun 21, 2023
c733883
fix comments
talonchandler Jun 21, 2023
a44aff2
fix bug
talonchandler Jun 21, 2023
ce2d607
test deskew cli
talonchandler Jun 21, 2023
0b1f177
update test data path
talonchandler Jun 21, 2023
2d7c317
match names
talonchandler Jun 21, 2023
74bbbde
simplify
talonchandler Jun 21, 2023
cbb5f5f
simplify scripts
talonchandler Jun 21, 2023
6c4f89f
remove unused
talonchandler Jun 21, 2023
1587ba8
write metadata at the same time as the data
talonchandler Jun 21, 2023
2ff0ea5
Better default path
talonchandler Jun 21, 2023
6cf8d02
clarify `deskew_zyx_and_save`
talonchandler Jun 21, 2023
d11639c
fix comment
talonchandler Jun 21, 2023
91668d6
add types
talonchandler Jun 21, 2023
5642ccf
typo
talonchandler Jun 21, 2023
d55cb90
basic deskew test
talonchandler Jun 21, 2023
f98dbc2
add comment
talonchandler Jun 21, 2023
cac9794
rename IO paths
talonchandler Jun 21, 2023
61b6c03
num_cores -> num_processes
talonchandler Jun 21, 2023
79beb03
fix renaming bug
talonchandler Jun 21, 2023
2eec840
improved defaults
talonchandler Jun 22, 2023
ffb7a77
Merge branch 'main' into use_slurmkit
talonchandler Jun 22, 2023
e38d540
Fix merge commit
talonchandler Jun 22, 2023
62a5c7c
update github workflows to python 3.10
talonchandler Jun 22, 2023
d455e1e
style
talonchandler Jun 22, 2023
f807ec1
isort
talonchandler Jun 22, 2023
27ac711
adding mantis_usage.md back from accidental delete
edyoshikun Jun 22, 2023
3025cb3
fixing minor docstring typos
edyoshikun Jun 22, 2023
79b015f
remove napari from dependencies (optional)
ieivanov Jun 23, 2023
b9817c4
reorder and match `get_deskewed_data_shape` and `deskew_data` args
ieivanov Jun 23, 2023
bc488f6
add natsort to project dependencies
ieivanov Jun 23, 2023
955d4bf
make napari non-optional dependency
ieivanov Jun 23, 2023
9240cc4
switch from using os to pathlib
ieivanov Jun 23, 2023
fca3f55
move keep_overhang to deskew_param yaml
ieivanov Jun 23, 2023
a5c81f4
change chunk size to XYZ dimensions
ieivanov Jun 23, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 0 additions & 123 deletions docs/mantis_usage.md

This file was deleted.

21 changes: 21 additions & 0 deletions examples/slurm_option1/batch.sh
edyoshikun marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash

INPUT_DATA="/hpc/instruments/cm.mantis/2023_05_10_PCNA_RAC1/timelapse_2_3/timelapse_2_lightsheet_1.zarr/*/*/*"
OUTPUT_PATH="./timelapse_2_lightsheet_1_deskewed.zarr"
DESKEW_PARAMS="./deskew_settings.yml"

# Make an array of positions
POSPATHS=($(natsort -p $INPUT_DATA))
POSITIONS=${#POSPATHS[@]}

# Get the Zarrstore name
IFS='/' read -ra path <<< "${POSPATHS[0]}"
ZARR_STORE=$(IFS='/'; echo "${path[*]:0:${#path[@]}-3}")
echo "Zarr Store: $ZARR_STORE"

rm ./output/deskew/*.out

# Create an empty array to pre-initialize positions taking pos 0 as sample for the shape
ZARR_JOB_ID=$(sbatch --parsable empty_zarr.sh "$INPUT_DATA" $DESKEW_PARAMS $OUTPUT_PATH)
echo "DONE ZARR JOB: $ZARR_JOB_ID"
DESKEW_JOB_ID=$(sbatch --parsable --array=0-$((POSITIONS-1)) -d after:$ZARR_JOB_ID deskew.sh "$INPUT_DATA" $OUTPUT_PATH)
52 changes: 52 additions & 0 deletions examples/slurm_option1/deskew.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
#!/bin/bash
#SBATCH --job-name=deskew
#SBATCH --time=4:00:00
#SBATCH --partition=cpu
#SBATCH --cpus-per-task=16
#SBATCH --mem-per-cpu=64G
#SBATCH --output=./output/deskew/deskew-%A-%a.out
env | grep "^SLURM" | sort

module load anaconda
module load comp_micro
conda activate pyplay

now=$(date '+%y-%m-%d')
logpath=./logs/$now/deskew
mkdir -p $logpath
logfile="$logpath/deskew_$SLURM_ARRAY_TASK_ID.out"

INPUT_PATH=$1
OUTPUT_PATH=$2
NUM_CORES=128

echo "in: $INPUT_PATH " >> ${logfile}
echo "out: $OUTPUT_PATH " >> ${logfile}
echo "idx: $SLURM_ARRAY_TASK_ID " >> ${logfile}

#Get the array of positions
POSPATHS=($(natsort -p $INPUT_PATH)) >> ${logfile}
echo "pospaths: ${POSPATHS[$SLURM_ARRAY_TASK_ID]}" >> ${logfile}

# Create an array to store the last three directories for each path
OUTPUT_FOVS=()

# Extract the last three directories for each path
for path in "${POSPATHS[@]}"; do
IFS='/' read -ra last_three_dirs <<< "$path"
OUTPUT_FOVS+=("${last_three_dirs[-3]}/${last_three_dirs[-2]}/${last_three_dirs[-1]}")
done

# Start measuring the execution time
start_time=$(date +%s.%N)
echo "Starting the deskew" >> ${logfile}

# Key Code
mantis deskew ${POSPATHS[${SLURM_ARRAY_TASK_ID}]} ./deskew_settings.yml -o "$OUTPUT_PATH/${OUT_FOV_DIR[SLURM_ARRAY_TASK_ID]}" -j $NUM_CORES --slurm >> ${logfile}

# End measuring the execution time
end_time=$(date +%s.%N)
# Calculate the elapsed time
elapsed_time=$(echo "$end_time - $start_time" | bc)

echo "Script execution time: $elapsed_time seconds" >> ${logfile}
4 changes: 4 additions & 0 deletions examples/slurm_option1/deskew_settings.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
pixel_size_um: 0.116
ls_angle_deg: 36.17 # can be calibrated using estimate_deskew.py
px_to_scan_ratio: 0.371 # Optional, can be calibrated using estimate_deskew.py
scan_step_um: 0.313 # Optional, corresponds to 10 mV
108 changes: 108 additions & 0 deletions examples/slurm_option1/empty_zarr.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# %%%
from iohub import open_ome_zarr

from mantis.cli.parsing import (
input_data_path_argument,
deskew_param_argument,
output_dataset_options,
)
from mantis.analysis.deskew import get_deskewed_data_shape
import yaml
from mantis.analysis.AnalysisSettings import DeskewSettings
from dataclasses import asdict

# debugging
from tqdm import tqdm
import click
import numpy as np
from datetime import datetime
import os
from iohub.ngff_meta import TransformationMeta
from natsort import natsorted

def _get_deskew_params(deskew_params_path):
# Load params
with open(deskew_params_path) as file:
raw_settings = yaml.safe_load(file)
settings = DeskewSettings(**raw_settings)
print(f"Deskewing parameters: {asdict(settings)}")
return settings


@click.group()
def cli():
pass


@cli.command()
@input_data_path_argument()
@deskew_param_argument()
@output_dataset_options(default="./deskewed.zarr")
@click.option(
"--keep-overhang",
"-ko",
default=False,
is_flag=True,
help="Keep the overhanging region.",
)
@click.help_option("-h", "--help")
def create_empty_zarr(input_data_path, deskew_param_path, output_path, keep_overhang):
# Load the "0" position to infer dataset information
input_data_path = natsorted(input_data_path)
click.echo(f"Input data folders {input_data_path})")

input_dataset = open_ome_zarr(str(input_data_path[0]), mode="r")
T, C, Z, Y, X = input_dataset.data.shape

# Get the deskewing parameters
settings = _get_deskew_params(deskew_param_path)
deskewed_shape, voxel_size = get_deskewed_data_shape(
(Z, Y, X),
settings.pixel_size_um,
settings.ls_angle_deg,
settings.px_to_scan_ratio,
keep_overhang,
)

click.echo("Creating empty array...")

# Handle transforms and metadata
transform = TransformationMeta(
type="scale",
scale=2 * (1,) + voxel_size,
)

# Prepare output dataset
channel_names = input_dataset.channel_names

# Output shape based on the type of reconstruction
output_shape = (T, len(channel_names)) + deskewed_shape
click.echo(f"Number of positions: {len(input_data_path)}")
click.echo(f"Output shape: {output_shape}")
chunk_size = (1, 1, 64) + deskewed_shape[-2:]
click.echo(f"Chunk size {chunk_size}")

# Create output dataset
output_dataset = open_ome_zarr(
output_path, layout="hcs", mode="w", channel_names=channel_names
)
# This takes care of the logic for single position or multiple position by wildcards
for filepath in input_data_path:
path_strings = filepath.split(os.path.sep)[-3:]
pos = output_dataset.create_position(
str(path_strings[0]), str(path_strings[1]), str(path_strings[2])
)
_ = pos.create_zeros(
name="0",
shape=(
T,
C,
)
+ deskewed_shape,
chunks=chunk_size,
dtype=np.uint16,
transform=[transform],
)

if __name__ == "__main__":
create_empty_zarr()
32 changes: 32 additions & 0 deletions examples/slurm_option1/empty_zarr.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#!/bin/bash

#SBATCH --job-name=ZARR_INIT
#SBATCH --time=0:10:00
#SBATCH --partition=cpu
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=5G
#SBATCH --output=./output/deskew/empty_zarr_%j.out
env | grep "^SLURM" | sort

module load anaconda
module load comp_micro
conda activate pyplay

IN_DATA=$1
DESKEW_PARAMS=$2
OUT_DATA=$3

# Logging parameters
rm -r $OUT_DATA
now=$(date '+%y-%m-%d')
logpath=./logs/$now/deskew
mkdir -p $logpath
rm $logpath/*.out
logfile="$logpath/empty_zarr.out"

echo "raw: $IN_DATA" >> ${logfile}
echo "out: $OUT_DATA " >> ${logfile}
echo "deskew params: $DESKEW_PARAMS " >> ${logfile}

python -u ./empty_zarr.py $IN_DATA $DESKEW_PARAMS -o $OUT_DATA >> ${logfile}

Loading