prototype parallel deskew implementation #33

ieivanov · 2023-05-01T22:13:36Z

Prototype parallel deskew implementation, see discussion in #6

TODO:

debug multiprocessing of ndtiff datasets, see micro-manager/NDTiffStorage/#109
as a workaround to the above, open a new Dataset object in every worker, rather than pickling the dask array from the main process
define behavior of --view argument when using multiprocessing
define behavior of --view argument when processing HCS datasets

…el-deskew

this is adds the skeleton to multiprocessing per position over T and C.

edyoshikun · 2023-05-27T01:31:00Z

The implementation now should mirror recOrders new recon functions as well as the standard that we are going to follow to easily parallelize the functions, meaning simplifying the functions to process a single position and iterate over T and C. Currently , the single node will get one position and within the node we can multiprocess over T and C.

One problem at the moment is a mismatch in the deskewed output, which does not match the output of get_deskewed_data_shape. Let's chat next week about this, but I feel like this should easily let us write slurm scripts. @talonchandler

talonchandler

We're getting there. I think the highest priority problems are:

running one position then another position overwrites the first position (this prevents any slurm-level parallelization)
The parallel decorator isn't connected...I think this is nice to have, but not a need to have since we're mainly planning to parallelize over P with slurm.

mantis/cli/deskew.py

ieivanov · 2023-06-02T18:05:45Z

I know that this is a work in progress, but before merging we should plan to expand the function documentation and come up with sensible function names. At this point we have functions named analysis.deskew.deskew_data, cli.deskew.deskew. and cli.deskew.deskew_cli - it's not immediately clear from the function name what each function does.

analysis.deskew.deskew_data deskews NDArrays. I think we can rename it to analysis.deskew.deskew and update the function docstring to start with "Deskew NDArrays with 3 dimensions". Another name for this function could be analysis.deskew.deskew_ndarray.

cli.deskew.deskew is a workflow for reading zarr datasets, deskewing, and saving results as zarr datasets. I think we could keep this name, though it could clash / be confused with the one above. We could rename it to cli.deskew.deskew_workflow?

cli.deskew.deskew_cli needs a new name. This function deskews a single position from a zarr dataset. Maybe call it cli.deskew.deskew_position?

cli.deskew.single_process and cli.deskew.parallel also need better names. I'm not perfectly clear on how they interact with the other functions, I'll let you decide on more fitting names. They maybe should also start with an underscore if they are largely internal methods.

P.S. We'll likely have similar issues with other analysis methods and CLI workflows. We could try to set a convention for naming the ndarray methods and the cli methods.

talonchandler · 2023-06-22T03:22:11Z

This branch is now deprecated by #47.

To close the loop on the comments @ieivanov brought up here:

@edyoshikun and I decided to drop the deskewing-integrated viewer. We've found that napari's builtin and napari-ome-zarr viewers work well.
we've significantly reworked the names, but we're still very open to further discussion on Mantis deskewing with multiprocessing and slurm #47. We can continue the discussion there.

prototype parallel deskew implementation

3062cc4

ieivanov mentioned this pull request May 1, 2023

Generalized deskewing for mantis #6

Merged

ziw-liu mentioned this pull request May 18, 2023

Convert to zarr as an initial analysis step #42

Closed

edyoshikun added 5 commits May 26, 2023 10:54

Merge branch 'main' of github.com:czbiohub/mantis into feature/parall…

5606cd2

…el-deskew

added scipy dependency

2b61e4e

refactoring deskew to comply to our single position standard.

60651ad

this is adds the skeleton to multiprocessing per position over T and C.

adding napari and iohub dependencies

32bb3bc

fixing mistake on changing this file

54e20c4

talonchandler added 5 commits May 30, 2023 14:25

revive accidental deletion

ee7c575

fix input-output mixup bug

57fc884

cleanup

94b6cb4

whoops

b10417c

remove vestigial param

0353f18

talonchandler reviewed May 31, 2023

View reviewed changes

mantis/cli/deskew.py Outdated Show resolved Hide resolved

mantis/cli/deskew.py Outdated Show resolved Hide resolved

process positions sequentially and parellize T and C.

e8dcc43

talonchandler mentioned this pull request May 31, 2023

create_position fails after closing and reopening a dataset czbiohub-sf/iohub#135

Closed

change reader mode

7bf7cb6

talonchandler mentioned this pull request Jun 1, 2023

Remove state in NGFF plate initialization czbiohub-sf/iohub#138

Merged

edyoshikun added 2 commits June 8, 2023 14:38

have a flag for slurm

c06aa67

cleaning up the deskew cli function.

ed8bb2e

talonchandler mentioned this pull request Jun 12, 2023

create-then-fill #46

Closed

talonchandler closed this Jun 22, 2023

ieivanov deleted the feature/parallel-deskew branch July 9, 2023 22:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prototype parallel deskew implementation #33

prototype parallel deskew implementation #33

ieivanov commented May 1, 2023 •

edited by edyoshikun

Loading

edyoshikun commented May 27, 2023

talonchandler left a comment

ieivanov commented Jun 2, 2023 •

edited

Loading

talonchandler commented Jun 22, 2023

prototype parallel deskew implementation #33

prototype parallel deskew implementation #33

Conversation

ieivanov commented May 1, 2023 • edited by edyoshikun Loading

edyoshikun commented May 27, 2023

talonchandler left a comment

Choose a reason for hiding this comment

ieivanov commented Jun 2, 2023 • edited Loading

talonchandler commented Jun 22, 2023

ieivanov commented May 1, 2023 •

edited by edyoshikun

Loading

ieivanov commented Jun 2, 2023 •

edited

Loading