Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI for concatenating zarr stores #145

Merged
merged 13 commits into from
Jul 19, 2024
Merged

CLI for concatenating zarr stores #145

merged 13 commits into from
Jul 19, 2024

Conversation

edyoshikun
Copy link
Contributor

@edyoshikun edyoshikun commented Jul 2, 2024

NOTE: this function might have to be ported to iohub or iphub later

This PR adds a CLI for concatenating zarr stores. The main purpose is to merge multiples stores similar to append_channels.

Additionally, since this relies on the copy_n_paste_czyx() function to crop the arrays, one can pass cropping parameters for T,Z,Y, and X in the config file.

This helps create toy datasets and merge the datasets at the end of our pipelines and assumes that all the datasets have the same folder structure.

Edit:

  • I made the cropping optional (i.e all) by default. The config file by still expose the variables for users who want to do the cropping.

@edyoshikun edyoshikun requested a review from ziw-liu July 2, 2024 22:57
Copy link
Contributor

@talonchandler talonchandler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will vote for simplifying this PR to concatenation without cropping.

I understand that there are many shared parts of concatenation and cropping and that you're trying cut down on I/O here, but I value modularity and clearly named CLI functions over I/O concerns, which can be solved later during pipelining.

I will also suggest concatenate over concatenate_datasets.

Copy link
Contributor

@talonchandler talonchandler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I needed this utility today and it worked well! Thanks @edyoshikun!

- 'hiding' cropping by defaulting parameters to 'all'.
@edyoshikun edyoshikun requested a review from ziw-liu July 15, 2024 22:47
@ziw-liu
Copy link
Contributor

ziw-liu commented Jul 16, 2024

What is the testing requirement on shrimPy?

@edyoshikun
Copy link
Contributor Author

I don't think we had a strict requirement, however, I will add a test function that runs the CLI.
The config is checked with the pydantic models.

Thanks @ziw-liu

Copy link
Contributor

@talonchandler talonchandler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used and tested this, and I'm happy with where it's at. I'm fine with merging and moving w/o tests for now (but please go ahead if you're doing tests anyway @edyoshikun). Thanks!

@edyoshikun
Copy link
Contributor Author

I added the test and made some changes to make sure the config and cli files have consistent naming w.r.t. to the other CLIs.

@talonchandler
Copy link
Contributor

Thanks @edyoshikun! LGTM

@talonchandler talonchandler merged commit 0aec733 into main Jul 19, 2024
2 checks passed
@ieivanov ieivanov deleted the concat_cli branch August 13, 2024 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants