README update (#96)

* replace mantis preprint with QLIPP paper * fix figure links and update conda command * surfacing the new preprint * added place-holders for new movies and new models * move pipeline chart to its own page * show video covers in a table --------- Co-authored-by: Shalin Mehta <[email protected]>
mehta-lab · Jun 28, 2024 · dde3e27 · dde3e27
1 parent 2747e83
commit dde3e27
Show file tree

Hide file tree

Showing 5 changed files with 128 additions and 110 deletions.
diff --git a/README.md b/README.md
@@ -1,27 +1,78 @@
 # VisCy
 
-VisCy is a deep learning pipeline for training and deploying computer vision models for image-based phenotyping at single cell resolution.
+VisCy is a deep learning pipeline for training and deploying computer vision models for image-based phenotyping at single-cell resolution.
 
-The current focus of the pipeline is on the image translation models for virtual staining of multiple cellular compartments from label-free images.
-We are building these models for simultaneous segmentation of nuclei and membrane, which are the first steps in a single-cell phenotyping pipeline.
-Our pipeline also provides utilities to export the models to ONNX format for use at runtime.
-We will grow the collection of the models suitable for high-throughput imaging and phenotyping.
-Expect rough edges until we release a PyPI package.
+The following methods are being developed:
 
-![virtual_staining](docs/figures/phase_to_nuclei_membrane.svg)
+- Image translation
+  - Robust virtual staining of landmark organelles
+- Image classification
+  - Supervised learning of of cell state (e.g. state of infection)
+- Image representation learning
+  - Self-supervised learning of the cell state and organelle phenotypes
 
-This pipeline evolved from the [TensorFlow version of virtual staining pipeline](https://github.com/mehta-lab/microDL), which we reported in [this paper in 2020](https://elifesciences.org/articles/55502). The previous pipeline is now a public archive, and we will be focusing our efforts on VisCy.
+VisCy is currently considered alpha software and is under active development.
+Frequent breaking changes are expected.
 
-## Installing viscy
+## Virtual staining
 
-1. We highly encourage using a new Conda/virtual environment.
-    The example below uses [Mamba](https://github.com/mamba-org/mamba),
-    a faster re-implementation of Conda.
+A full illustration of the virtual staining pipeline can be found [here](docs/virtual_staining.md).
+Below are some examples of virtually stained images (click to play videos).
+See the full gallery [here](https://github.com/mehta-lab/VisCy/wiki/Gallery).
+
+| VSCyto3D | VSNeuromast | VSCyto2D |
+|:---:|:---:|:---:|
+| [![HEK293T](docs/figures/svideo_1.png)](https://github.com/mehta-lab/VisCy/assets/67518483/d53a81eb-eb37-44f3-b522-8bd7bddc7755) | [![Neuromast](docs/figures/svideo_3.png)](https://github.com/mehta-lab/VisCy/assets/67518483/4cef8333-895c-486c-b260-167debb7fd64) | [![A549](docs/figures/svideo_5.png)](https://github.com/mehta-lab/VisCy/assets/67518483/287737dd-6b74-4ce3-8ee5-25fbf8be0018) |
+
+### Reference
+
+The virtual staining models and training protocols are reported in our recent [preprint on robust virtual staining](https://www.biorxiv.org/content/10.1101/2024.05.31.596901):
+
+```bibtex
+@article {Liu2024.05.31.596901,
+    author = {Liu, Ziwen and Hirata-Miyasaki, Eduardo and Pradeep, Soorya and Rahm, Johanna and Foley, Christian and Chandler, Talon and Ivanov, Ivan and Woosley, Hunter and Lao, Tiger and Balasubramanian, Akilandeswari and Liu, Chad and Leonetti, Manu and Arias, Carolina and Jacobo, Adrian and Mehta, Shalin B.},
+    title = {Robust virtual staining of landmark organelles},
+    elocation-id = {2024.05.31.596901},
+    year = {2024},
+    doi = {10.1101/2024.05.31.596901},
+    publisher = {Cold Spring Harbor Laboratory},
+    URL = {https://www.biorxiv.org/content/early/2024/06/03/2024.05.31.596901},
+    eprint = {https://www.biorxiv.org/content/early/2024/06/03/2024.05.31.596901.full.pdf},
+    journal = {bioRxiv}
+}
+```
+
+This package evolved from the [TensorFlow version of virtual staining pipeline](https://github.com/mehta-lab/microDL), which we reported in [this paper in 2020](https://elifesciences.org/articles/55502):
+
+```bibtex
+@article {10.7554/eLife.55502,
+article_type = {journal},
+title = {Revealing architectural order with quantitative label-free imaging and deep learning},
+author = {Guo, Syuan-Ming and Yeh, Li-Hao and Folkesson, Jenny and Ivanov, Ivan E and Krishnan, Anitha P and Keefe, Matthew G and Hashemi, Ezzat and Shin, David and Chhun, Bryant B and Cho, Nathan H and Leonetti, Manuel D and Han, May H and Nowakowski, Tomasz J and Mehta, Shalin B},
+editor = {Forstmann, Birte and Malhotra, Vivek and Van Valen, David},
+volume = 9,
+year = 2020,
+month = {jul},
+pub_date = {2020-07-27},
+pages = {e55502},
+citation = {eLife 2020;9:e55502},
+doi = {10.7554/eLife.55502},
+url = {https://doi.org/10.7554/eLife.55502},
+keywords = {label-free imaging, inverse algorithms, deep learning, human tissue, polarization, phase},
+journal = {eLife},
+issn = {2050-084X},
+publisher = {eLife Sciences Publications, Ltd},
+}
+```
+
+## Installation
+
+1. We recommend using a new Conda/virtual environment.
 
     ```sh
-    mamba create --name viscy python=3.10
-    # OR specify a custom path since the dependencies are large
-    mamba create --prefix /path/to/conda/envs/viscy python=3.10
+    conda create --name viscy python=3.10
+    # OR specify a custom path since the dependencies are large:
+    # conda create --prefix /path/to/conda/envs/viscy python=3.10
     ```
 
 2. Clone this repository and install with pip:
@@ -58,99 +109,6 @@ The pipeline is built using the [PyTorch Lightning](https://www.pytorchlightning
 The [iohub](https://github.com/czbiohub-sf/iohub) library is used
 for reading and writing data in [OME-Zarr](https://www.nature.com/articles/s41592-021-01326-w) format.
 
-The full functionality is only tested on Linux `x86_64` with NVIDIA Ampere GPUs (CUDA 12.4).
-Some features (e.g. mixed precision and distributed training) may not work with other setups,
+The full functionality is tested on Linux `x86_64` with NVIDIA Ampere GPUs (CUDA 12.4).
+Some features (e.g. mixed precision and distributed training) may not be available with other setups,
 see [PyTorch documentation](https://pytorch.org) for details.
-
-## Virtual staining of cellular compartments from label-free images
-
-Predicting sub-cellular landmarks such as nuclei and membrane from label-free (e.g. phase) images
-can improve imaging throughput and ease experiment design.
-However, training a model directly for segmentation requires laborious manual annotation.
-We use fluorescent markers as a proxy of supervision with human-annotated labels,
-and turn this instance segmentation problem into a paired image-to-image translation (I2I) problem.
-
-viscy features an end-to-end pipeline to design, train and evaluate I2I models in a declarative manner.
-It supports 2D, 2.5D (3D encoder, 2D decoder) and 3D U-Nets,
-as well as 3D networks with anisotropic filters.
-
-### Overview of the pipeline
-
-```mermaid
-flowchart LR
-    subgraph sp[Signal Processing]
-        Registration --> Reconstruction --> Resampling
-    end
-    subgraph viscy["Computer Vision (viscy)"]
-        subgraph Preprocessing
-            Normalization -.-> fd[Feature Detection]
-        end
-        subgraph Training
-            arch[Model Architecting]
-            hyper[Hyperparameter Tuning]
-            val[Performance Validation]
-            compute[Acceleration]
-            arch <--> hyper <--> compute <--> val <--> arch
-        end
-        subgraph Testing
-            regr[Regression Metrics]
-            segm[Instance Segmentation Metrics]
-            cp[CellPose]
-            cp --> segm
-        end
-        Preprocessing --> Training --> Testing
-        Testing --> test{"Performance?"}
-        test -- good --> Deployment
-        test -- bad --> Training
-    end
-    subgraph Segmentation
-        Cellpose ~~~ aicssegmentation
-    end
-    input[(Raw Images)] --> sp --> stage{"Training?"}
-    stage -.- no -.-> model{{Virtual Staining Model}}
-    stage -- yes --> viscy
-    viscy --> model
-    model --> vs[(Predicted Images)]
-    vs --> Segmentation --> output[Biological Analysis]
-```
-
-### Model architecture
-
-![2.5D U-Net light](docs/figures/2_5d_unet_light.svg#gh-light-mode-only)
-![2.5D U-Net dark](docs/figures/2_5d_unet_dark.svg#gh-dark-mode-only)
-
-### Reference
-
-The virtual staining method is described in [this preprint](https://doi.org/10.1101/2024.05.31.596901):
-
-```bibtex
-@article {Liu2024.05.31.596901,
-    author = {Liu, Ziwen and Hirata-Miyasaki, Eduardo and Pradeep, Soorya and Rahm, Johanna and Foley, Christian and Chandler, Talon and Ivanov, Ivan and Woosley, Hunter and Lao, Tiger and Balasubramanian, Akilandeswari and Liu, Chad and Leonetti, Manu and Arias, Carolina and Jacobo, Adrian and Mehta, Shalin B.},
-    title = {Robust virtual staining of landmark organelles},
-    elocation-id = {2024.05.31.596901},
-    year = {2024},
-    doi = {10.1101/2024.05.31.596901},
-    publisher = {Cold Spring Harbor Laboratory},
-    abstract = {Dynamic imaging of landmark organelles, such as nuclei, cell membrane, nuclear envelope, and lipid droplets enables image-based phenotyping of functional states of cells. Multispectral fluorescent imaging of landmark organelles requires labor-intensive labeling, limits throughput, and compromises cell health. Virtual staining of label-free images with deep neural networks is an emerging solution for this problem. Multiplexed imaging of cellular landmarks from scattered light and subsequent demultiplexing with virtual staining saves the light spectrum for imaging additional molecular reporters, photomanipulation, or other tasks. Published approaches for virtual staining of landmark organelles are fragile in the presence of nuisance variations in imaging, culture conditions, and cell types. This paper reports model training protocols for virtual staining of nuclei and membranes robust to cell types, cell states, and imaging parameters. We developed a flexible and scalable convolutional architecture, named UNeXt2, for supervised training and self-supervised pre-training. The strategies we report here enable robust virtual staining of nuclei and cell membranes in multiple cell types, including neuromasts of zebrafish, across a range of imaging conditions. We assess the models by comparing the intensity, segmentations, and application-specific measurements obtained from virtually stained and experimentally stained nuclei and membranes. The models rescue the missing label, non-uniform expression of labels, and photobleaching. We share three pre-trained models, named VSCyto3D, VSCyto2D, and VSNeuromast, as well as VisCy, a PyTorch-based pipeline for training, inference, and deployment that leverages the modern OME-Zarr format.Competing Interest StatementThe authors have declared no competing interest.},
-    URL = {https://www.biorxiv.org/content/early/2024/06/03/2024.05.31.596901},
-    eprint = {https://www.biorxiv.org/content/early/2024/06/03/2024.05.31.596901.full.pdf},
-    journal = {bioRxiv}
-}
-```
-
-We also report the application of the virtual staining pipeline in [this preprint](https://doi.org/10.1101/2023.12.19.572435):
-
-```bibtex
-@article {Ivanov2023.12.19.572435,
-    author = {Ivanov, Ivan E. and Hirata-Miyasaki, Eduardo and Chandler, Talon and Kovilakam, Rasmi Cheloor and Liu, Ziwen and Liu, Chad and Leonetti, Manuel D. and Huang, Bo and Mehta, Shalin B.},
-    title = {Mantis: high-throughput 4D imaging and analysis of the molecular and physical architecture of cells},
-    elocation-id = {2023.12.19.572435},
-    year = {2023},
-    doi = {10.1101/2023.12.19.572435},
-    publisher = {Cold Spring Harbor Laboratory},
-    abstract = {High-throughput dynamic imaging of cells and organelles is important for parsing complex cellular responses. We report a high-throughput 4D microscope, named Mantis, that combines two complementary, gentle, live-imaging technologies: remote-refocus label-free microscopy and oblique light-sheet fluorescence microscopy. We also report open-source software for automated acquisition, registration, and reconstruction, and virtual staining software for single-cell segmentation and phenotyping. Mantis enabled high-content correlative imaging of molecular components and the physical architecture of 20 cell lines every 15 minutes over 7.5 hours, and also detailed measurements of the impacts of viral infection on the architecture of host cells and host proteins. The Mantis platform can enable high-throughput profiling of intracellular dynamics, long-term imaging and analysis of cellular responses to stress, and live cell optical screens to dissect gene regulatory networks.Competing Interest StatementThe authors have declared no competing interest.},
-    URL = {https://www.biorxiv.org/content/early/2023/12/19/2023.12.19.572435},
-    eprint = {https://www.biorxiv.org/content/early/2023/12/19/2023.12.19.572435.full.pdf},
-    journal = {bioRxiv}
-}
-```
diff --git a/docs/figures/svideo_1.png b/docs/figures/svideo_1.png
diff --git a/docs/figures/svideo_3.png b/docs/figures/svideo_3.png
diff --git a/docs/figures/svideo_5.png b/docs/figures/svideo_5.png
diff --git a/docs/virtual_staining.md b/docs/virtual_staining.md
@@ -0,0 +1,60 @@
+# Virtual staining of cellular compartments from label-free images
+
+Predicting sub-cellular landmarks such as nuclei and membrane from label-free (e.g. phase) images
+can improve imaging throughput and ease experiment design.
+However, training a model directly for segmentation requires laborious manual annotation.
+We use fluorescent markers as a proxy of supervision with human-annotated labels,
+and turn this instance segmentation problem into a paired image-to-image translation (I2I) problem.
+
+VisCy features an end-to-end pipeline to design, train and evaluate I2I models in a declarative manner.
+It supports 2D, 2.5D (3D encoder, 2D decoder) and 3D U-Nets,
+as well as 3D networks with anisotropic filters (UNeXt2).
+
+## Overview of the pipeline
+
+```mermaid
+flowchart LR
+    subgraph sp[Signal Processing]
+        Registration --> Reconstruction --> Resampling
+    end
+    subgraph viscy["Computer Vision (viscy)"]
+        subgraph Preprocessing
+            Normalization -.-> fd[Feature Detection]
+        end
+        subgraph Training
+            arch[Model Architecting]
+            hyper[Hyperparameter Tuning]
+            val[Performance Validation]
+            compute[Acceleration]
+            arch <--> hyper <--> compute <--> val <--> arch
+        end
+        subgraph Testing
+            regr[Regression Metrics]
+            segm[Instance Segmentation Metrics]
+            cp[CellPose]
+            cp --> segm
+        end
+        Preprocessing --> Training --> Testing
+        Testing --> test{"Performance?"}
+        test -- good --> Deployment
+        test -- bad --> Training
+    end
+    subgraph Segmentation
+        Cellpose ~~~ aicssegmentation
+    end
+    input[(Raw Images)] --> sp --> stage{"Training?"}
+    stage -.- no -.-> model{{Virtual Staining Model}}
+    stage -- yes --> viscy
+    viscy --> model
+    model --> vs[(Predicted Images)]
+    vs --> Segmentation --> output[Biological Analysis]
+```
+
+## Model architectures
+
+Reported in the [2024 preprint](https://www.biorxiv.org/content/10.1101/2024.05.31.596901):
+
+Reported in the [2020 paper](https://elifesciences.org/articles/55502v1):
+
+![2.5D U-Net light](https://github.com/mehta-lab/VisCy/blob/main/docs/figures/2_5d_unet_dark.svg?raw=true#gh-light-mode-only)
+![2.5D U-Net dark](https://github.com/mehta-lab/VisCy/blob/main/docs/figures/2_5d_unet_dark.svg?raw=true#gh-dark-mode-only)