Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make image handling more efficient #58

Open
nilseling opened this issue Feb 4, 2022 · 11 comments
Open

Make image handling more efficient #58

nilseling opened this issue Feb 4, 2022 · 11 comments

Comments

@nilseling
Copy link
Contributor

Somehow use raster images to write .h5 files - basically avoiding loading images into memory

@nilseling
Copy link
Contributor Author

Add browser functionality as used in EBImage

@nilseling
Copy link
Contributor Author

Check out vitessceR for integration once it's released

@Pancreas-Pratik
Copy link

@nilseling I wanted to thank you and everyone who put in the work into cytomapper.

I am working on analyzing IMC data. I am stuck on the interactive "gating" that is done through the shiny app.

I have about 8 images with 20-30 channels each. Originally, I tried the whole pipeline, but I was stuck on this step on my personal 128GB RAM and 16 Core personal Ubuntu server. I was able to use the shiny app for trying out gating, but it was slow and was taking some time to load after each change was made.

I thought that I maybe needed more RAM/memory and cores/threads, therefore I have now brought over the analysis to my institution's HPC cluster.

Now, I am using RStudio Server Open Source on an HPC. I tried the gating with about 260GB RAM/memory and about 26 threads (I could allocate more computational resources as well) but it was still the same (slow).

After some googling around, I found that in order to make shiny faster (or I guess more parallel?) I would have to find a way to pay $$ for RStudio Connect (which gives the capability to spawn multiple processes per app - rather than the 1 process limit per app on the free shiny app) or request it from my institution, which I am trying to avoid...

--
@nilseling Have you experienced the same thing? If so, what is your work-around (without, of course, paying for a RStudio Connection subscription)?

@Pancreas-Pratik
Copy link

Pancreas-Pratik commented Jul 17, 2022

Nevermind about my question... I guess this makes sense, since the images object in the R environment that is loaded into the shiny is ~11Gb for me. I read through the documentation (?cytomapperShiny) and found a subtle, but very helpful, clue on what to do (to use only masks not the images)

Now, I only have spe and masks loaded (which are much smaller [<1Gb]) and running the shiny using the masks through, I guess internally, plotCells() rather than plotPixels(), is much smoother and very fast.

Thank you again, sorry for any disruption!

@nilseling
Copy link
Contributor Author

Hi @Pancreas-Pratik,
thanks for the detailed comments. Indeed, image handling is not so straight forward in R. I'd like to understand the setting a bit better as it will also help other users. When you are saying that the images are 11GB in memory it shouldn't be a problem to load them into memory on your laptop. What are the dimensions of the images? I believe the issue comes from drawing the composites on the R graphics device when there are a lot of pixels to display. Unfortunately, displaying the masks should actually be slower due to internal subsetting operations.

@Pancreas-Pratik
Copy link

Hi @nilseling,
You're welcome! I really appreciate your prompt response.

The dimensions are 2372 pixels x 1947 pixels
Just curious when you guys did your studies, were the dimensions the same?

@nilseling I think you are very right. I think that was the main issue I was having, the images themselves were taking alot of time to load on the R graphics device. Is there a solution to this?

Regardless, since yesterday or so, I have been using only the masks, so something like this:

if (interactive()) {
    library(cytomapper)
   # images <- readRDS("data/images.rds")
   spe <- readRDS("data/spe.rds")
    masks <- readRDS("data/masks.rds")
    cytomapperShiny(object = spe, mask = masks, 
                    cell_id = "ObjectNumber", img_id = "sample_id")
}

and the gating experience was very smooth (even though you mention the internal subsetting operations would be, I guess, slower?). I could probably try using the images.rds in the cytomapperShiny() but just avoid going to that second tab altogether where the composites are drawn on the R graphics device. I am sort of on a time crunch right now, so using the masks alone like above are working, so just going to use it that way for now. I am so happy I got through the roadblock.

@nilseling you and your team are amazing!

P.S. I have another question, but it's different, so I'll open a new issue in a "just a heartbeat".

@Pancreas-Pratik
Copy link

Pancreas-Pratik commented Jul 19, 2022

But I do want to mention that I could imagine... when doing 3D IMC... doing the gating for every "2D slice" of the "3D block" could...take some time!

@nilseling
Copy link
Contributor Author

Yes, these images are quite large and plotting them is the limiting factor. The fastest way of gating would be to not load any masks or images and run cytomapperShiny only on the spe object. But then you won't be able the observe the spatial distribution of cells.

@Pancreas-Pratik
Copy link

Pancreas-Pratik commented Aug 11, 2022

Thank you @nilseling

You are very correct again that plotting the images (drawing the composites on the R graphics device) is the limiting factor. I have an additional question: Is there any way to speed up drawing the composites on the R graphics device in shiny. Would running on a GPU-enabled node on my HPC cluster help? Do you think there is any code within cytomapper I could modify to speed this up?

EDIT: It's good the way it is on second thought. I ended up loading two instances of RStudio Server and gate two cell-types at the same time. (While one is drawing the composite on it's R graphics device, I do the other) This works! 👍

@Pancreas-Pratik
Copy link

Pancreas-Pratik commented Aug 11, 2022

Also, I now understand that using the spe and masks only (without images) or spe only (without images or masks) for interactive gating using cytomapperShiny will not serve my purpose for accurately categorizing individual segmented cells in my images into their respective cell types for quantification/spatial analysis/etc. (It took some time and energy to realize this!)

@nilseling mentioned this here #58 (comment), but I did not completely understand at the time. To save someone else time in the future, my conclusion, although obvious in hindsight, is that the images are required to see the spatial distribution of cells on the image, and also, very importantly, to determine if actual cell signal channel/marker signal is being detected or if it is very low/noise (false-positive) that is being detected visualizing the gating on the images after each adjustment of gating is vital to knowing if gates and cell types are being assigned to their appropriate categories.

I made the mistake of gating all signal on my asinh-transformed counts greater than 0 for positive channel/marker selection and all cells with 0 signal for negative channel/marker selection... I should have adjusted the gating so that it wasn't just "black and white" (positive and negative expression), but there gradients to expression/signal such as zero, low, medium, high, etc... where "high" signal may be where the cell type of interest is, and maybe low and zero are noise (false-positives) for that particular channel/marker.

Here is an example where I selected too many cells (and alot of noise/false-positives):
ex-too-much

Here is where I have adjusted it to now, which was better:
ex-better

@nilseling
Copy link
Contributor Author

The imager::display function could be a potential solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants