-
Notifications
You must be signed in to change notification settings - Fork 933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Processing vizgen data #7080
Comments
@nzhang89 thanks for posting an issue about this - I heard Vizgen was planning to modify the output segmentation format, but I have not yet seen what these files look like. Any chance you could share some data with me in this format? My email is [email protected]. I can then go ahead and update the LoadVizgen function to support cell boundaries in parquet format. In the meantime, the steps for creating a |
Thanks for addressing this - having exactly the same issue here. Looking at Vizgen's latest user guide: it seems that with MERSCOPE instrument software v232 or later, they've moved to parquet files for the cell boundaries instead of the hdf5 format. |
Hey @andrewjkwok, thank you for your patience and pointing me to the user guide. I've reached out to Vizgen for additional information, and I'll update you here once I have a solution |
Vizgen now has Cellpose segmentation output stored as Additionally, there were some issue using |
@alikhuseynov thank you so much! A PR to handle parquet files would be much appreciated |
@AustinHartman sure, I like Seurat package and it's always nice to contribute. I will try to work on that. |
PR submitted #7190 |
Thnkas @alikhuseynov for the update. I installed the branch that you updated, but unfortunately encounter an error sayign:
Do you have any idea what the issue might be? When I look at the traceback, this is what I get:
|
How does the content look like of these directories?
Could you post your |
My
I'm not sure about the directory containing the Cellpose output - the .parquet file seems to be in the My script currently involves nothing other than loading the libraries and running the command!
|
in our case the Vizgen directory content looks like this:
I would assume that your Since I don't have access to the test data, could you run the following and see if the content of your
|
Huh this is strange. Sorry for the poor formatting earlier. This is the directory structure:
So it seems our directory structures are a bit different, as the parquet file is directly in my The parquet file seems to be more or less similar?:
|
No worries, sound good. Vizgen did updates on Merscope output as far as we know. I will add support for reading any single
|
just add the support #7190, seems to work. |
Thanks for such quick responses. I've reinstalled as you've instructed. The issue coming up now is:
i.e. I've renamed my parquet file to what you've suggested? |
Ok, given the last fix, you don't need to rename the below is my session info, just in case.
|
Very sorry but it still throws an error, this time like this...
Very happy to share data if you would prefer to best troubleshoot? |
yes, it would then be easier to debug.
there will be more updates on that PR #7190 |
I think the problem might come from your
|
Shared - please let me know if there are issues accessing the data. |
got it! |
I'm also getting the same issues as above |
yes, I'm working on the fix, which will be available asap. ideally one should have single polygon for a single segmented cell ID:
However, some have these, ie multiple polygons for single cell.
It's hard to tell which one of those multiple polygons would be the actual segmented cell. Right now, I'm keeping the single polygon with maximum points for a single cell. Suggestions on which polygons to choose are welcome. |
Yah I think max points would be fine, would love to try the fix! Thanks, |
The fix is ready, try out. re-install: These packages The test Vizgen data is from @andrewjkwok, this part will generate the object:
Result, it took ~ 5 mins with
|
Hmm I'm getting |
strange, could you please list the |
|
I was able to get it to work with hdf5 files by just using
I had this same issue, and eventually realized I had mol.type = 'microns' as an environment variable when testing later on, which fixed the LoadVizgen() function. |
I'm going to push for a fix later today, that will solve above mentioned issues.
For multiple
if yes, that will be solved too. ..will do few tests on my side, before committing |
Yes, I was running it on a Windows laptop. For the multiple .parquet files, I had a subset of the data in a folder of the directory for testing. This lead to multiple |
I see, yeah.. that was done for the situation when one has
I will try to add support to like: .."look for parquet file only in the current dir, if not found, look in the sub dir" |
I just did the last fix, please try it out. It did work on my side for old data (ie Test run would be this
sessionInfo()
|
Thanks for the update. I installed the latest version with This somehow now throws up an error:
Incidentally I keep also getting an error saying that there's no default argument for |
You are welcome, I want to make sure it's robust and always works. Does your
|
..for sanity check, could you run the following and see if all 3 files are present? sanity check
output should look something like this:
|
Got it - turns out exactly as you predicted there was an extra cell metadata file (had tried to edit a version) - works now after deleting that file! |
Hi Alik, thanks a lot for this implementation! Currently the LoadVizgen function is only able to load the segmentation coming from one z-layer. VIZGEN opened a post processing package in GitHub where we can redo the segmentation and include multiple z-layers so the parquet files now contain the segmentation of multiple layers and is able to identify new cells in different layers. To combine all the info in one unique Seurat object seems like. preliminary way is running LoadVizgen() per layer and then merge the Seurat objects. seurat.objF<- merge(seurat_z1.obj, y = c(seurat_z2.obj, seurat_z3.obj,seurat_z4.obj, seurat_z5.obj,seurat_z6.obj), add.cell.ids = c("1z", "2z", "3z","4z", "5z", "6z"), project = "test") However some cells are also observed in different z-layers and merging the objects in the end we will need to merge the same cells by ID so they are not counted as a different cell.... Could be more simple to adapt the LoadVizgen() function to be able to use more than one z layer info? |
Hey Mar, thanks. Support for multiple segmentations is on "todo" list. I need to take a look at Vizgen repo.
One could generate single dataframe of all (or selected) z-planes with unique cell IDs (also found in the count matrix) and use it as final Alternative post-processing could be Maximum Intensity Projection (MIP) on all z-planes, then segmentation on that and use the output for single z-plane @AustinHartman, do you have any suggestion? Thanks |
For some current (and potentially future) versions of the Vizgen segmentation, the segmentations for a cell won't be the same on all planes. From my knowledge, the newly generated Though based on @munizmom 's comment it seems like there is multiple output files. My two cents on adding in multiple layers: While I think it would be interesting to view multiple cell segmentations, I'm not sure what the benefit would be doing this in Seurat. The segmentations take up a ton of space in the Seurat object and dealing with the complete set of segmentations would be a slog. Being able to view the segmentations in Seurat is very useful for image generation and some small exploratory analysis; but for me Seurat isn't meant to be an exploratory image analysis tool. |
Hi Alik, thanks for the super prompt answer . To answer your questions:
“One could generate single dataframe of all (or selected) z-planes with unique cell IDs (also found in the count matrix) and use it as final "segmentation" FOV.” MM: I am going to test this and come back to you. Thanks! I will send you an email Alik with access to the multi-layer segmentation and all the output needed for loading the experiment. Give me just a day to put it together :). Thanks for the feedback and help of the community! if we manage to maintain more z-layer info it will at least make a better niches analyses and 3D analyses |
Thanks! sounds good, I see the point and agree that it will be huge/heavy object with all the z-planes segmentations. |
I'm happy that the current implementation works and is useful for users ;) |
Hi, I tested the function on our latest Vizgene data and the following error appeared:
Can you help to avoid this problem ? |
Hi, that has to do with segmentation output, I did implement few sanity checks to deal with nested polygon lists. Alternatively, if you don't need segmentations for your preliminary analysis, try to load without segmentations first, eg: Also, try to use parallelization with Hope this helps. |
can you please share your |
@heilandd ..my last commit in:
just re-install |
Hi All this is very helpful, thank you very much for the updates! I have been trying to run a Vizgen run with .parquet cell boundaries. I used Alik updates on LoadVizgen. However, I get the following warnings:
I went ahead with data processing. The UMAP looks good but I get the following when I try to plot the ImageDimPlot:
Does anyone knows what is the issue? |
It's difficult to understand how your object looks like. Try to check which spatial FOVs are present
when does that warning comes? |
Vizgen has a PR with an updated function here: We are working on integrating this into Seurat (apologies for the delay), but you shoudl be able to clone/install this now to get started |
Thank you for your continuous and dedicated efforts of incorporating more and more useful features to Seurat. We recently received our first batch of vizgen data, and I would like to process and visualize them using Seurat.
But the folder structure and files we received are a little different from the public vizgen data. We do not have a cell_boundaries folder with a bunch of h5 files. Instead, we have a single cell_boundaries.parquet file. In this situation, can I still use the cell boundary/segmentation information?
Thank you for your help.
The text was updated successfully, but these errors were encountered: