Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assigning neighborhood labels to single cells -- different strategies? #358

Open
MaximilianNuber opened this issue Jan 10, 2025 · 0 comments

Comments

@MaximilianNuber
Copy link

Dear Dr. Morgan,

In the milo tutorials, you show how to use findNhoodGroupMarkers. First, this is done without additional clustering of neighborhoods, only based on the significance and log fold change of the neighborhoods.
I would like to use this for identifying disease states in single cells, akin to https://www.nature.com/articles/s41588-023-01523-7.

I apply either milo to single cell types, or subset the neighborhoods for findNhoodGroupMarkers.
The differential expression looks quite good. However, I realised a problem about the label assignment from neighborhoods to single cells. In one case, a celltype consists of ~48k cells, with about 4600 non-DA neighborhoods and about 300 DA neighborhoods. After assignment to single cells within findNhoodGroupMarkers, the function assigns the "test" label only to about 200 to 300 cells, despite 20k cells being in at least 1 DA neighborhood. Checking the code, I realized this is because cells overlapping between neighborhoods are excluded from assignment. I got similar cell numbers using majority voting, as implemented in the publication code of the paper above.

Unfortunately, I cannot share my data yet, and I have not been able to find a similar dataset.
I assume, the problem is that DA neighborhoods are quite central to the celltype cluster. Therefore, the DA neighborhoods are surrounded by non-DA neighborhoods. And therefore, each cell within the DA neighborhoods is part of at least one non-DA neighborhood (and it seems part of mor non-DA neighborhoods than DA neighborhoods).
I am quite certain about the relevance of the DA neighborhoods I found, as they overlap between different but related conditions.

From the code of your publication on milo 2.0 I saw a comment, that neighborhood clustering is applied to find neighborhood clusters with mostly, if not only, DA neighborhoods. However, if I apply neighborhood clustering, the clusters remain quite coarse in granularity, even for different parameters of max.lfc.delta and overlap, and therefore contrain much more non-DA neighborhoods than otherwise.

To my question: are there other established ways to assign neighborhood clusters to single cells?
Maybe neighborhoods should be further refined/filtered before cell assignment (like in miloDE).
Or maybe the situation changes when the neighborhood graph is raised to the second order of neighbors? (again like in miloDE)
Lastly, I have been trying to use label propagation in igraph, but I don´t understand it enough, yet. Would you think this is a good idea?

TL;DR:

  • Are there other strategies for assigning neighborhood labels to single cells?
  • Many cells in at least one DA neighborhood, implemented strategy yields very few cells after assignment.
  • Grouping with groupNhoods did not identify clusters with relevant neighborhoods.

Thank you for any help and best regards,
max

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant