Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overlapping observations obstruct obvious outcomes #820

Open
jahilton opened this issue Mar 11, 2024 · 5 comments
Open

Overlapping observations obstruct obvious outcomes #820

jahilton opened this issue Mar 11, 2024 · 5 comments

Comments

@jahilton
Copy link

Is your feature request related to a problem? Please describe.
A contributor plotted known marker genes for their Dataset and did not see the pattern of high expression in the areas they expected (actually didn't see high expression anywhere). This is because the sorted order of obs just so happened to be that the observations with the higher expression for that gene were not visible in Explorer because they were buried under other observations.

Describe the solution you'd like
The contributor offered 2 solutions:

  1. Mirror scanpy in that there is some intuitive reordering depending on what is being used to 'paint' the observations so the observations with higher values appear on top.
  2. Allow users to adjust the spot size, thus reducing the overlap of observations.
@corismall
Copy link

Qian Li ([email protected]) asked: When looking at the expression of a certain gene, is it possible (better) to place cells with expression of this gene on top of those without expression on the UMAP (like what Scanpy does)?

@tihuan
Copy link
Contributor

tihuan commented Jun 27, 2024

Thanks for flagging this, @jahilton !

Do you happen to have a screenshot of what it looks like and perhaps how to reproduce 🙏 ? That'll help us plan a solution!

Thanks so much!

@jahilton
Copy link
Author

Try this dataset - flip to pca embeddings (as it crowds the obs more than umap/tsne) - search for Gene CLIC3 & plot by that expression...

Screenshot 2024-06-27 at 2 26 47 PM

Download the same Dataset & plot in scanpy...
Screenshot 2024-06-27 at 2 27 47 PM

You'll see in scanpy there's a high expression streak that isn't visible in CELLxGENE because those obs are underneath obs with low CLIC3 expression.
Screenshot 2024-06-27 at 2 29 37 PM

@tihuan
Copy link
Contributor

tihuan commented Jul 1, 2024

Oh amazing! This is super helpful 💡 Thanks so much for the screenshots and explanation, @jahilton 🙏 !!

@tihuan
Copy link
Contributor

tihuan commented Jul 1, 2024

A quick search seems to suggest that to make sure higher expression dots come out on top, we just need to paint them last in the UI. So that'd be a direction eng team can investigate 😄

Mostly just thinking about performance implication, because to do so, we'd need to sort the data points and we could have > 1M data points. However, we could also overcome the issue by pre-sorting that in the BE 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants