Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem in intetpreting the SAMap integration results #152

Open
GGboy-Zzz opened this issue Sep 16, 2024 · 2 comments
Open

problem in intetpreting the SAMap integration results #152

GGboy-Zzz opened this issue Sep 16, 2024 · 2 comments

Comments

@GGboy-Zzz
Copy link

Hello,
Thank you for developing such a useful tool! I'm working on integrating scRNAseq data cross species, and with the samap tools, I got an integration result that looks pretty good. To interpret the samap results, I have some confusion that hoping to get your hlep.
My stitched samap umap as below,
image
My problem is,

  1. I had passed known cell annotation to keys and neigh_from_keys in samap run, and I want to know if it is necessary to pass two parameters at the same time, because I only passed the cell annotation to neigh_from_keys before. In addition, do you think using leidn clustering would improve the integration result?
  2. for some cell types, It's not a complete one-to-one correspondence (based on cell annotation resolutions). And I want to identify the specific cell barcode that mapping or unmapping to a certain cell type of another species, such as cell label transfer, how can I achieve it?

Thank you in anticipation

Best regards

@atarashansky
Copy link
Owner

  1. neigh_from_keys actually expects a dictionary of booleans keyed by species ID - sorry the documentation isn't clear. Species where neigh_from_keys is True use the values defined in keys to determine neighborhoods. By default, keys uses leiden clustering. So if you'd like to use custom annotations the right way is to set neigh_from_keys to True and set keys to the annotation column name for each species. (Incidentally, setting neigh_from_keys to a dictionary of strings ends up being truthy anyway, so you probably don't need to rerun samap.)

  2. If you're comfortable working with sparse adjacency matrices, you can always look at the graph in sm.samap.adata.obsp['connectivities'] and for each row (cell) see which other cells it is connected to (nonzero columns).

@GGboy-Zzz
Copy link
Author

Thanks for your clear response, I set both keys and neigh_from_keys to my annotation column, code as below,
names={'mo':ENSMUST_array,'ze':ENSDART_array}
sm = SAMAP(filenames,f_maps = './maps/',save_processed=False, names=names,keys ={'mo':'celltype.predicted','ze':'ClusterName_short'})
sm.run(neigh_from_keys={'mo':'celltype.predicted','ze':'ClusterName_short'})
samap = sm.samap
And I wanted to identify aligned cell types by caculating cell type mapping scores, most of the cell types connected as expected with high mapping scores. However, a small portion of cell types showed either low mapping scores or incorrect connections, which I suspect may be due to inconsistencies in the granularity of cell annotations.
I would like to inquire about the following:

  1. What is the threshold for a reliable mapping score? it's robust in the quantity of a certain cell type?
  2. After rerunning SAMap on a subset of cell types ( not a one-to-one correspondence), I noticed that the cells from the species with fewer cells were more scattered on the UMAP. Could this be due to over-integration?
    custome cluster annotation
    image
    leiden_cluster
    image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants