Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different results for same cluster #20

Open
ayyildizd opened this issue May 28, 2024 · 6 comments
Open

Different results for same cluster #20

ayyildizd opened this issue May 28, 2024 · 6 comments

Comments

@ayyildizd
Copy link

Hi, thanks for developing this nice tool.

I have two different subset of data which has the same stem-cell-like cluster as a common population. When I run tricycle with these 2 subset of data, I see that the same stem cell population gets different tricyclePosition assignment. In one case they are in mitotic phase and the other case they are in G1/G0 phase. These 2 subsets of data have big difference in terms of number of cells, the one gives me mitotic score has 10K less cells than the other and it is less heterogeneous subset.
I am wondering if this is normal behaviour since their pca is different. Could you elaborate on which score to trust on in these cases ?
For info: both of the datasets I used did not yield ellipsoid PCA.

@kasperdanielhansen
Copy link
Collaborator

kasperdanielhansen commented May 28, 2024 via email

@ayyildizd
Copy link
Author

Thanks for fast response.

First dataset
image

image

Then the second dataset:
image

image

As you can see common cluster nNSC and two others (NB1 and 2) gets different cell cycle phase assignment.

@ayyildizd
Copy link
Author

I was re-looking the last dataset and I see TOP2A plot looks really weird

image

Then I re-run the tricycle with exact same code (calling from history) and now I see it is completely opposite of what I saw before.
image

image image

Do you know why this happens?

@kasperdanielhansen
Copy link
Collaborator

kasperdanielhansen commented May 28, 2024 via email

@ayyildizd
Copy link
Author

They are all single nuclei sequencing coming from 10x platform. Our average sequencing depth is 50K. There are 24 samples here coming from both healthy and disease (3 different stages). There are batch effects in these samples, so I used harmony to integrate them. And yes both the datasets I run tricycle are subsets of a certain cell type populations.
Do I really need to run tricycle on each sample one by one?
Then what I don't get is harmony umaps here are used for visualisation purposes just like regular UMAPs and I thought they don't interfere with the tricycle since it uses PCA embedding. After I run the tricycle and get the timing I just transferred it to my seurat object to plot and understand which cell type has which cell cycle phase.
Please let me know if I did/interpret this in wrong way.

Here is the code I used (and I used tricycle version 1.12.0)

ref.o <- run_pca_cc_genes(as.SingleCellExperiment(DietSeurat(seurat_obj)), exprs_values = "logcounts", species = "human", gname.type = "SYMBOL") 
cc.ref <- attr(reducedDim(ref.o, "PCA"), "rotation")[, seq_len(2)]
sce <- estimate_cycle_position(as.SingleCellExperiment(DietSeurat(seurat_obj)), ref.m = cc.ref) 

Then I transferred this column to seurat object to plot it in the reduction I want.

@ayyildizd
Copy link
Author

Also to add, it is not giving same result when I rerun the same code. Please see below. I re-run the same code for the smaller dataset, and now you see the clusters on the right side of umap gets different color/assignment.

image

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants