Pancreas example dataset #474
-
Hi, I need some help interpreting the analysis of the pancreas dataset. I followed the tutorial/recipe here: https://scvelo.readthedocs.io/Pancreas.html and got the same velocity embedding stream graph as the documentation (also shown below) I modified the recipe by removing some precomputed data, as shown in the code snippet below:
The fate of the epsilon has changed quite substantially and the velocity embedding stream now leads into alpha cells. From the paper, I understand that the stochastic steady-state model might sometimes perform inaccurately and it might be a good idea to run the dynamical model to get an overall improved accuracy of velocities across the manifold. So I reran the code using the dynamical model.
The velocities now show a "backflow" in the alpha cells which cannot be seen in either the documentation or the paper. Is this due to a difference in UMAP embedding that we see this or maybe due to changes in parameter values for neighbor calculation? Or have I messed up somewhere? I would really appreciate your comments on this. Versions from
Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The projection of velocities depends on the embedding coordinates and the genes that are included in the analysis. In the second example you're filtering out many important genes (setting For the latter dynamical example, this is due to gene selection. It is more likely to filter out important genes when setting the thresholds relatively low (min counts) and cutting out all those that don't make it into the top2000 hvg. I would generally recommend to set the threshold |
Beta Was this translation helpful? Give feedback.
The projection of velocities depends on the embedding coordinates and the genes that are included in the analysis. In the second example you're filtering out many important genes (setting
min_shared_counts=1
and then going for 2k hvg's is not a good selection criteria - the min_counts definitely needs to be higher, otherwise many important genes are removed while secondary important ones with few counts and high variations are included for velocity analysis even though they simply cannot yield any signal in splicing kinetics with such low cell numbers). Anyways, I wouldn't claim from a 2D projection, that the fate of the epsilon cells have changed. In fact, if you explore velocities in th…