Myelin insulation as a risk factor for axonal degeneration in autoimmune demyelinating disease. Analysis. Profiling of CD45+ immune cells from wt and hMBP SpC 4 days post-EAE onset by single-cell RNA sequencing
Reading in raw count matrices output by CellRanger, annotating meta data, merging both hMBP and WT counts into one object. Barcodes with less than 100 reads are discarded as empty droplets.
Plotting densities of counts/cell by sample. The line represents the applied cutoff.
Plotting the density of detected genes/cell by sample. The line represents the applied cutoff.
Summary plotting read counts vs. number of genes, colored by ratio of mitochondrial genes.
Filtering for high-quality cells is done according to cutoffs stated above.
Count data is scaled to 10.000 counts/cell and log2p normalized.
2000 most variable genes are selected for downstream analysis.
Data is normalized and PCA calculated for 2000 most variable genes. Elbow plot shows %-variance explained by each PC. There is no clear shoulder. 30 PCs will be used for downstream analysis.
For visualization, UMAP is calculated using the first 30 PC’s and default parameters (n.neighbors = 30, min.dist = 0.3, seed.use = 42, etc.). Plotted below colored by sample.
Nearest-neighbor and shared nearest neighbor graphs are constructed using the first 30 dimensions of PCA and standard parameters (k=20, nn.method = ‘annoy’, annoy.metric = ‘euclidean’, prune.SNN = 1/15, n.trees = 50,…) Cells are clustered by modularity optimization on the Louvain algorithm (Waltman and van Eck (2013). Standard parameters are used (resolution = 0.8).
In order to assign cell types two approaches are used jointly: (1) Plotting of canonical immune cells markers (2) Calculating marker genes of individual clusters and manual search in databases with cell type annotations (,
Plotting Ptprc (CD45) expression shows low background from CD45- cells. Clusters 16 and 17 look like contamination.
clPlot = DimPlot_scCustom(sc.norm)&NoAxes()
#Ptprc: CD45+, the cells that don't express are background
FeaturePlot(sc.norm, features = c('Ptprc')) + clPlot + plot_layout(ncol = 2)
Plotting Cd14,C1qa,Aif1 (IBA1), Trem2 shows as signal both in macrophages and microglia
wrap_plots(FeaturePlot(sc.norm, features = c('Cd14','C1qa','Aif1','Trem2'))&NoAxes())+clPlot
Plotting P2ry12 and Tmem 119 as markers for Microglia. Both are fairly specific to Clusters 4,5,9. C1qa and Cx3cr also have higher expression in these clusters.
wrap_plots(FeaturePlot(sc.norm, features = c('P2ry12','Tmem119','C1qa','Cx3cr1'))&NoAxes())+clPlot
Plotting Fcgr1 (CD64), Ccr2, Cxcr4 as markers for macrophages. They are all strongest in Clusters 0,1,2,3,13
#macrophages: Fcgr1: CD64, Cx3cr1
wrap_plots(FeaturePlot(sc.norm, features = c('Fcgr1','Ccr2','Cxcr4'))&NoAxes())+clPlot
Plotting Cd3g, Cd3d (CD3) and Trbc1, Trbc2 (TCR) shows that clusters 7,6,12 are T-Cells. (1) Cd4 (CD4) and Cd8a (CD8) are often not well detected in scRNAseq data but can see both populations present (2) NK cells are defined by absence of CD3 and e.g. Klrk1 (CD314), Klrb1 (CD161), Ncr1 (NKp46) we can see a population that is CD3-,CD314+,NKp46+ in one of the T-cell clusters
#T Cells Cd3g/Cd3d: CD3
wrap_plots(FeaturePlot(sc.norm, features = c('Cd3g','Cd3d','Trbc1', 'Trbc2'))&NoAxes())+clPlot
Plotting Cd19 (CD19), Ms4a1 (CD20), Cd79a+Cd79b (CD79) shows B-cells in Cluster 18. Looks like a very small percentage, there might be some selection bias.
wrap_plots(FeaturePlot(sc.norm, features = c('Cd19','Ms4a1','Cd79a','Cd79b'))&NoAxes())+clPlot
Plotting Itgax (CD11c), H2-Aa, H2-ab1, H2-D1 (MHCII) as markers for conventional dendritic cells (DC). Both are not very specific, strongest in Cluster 10. Additionally, Xcr1 and Clec9a (CD370) are supposed to be markers for DC presenting to CD8+ T-cells which are also concentrated in Cluster 10.
wrap_plots(FeaturePlot(sc.norm, features = c('Itgax','H2-Aa','Xcr1','Clec9a'))&NoAxes())+clPlot
Plotting Siglech (Siglec-H), which also marks Microglia and Bst2 (CD317) as Markers of Plasmacytoid DCs. Both are strongest in Cluster 14.
wrap_plots(FeaturePlot(sc.norm, features = c('Siglech','Bst2'),ncol = 1) & NoAxes()) + clPlot
Expression of Ngp (Neutrophilic granule protein), S100a8 and Retnlg identfy Cluster 15 as granulocytes –> these are neutrophils that are important for EAE additional markers: Cxcr2: main chemokine receptor for neutrophils, Ly6g
wrap_plots(FeaturePlot(sc.norm, features = c('S100a8','Ngp','Retnlg'))&NoAxes()&NoAxes())+clPlot
Clusters 16, 11 and 17 still remain unclear. Checking by calculating unqiue clustermarkers:
FindMarkers(sc.norm, = 'seurat_clusters',ident.1 = 17, logfc.threshold = 2, min.diff.pct = 0.7, only.pos = TRUE)
p_val avg_log2FC pct.1 pct.2 p_val_adj
Tm4sf1 0.000000e+00 2.805910 0.707 0.002 0.000000e+00
Id3 0.000000e+00 2.596465 0.747 0.016 0.000000e+00
Sparcl1 0.000000e+00 3.010900 0.813 0.003 0.000000e+00
Ptn 0.000000e+00 3.544829 0.787 0.008 0.000000e+00
Nedd4 0.000000e+00 2.546426 0.773 0.014 0.000000e+00
Igfbp7 5.368829e-211 4.577987 0.813 0.047 1.667183e-206
Tsc22d1 6.104530e-184 2.607984 0.813 0.052 1.895640e-179
Sparc 1.052539e-68 2.639328 0.987 0.220 3.268449e-64
Cluster 17 seems to be oligendrocytes (Mag, Klk6, Ptgds)
FindMarkers(sc.norm, = 'seurat_clusters',ident.1 = 16, logfc.threshold = 2, min.diff.pct = 0.7, only.pos = TRUE)
p_val avg_log2FC pct.1 pct.2 p_val_adj
S100a9 0.000000e+00 8.055196 0.901 0.013 0.00000e+00
S100a8 9.247286e-81 7.028866 0.938 0.228 2.87156e-76
sc.norm %>% FeaturePlot('Cldn5')
Tm4sf1 points to endothelial cells for cluster 16, which can be confirmed by looking at Cldn5
One cluster remains unannotated. By calculating marker genes we find Fscn1 and Ccr7 among the top markers which are annotated in literature as highly specific markers for activation of dendritic cells.
sc.norm %>% FindMarkers(ident.1 = '11',logfc.threshold = 0.5, only.pos = TRUE,min.diff.pct = 0.7)
p_val avg_log2FC pct.1 pct.2 p_val_adj
Fscn1 0.000000e+00 4.467155 0.905 0.058 0.000000e+00
Ramp3 0.000000e+00 2.848859 0.709 0.008 0.000000e+00
Ccr7 0.000000e+00 4.102384 0.921 0.007 0.000000e+00
Serpinb6b 0.000000e+00 3.729916 0.852 0.034 0.000000e+00
Traf1 1.047943e-240 2.608671 0.820 0.100 3.254177e-236
Cytip 6.252751e-192 2.987493 0.931 0.183 1.941667e-187
Tmem123 4.074869e-170 3.310465 0.942 0.241 1.265369e-165
Cell cycle phase scoring (using an enrichment approach adapted from Tirosh et al. 2016) shows actively dividing cells only in Cluster 8 which is recapitulated by expression of canonical markers for proliferating cells Top2a and Mki67.This cluster shows markers for both T-cells and microglia so we have to regress out cell cycle phase to put each cell type in their corresponding clusters.
wrap_plots(DimPlot(sc.norm, "Phase", = 'Phase',ncol = 1)&NoAxes()) + wrap_plots( FeaturePlot(sc.norm,features = c('Top2a','Mki67'), ncol = 1)&NoAxes())
Now cycling cells are closer to the microglia and t-cell clusters
## Doublet & contamination removal
In order to improve the clustering we will also remove doublet cells using scDoubletFinder and filter out any clusters that are not immune cells. As scDoubletFinder relies on randomization, the clustering results downstream of that will not be 100% reproducible via this notebook. Total number of clusters and assignment of cells to (unsupervised) clusters will vary slightly. Therefore we provide the code we used and will then load a list containing the filtered cell barcodes as they were output when we ran the function at the end of the snippet in order to generate exactly the same figures as presented in the paper.
Now we assign final cell type labels.
Based on marker genes for the individual cell types identified in our dataset no single cluster seems to contain a contamination.
Cell states were analyzed based on discussion and marker gene expression reported in Giladi et al. 2021. Here they state, that Ly6c+ monocytes (human: CD14hi-CD16lo) carry chemokine receptors that allow tissue infiltration and subsequent generation of different monocyte-derived cells. After infiltration they gain expression of MHCII-related genes. Monocytes expressing Ccr2 were identified as main drivers of EAE pathogenesis.
In their single-cell analysis they could identify a more fine-grained classification of different cell states:
- One cluster showed expression of interferon responsive genes and labeled Ifi2+
- Two subsets expressed Arg1, Apoc2 and C1qb and were designated as Arg1+ I and
- A cluster was characterized by expression of Nos2, Gpnmb, Arg1 and Fabp5 and designated Nos2+
- One population that expressed inflammatory genes such as Saa3, Plac8 and Gbp2 was called Saa3+
- A cluster expressing Cxcl9, Cxcl10 and Il1b was designated as Cxcl10+
While there is no perfect match of the markers, we can still see a good correspondance our subclustering and the respective substate markers described by Giladi et al. It seems not to be possible to retrieve the the two individual Arg1 populations. according to these markers, our clusters correspond to: - 0: Arg1+ - 1: Arg1+ - 2: Nos2+ - 3: Cxcl10+ - 4: Saa3+ (by the markers other than Saa3) - 5: Ifi2+
Cluster 6 seems to be a contamination as there are several markers from other celltypes expressed Cluster 5 seems to be a contamination as well as expression of sevarl Microglia markers is missing
The following analysis of microglia subclusters is based on the findings by Jordao et al. 2019. - In their analysis all MG populations expressed Bhlhe41, Gpr34, Hexb, Olfml3, P2ry12, P2ry13, Sall1, Serpine2, Siglech, and Sparc. - But daMG clusters showed lower expression of P2ry12, Maf,Slc2a5 and higher expression of Ccl2, Cxcl10, Ly86,and Mki67,indicating a proliferative capacity of daMG alongside the production of chemokines. - From the paper by Jordao et al.:The most inflammatory disease-associated microglial subsets (daMG2, daMG3, and daMG4) strongly downregulated the core microglial genes P2ry12, Tmem119,and Selplg and up-regulated Ly86. Both P2RY12 and TMEM119 immunoreactivities were clearly down-regulated within the core of spinal cord lesions, whereas CD162 (encoded by Selplg) was only weakly reduced. By contrast, microglial MD-1 (encoded by Ly86) was strongly up-regulated in the lesions. Because of their transcriptional profile and their P2RY12loTMEM119loMD-1hi phenotype, we determined that only daMG2, daMG3, and daMG4 localized within the lesion sites.
- All cells express most of the the micglia markers
- Clusters 1,2,4 express more daMG associated genes
- Cluster 0 containss hMG (has stronger expression in all genes downregulated in daMG and less expression in genes upregulated in daMG)
- Cluster 3 contains Microglia with an active cell cylce
- Genes that mark daMG1 don’t show variation across our data set so daMG1 population is probably not captured
- Cluster 2 is most similar to daMG2
- Cluster 1 seems to be an intermediate between hMG and daMG2, but also has some dMG3 genes expressed
- Cluster 4 is most similar to daMG3
- the daMG4 signature doesn’t show as a clear subpopulation, might be not well captured here
The following blocks generate the figures as found in the paper.
