Skip to content

Commit

Permalink
update cell annotation details
Browse files Browse the repository at this point in the history
  • Loading branch information
enc-kcotto committed Apr 23, 2024
1 parent 1cf8505 commit aff7dc6
Showing 1 changed file with 28 additions and 4 deletions.
32 changes: 28 additions & 4 deletions _posts/0008-03-01-Cell_annotation.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,16 +9,35 @@ feature_image: "assets/genvis-dna-bg_optimized_v1a.png"
date: 0008-03-01
---

## Clustering/Cell type annotation
## Cell Type Annotation

TODO: add intro

First, we need to load the relevant libraries.
```R
library('SingleR')
library(SingleR)
library(celldex)
```

The following function provides normalized expression values of 830 microarray samples generated by [ImmGen](<http://www.immgen.org/) from pure populations of murine immune cells. The samples were processed and normalized as described in Aran, Looney and Liu et al. (2019), i.e., CEL files from the Gene Expression Omnibus (GEO; GSE15907 and GSE37448), were downloaded, processed, and normalized using the robust multi-array average (RMA) procedure on probe-level data. This dataset consists of 20 broad cell types ("label.main") and 253 finely resolved cell subtypes
("label.fine"). The subtypes have also been mapped to the Cell Ontology ("label.ont", if cell.ont is not "none"), which can be used for further programmatic queries.

Calling the ImmGenData() function returns a SummarizedExperiment object containing a matrix of log-expression values with sample-level labels.

```R
#cell typing with single R
#load singler immgen reference
#load singleR immgen reference
ref_immgen <- celldex::ImmGenData()

ref_immgen

head(ref_immgen$label.main)
head(ref_immgen$label.fine)
head(ref_immgen$label.ont)

```

```R
#generate predictions for our seurat object
predictions_main = SingleR(test = GetAssayData(merged),
ref = ref_immgen,
Expand All @@ -40,4 +59,9 @@ DimPlot(merged, group.by = c("immgen_singler_main"))

DimPlot(merged, group.by = c("immgen_singler_fine"))

```
```

### Note on reference annotation datasets
As one might expect, the choice of reference can have a major impact on the annotation results. It's essential to choose a reference dataset encompassing a broader spectrum of labels than those expected in our test dataset. Trust in the appropriateness of labels assigned by the original authors to reference samples is often a leap of faith, and it's unsurprising that certain references outperform others due to differences in sample preparation quality. Ideally, we favor a reference generated using a technology or protocol similar to our test dataset, although this consideration is typically not an issue when using SingleR() for annotating well-defined cell types.

Users are advised to read the relevant vignette for more details about the available references as well as some recommendations on which to use. (As an aside, the ImmGen dataset and other references were originally supplied along with SingleR itself but have since been migrated to the separate celldex package for more general use throughout Bioconductor.) Of course, as we shall see in the next Chapter, it is entirely possible to supply your own reference datasets instead; all we need are log-expression values and a set of labels for the cells or samples

0 comments on commit aff7dc6

Please sign in to comment.