-
Notifications
You must be signed in to change notification settings - Fork 1
/
Simon DESeq.Rmd
73 lines (57 loc) · 1.74 KB
/
Simon DESeq.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
DE via Pseudobulk
=================
Simon Anders, 2019-07-30
Using the data from the ASD paper, this short notebook shows how a DE analysis can be done using pseudobulks: For a given cell sub-population
(here used as example: the L2/3 neurons), add up the counts over all cells in each sample, then perform a normal DESeq2 analysis on that.
```{r}
suppressPackageStartupMessages({
library( tidyverse )
library( Matrix )
library( DESeq2 )
})
```
Load ASD data:
```{r}
path <- "~/Desktop/ASD/rawMatrix/"
cellinfo <- read.delim( file.path( path, "meta.txt" ), stringsAsFactors=FALSE )
counts <- readMM( file.path( path, "matrix.mtx" ) )
rownames(counts) <- read.delim( file.path( path, "genes.tsv" ), header=FALSE )$V2
colnames(counts) <- readLines( file.path( path, "barcodes.tsv" ) )
head(cellinfo)
```
Convert counts from a coordinate-sparse to a column-sparse matrix. This speeds up the calculation of the pseudobulks (but does require lots of memory here).
```{r}
counts <- as( counts, "dgCMatrix" )
counts[ 1:5, 1:5 ]
```
Extract sample information from 'meta' table:
```{r}
sampleTable <-
cellinfo %>% select( sample : RNA.Integrity.Number ) %>% unique
sampleTable
```
Make pseudobulk for L2/3 cells by simply summing up L2/3 cells for each sample
```{r}
pseudobulk_L23 <-
sapply( sampleTable$sample, function(s)
rowSums( counts[ , cellinfo$sample == s & cellinfo$cluster=="L2/3", drop=FALSE ] ) )
pseudobulk_L23[ 1:5, 1:5 ]
```
Run DESeq on the pseudobulk.
```{r}
dds <- DESeqDataSetFromMatrix( pseudobulk_L23, sampleTable, ~ age + sex + region + diagnosis )
dds <- DESeq( dds )
res <- results( dds )
```
Show top hits
```{r}
res[ order(res$padj), ]
```
How many hits at 10% FDR?
```{r}
table( res$padj < .1 )
```
Session info
```{r}
sessionInfo()
```