-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathenrichMiR.Rmd
163 lines (131 loc) · 5.44 KB
/
enrichMiR.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
---
title: "miRNA target enrichment analysis with enrichMiR"
author:
- name: Pierre-Luc Germain
affiliation:
- D-HEST Institute for Neuroscience, ETH
- Lab of Statistical Bioinformatics, UZH
- name: Michael Soutschek
affiliation: Lab of Systems Neuroscience, D-HEST Institute for Neuroscience, ETH
package: enrichMiR
output:
BiocStyle::html_document
abstract: |
This vignettes explores the main functionalities of the enrichMiR package.
vignette: |
%\VignetteIndexEntry{enrichMiR}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---
```{r, include=FALSE}
library(BiocStyle)
```
```{r echo = FALSE, out.width="30%", fig.align = 'right'}
knitr::include_graphics(system.file('docs', 'enrichMiR_sticker.png', package='enrichMiR'))
```
## Installing
You can install the package using:
```{r, eval=FALSE}
BiocManager::install("ETHZ-INS/enrichMiR")
```
<br/><br/>
## Preparing the data
EnrichMiR requires two main inputs: a target annotation object, and the signal in
which you want to look for target enrichment, such as a geneset (versus a
background set) or the results of a differential-expression analysis (DEA).
Genesets would simply be vectors of gene identifiers, while a DEA would be a
table that looks like this:
```{r, message=FALSE}
library("enrichMiR")
data("exampleDEA",package="enrichMiR")
head(exampleDEA)
```
The row names should be gene identifiers (or also transcript identifiers,
depending on the target annotation), and it should minimally have the 'logFC',
'PValue' and 'FDR' columns (similar column names from established RNAseq DEA
packages can be automatically recognized).
The other important input is a target annotation, which looks like this:
```{r}
# we load the TargetScan conserved target annotation for human:
data("consTShs", package="enrichMiR")
consTShs[,1:4]
```
A target annotation minimally has the columns "set" (indicating the sets to be
tested, in this case the miRNA families, denoted by their seed region) and
"feature" (the features for which enrichment is tested, such as genes), together
denoting a miRNA-target pair, and optionally the columns "sites" (indicating how
many binding sites the feature has for the miRNA) and "score" indicating for
instance a repression score. (The objects can also include metadata regarding
synonymous feature names and miRNA family members).
# Running an enrichment analysis
An enrichment analysis can be launched as follows:
```{r}
devtools::load_all("../")
er <- testEnrichment(exampleDEA, consTShs)
```
Unless the tests are specified, this will run the default tests, namely the
siteoverlap test respectively on significantly upregulated and downregulated
genes, as well as the areamir test on the continous differential expression
signal. Printing the object will give an overview of the top results of each test:
```{r}
er
```
We see that both the siteoverlap test on downregulated genes and the areamir
test give the same top enrichments, while the siteoverlap test on upregulated
genes gives no enrichment. An aggregated table for all tests can also be
obtained using `getResults(er)`, and more detailed results of a test can be
viewed with:
```{r}
er$siteoverlap.down
```
The results can alternatively be visualized with the `enrichPlot` function:
```{r}
enrichPlot(er$siteoverlap.down)
```
# CD plots
A good way to visualize an enrichment is to plot the cumulative foldchange
distributions of different target groups. This can be done with the
`CDplotWrapper` function:
```{r}
CDplotWrapper(dea=exampleDEA, sets=consTShs, setName="UGGUCCC")
```
The genes in each sets are ranked according to their foldchange (the x-axis),
while the y-axis shows the proportion of genes in the set that have a foldchange
below or equal to that given on x. In this way, small shifts in foldchanges can
easily be visualized.
In the presence of a real miRNA effect, we expect to see a gradual dose-response
pattern, as shown above, meaning that targets with stronger binding sites also
have stronger log-foldchanges (i.e. lower logFC, in the case of an increased
miRNA activity, as in this example).
There are different ways of splitting the genes into groups, which can be
specified using the `by` argument:
```{r}
CDplotWrapper(dea=exampleDEA, sets=consTShs, setName="UGGUCCC", by="sites")
```
Because UTR length is correlated with more binding sites for any miRNA, the
number of sites tends to give spurious signals. We therefore recommend
splitting by best binding type if this information is available in
the target annotation you are using, or otherwise splitting by predicted
repression score (if available).
# Overview of the different enrichment tests
enrichMiR implements different statistical tests for target enrichment. You
can view what tests are possible for a given input using:
```{r}
availableTests(exampleDEA, consTShs)
```
Some tests depend on continous inputs (e.g. fold-changes), and hence require
the results of a differential expression analysis (DEA) to be provided),
while others are binary, i.e. based on the membership in a gene set.
When the input is a DEA, each binary tests will be performed on both
significantly upregulated and downregulated genes.
Below is a description of the various enrichment tests. We recommend using the
siteoverlap (most sensitive) or areamir tests. For a more detailed benchmark,
please consult the publication.
```{r, echo=FALSE, results='asis'}
enrichMiR:::.testDescription()
```
<br/><br/>
# Session info
```{r}
sessionInfo()
```