Skip to content

Commit ff87193

Browse files
Update salmon slides
1 parent 913411c commit ff87193

5 files changed

+140
-4
lines changed

Markdowns/03_Quantification_with_Salmon_introduction.Rmd

Lines changed: 64 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22
title: "Alignment and Quantification of Gene Expression with Salmon"
33
date: "March 2023"
44
output:
5-
beamer_presentation: default
65
ioslides_presentation:
76
css: css/stylesheet.css
87
logo: images/CRUK_Cambridge_Institute.png
98
smaller: yes
109
widescreen: yes
10+
beamer_presentation: default
1111
bibliography: ref.bib
1212
---
1313

@@ -17,6 +17,9 @@ bibliography: ref.bib
1717

1818
<img src="images/workflow_3Day.svg" class="centerimg" style="width: 80%; margin-top: 60px;">
1919

20+
21+
22+
2023
## Traditional Alignment
2124

2225
AIM: Given a reference sequence and a set of short reads, align each read to
@@ -45,6 +48,14 @@ Aligners: STAR, HISAT2
4548

4649
<img src="images/quasi_mapping_2.svg" class="centerimg" style="width: 90%; margin-top: 40px;">
4750

51+
52+
## Alignment and Quantification overview {#less_space_after_title}
53+
54+
<div style="line-height: 10%;"><br></div>
55+
56+
<img src="images/aln_quant_overview.png" class="centerimg" style="width: 48%; margin-top: 60px;">
57+
58+
4859
## Alignment
4960
* Traditional alignment perform base-by-base alignment
5061
* Traditional alignment is (relatively) slow and computationally intensive
@@ -111,15 +122,65 @@ Salmon also takes account of biases:
111122

112123
* Because salmon searches transcription, not genome, it's not the right tool for finding new genes or isoforms
113124

114-
## Salmon workflow
115125

116-
<img src="images/Salmon_workflow_2.png" class="centerimg" style="width: 55%;">
126+
127+
## Salmon workflow
128+
* Salmon essential steps
129+
1. Salmon indexing
130+
2. Quasi-mapping and abundance quantification
131+
<img src="images/Salmon_workflow_2.png" class="centerimg" style="width: 40%;">
117132

118133
<div style="text-align: right">
119134
Patro *et al.* (2017) Nature Methods doi:10.1038/nmeth.4197
120135
</div>
121136

137+
138+
## Salmon: Salmon indexing
139+
140+
* Two essential steps
141+
1. Create transcriptome index
142+
* This makes downstream quasi-mapping and quantification step efficient and faster
143+
* Once you create an index, you can use it again and again
144+
* Salmon indexing has two components
145+
* Creates the reference transcriptome suffix array (SA)
146+
* Each transcript in the reference transcriptome is mapped to its location in the SA using a hash table
147+
2. Quasi-mapping and quantification
148+
149+
150+
## Salmon: Quasi-mapping
151+
<div class="columns-2">
152+
<img src="images/quasi-mapping_overview.png" class="centerimg" style="width: 100%; height: 100%">
153+
154+
* The transcriptome (consisting of transcripts $t1,...,t6$) is converted into a \$ separated string "T"
155+
* On "T" suffix array, SA[T], and a hash table, h , are constructed (in indexing step).
156+
* The mapping operation begins with a k-mer (here, k = 3)
157+
* From left to right, the read is scanned until a k-mer appears in the hash table.
158+
* All suffixes containing the k-mer are found in the hash table and the SA intervals are retrieved
159+
* The maximal matching prefix (MMP) is determined by finding the longest read sequence that exactly matches the reference suffix
160+
* This process is repeated until the end of the read
161+
* The final mapping is generated by determining the transcripts that appear in all MMPs for the read
162+
163+
</div>
164+
165+
\
166+
167+
Avi Srivastava *et al.* (2016) Bioinformatics 2016 Jun 15;32(12)
168+
169+
170+
## Abundance estimation
171+
172+
* With the quasi-mapping method, the best mapping is determined for each read
173+
* After modeling sample-specific parameters and biases, salmon will generate transcript abundance estimates
174+
* A read that maps equally to more than one transcript will have its count divided among them (Isoform information not lost)
175+
* A variety of complex modeling approaches are used to estimate transcript abundances, including Expectation Maximization (EM), which corrects for sample-specific biases.
176+
* GC bias
177+
* Positional bias
178+
* Fragment length bias
179+
* Sequence-based bias
180+
181+
122182
## Practical
123183

184+
124185
1. Create and index to the transcriptome with Salmon
125186
2. Quantify transcript expression using Salmon

Markdowns/03_Quantification_with_Salmon_introduction.html

Lines changed: 76 additions & 1 deletion
Large diffs are not rendered by default.
Binary file not shown.

images/aln_quant_overview.png

55.4 KB
Loading

images/quasi-mapping_overview.png

61.5 KB
Loading

0 commit comments

Comments
 (0)