Skip to content

Commit

Permalink
Update salmon slides
Browse files Browse the repository at this point in the history
  • Loading branch information
chilamakuricsreddy committed Mar 14, 2024
1 parent 913411c commit ff87193
Show file tree
Hide file tree
Showing 5 changed files with 140 additions and 4 deletions.
67 changes: 64 additions & 3 deletions Markdowns/03_Quantification_with_Salmon_introduction.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -2,12 +2,12 @@
title: "Alignment and Quantification of Gene Expression with Salmon"
date: "March 2023"
output:
beamer_presentation: default
ioslides_presentation:
css: css/stylesheet.css
logo: images/CRUK_Cambridge_Institute.png
smaller: yes
widescreen: yes
beamer_presentation: default
bibliography: ref.bib
---

Expand All @@ -17,6 +17,9 @@ bibliography: ref.bib

<img src="images/workflow_3Day.svg" class="centerimg" style="width: 80%; margin-top: 60px;">




## Traditional Alignment

AIM: Given a reference sequence and a set of short reads, align each read to
Expand Down Expand Up @@ -45,6 +48,14 @@ Aligners: STAR, HISAT2

<img src="images/quasi_mapping_2.svg" class="centerimg" style="width: 90%; margin-top: 40px;">


## Alignment and Quantification overview {#less_space_after_title}

<div style="line-height: 10%;"><br></div>

<img src="images/aln_quant_overview.png" class="centerimg" style="width: 48%; margin-top: 60px;">


## Alignment
* Traditional alignment perform base-by-base alignment
* Traditional alignment is (relatively) slow and computationally intensive
Expand Down Expand Up @@ -111,15 +122,65 @@ Salmon also takes account of biases:

* Because salmon searches transcription, not genome, it's not the right tool for finding new genes or isoforms

## Salmon workflow

<img src="images/Salmon_workflow_2.png" class="centerimg" style="width: 55%;">

## Salmon workflow
* Salmon essential steps
1. Salmon indexing
2. Quasi-mapping and abundance quantification
<img src="images/Salmon_workflow_2.png" class="centerimg" style="width: 40%;">

<div style="text-align: right">
Patro *et al.* (2017) Nature Methods doi:10.1038/nmeth.4197
</div>


## Salmon: Salmon indexing

* Two essential steps
1. Create transcriptome index
* This makes downstream quasi-mapping and quantification step efficient and faster
* Once you create an index, you can use it again and again
* Salmon indexing has two components
* Creates the reference transcriptome suffix array (SA)
* Each transcript in the reference transcriptome is mapped to its location in the SA using a hash table
2. Quasi-mapping and quantification


## Salmon: Quasi-mapping
<div class="columns-2">
<img src="images/quasi-mapping_overview.png" class="centerimg" style="width: 100%; height: 100%">

* The transcriptome (consisting of transcripts $t1,...,t6$) is converted into a \$ separated string "T"
* On "T" suffix array, SA[T], and a hash table, h , are constructed (in indexing step).
* The mapping operation begins with a k-mer (here, k = 3)
* From left to right, the read is scanned until a k-mer appears in the hash table.
* All suffixes containing the k-mer are found in the hash table and the SA intervals are retrieved
* The maximal matching prefix (MMP) is determined by finding the longest read sequence that exactly matches the reference suffix
* This process is repeated until the end of the read
* The final mapping is generated by determining the transcripts that appear in all MMPs for the read

</div>

\

Avi Srivastava *et al.* (2016) Bioinformatics 2016 Jun 15;32(12)


## Abundance estimation

* With the quasi-mapping method, the best mapping is determined for each read
* After modeling sample-specific parameters and biases, salmon will generate transcript abundance estimates
* A read that maps equally to more than one transcript will have its count divided among them (Isoform information not lost)
* A variety of complex modeling approaches are used to estimate transcript abundances, including Expectation Maximization (EM), which corrects for sample-specific biases.
* GC bias
* Positional bias
* Fragment length bias
* Sequence-based bias


## Practical


1. Create and index to the transcriptome with Salmon
2. Quantify transcript expression using Salmon
77 changes: 76 additions & 1 deletion Markdowns/03_Quantification_with_Salmon_introduction.html

Large diffs are not rendered by default.

Binary file modified Markdowns/03_Quantification_with_Salmon_introduction.pdf
Binary file not shown.
Binary file added images/aln_quant_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added images/quasi-mapping_overview.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit ff87193

Please sign in to comment.