diff --git a/Markdowns/01_Introduction_to_RNAseq_Methods.Rmd b/Markdowns/01_Introduction_to_RNAseq_Methods.Rmd index 2ba9a74..992d5d9 100644 --- a/Markdowns/01_Introduction_to_RNAseq_Methods.Rmd +++ b/Markdowns/01_Introduction_to_RNAseq_Methods.Rmd @@ -1,6 +1,6 @@ --- title: "Introduction to RNAseq Methods" -date: "March 2023" +date: "October 2024" output: ioslides_presentation: css: css/stylesheet.css @@ -93,7 +93,7 @@ output:
-**Experimental Design** +**Experimental Design**
@@ -112,61 +112,23 @@ output:
- +

Image adapted from: Wang, Z., et al. (2009), Nature Reviews Genetics, 10, 57–63.

- - - - - - - - - - -## Designing the right experiment - - - - - - - - - -### A good experiment should: - -* Have clear objectives - -* Have sufficient power - -* Be amenable to statisical analysis - -* Be reproducible - -* More on experimental design later - - -## Designing the right experiment - -### Practical considerations for RNAseq - +## Practical considerations for RNAseq + * Coverage: how many reads? - + * Read length & structure: Long or short reads? Paired or Single end? - -* Controlling for batch effects - + * Library preparation method: Poly-A, Ribominus, other? - -## Designing the right experiment - How many reads do we need? - - + +## How many reads do we need? +

The coverage is defined as: @@ -176,191 +138,24 @@ $\frac{Read\,Length\;\times\;Number\,of\,Reads}{Length\,of\,Target\,Sequence}$

-The amount of sequencing needed for a given sample is determined by the goals of -the experiment and the nature of the RNA sample. - - * For a general view of differential expression: 5–25 million reads per sample * For alternative splicing and lowly expressed genes: 30–60 million reads per sample. * In-depth view of the transcriptome/assemble new transcripts: 100–200 million reads * Targeted RNA expression requires fewer reads. * miRNA-Seq or Small RNA Analysis require even fewer reads. - + ## Designing the right experiment - Read length -### Long or short reads? Paired or Single end? +Long or short reads? Paired or Single end? The answer depends on the experiment: * Gene expression – typically just a short read e.g. 50/75 bp; SE or PE. * kmer-based quantification of Gene Expression (Salmon etc.) - benefits from PE. * Transcriptome Analysis – longer paired-end reads (such as 2 x 75 bp). -* Small RNA Analysis – short single read, e.f. SE50 - will need trimming. - - - - -## Designing the right experiment - Replication - -### Biological Replication - -* Measures the biological variations between individuals - -* Accounts for sampling bias - -### Technical Replication - -* Measures the variation in response quantification due to imprecision in the -technique - -* Accounts for technical noise - - -## Designing the right experiment - Replication - -### Biological Replication - -
-Each replicate is from an indepent biological individual - -* *In Vivo*: - - * Patients - * Mice - -* *In Vitro*: - - * Different cell lines - * Different passages +* Small RNA Analysis – short single read, e.g. SE50 - will need trimming. + -
- -
- -
- -## Designing the right experiment - Replication - -### Technical Replication - - -Replicates are from the same individual but processed separately - -* Experimental protocol -* Measurement platform - - -
- -
- -## Designing the right experiment - Batch effects - -* Batch effects are sub-groups of measurements that have qualitatively different behavior across conditions and are unrelated to the biological or scientific variables in a study. - -* Batch effects are problematic if they are confounded with the experimental variable. - -## Designing the right experiment - Batch effects - - - -## Designing the right experiment - Batch effects - - - -## Designing the right experiment - Batch effects - -* Batch effects are sub-groups of measurements that have qualitatively different behavior across conditions and are unrelated to the biological or scientific variables in a study. - -* Batch effects are problematic if they are confounded with the experimental variable. - -* Batch effects that are randomly distributed across experimental variables can be controlled for. - -## Designing the right experiment - Batch effects - -* Batch effects are sub-groups of measurements that have qualitatively different behavior across conditions and are unrelated to the biological or scientific variables in a study. - -* Batch effects are problematic if they are confounded with the experimental variable. - -* Batch effects that are randomly distributed across experimental variables can be controlled for. - -* Randomise all technical steps in data generation in order to avoid batch effects. - - - -## Designing the right experiment - Batch effects - -* Batch effects are sub-groups of measurements that have qualitatively different behavior across conditions and are unrelated to the biological or scientific variables in a study. - -* Batch effects are problematic if they are confounded with the experimental variable. - -* Batch effects that are randomly distributed across experimental variables can be controlled for. - -* Randomise all technical steps in data generation in order to avoid batch effects. - - - -## Designing the right experiment - Batch effects - -* Batch effects are sub-groups of measurements that have qualitatively different behavior across conditions and are unrelated to the biological or scientific variables in a study. - -* Batch effects are problematic if they are confounded with the experimental variable. - -* Batch effects that are randomly distributed across experimental variables can be controlled for. - -* Randomise all technical steps in data generation in order to avoid batch effects. - - - -## Designing the right experiment - Batch effects - -* Batch effects are sub-groups of measurements that have qualitatively different behavior across conditions and are unrelated to the biological or scientific variables in a study. - -* Batch effects are problematic if they are confounded with the experimental variable. - -* Batch effects that are randomly distributed across experimental variables can be controlled for. - -* Randomise all technical steps in data generation in order to avoid batch effects - -* **Record everything**: Age, sex, litter, cell passage .. - - -## RNAseq Workflow - - -
- - -
-**Experimental Design** -
- -
-**Library Preparation** -
- -
-**Sequencing** -
- -
-**Bioinformatics Analysis** -
-
- -
- - -
- -
-

Image adapted from: Wang, Z., et al. (2009), Nature Reviews Genetics, 10, 57–63.

-
## Library preparation @@ -372,7 +167,7 @@ Replicates are from the same individual but processed separately position: absolute; top: 0px; left: 0px"> - +
- - Ribosomal RNA + - Ribosomal RNA
- - Poly-A transcripts + - Poly-A transcripts
- - Other RNAs e.g. tRNA, miRNA etc. + - Other RNAs e.g. tRNA, miRNA etc.
@@ -406,7 +201,7 @@ Replicates are from the same individual but processed separately
- +
Poly-A transcripts e.g.: @@ -424,7 +219,7 @@ Poly-A transcripts e.g.:
- +
Poly-A transcripts + Other mRNAs e.g.: @@ -435,43 +230,9 @@ Poly-A transcripts + Other mRNAs e.g.:
-## RNAseq Workflow - - -
- - -
-**Experimental Design** -
- -
-**Library Preparation** -
- -
-**Sequencing** -
- -
-**Bioinformatics Analysis** -
-
- -
- - -
- -
-

Image adapted from: Wang, Z., et al. (2009), Nature Reviews Genetics, 10, 57–63.

-
- -## Sequencing by synthesis +## Sequencing by Synthesis -A complimentary strand is synthesized using the cDNA fragment as template. +A complimentary strand is synthesized using the cDNA fragment as template. Each nucleotide includes a fluorescent tag and as the new strand is synthesized, the colour of the fluorescence indicates which base is being added. @@ -479,98 +240,26 @@ the colour of the fluorescence indicates which base is being added. The sequencer records the order of these flashes of light and translates them to a base sequence. -[see this animation](https://emea.illumina.com/science/technology/next-generation-sequencing/sequencing-technology.html) - -
- -
- -
- -
- -## Sequencing by synthesis - sequencing errors Sequencing errors cause uncertainty in calling the nucleotide at a given -location. These reductions in confidence would be reflected int he quality +location. These reductions in confidence would be reflected in the quality scores in your fastq output. -
- -If a probe doesn't shine as bright as it should, the sequencer is less confident -in calling that base. - - - - -
- -
- -If there are lots of probes the same colour in the same region the sequencer -finds it harder to identify the individual reads. - - - -
- -## RNAseq Workflow - - -
- - -
-**Experimental Design** -
- -
-**Library Preparation** -
- -
-**Sequencing** -
- -
-**Bioinformatics Analysis** -
+
+
-
- - +
+
-
-

Image adapted from: Wang, Z., et al. (2009), Nature Reviews Genetics, 10, 57–63.

-
## Case Study - + ## Differential Gene Expression Analysis Workflow {#less_space_after_title}

- + diff --git a/Markdowns/01_Introduction_to_RNAseq_Methods.html b/Markdowns/01_Introduction_to_RNAseq_Methods.html index 1c7f7ed..e44e3af 100644 --- a/Markdowns/01_Introduction_to_RNAseq_Methods.html +++ b/Markdowns/01_Introduction_to_RNAseq_Methods.html @@ -3061,20 +3061,23 @@ }); }; -if (document.readyState !== "loading" && - document.querySelector('slides') === null) { - // if the document is done loading but our element hasn't yet appeared, defer - // loading of the deck - window.setTimeout(function() { - loadDeck(null); - }, 0); -} else { - // still loading the DOM, so wait until it's finished - document.addEventListener("DOMContentLoaded", loadDeck); +if (!window.Shiny) { + // If Shiny is loaded, the slide deck is initialized in ioslides template + + if (document.readyState !== "loading" && + document.querySelector('slides') === null) { + // if the document is done loading but our element hasn't yet appeared, defer + // loading of the deck + window.setTimeout(function() { + loadDeck(null); + }, 0); + } else { + // still loading the DOM, so wait until it's finished + document.addEventListener("DOMContentLoaded", loadDeck); + } } - + + + + + + + + + + + + + + + + + + + + + + + +
+

+ +

+

October 2024

+
+
+ +

Differential Gene Expression Analysis Workflow

+ +
+
+ +

+ +

Transformation

+ +

For differential expression analyses we use raw counts but to visualise the data to explore it we use transformed data.

+ +
+
+

+ +
+
    +
  • The range of raw counts is very large
  • +
  • Variance increases with mean gene expression
  • +
+ +
+
+

+ +
+
    +
  • Allows us to more clearly assess differences between sample groups
  • +
+ +

Types of Transformations

+ +
    +
  • Log2
  • +
  • Rlog - Performs a log2 scale transformation in a way that compensates for differences between samples for genes with low read count and also normalizes between samples for library size.
  • +
  • VST - Variance stabilizing transformation (VST) aims at generating a matrix of values for which variance is constant across the range of mean values, especially for low mean and accounts for library size.
  • +
+ +

Comparison between the two: https://bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#count-data-transformations

+ +

Principle Component Analysis

+ +
    +
  • Unsupervised analysis
  • +
  • If the experiment is well controlled and has worked well, we should find that replicate samples cluster closely, whilst the greatest sources of variation in the data should be between treatments/sample groups
  • +
  • Useful tool for checking for outliers and batch effects
  • +
+ +

+ + + + +
+ + + + + + + + + diff --git a/additional_scripts_and_materials/RNA-seq_stats.pptx b/additional_scripts_and_materials/RNA-seq_stats.pptx index f562ae5..1d1ac10 100644 Binary files a/additional_scripts_and_materials/RNA-seq_stats.pptx and b/additional_scripts_and_materials/RNA-seq_stats.pptx differ diff --git a/images/Illumina_SBS.A.png b/images/01s_Illumina_SBS.1.png similarity index 100% rename from images/Illumina_SBS.A.png rename to images/01s_Illumina_SBS.1.png diff --git a/images/Illumina_SBS.B.png b/images/01s_Illumina_SBS.2.png similarity index 100% rename from images/Illumina_SBS.B.png rename to images/01s_Illumina_SBS.2.png diff --git a/images/RNA_Extraction.svg b/images/01s_RNA_Extraction.svg similarity index 100% rename from images/RNA_Extraction.svg rename to images/01s_RNA_Extraction.svg diff --git a/images/RNAseq_WorkFlow.png b/images/01s_RNAseq_WorkFlow.png similarity index 100% rename from images/RNAseq_WorkFlow.png rename to images/01s_RNAseq_WorkFlow.png diff --git a/images/case_study.png b/images/01s_case_study.png similarity index 100% rename from images/case_study.png rename to images/01s_case_study.png diff --git a/images/OtherRNA.svg b/images/01s_libprep_OtherRNA.svg similarity index 100% rename from images/OtherRNA.svg rename to images/01s_libprep_OtherRNA.svg diff --git a/images/mRNA.svg b/images/01s_libprep_mRNA.svg similarity index 100% rename from images/mRNA.svg rename to images/01s_libprep_mRNA.svg diff --git a/images/rRNA.svg b/images/01s_libprep_rRNA.svg similarity index 100% rename from images/rRNA.svg rename to images/01s_libprep_rRNA.svg diff --git a/images/polyA_selection.svg b/images/01s_polyA_selection.svg similarity index 100% rename from images/polyA_selection.svg rename to images/01s_polyA_selection.svg diff --git a/images/ribominus_selection.svg b/images/01s_ribominus_selection.svg similarity index 100% rename from images/ribominus_selection.svg rename to images/01s_ribominus_selection.svg diff --git a/images/workflow_3Day.svg b/images/01s_workflow_3Day.svg similarity index 100% rename from images/workflow_3Day.svg rename to images/01s_workflow_3Day.svg diff --git a/images/FastQC_logo.png b/images/02s_FastQC_logo.png similarity index 100% rename from images/FastQC_logo.png rename to images/02s_FastQC_logo.png diff --git a/images/bad1.png b/images/02s_fastqc_bad1.png similarity index 100% rename from images/bad1.png rename to images/02s_fastqc_bad1.png diff --git a/images/bad2.png b/images/02s_fastqc_bad2.png similarity index 100% rename from images/bad2.png rename to images/02s_fastqc_bad2.png diff --git a/images/bad3.png b/images/02s_fastqc_bad3.png similarity index 100% rename from images/bad3.png rename to images/02s_fastqc_bad3.png diff --git a/images/bad4.png b/images/02s_fastqc_bad4.png similarity index 100% rename from images/bad4.png rename to images/02s_fastqc_bad4.png diff --git a/images/good1.png b/images/02s_fastqc_good1.png similarity index 100% rename from images/good1.png rename to images/02s_fastqc_good1.png diff --git a/images/good2.png b/images/02s_fastqc_good2.png similarity index 100% rename from images/good2.png rename to images/02s_fastqc_good2.png diff --git a/images/good3.png b/images/02s_fastqc_good3.png similarity index 100% rename from images/good3.png rename to images/02s_fastqc_good3.png diff --git a/images/good4.png b/images/02s_fastqc_good4.png similarity index 100% rename from images/good4.png rename to images/02s_fastqc_good4.png diff --git a/images/fq.png b/images/02s_fq.png similarity index 100% rename from images/fq.png rename to images/02s_fq.png diff --git a/images/fq_3rd_line.png b/images/02s_fq_3rd_line.png similarity index 100% rename from images/fq_3rd_line.png rename to images/02s_fq_3rd_line.png diff --git a/images/fq_headers.png b/images/02s_fq_headers.png similarity index 100% rename from images/fq_headers.png rename to images/02s_fq_headers.png diff --git a/images/fq_quality.png b/images/02s_fq_quality.png similarity index 100% rename from images/fq_quality.png rename to images/02s_fq_quality.png diff --git a/images/fq_seq.png b/images/02s_fq_seq.png similarity index 100% rename from images/fq_seq.png rename to images/02s_fq_seq.png diff --git a/images/GappedAlignment.svg b/images/03s_GappedAlignment.svg similarity index 100% rename from images/GappedAlignment.svg rename to images/03s_GappedAlignment.svg diff --git a/images/Read_counting_2.svg b/images/03s_Read_counting_2.svg similarity index 100% rename from images/Read_counting_2.svg rename to images/03s_Read_counting_2.svg diff --git a/images/SAM_alignment_1c.png b/images/03s_SAM_alignment_1c.png similarity index 100% rename from images/SAM_alignment_1c.png rename to images/03s_SAM_alignment_1c.png diff --git a/images/SRAlignment.svg b/images/03s_SRAlignment.svg similarity index 100% rename from images/SRAlignment.svg rename to images/03s_SRAlignment.svg diff --git a/images/03s_aln_quant_overview.svg b/images/03s_aln_quant_overview.svg new file mode 100644 index 0000000..3ddcdc8 --- /dev/null +++ b/images/03s_aln_quant_overview.svg @@ -0,0 +1,91 @@ + + + + + + + + + + + + + + + + + + + diff --git a/images/quasi_mapping_1.svg b/images/03s_quasi_mapping_1.svg similarity index 100% rename from images/quasi_mapping_1.svg rename to images/03s_quasi_mapping_1.svg diff --git a/images/quasi_mapping_2.svg b/images/03s_quasi_mapping_2.svg similarity index 100% rename from images/quasi_mapping_2.svg rename to images/03s_quasi_mapping_2.svg diff --git a/images/quasi_mapping_3.svg b/images/03s_quasi_mapping_3.svg similarity index 100% rename from images/quasi_mapping_3.svg rename to images/03s_quasi_mapping_3.svg diff --git a/images/Insert_Size.svg b/images/04s_Insert_Size.svg similarity index 100% rename from images/Insert_Size.svg rename to images/04s_Insert_Size.svg diff --git a/images/Insert_Size_QC.svg b/images/04s_Insert_Size_QC.svg similarity index 100% rename from images/Insert_Size_QC.svg rename to images/04s_Insert_Size_QC.svg diff --git a/images/TranscriptCoverage.svg b/images/04s_TranscriptCoverage.svg similarity index 100% rename from images/TranscriptCoverage.svg rename to images/04s_TranscriptCoverage.svg diff --git a/images/05s_PCA.png b/images/05s_PCA.png new file mode 100644 index 0000000..e4d5538 Binary files /dev/null and b/images/05s_PCA.png differ diff --git a/images/05s_log2countsBoxPlot.png b/images/05s_log2countsBoxPlot.png new file mode 100644 index 0000000..42d98df Binary files /dev/null and b/images/05s_log2countsBoxPlot.png differ diff --git a/images/05s_rawCountsBoxPlot.png b/images/05s_rawCountsBoxPlot.png new file mode 100644 index 0000000..559fe1f Binary files /dev/null and b/images/05s_rawCountsBoxPlot.png differ diff --git a/images/IlluminaLibraryPrep1.png b/images/IlluminaLibraryPrep1.png deleted file mode 100644 index 3f56f44..0000000 Binary files a/images/IlluminaLibraryPrep1.png and /dev/null differ diff --git a/images/IlluminaLibraryPrep2.png b/images/IlluminaLibraryPrep2.png deleted file mode 100644 index a3b7e82..0000000 Binary files a/images/IlluminaLibraryPrep2.png and /dev/null differ diff --git a/images/Illumina_SBS.001.png b/images/Illumina_SBS.001.png deleted file mode 100644 index 178fe84..0000000 Binary files a/images/Illumina_SBS.001.png and /dev/null differ diff --git a/images/Illumina_SBS.002.png b/images/Illumina_SBS.002.png deleted file mode 100644 index c3600ba..0000000 Binary files a/images/Illumina_SBS.002.png and /dev/null differ diff --git a/images/Illumina_SBS.003.png b/images/Illumina_SBS.003.png deleted file mode 100644 index a56783b..0000000 Binary files a/images/Illumina_SBS.003.png and /dev/null differ diff --git a/images/Illumina_SBS.004.png b/images/Illumina_SBS.004.png deleted file mode 100644 index 71cd7dd..0000000 Binary files a/images/Illumina_SBS.004.png and /dev/null differ diff --git a/images/Illumina_SBS.005.png b/images/Illumina_SBS.005.png deleted file mode 100644 index 0c2d548..0000000 Binary files a/images/Illumina_SBS.005.png and /dev/null differ diff --git a/images/Illumina_SBS.006.png b/images/Illumina_SBS.006.png deleted file mode 100644 index c4b9d52..0000000 Binary files a/images/Illumina_SBS.006.png and /dev/null differ diff --git a/images/Illumina_SBS.007.png b/images/Illumina_SBS.007.png deleted file mode 100644 index cb3a5b0..0000000 Binary files a/images/Illumina_SBS.007.png and /dev/null differ diff --git a/images/Illumina_SBS.008.png b/images/Illumina_SBS.008.png deleted file mode 100644 index 7751a16..0000000 Binary files a/images/Illumina_SBS.008.png and /dev/null differ diff --git a/images/Illumina_SBS.009.png b/images/Illumina_SBS.009.png deleted file mode 100644 index eab1514..0000000 Binary files a/images/Illumina_SBS.009.png and /dev/null differ diff --git a/images/Illumina_SBS.010.png b/images/Illumina_SBS.010.png deleted file mode 100644 index d46418f..0000000 Binary files a/images/Illumina_SBS.010.png and /dev/null differ diff --git a/images/Illumina_SBS.011.png b/images/Illumina_SBS.011.png deleted file mode 100644 index a740756..0000000 Binary files a/images/Illumina_SBS.011.png and /dev/null differ diff --git a/images/Illumina_SBS.012.png b/images/Illumina_SBS.012.png deleted file mode 100644 index e14a611..0000000 Binary files a/images/Illumina_SBS.012.png and /dev/null differ diff --git a/images/Illumina_SBS.013.png b/images/Illumina_SBS.013.png deleted file mode 100644 index 5b44c10..0000000 Binary files a/images/Illumina_SBS.013.png and /dev/null differ diff --git a/images/Illumina_SBS.014.png b/images/Illumina_SBS.014.png deleted file mode 100644 index 0bc7f3e..0000000 Binary files a/images/Illumina_SBS.014.png and /dev/null differ diff --git a/images/Illumina_SBS.015.png b/images/Illumina_SBS.015.png deleted file mode 100644 index 4cbd783..0000000 Binary files a/images/Illumina_SBS.015.png and /dev/null differ diff --git a/images/Illumina_SBS.016.png b/images/Illumina_SBS.016.png deleted file mode 100644 index 65b50db..0000000 Binary files a/images/Illumina_SBS.016.png and /dev/null differ diff --git a/images/Illumina_SBS.017.png b/images/Illumina_SBS.017.png deleted file mode 100644 index 7ae933f..0000000 Binary files a/images/Illumina_SBS.017.png and /dev/null differ diff --git a/images/Illumina_SBS.018.png b/images/Illumina_SBS.018.png deleted file mode 100644 index 341522c..0000000 Binary files a/images/Illumina_SBS.018.png and /dev/null differ diff --git a/images/Illumina_SBS.019.png b/images/Illumina_SBS.019.png deleted file mode 100644 index e157cc1..0000000 Binary files a/images/Illumina_SBS.019.png and /dev/null differ diff --git a/images/quasi-mapping_overview.png b/images/quasi-mapping_overview.png deleted file mode 100644 index 9a67d51..0000000 Binary files a/images/quasi-mapping_overview.png and /dev/null differ