Skip to content

Lab 01: QC

Ryan edited this page Aug 4, 2023 · 2 revisions

Software

FastQC Website

MultiQC Website

Quality assessment of read files

All raw data will be located in /pickett_shared/teaching/RNASeq_workshop/raw_data/reads.

To confirm, run:

ls /pickett_shared/teaching/RNASeq_workshop/raw_data/reads

You should see 8 files. Does that mean we have 8 samples?

Symbolic links

These files are big, and copying each file will use up too much memory for our system. Rather than copying files to your directory, I recommend creating a symbolic link.

Navigate to /pickett_shared/teaching/RNASeq_workshop/analysis, and create a directory with your UTK user name; this is where you will store your output files.

mkdir <your_username>
cd <your_username>

Within this directory, create a sub-directory to hold the first step of our analysis:

mkdir 01_fastqc
cd 01_fastqc

Now, run the command:

ln -s /pickett_shared/teaching/RNASeq_workshop/raw_data/reads/SRR17062759_1.fastq.gz ./

This creates a symbolic link to the file; rather than creating a hard duplicate, this command creates a different type of file that points to the original file.

FastQC is not available by default on Sphinx; load it with the following command:

Test that fastqc loaded properly for you. What message pops up if you just run fastqc? How about fastqc -h?

To run fastqc on your data, run the following:

mkdir fastqc_output
fastqc -o fastqc_output ./SRR17062759_1.fastq.gz

This creates an HTML file that is unable to be viewed on Terminal. Using the scp command, copy this file to your personal computer to open the HTML file for viewing.

scp <your_username>@sphinx.ag.utk.edu:/pickett_shared/teaching/RNASeq_workshop/analysis/<your_username>/01_fastqc/fastqc_output/*html .

Challenge

We have performed quality assessment on the forward read file for sample SRR17062759. Repeat this for the reverse read pair file.

MultiQC

Once you have both FastQC html files, we can run MultiQC to aggregate our results. Load it with the following command:

In the same directory you ran FastQC, run the following command:

multiqc ./fastqc_output

What is the importance of the . in this command?

Once it has finished running, you will have a file in your 01_fastqc directory named multiqc_report.html. This is the default file name of every run of MultiQC.

Clone this wiki locally