-
Notifications
You must be signed in to change notification settings - Fork 0
Lab 01: QC
All raw data will be located in /pickett_shared/teaching/RNASeq_workshop/raw_data/reads
.
To confirm, run:
ls /pickett_shared/teaching/RNASeq_workshop/raw_data/reads
You should see 8 files. Does that mean we have 8 samples?
These files are big, and copying each file will use up too much memory for our system. Rather than copying files to your directory, I recommend creating a symbolic link.
Navigate to /pickett_shared/teaching/RNASeq_workshop/analysis
, and create a directory with your UTK user name; this is where you will store your output files.
mkdir <your_username>
cd <your_username>
Within this directory, create a sub-directory to hold the first step of our analysis:
mkdir 01_fastqc
cd 01_fastqc
Now, run the command:
ln -s /pickett_shared/teaching/RNASeq_workshop/raw_data/reads/SRR17062759_1.fastq.gz ./
This creates a symbolic link to the file; rather than creating a hard duplicate, this command creates a different type of file that points to the original file.
FastQC is not available by default on Sphinx; load it with the following command:
spack load [email protected]%[email protected]
Test that fastqc loaded properly for you. What message pops up if you just run fastqc
? How about fastqc -h
?
To run fastqc on your data, run the following:
mkdir fastqc_output
fastqc -o fastqc_output ./SRR17062759_1.fastq.gz
This creates an HTML file that is unable to be viewed on Terminal. Using the scp
command, copy this file to your personal computer to open the HTML file for viewing.
scp <your_username>@sphinx.ag.utk.edu:/pickett_shared/teaching/RNASeq_workshop/analysis/<your_username>/01_fastqc/fastqc_output/*html .
We have performed quality assessment on the forward read file for sample SRR17062759. Repeat this for the reverse read pair file.
Once you have both FastQC html files, we can run MultiQC to aggregate our results. Load it with the following command:
spack load [email protected]%[email protected]
In the same directory you ran FastQC, run the following command:
multiqc ./fastqc_output
What is the importance of the .
in this command?
Once it has finished running, you will have a file in your 01_fastqc
directory named multiqc_report.html
. This is the default file name of every run of MultiQC.