This pipeline facilitates the analysis of bulk RNA-seq data in UCloud HPC using Ubuntu-Terminal with Slurm workload manager:
- it is designed to perform quality control (QC) on paired-end FASTQ files.
- it is designed to align paired-end FASTQ files using STAR and optionally Salmon.
- it is designed to process BAM files. It performs mapping QC, and data preprocessing steps to prepare the data for downstream analysis (read counting, bigWig coverage files,etc)
- it is designed to organize files by category.
Supported genomes: hg38, mm10, mm39.
Quality Control of fastq files Runs FASTQC, Fastq_Screen and AdapterRemoval2.
Alignment of RNA-seq data using STAR (and optionally Salmon)
Postalignment of BAM files - Read counting Read counting is done by FeatureCounts.
- Clone Repository and copy the script to your Scripts folder
git clone <repository-url>
cd <repository-directory>
Modify SLURM Parameters (Optional): Open a script ( and modify SLURM parameters at the beginning of the file, such as account, output file, email notifications, nodes, memory, CPU cores, and runtime. Alternatively, you can modify these parameters on-the-fly when executing the script.
On UCloud, start a Terminal Ubuntu run:
- Enable Slurm cluster
- To process several samples consider requesting nodes > 1
- Set the modules path to FGM > Utilities > App > easybuild
- Include the References folder FGM > References > References
Include your Scripts folder and the folder with the fastq.gz/bam files.
- Match the job CPUs to the amounts requested in the script.
- If you modify the memory parameter in the script, specify 5-10% less than the memory available in the terminal run.
- Although it is not necessary to enable tmux, it is a good practise to always do it.
- The configuration file of Fastq_Sreen is also located in the /References folder.
Run the Script: Submit the script to the SLURM cluster:
sbatch -J <job_name> path_to/Scripts_folder/ <input--file>
Replace input-file with the full path to your file.
For several samples you can use a for loop:
for i in *<file-pattern>; do sbatch -J <job_name> path_to/Scripts_folder/ $i; sleep 1; done
Monitor Job: You can monitor the job using the SLURM commands, such as squeue, scontrol show job , and check the log files generated.
Notes: Find test data in UCloud at the FGM project (Utilities/Example_data/bulkRNA/Fastq)