Pipeline for bulk RNA-seq in UCloud using Slurm

Overview of the Pipeline

This pipeline facilitates the analysis of bulk RNA-seq data in UCloud HPC using Ubuntu-Terminal with Slurm workload manager:

pe_fastq_qc.sh: it is designed to perform quality control (QC) on paired-end FASTQ files.
pe_align_rnaseq_v2_multigenome.sh: it is designed to align paired-end FASTQ files using STAR and optionally Salmon.
pe_postalign_RNA-seq_multigenome.sh: it is designed to process BAM files. It performs mapping QC, and data preprocessing steps to prepare the data for downstream analysis (read counting, bigWig coverage files,etc)
reorganize_files_rnaseq.sh: it is designed to organize files by category.

Supported genomes: hg38, mm10, mm39.

Access to the guides

Quality Control of fastq files Runs FASTQC, Fastq_Screen and AdapterRemoval2.
Alignment of RNA-seq data using STAR (and optionally Salmon)
Postalignment of BAM files - Read counting Read counting is done by FeatureCounts.
Organize files by category

General Usage

Clone Repository and copy the script to your Scripts folder

git clone <repository-url> 
cd <repository-directory>

Modify SLURM Parameters (Optional): Open a script (script.sh) and modify SLURM parameters at the beginning of the file, such as account, output file, email notifications, nodes, memory, CPU cores, and runtime. Alternatively, you can modify these parameters on-the-fly when executing the script.
On UCloud, start a Terminal Ubuntu run:
- Enable Slurm cluster
- To process several samples consider requesting nodes > 1
- Set the modules path to FGM > Utilities > App > easybuild

Include the References folder FGM > References > References

Include your Scripts folder and the folder with the fastq.gz/bam files.
Notes:
- Match the job CPUs to the amounts requested in the script.
- If you modify the memory parameter in the script, specify 5-10% less than the memory available in the terminal run.
- Although it is not necessary to enable tmux, it is a good practise to always do it.
- The configuration file of Fastq_Sreen is also located in the /References folder.

Run the Script: Submit the script to the SLURM cluster:

sbatch -J <job_name> path_to/Scripts_folder/script.sh <input--file>

Replace input-file with the full path to your file.

For several samples you can use a for loop:

for i in *<file-pattern>; do sbatch -J <job_name> path_to/Scripts_folder/script.sh $i; sleep 1; done

Monitor Job: You can monitor the job using the SLURM commands, such as squeue, scontrol show job , and check the log files generated.

Notes: Find test data in UCloud at the FGM project (Utilities/Example_data/bulkRNA/Fastq)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Img		Img
Rmarkdown		Rmarkdown
scripts		scripts
.DS_Store		.DS_Store
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NEWS.md		NEWS.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pipeline for bulk RNA-seq in UCloud using Slurm

Overview of the Pipeline

Access to the guides

General Usage

About

Releases

Packages

Languages

License

kristinekdl/Pipeline-bulk-RNA-seq_ucloud

Folders and files

Latest commit

History

Repository files navigation

Pipeline for bulk RNA-seq in UCloud using Slurm

Overview of the Pipeline

Access to the guides

General Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages