New Generation Sequencing (NGS)

This work focuses on the study of Yang et al. (2016) who were interested in the epthithelium-mesenchymal transition (EMT) process. In their work, the EMT has been induced by ectopic expression of Zeb1 in a lung cancer cell line (H358). The authors have studied RNAseq data over 7 days, starting from uninduced cells.

The initial data are available on the NCBI site. In order to reduce time computation, we used only 0.5% of the total RNAseq data at the following address: http://rssf.i2bc.paris-saclay.fr/X-fer/AtelierNGS/TPrnaseq.tar.gz

Dependencies

The pipeline runs on bash. Some package are required for launching some commands such as fastqc, trimmomatic and featureCounts.

sudo apt-get install -y fastqc # For using fastqc
conda install -c bioconda trimmomatic # For using trimmomatic
sudo apt-get install -y subread # For using featureCounts

Hardware requirements

A machine with at least 16 GB of FREE RAM (to create the index and the mapping on the chromosome 18 of the reference genome).

Executing The Pipeline

The pipeline is used to create a file named "hugo-counts.txt" to which is associated, for each gene, the HUGO identifier and the number of reads aligned for each observation. This file is available in the repository Data/Counts. The steps are the followings.

Clone the Github repository to your machine

git clone https://github.com/Theo-Roncalli/RNAseq-EMT.git
cd RNAseq-EMT

Importation of reads and reference genome

bash install.sh

Creation of the counting file which contains, for each HUGO code in Chromosome 18, the numbers of reads per gene and per observation.

bash counting.sh

Cleaning Repository

For cleaning the repository (i.e. delete Data and Figures folders), please type:

bash clean.sh

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Data		Data
Figures		Figures
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clean.sh		clean.sh
counting.sh		counting.sh
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

New Generation Sequencing (NGS)

Dependencies

Hardware requirements

Executing The Pipeline

Cleaning Repository

About

Releases

Packages

Languages

License

Theo-Roncalli/RNAseq-EMT

Folders and files

Latest commit

History

Repository files navigation

New Generation Sequencing (NGS)

Dependencies

Hardware requirements

Executing The Pipeline

Cleaning Repository

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages