Skip to content

Analysis of human differentially expressed genes during the EMT process

License

Notifications You must be signed in to change notification settings

Theo-Roncalli/RNAseq-EMT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

30 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

New Generation Sequencing (NGS)

This work focuses on the study of Yang et al. (2016) who were interested in the epthithelium-mesenchymal transition (EMT) process. In their work, the EMT has been induced by ectopic expression of Zeb1 in a lung cancer cell line (H358). The authors have studied RNAseq data over 7 days, starting from uninduced cells.

The initial data are available on the NCBI site. In order to reduce time computation, we used only 0.5% of the total RNAseq data at the following address: http://rssf.i2bc.paris-saclay.fr/X-fer/AtelierNGS/TPrnaseq.tar.gz

Dependencies

The pipeline runs on bash. Some package are required for launching some commands such as fastqc, trimmomatic and featureCounts.

sudo apt-get install -y fastqc # For using fastqc
conda install -c bioconda trimmomatic # For using trimmomatic
sudo apt-get install -y subread # For using featureCounts

Hardware requirements

A machine with at least 16 GB of FREE RAM (to create the index and the mapping on the chromosome 18 of the reference genome).

Executing The Pipeline

The pipeline is used to create a file named "hugo-counts.txt" to which is associated, for each gene, the HUGO identifier and the number of reads aligned for each observation. This file is available in the repository Data/Counts. The steps are the followings.

  1. Clone the Github repository to your machine
git clone https://github.com/Theo-Roncalli/RNAseq-EMT.git
cd RNAseq-EMT
  1. Importation of reads and reference genome
bash install.sh
  1. Creation of the counting file which contains, for each HUGO code in Chromosome 18, the numbers of reads per gene and per observation.
bash counting.sh

Cleaning Repository

For cleaning the repository (i.e. delete Data and Figures folders), please type:

bash clean.sh

About

Analysis of human differentially expressed genes during the EMT process

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages