Skip to content

Set of scripts to analyse RRBS data, from the optimisation of experimental design up to the identification of differentially methylated regions

License

Notifications You must be signed in to change notification settings

ljouneau/RRBS-toolkit

Repository files navigation

RRBS-toolkit

Our pipeline is a comprehensive set of tools allowing to conduct RRBS analysis, from the optimisation of experimental design, up to the identification of differentially methylated regions DMRs between two groups of samples.

Organisation of scripts

Main steps of the pipeline are organized in following directories :

RR_genome

Directory contains a script to simulate *in silico* fragmentation of the genome after digestion by a restriction enzyme (ex : MSP1) and selection of fragments. Fragments selected are stored in a fasta file. This fasta files can be annotated in a further step (see **Annotation**), to produce a table showing in which gene or genomic feature a fragment is located.

Bismark_methylation_call

Directory contains scripts to prepare the genome for Bismark mapping and to process fastq files. At the end of the process of fastq files, this pipeline provide a file (**synthese_CpG.txt**) containing the coverage and the percentage of methylation for each CpG at least covered by one read.

Descriptive_analysis

Directory contains a script to produce a hierarchical clustering and principal component analyses of several samples.

Differential analysis

Directory contains a script to compare methylation of two groups of samples, either using [methylKit](https://bioconductor.org/packages/devel/bioc/vignettes/methylKit/inst/doc/methylKit.html) or [methylSig](http://sartorlab.ccmb.med.umich.edu/node/17) R package.

Annotation

Directory contains a script to annotate results of RR_genome or differential analysis.

Venn

Directory contains a script to compare 2 or 3 analysis results.

In each one of these directories, you will find a dedicated readme file (in pdf format) describing the main goals of the step and how to use scripts.

Schema describing the relationships existing between the main modules:

![RRBS toolkit schema](https://github.com/ljouneau/RRBS-toolkit/blob/master/RRBBS_toolkit_schema.png)

Technical prerequisites

Our scripts have been developped in :

  • Python 2.7 (with bx.intervals.intersection module for the Annotation and matplotlib for the Venn)
  • R (version >= 3.30)
  • Shell

It integrates several external tools :

All these tools should be installed before to use RRBS toolkit.

Once all these prerequisites are satisfied, you should edit file RRBS_HOME/config.sh and change the path to these external tools (RRBS_HOME refers to the path where RRBS toolkit is installed).

If you plan to launch your treatment on a cluster (using, by instance, qsub SGE command), you should define environment variable RRBS_HOME :

export RRBS_HOME=/path_to_the_directory/where/RRBS_toolkit/has/been/installed

(you can place this line of shell script in your file $HOME/.profile).

About

Set of scripts to analyse RRBS data, from the optimisation of experimental design up to the identification of differentially methylated regions

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published