ChIP-nf

A Nextflow pipeline for processing ChIP-seq data.

Installing Nextflow

Nextflow can be installed by using the following command:

curl -fsSL get.nextflow.io | bash

Running the pipeline

First you need to pull the pipeline using Nextflow:

$ nextflow pull guigolab/chip-nf
Checking guigolab/chip-nf ...
 downloaded from https://github.com/guigolab/chip-nf.git

You can get the pipeline help with the following command:

$ nextflow run chip-nf --help

N E X T F L O W  ~  version 0.24.1
Launching `guigolab/chip-nf` [nostalgic_franklin] - revision: 974a45c356 [master]

C H I P - N F ~ ChIP-seq Pipeline
---------------------------------
Run ChIP-seq analyses on a set of data.

Usage:
    chipseq-pipeline.nf --index TSV_FILE --genome GENOME_FILE [OPTION]...

Options:
    --help                              Show this message and exit.
    --index TSV_FILE                    Tab separted file containing information about the data.
    --genome GENOME_FILE                Reference genome file.
    --genome-index GENOME_INDEX_ FILE   Reference genome index file.
    --genome-size GENOME_SIZE           Reference genome size for MACS2 callpeaks. Must be one of
                                        MACS2 precomputed sizes: hs, mm, dm, ce. (Default: hs)
    --mismatches MISMATCHES             Sets the maximum number/percentage of mismatches allowed for a read (Default: 2).
    --multimaps MULTIMAPS               Sets the maximum number of mappings allowed for a read (Default: 10).
    --min-matched-bases BASES           Sets the minimum number/percentage of bases that have to match with the reference (Default: 0.80).
    --quality-threshold THRESHOLD       Sets the sequence quality threshold for a base to be considered as low-quality (Default: 26).
    --fragment-length LENGTH            Sets the fragment length globally for all samples (Default: 200).
    --remove-duplicates                 Remove duplicate alignments instead of just flagging them (Default: false).
    --rescale                           Rescale peak scores to conform to the format supported by the
                                        UCSC genome browser (score must be <1000) (Default: false).
    --shift                             Move fragments ends and apply global extsize in peak calling. (Default: false).

Input

The input data and metadata should be specified using a tab separated file and passing it to the pipeline command with the option --index. Here is an example of the file format:

sample1    sample1_run1     /path/to/sample1_run1.fastq.gz    -           H3
sample1    sample1_run2     /path/to/sample1_run2.fastq.gz    -           H3
sample1    sample1_run3     /path/to/sample1_run3.fastq.gz    -           H3
sample1    sample1_run4     /path/to/sample1_run4.fastq.gz    -           H3
sample2    sample2_run1     /path/to/sample2_run1.fastq.gz    control1    H3K4me2
control1   control1_run1    /path/to/control1_run1.fastq.gz   control1    input

The fields in the file correspond to:

identifier used for merging the BAM files
single run identifier
path to the fastq file to be processed
identifier of the input or - if no control is used
mark/histone or input if the line refers to a control
optional sample fragment length. If not specified the fragment length is estimated using SPP

Output

The pipeline will produce the following output data:

Alignments
pileupSignal, pileup signal tracks
fcSignal, fold enrichment signal tracks
pvalueSignal, -log_10(P) signal tracks
narrowPeak, peak locations with peak summit, pvalue and qvalue (BED6+4)
broadPeak, similar to narrowPeak (BED6+3)
gappedPeak, both narrow and broad peaks (BED12+3)

Check MACS2 output files for details.

The output data information is written to a file called chipseq-pipeline.db created in the folder from where the pipeline is run. Here is an example of the db file:

sample1   /path/to/results/peakOut/sample1.pileup_signal.bw    H3         255     pileupSignal    0.9960   0.4393
sample1   /path/to/results/peakOut/sample1_peaks.narrowPeak    H3         255     narrowPeak      0.9960   0.4393
sample1   /path/to/results/sample1.bam                         H3         255     Alignments      0.9960   0.4393
sample1   /path/to/results/peakOut/sample1_peaks.gappedPeak    H3         255     gappedPeak      0.9960   0.4393
sample1   /path/to/results/peakOut/sample1_peaks.broadPeak     H3         255     broadPeak       0.9960   0.4393
sample2   /path/to/results/peakOut/sample2_peaks.gappedPeak    H3K4me2    200     gappedPeak      0.9995   0.7216
sample2   /path/to/results/peakOut/sample2.fc_signal.bw        H3K4me2    200     fcSignal        0.9995   0.7216
sample2   /path/to/results/peakOut/sample2.pval_signal.bw      H3K4me2    200     pvalueSignal    0.9995   0.7216
sample2   /path/to/results/peakOut/sample2_peaks.broadPeak     H3K4me2    200     broadPeak       0.9995   0.7216
sample2   /path/to/results/peakOut/sample2.pileup_signal.bw    H3K4me2    200     pileupSignal    0.9995   0.7216
sample2   /path/to/results/peakOut/sample2_peaks.narrowPeak    H3K4me2    200     narrowPeak      0.9995   0.7216
sample2   /path/to/results/sample2_GCCAAT_primary.bam          H3K4me2    200     Alignments      0.9995   0.7216

The fields in the file correspond to:

merge identifier
path
mark/histone
(estimated) fragment length
data type
NRF (Nonredundant Fraction)
FRiP (Fraction of Reads in Peaks)

Name	Name	Last commit message	Last commit date
Latest commit emi80 Support gzipped genome in gem index process Oct 6, 2022 87bbbe8 · Oct 6, 2022 History 281 Commits
.github/workflows	.github/workflows	Update ci script - fix cleanup	Nov 25, 2021
bin	bin	Fix NRF and FRiP processes and update number of decimals in output	Apr 11, 2016
data	data	Convert pipeline to DSL2	Nov 25, 2021
docker	docker	Update Dockerfile and tool versions	Nov 23, 2021
modules	modules	Support gzipped genome in gem index process	Oct 6, 2022
scripts	scripts	[ci skip] Update changelog script	Jul 4, 2018
.circ	.circ	Update circ	Aug 30, 2017
.gitignore	.gitignore	[ci skip] Update gitignore	Jul 4, 2018
CHANGELOG.md	CHANGELOG.md	Update changelog for release	Jul 4, 2018
README.adoc	README.adoc	Move to Travis CI	Mar 2, 2018
TODO.adoc	TODO.adoc	Update todo list	Feb 20, 2017
chipseq-pipeline.nf	chipseq-pipeline.nf	Move BAM files merging to 'bam' module	Nov 25, 2021
nextflow.config	nextflow.config	Update Dockerfile and tool versions	Nov 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChIP-nf

Installing Nextflow

Running the pipeline

Input

Output

About

Releases 7

Packages

Languages

guigolab/chip-nf

Folders and files

Latest commit

History

Repository files navigation

ChIP-nf

Installing Nextflow

Running the pipeline

Input

Output

About

Topics

Resources

Stars

Watchers

Forks

Releases 7

Packages 0

Languages

Packages