IMPORTANT: READ THROUGH THE GUIDE INFORMATION IN THE TEMPLATES TO MAKE CORRECT MANIFEST TALBES AND CONFIG FILE.
- You can download reference genome, pre-build BWA index and annotated regions (e.g., blacklist) from ENCODE for hg38 and hg19 using the scripts in .assets/Reference. The reference manifest table hg38/hg19.tsv will be generated accordingly. Currently, the ENCODE black list and bwa index are mandatory for the manifest file, which means you can also create it by yourself based on
.Reference/hg38_template.tsv
or example below:
genome_name | hg38 |
---|---|
bwa_index | /path/to/bwa/index |
blacklist | /path/to/blacklist/regions |
- The samples's sequence data table template. Note:Prepare the table for single-end and paired-end samples separately and use exactly same table
header
, if there are multiple lanes, use comma to separate the list.
sample_id | R1 | R2(p.r.n.) |
---|---|---|
A | full/path/to/A_L001_R1.fq.gz | |
B | full/path/to/B_L001_R1.fq.gz | full/path/to/B_L001_R2.fq.gz |
C | path/C_L001_R1.fq.gz,path/C_L002_R1.fq.gz | path/C_L001_R2.fq.gz,path/C_L002_R2.fq.gz |
- The samples aggregation template. groups can refer to different batches or cancer subtypes, etc.
sample_id | group |
---|---|
A | G1 |
B | G1 |
C | G2 |
D | G2 |
A config YAML file specifies all PATHs of input files and parameters that are necessary for successfully running this pipeline. This includes a specification of the path to the genome reference files. Please make sure to specify absolute paths rather than relative paths in your config files. More detailed instructions can be found at the config_template
There are two randomly generate cfMeDIP-seq testing datasets, more details can be found here.
Reference/
: Genome reference and BWA index (Athaliana.BAC.F19K16.F24B22) for extra environments installation.
Fastq/
: Randomly generated paired-end FASTQ reads. Sample A, B were derived from two Athaliana BACs; toy sample 1, 2 incopporated both human and two BACs sequences with UMI barcodes.
Res/
: The outcomes for test run (sample A, B) will be depoisted here. In addtion, the aggregated QC reports for the 2 toy samples and 163 brain cancer samples from this study were attched.