Skip to content

Download test data

Anusri Pampari edited this page Jan 10, 2023 · 12 revisions

Step 1

We will start by creating a directory (~/chrombpnet_tutorial/data/downloads) to store downloaded data.

mkdir -p ~/chrombpnet_tutorial/data/downloads

Step 2

Download hg38 human reference genome data - fasta file, chromosome sizes file and blacklisted bed regions.

# download reference data
wget https://www.encodeproject.org/files/GRCh38_no_alt_analysis_set_GCA_000001405.15/@@download/GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta.gz -O ~/chrombpnet_tutorial/data/downloads/hg38.fa.gz
yes n | gunzip ~/chrombpnet_tutorial/data/downloads/hg38.fa.gz

# download reference chromosome sizes 
wget https://www.encodeproject.org/files/GRCh38_EBV.chrom.sizes/@@download/GRCh38_EBV.chrom.sizes.tsv -O ~/chrombpnet_tutorial/data/downloads/hg38.chrom.sizes

# download reference blacklist regions 
wget https://www.encodeproject.org/files/ENCFF356LFX/@@download/ENCFF356LFX.bed.gz -O ~/chrombpnet_tutorial/data/downloads/blacklist.bed.gz

Step 3

We then download ENCSR868FGK ATAC-seq reads in bam format using the commands below.

# download bam files

wget https://www.encodeproject.org/files/ENCFF077FBI/@@download/ENCFF077FBI.bam -O ~/chrombpnet_tutorial/data/downloads/rep1.bam
wget https://www.encodeproject.org/files/ENCFF128WZG/@@download/ENCFF128WZG.bam -O ~/chrombpnet_tutorial/data/downloads/rep2.bam
wget https://www.encodeproject.org/files/ENCFF534DCE/@@download/ENCFF534DCE.bam -O ~/chrombpnet_tutorial/data/downloads/rep3.bam