Skip to content
This repository has been archived by the owner on Jul 22, 2018. It is now read-only.

CGL-Deeplearning/genomePileup

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

genomePileup

Description

This program generates training data for deepore a neural network based variant caller. This program generates chromosome wise training data taking the VCF file as ground truth.

Dependencies

python3.5+, pysam, numpy, scipy, PIL

Usage

Whole genome run

./pileupProcessor.sh [bam_file_path] [ref_fasta_file_path] [vcf_file_path] [output_directory] [number_of_threads]

Parameters:

-- bam_file_path: Path to alignment bam file
-- ref_fasta_file_path: Path to reference fasta file
-- vcf_file_path: Path to vcf file
-- output_direcotry: Path to a directory where output will be saved
-- number_of_threads: Most threads it can use to generate images

Example run:

./pileupProcessor.sh ~/illumina/chr3.bam ~/illumina/chr3.fa ~/illumina/chr3_whole_chr.vcf.gz ~/pileup_output/test/ 8


Specific site run using the python script

python3 main.py --bam [bam_file_path] --ref [ref_fasta_file_path] --vcf [vcf_file_path] --output_dir [output_directory] --parallel [bool] --max_threads [int] --contig [string] --site [string]

Parameters

-- bam_file_path: Path to alignment bam file
-- ref_fasta_file_path: Path to reference fasta file
-- vcf_file_path: Path to vcf file
-- output_direcotry: Path to a directory where output will be saved
-- parallel: If true it will use mutiprocessing (Default is False)
-- max_threads: If parallel is true then will use max_thread number of threads (Default is 5)
-- contig: Contig to focus (Default is "chr3") [Example: chr3, chr2 etc.]
-- site: Site of the contig to focus (Default is empty) [Example: ":100000-200000"]

Example run:

python3 main.py --bam ~/illumina/chr3.bam --ref ~/illumina/chr3.fa --vcf ~/illumina/chr3_whole_chr.vcf.gz --output_dir ~/pileup_output/test/ --contig chr3 --site :100000-200000 --parallel True

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published