This program generates training data for deepore a neural network based variant caller. This program generates chromosome wise training data taking the VCF file as ground truth.
python3.5+, pysam, numpy, scipy, PIL
./pileupProcessor.sh [bam_file_path] [ref_fasta_file_path] [vcf_file_path] [output_directory] [number_of_threads]
-- bam_file_path
: Path to alignment bam file
-- ref_fasta_file_path
: Path to reference fasta file
-- vcf_file_path
: Path to vcf file
-- output_direcotry
: Path to a directory where output will be saved
-- number_of_threads
: Most threads it can use to generate images
./pileupProcessor.sh ~/illumina/chr3.bam ~/illumina/chr3.fa ~/illumina/chr3_whole_chr.vcf.gz ~/pileup_output/test/ 8
python3 main.py --bam [bam_file_path] --ref [ref_fasta_file_path] --vcf [vcf_file_path] --output_dir [output_directory] --parallel [bool] --max_threads [int] --contig [string] --site [string]
-- bam_file_path
: Path to alignment bam file
-- ref_fasta_file_path
: Path to reference fasta file
-- vcf_file_path
: Path to vcf file
-- output_direcotry
: Path to a directory where output will be saved
-- parallel
: If true it will use mutiprocessing (Default is False)
-- max_threads
: If parallel is true then will use max_thread number of threads (Default is 5)
-- contig
: Contig to focus (Default is "chr3") [Example: chr3, chr2 etc.]
-- site
: Site of the contig to focus (Default is empty) [Example: ":100000-200000"]
python3 main.py --bam ~/illumina/chr3.bam --ref ~/illumina/chr3.fa --vcf ~/illumina/chr3_whole_chr.vcf.gz --output_dir ~/pileup_output/test/ --contig chr3 --site :100000-200000 --parallel True