Skip to content

Latest commit

 

History

History
51 lines (43 loc) · 2.65 KB

KFDRC_SENTIEON_GVCF_WORKFLOW_README.md

File metadata and controls

51 lines (43 loc) · 2.65 KB

Kids First Data Resource Center Sentieon gVCF Workflow

Kids First repository logo

This workflow takes a BAM/CRAM file, runs VerifyBamID, then runs Sentieon Haplotyper and CollectVCMetrics.

The input BAM/CRAM file can either be a BQSR-recalibrated file or a pre-recalibration file with an accompanying recalibration table provided in the recal_table input.

This pipeline was made possible thanks to significant software and support contributions from Sentieon. For more information on our collaborators, check out their website:

Relevant Softwares and Versions

Outputs

gvcf: The germline variants calls in VCF format gvcf_calling_metrics: Detail and summary metrics about the gVCF verifybamid_output: If not provided by the user, the workflow will output verifybamid's selfSM file

Tips for running:

  1. For contamination input, either populate the contamination field or provide the three contamination files: contamination_sites_bed, contamination_sites_mu, and contamination_sites_ud. Failure to provide one of these groups will result in a failed run.
  2. Suggested reference inputs (available from the Broad Resource Bundle):
    • contamination_sites_bed: Homo_sapiens_assembly38.contam.bed
    • contamination_sites_mu: Homo_sapiens_assembly38.contam.mu
    • contamination_sites_ud: Homo_sapiens_assembly38.contam.UD
    • dbsnp_vcf: Homo_sapiens_assembly38.dbsnp138.vcf
    • reference_tar: Homo_sapiens_assembly38.tgz
    • wgs_calling_interval_list: wgs_coverage_regions.hg38.interval_list
    • wgs_evaluation_interval_list: wgs_evaluation_regions.hg38.interval_list
  3. The input for the reference_tar must be a tar file containing the reference fasta along with its indexes. The required indexes are [.64.ann,.64.amb,.64.bwt,.64.pac,.64.sa,.dict,.fai] and are generated by bwa, picard, and samtools. Additionally, an .64.alt index is recommended.
  4. If you are making your own bwa indexes make sure to use the -6 flag to obtain the .64 version of the indexes. Indexes that do not match this naming schema will cause a failure in certain runner ecosystems.
  5. Should you decide to create your own reference indexes and omit the ALT index file from the reference, or if its naming structure mismatches the other indexes, then your alignments will be equivalent to the results you would obtain if you run BWA-MEM with the -j option.