Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VariantRecord.stop is not always correct #966

Closed
zhouyangyu opened this issue Nov 6, 2020 · 2 comments
Closed

VariantRecord.stop is not always correct #966

zhouyangyu opened this issue Nov 6, 2020 · 2 comments

Comments

@zhouyangyu
Copy link

zhouyangyu commented Nov 6, 2020

I may have discovered a bug. I'm using pysam==0.16.0.1.

Given a test.vcf file like this:

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=2>
##contig=<ID=3>
##INFO=<ID=END,Number=1,Type=Integer,Description="End position of the structural variant">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  I-H-108298-N1-1-D1-1    I-H-136057-T1-1-D1-1
2       199167883       .       N       N]3:1666499]    .       PASS    END=166649      brass_I-H-136057-T1-1-D1-1_PS:brass_I-H-136057-T1-1-D1-1_RC:gridss_REF:gridss_ASQ:gridss_ASRP:gridss_ASSR:gridss_BANRP:gridss_BANRPQ:gridss_BANSR:gridss_BANSRQ:gridss_BAQ:gridss_BASRP:gridss_BASSR:gridss_BQ:gridss_BSC:gridss_BSCQ:gridss_BUM:gridss_BUMQ:gridss_BVF:gridss_CASQ:gridss_IC:gridss_IQ:gridss_QUAL:gridss_RASQ:gridss_REFPAIR:gridss_RP:gridss_RPQ:gridss_SR:gridss_SRQ:gridss_VF      0:0:54:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:0:21:0:0:0:0:0   6:0:68:75.1:15:2:2:37.55:0:0:0:0:0:124.01:0:0:4:124.01:4:0:0:0:508.75:321.01:10:6:112.64:0:0:12                                         

running the following code:

import pysam

with pysam.VariantFile("test.vcf", mode="r") as vcf:
    for rec in vcf:
        print(rec.start, rec.stop)

expected output:

199167883 1666499

observed output:

199167882 199167883

Even though END=1666499, the VariantRecord.stop is not correct on TRA (translocation) records, it should be 1666499 not 199167883. However, this is not a problem when using pysam==0.15.3.

Thanks for developing this tool, and I look forward to hearing from you.

@jmarshall
Copy link
Member

rec.startrec.stop would be the interval on chromosome 2 of this side of the structural variant. So it makes no sense to expect rec.stop to be unrelated to and in fact enormously less than rec.start.

Your record uses the END field invalidly. See samtools/hts-specs#425, PR samtools/hts-specs#436 et al.

@zhouyangyu
Copy link
Author

Thank you for your response. Closing this as my problem does not stem from pysam

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants