Skip to content
This repository has been archived by the owner on Dec 9, 2023. It is now read-only.

papaemmelab / mergeSVvcf: invalid chromosome issue with pysam #5

Open
yangyxt opened this issue Mar 17, 2021 · 12 comments
Open

papaemmelab / mergeSVvcf: invalid chromosome issue with pysam #5

yangyxt opened this issue Mar 17, 2021 · 12 comments

Comments

@yangyxt
Copy link

yangyxt commented Mar 17, 2021

I have test the run with a simple vcf file from DELLY, (test convert all symbolic records to BND records) and it failed.
The specific error log is :

Traceback (most recent call last):
  File "/software/mergeSVvcf/1.0.2_python3.6.4/bin/mergesvvcf", line 11, in <module>
    load_entry_point('mergesvvcf==1.0.2', 'console_scripts', 'mergesvvcf')()
  File "/software/mergeSVvcf/1.0.2_python3.6.4/lib/python3.6/site-packages/mergesvvcf/__init__.py", line 36, in main
    debug=args.debug)
  File "/software/mergeSVvcf/1.0.2_python3.6.4/lib/python3.6/site-packages/mergesvvcf/mergedfile.py", line 310, in merge
    vcfrec.contig = avgloc1.chrom
  File "pysam/libcbcf.pyx", line 3061, in pysam.libcbcf.VariantRecord.contig.__set__
ValueError: Invalid chromosome/contig

However I check the consistence between metalines and contig names in vcf records. So I'm not sure what's wrong.

Here I provide a simple vcf for bug reproduction. Could you pls take a look at it? @ljdursi @mflevine

@yangyxt
Copy link
Author

yangyxt commented Mar 17, 2021

Sorry, just forgot to paste the file attachment here. Pls check the vcf file. Note I changed the extension to txt for uploading.
H19080A.delly.txt

.

@ljdursi
Copy link
Owner

ljdursi commented Mar 18, 2021

Thanks for letting me know! I'll look into it this weekend.

@yangyxt
Copy link
Author

yangyxt commented Mar 22, 2021

Thanks for letting me know! I'll look into it this weekend.

Thanks for your response. Much appreciated for your time. Could you pls let me know whether this is generally a simple issue to fix or should I wait for a few days?

To be honest, this is quite a critical step in my workflow, I need to take a look at the format normalization result before writing downstream pipelines.

BTW, do you happen to know any tools can do the conversion from BND notation vcf records back to symbolic sv records?

Again, much appreciated for making this tool.

@ljdursi
Copy link
Owner

ljdursi commented Mar 22, 2021

Hi @yangyxt - I just ran

mergevcf -o foo.vcf H19080A.delly.vcf

I haven't examined the output in any detail but it definitely ran and generated plausible-seeming output. Could you share with me the exact command you ran to get the 'Invalid chromosome/contig' error message?

@YY-SONG0718
Copy link

YY-SONG0718 commented Mar 23, 2021

Dear maintainer,

I am experiencing the same issue with mergeSVvcf.

When I was trying to use mergevcf to process chrX, it returned an assertion error, but other chromosomes were fine. #6

I thought to try out mergeSVvcf, and the Invalid chromosome/contig error was returned. However, this time other chromosomes also returned the same error.

Traceback (most recent call last):
  File "/anaconda3/bin/mergesvvcf", line 33, in <module>
    sys.exit(load_entry_point('mergesvvcf==1.0.2', 'console_scripts', 'mergesvvcf')())
  File "/mergeSVvcf/mergesvvcf/__init__.py", line 31, in main
    mergedfile.merge(input_files, labels, args.sv, args.output,
  File "/mergeSVvcf/mergesvvcf/mergedfile.py", line 310, in merge
    vcfrec.contig = avgloc1.chrom
  File "pysam/libcbcf.pyx", line 3061, in pysam.libcbcf.VariantRecord.contig.__set__
ValueError: Invalid chromosome/contig

Here I am attaching a vcf of chrX from me to help you reproduce.
test_chrX.txt

I am using Python 3.8.5 this time since mergeSVvcf somehow failed to built under Python 2.7.15.

Kind regards,
Yuyao

@yangyxt
Copy link
Author

yangyxt commented Mar 23, 2021

Dear @ljdursi

I just ran
mergesvvcf -o foo.vcf H19080A.delly.vcf
and I still hit the same error
image

@yangyxt
Copy link
Author

yangyxt commented Mar 23, 2021

BTW, I installed the branch of mergeSVvcf, instead of the main branch. So maybe this is related to @mflevine ?

@yangyxt
Copy link
Author

yangyxt commented Mar 23, 2021

Dear maintainer,

I am experiencing the same issue with mergeSVvcf.

When I was trying to use mergevcf to process chrX, it returned an assertion error, but other chromosomes were fine. #6

I thought to try out mergeSVvcf, and the Invalid chromosome/contig error was returned. However, this time other chromosomes also returned the same error.

Traceback (most recent call last):
  File "/anaconda3/bin/mergesvvcf", line 33, in <module>
    sys.exit(load_entry_point('mergesvvcf==1.0.2', 'console_scripts', 'mergesvvcf')())
  File "/mergeSVvcf/mergesvvcf/__init__.py", line 31, in main
    mergedfile.merge(input_files, labels, args.sv, args.output,
  File "/mergeSVvcf/mergesvvcf/mergedfile.py", line 310, in merge
    vcfrec.contig = avgloc1.chrom
  File "pysam/libcbcf.pyx", line 3061, in pysam.libcbcf.VariantRecord.contig.__set__
ValueError: Invalid chromosome/contig

Here I am attaching a vcf of chrX from me to help you reproduce.
test_chrX.txt

I am using Python 3.8.5 this time since mergeSVvcf somehow failed to built under Python 2.7.15.

Kind regards,
Yuyao

Dear Yuyao,
I think the fork mergeSVvcf is built on python3.6 instead of python 2.7.

@YY-SONG0718
Copy link

YY-SONG0718 commented Mar 23, 2021

Dear maintainer,
I am experiencing the same issue with mergeSVvcf.
When I was trying to use mergevcf to process chrX, it returned an assertion error, but other chromosomes were fine. #6
I thought to try out mergeSVvcf, and the Invalid chromosome/contig error was returned. However, this time other chromosomes also returned the same error.

Traceback (most recent call last):
  File "/anaconda3/bin/mergesvvcf", line 33, in <module>
    sys.exit(load_entry_point('mergesvvcf==1.0.2', 'console_scripts', 'mergesvvcf')())
  File "/mergeSVvcf/mergesvvcf/__init__.py", line 31, in main
    mergedfile.merge(input_files, labels, args.sv, args.output,
  File "/mergeSVvcf/mergesvvcf/mergedfile.py", line 310, in merge
    vcfrec.contig = avgloc1.chrom
  File "pysam/libcbcf.pyx", line 3061, in pysam.libcbcf.VariantRecord.contig.__set__
ValueError: Invalid chromosome/contig

Here I am attaching a vcf of chrX from me to help you reproduce.
test_chrX.txt
I am using Python 3.8.5 this time since mergeSVvcf somehow failed to built under Python 2.7.15.
Kind regards,
Yuyao

Dear Yuyao,
I think the fork mergeSVvcf is built on python3.6 instead of python 2.7.

Hi yang,

Thanks for pointing this out. MergeSVvcf built and installed successfully under python 3.6.13. However, the same issue still exists, for both chrX and chr1 that I just tested.

Traceback (most recent call last):
  File "anaconda3/envs/mergeSVvcf/bin/mergesvvcf", line 33, in <module>
    sys.exit(load_entry_point('mergesvvcf==1.0.2', 'console_scripts', 'mergesvvcf')())
  File "anaconda3/envs/mergeSVvcf/lib/python3.6/site-packages/mergesvvcf-1.0.2-py3.6-linux-x86_64.egg/mergesvvcf/__init__.py", line 36, in main
    debug=args.debug)
  File "anaconda3/envs/mergeSVvcf/lib/python3.6/site-packages/mergesvvcf-1.0.2-py3.6-linux-x86_64.egg/mergesvvcf/mergedfile.py", line 310, in merge
    vcfrec.contig = avgloc1.chrom
  File "pysam/libcbcf.pyx", line 3061, in pysam.libcbcf.VariantRecord.contig.__set__
ValueError: Invalid chromosome/contig

I am using pysam 0.16.0.1.

Kind regards,
Yuyao

@mflevine
Copy link

Ive narrowed down that the bug is related to GRCh vs hg. Contigs seem to be stripped of the chr internally which makes them no longer compatible with the header which is why pysam complains.

@mflevine
Copy link

@ljdursi it seems the chr stripping is from the original code. Is there a specific reason for this or do you think it can be safely removed?

@yangyxt the fork should be compatible with both Python 2 and 3.

@yangyxt
Copy link
Author

yangyxt commented Mar 24, 2021

Ive narrowed down that the bug is related to GRCh vs hg. Contigs seem to be stripped of the chr internally which makes them no longer compatible with the header which is why pysam complains.

Thanks for the info! At least I succeed in using mergeSVvcf after stripping the chr prefix before importing vcf into mergeSVvcf. Wish you would add more flexibility for mergeSVvcf regarding this issue later. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants