Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variant causing server error #518

Open
Peter-J-Freeman opened this issue Aug 10, 2023 · 11 comments
Open

Variant causing server error #518

Peter-J-Freeman opened this issue Aug 10, 2023 · 11 comments

Comments

@Peter-J-Freeman
Copy link
Collaborator

Describe the bug
image

@Peter-J-Freeman
Copy link
Collaborator Author

@leicray, we need a test suite for these. I think that if the boundary is correct, but just badly numbered by the user in terms of the direction they went into the intron, it is valid to correct

@Peter-J-Freeman
Copy link
Collaborator Author

Hello,

I think these are very good fixes.
""ExonBoundaryError: Position c.791-802 has been updated to position to 790+532 ensuring correct HGVS numbering for transcript NM_000086.2" is now an informative message for what has been updated and why. It is also wonderful that VariantValidator offers this update automatically, since calculating and comparing these distances manually would be really cumbersome.

Thank you also for the explanation for the UCSC Genome Browser link. It is now clear why those chr16:28,480,676-28,483,512 coordinates are shown. This is also now clear in the browser since you can see where the actual variant is. Without the VariantValidator track I just expected that the whole view is the variant, which would have been incorrect.

Thank you!

@leicray
Copy link
Contributor

leicray commented Aug 14, 2023

Could you elaborate on what is needed regarding a "test suite"?

@Peter-J-Freeman
Copy link
Collaborator Author

We need a few examples where the user has specified a base in an intron but has counted in the wrong direction, like the example above. I will then add tests to make sure VV handles these correctly.

The example above shows the used defining the incorrect start position but a correct exon boundary (791) and counting backwards into the intron (791-X) when they should have counted forwards into the intron (790+X).

So the reciprocal would be a variant in the back half of an intron but the user has counted forwards from the (790+X) when they should have counted backwards (791-X). I think we only need one example. I can then use these 2 examples and embed them into the variant end position to create 4 separate tests to take care of all eventualities.

Actually, do we need tests where both the start and end of the variant are ill defined do you think @leicray ?

@leicray
Copy link
Contributor

leicray commented Aug 15, 2023

Here are some test descriptions:

Valid variants near splice sites:

NM_000088.4:c.472-1_472del
NM_000088.4:c.543_543+1del
NM_000088.4:c.2560-3_2561dup
NM_000088.4:c.2608_2613+5 del
NM_000088.4:c.2559+1_2559+44del

These each validate correctly.

Invalid variants near splice sites:

NM_000088.4:c.472-1_472+1del - ExonBoundaryError: Position c.472+1 does not correspond with an exon boundary for transcript NM_000088.4
NM_000088.4:c.543-1_543+1del - ExonBoundaryError: Position c.543-1 does not correspond with an exon boundary for transcript NM_000088.4
NM_000088.4:c.543_453+1del - Interval end position 453 < interval start position 543
NM_000088.4:c.2560-3_2560+3dup - ExonBoundaryError: Position c.2560+3 does not correspond with an exon boundary for transcript NM_000088.4
NM_000088.4:c.2608_2608+5del - ExonBoundaryError: Position c.2608+5 does not correspond with an exon boundary for transcript NM_000088.4

An invalid variant with an unclear error message:

NM_000088.4:c.2559+1_2599+54del - ExonBoundaryError: Position c.2559+54 does not correspond with an exon boundary for transcript NM_000088.4

The issue here is that the intron length downstream of c,2559 is 88 nucleotides. The +54 position does not formally exist, but there perhaps ought to be automatic correction of the variant description to NM_000088.4:c.2559+2_2560-34del which is the correct description.

@Peter-J-Freeman
Copy link
Collaborator Author

Thanks @leicray. To clarify, you are happy with the warnings here

Invalid variants near splice sites:

NM_000088.4:c.543-1_543+1del - ExonBoundaryError: Position c.543-1 does not correspond with an exon boundary for transcript NM_000088.4
NM_000088.4:c.543_453+1del - Interval end position 453 < interval start position 543
NM_000088.4:c.2560-3_2560+3dup - ExonBoundaryError: Position c.2560+3 does not correspond with an exon boundary for transcript NM_000088.4
NM_000088.4:c.2608_2608+5del - ExonBoundaryError: Position c.2608+5 does not correspond with an exon boundary for transcript NM_000088.4

but 2599+54 should be 2560-34del

That's ideal because we are already handling NM_000086.2(CLN3):c.791-802_1056+1445del and warning

ExonBoundaryError: Position c.791-802 has been updated to position to 790+532 ensuring correct HGVS numbering for transcript NM_000086.2

So, we should also be catching

ExonBoundaryError: Position c.2599+54 has been updated to position to 2560-34 ensuring correct HGVS numbering for transcript NM_000088.4

I can make appropriate tests from these. Thanks

@leicray
Copy link
Contributor

leicray commented Aug 15, 2023

I am happy with the error messages that you listed as a group.

The example that you give for locations either side of an intron mid point suggests that the correction works in one direction, at present, but not the other.

@Peter-J-Freeman
Copy link
Collaborator Author

Yes, think that is currently the case. I have set the code up, but the flow seems to be broken. Fixing now. Thanks for the examples. Can make it happen now

@Peter-J-Freeman
Copy link
Collaborator Author

NM_000086.2:c.791-802_1056+1445del

[
            "ExonBoundaryError: Position c.791-802 has been updated to position to 790+532 ensuring correct HGVS numbering for transcript NM_000086.2"
        ],

NM_000088.4:c.2559_2559+54del

[
            "ExonBoundaryError: Position c.2559+54 has been updated to position to 2560-35 ensuring correct HGVS numbering for transcript NM_000088.4"
        ]

NM_000086.2:c.790_791-802del

[
            "ExonBoundaryError: Position c.791-802 has been updated to position to 790+532 ensuring correct HGVS numbering for transcript NM_000086.2",
            "NM_000086.2:c.790_790+532del normalized to NM_000086.2:c.790+1_790+533del"
        ]

NM_000088.4:c.2559+54_2560del

[
            "ExonBoundaryError: Position c.2559+54 has been updated to position to 2560-35 ensuring correct HGVS numbering for transcript NM_000088.4"
        ],

NM_000088.4:c.2559+54_2559+55del

[
            "ExonBoundaryError: Position c.2559+54 has been updated to position to 2560-35 ensuring correct HGVS numbering for transcript NM_000088.4",
            "ExonBoundaryError: Position c.2559+55 has been updated to position to 2560-34 ensuring correct HGVS numbering for transcript NM_000088.4"
        ],


NM_000086.2:c.791-803_791-802del

[
            "ExonBoundaryError: Position c.791-803 has been updated to position to 790+531 ensuring correct HGVS numbering for transcript NM_000086.2",
            "ExonBoundaryError: Position c.791-802 has been updated to position to 790+532 ensuring correct HGVS numbering for transcript NM_000086.2"
        ]


@leicray Does this look comprehensive?

@leicray
Copy link
Contributor

leicray commented Aug 15, 2023

All looks good to me.

@Peter-J-Freeman
Copy link
Collaborator Author

Ace. Writing tests then will link in the push and close

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants