Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gene2transcripts API: Gene Panel Functionality #516

Open
Sophiaj93 opened this issue Jul 14, 2023 · 4 comments
Open

gene2transcripts API: Gene Panel Functionality #516

Sophiaj93 opened this issue Jul 14, 2023 · 4 comments

Comments

@Sophiaj93
Copy link

Describe the solution you'd like
It might be useful to have some functionality around gene panels for the gene2transcripts web page/API...

  • Ability to paste a list of genes on the webpage and return transcripts for all of them
  • Ability to use a panelapp ID as input on the webpage/API and return transcripts for all genes in the panel (It would be important to also be able to specify the panel version e.g. panel version no. or latest signed off version. It would also need the ability to select confidence level for the genes e.g. include only Green or Green & Amber etc.)

Describe alternatives you've considered
I have currently created a python script that loops through a list of genes from a 'panel', makes a gene2transcripts_v2 API call for the gene, returns a dictionary of the info that I want (Transcript Reference ID, Chromosome, Exon Genomic/Start End Positions, Orientation), writes a BED file containing this information for all the genes in the panel.

This works very well for my current requirements (where I need an automated script and not all genes in the panel are necessarily from PanelApp), however, I think that having some in-built panel functionality in VV could be useful in different contexts in the future.

@leicray
Copy link
Contributor

leicray commented Jul 16, 2023

PanelApp https://nhsgms-panelapp.genomicsengland.co.uk/ is probably familiar only to a small subset of our users who will be mostly UK-based and are involved with Genomics England projects. As an indicator of the somewhat limited global impact of PanelApp, a search of PubMed for the term "panelapp" yields just one result: https://pubmed.ncbi.nlm.nih.gov/34329581/.

PanelApp panels are versioned. If we were to implement support for PanelApp panels it would have to be done in a fashion that complies with our strict versioning schema. We never update data that are used in analyses "on-the-fly", say by accessing an API on another site. This ensures that results that are returned to the user are accompanied by full details of the versions of software and data that were used in the analysis. It does appear that PanelApp panels are not updated too frequently, so that would make the implementation process easier.

However, we would need access, ideally, to plain-text lists of the data for each panel. This would ensure that we could easily transform the data for use within our existing processes. Can you point us to a source of plain-text files for each panel?

To be clear, this is not a promise that we will implement support for PanelApp.

You might be interested to know that we intend to provide support for Ensembl transcripts by the end of the year.

@Peter-J-Freeman
Copy link
Collaborator

@Sophiaj93 Code is now live, ready for testing. https://rest.variantvalidator.org/
gene2transcripts_v2

@Sophiaj93
Copy link
Author

Hi Pete,

Sorry for the huge delay in looking at this. Thanks again for adding these features!

I've just updated my code based on the new version of the gene2transcripts_v2 endpoint and all is working well. The genome build filter in particular is really helpful.

I haven't properly implemented the "|" delimited genes list in my code yet but have tried it out via the URL. Can't see any issues.

Thanks

@Peter-J-Freeman
Copy link
Collaborator

Actually, your timing is perfect. I have been implementing some server changes to try and speed up processing. If you can get in touch via my email when you want to run a larger job and I will monitor. Also, make changes as required.

Lessons learned last week

  • Where possible apply transcript filtering. This will speed up calls
  • Where possible apply genome build selection (again, speeds up the response)
  • The response timeout is set to 5 bminutes, and genes2transcripts can be one of the slower endpoints, so limit the rate of requests to ~1-3 per second. Feel free to play with this though

Hope this helps for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants