Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include Ensembl Tark API for reference retrieval #21

Merged
merged 26 commits into from
Apr 30, 2024
Merged

Include Ensembl Tark API for reference retrieval #21

merged 26 commits into from
Apr 30, 2024

Conversation

XLIU-hub
Copy link
Contributor

@XLIU-hub XLIU-hub commented Apr 25, 2024

This pull request allows for references retrieval of older Ensembl versions. It resolves #16.

Currently, Mutalyzer uses GRCh38 and GRCh37 APIs, which only support reference ID queries with newest versions. In this PR, we include a new API, Ensembl Tark, which also supports for queries with older versions.

Note that Tark supports only human transcripts. In the future, it could be extended to genes and other organisms.

Given that Tark provides only exon sequences in the response, we make use of the previous Ensembl REST APIs to retrieve the entire transcripts sequences (introns included).

Changes

  • Add reference source options for ensembl in command line interface (users can choose between ensembl, ensembl_rest and ensembl_tark).
  • Convert ensembl_tark JSON-type output into the retrieval model.
  • Add tests to check that the new functionality works for ensembl, ensembl_rest and ensembl_tark.

@mihailefter mihailefter merged commit b623aa5 into master Apr 30, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for older Ensembl versions
2 participants