-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
queries fail for some uniprot accessions #128
Comments
Just to add a tiny bit more info. I suspect the difference in behavior between The source file for the uniprot data plugin appears to be https://ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/idmapping_selected.tab.gz. From the README, the column headings for this file are as follows:
Note the difference in the records below in column 3 which should have a mapping to Entrez Gene.
This difference can also be seen on the corresponding UniProt web pages
Having said that, the reciprocal links do exist in NCBI Gene (likely through a mapping to Refseq Protein): |
Some uniprot accessions are not available for querying nor as output in the "uniprot" field/scope. To illustrate I've included 2 examples, one accession that works (P63044) and one that fails (P23819).
this works via https://mygene.info/v3/api#/query/get_query ;
"q" input: P63044
"fields" input: symbol,name,taxid,entrezgene,uniprot
returns:
this works via https://mygene.info/v3/api#/query/get_query ;
in "q" input: P23819
in "fields" input: symbol,name,taxid,entrezgene,uniprot
and returns:
However, note that for the latter query, the uniprot input ID that I queried (a swissprot record) is not included in the "uniprot" output field! So it seems there is a problem with the mygene.info database, possibly a subset of uniprot accessions/IDs are not stored/linked under "uniprot". Other examples are P23819, Q61941, Q8VHW2.
Furthermore, POST queries against these accessions fail even though they should not (probably same root cause).
this works via https://mygene.info/v3/api#/query/post_query ;
{ "q": "P63044", "scopes": "uniprot" }
returns:
this query fails, but it should not as this is a valid uniprot accesion that is in the mygene.info dataset (see GET query above) ;
{ "q": "P23819", "scopes": "uniprot" }
returns:
The text was updated successfully, but these errors were encountered: