Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drug repurposing: use APIs for DOID-SNOMED traversal and for hposim #57

Open
cmungall opened this issue Oct 23, 2017 · 11 comments
Open

Comments

@cmungall
Copy link
Member

https://github.com/NCATS-Tangerine/cq-notebooks/blob/master/Orange_QB2_Other_CQs/Drug_Repurpose_By_Pheno/BTExplorer-QB2.3.ipynb

Convert SNOMED ID to DOID using Drugcentral doid_xref file

We shouldn't be dependent on files checked into github.

I thought this would be possible in scigraph, but there is an annoying blocker:
SciGraph/SciGraph#248

Another option is wikidata, but snomed xrefs don't seem to be there:
https://www.wikidata.org/wiki/Q206901

@stuppie / @putmantime is this a licensing issue? I would have thought it ok to put xrefs in, just not any further content. Maybe wd playing it safe?

Calculating phenotypic similarity

This can be done using the owlsim web API

@TomConlin
Copy link
Contributor

CM: "We shouldn't be dependent on files checked into github."

Why not?
They can be fetched remotely or cloned locally
the cq notebook already depend quite heavily on GitHub
we can be sure the file is under 100M
We use it for the FA gene sets since the last hackathon.

not sure I see the point of the directive

@cmungall
Copy link
Member Author

We should limit this.

  • the source database may change, but there is no procedure in place to update the files in github
  • it bypasses the APIs and one of our goals is to demonstrate doing things by APIs
  • it limits our ability to recapitulate the logic of the notebook using more generic mechanisms such as the biothings explorer or knowledge beacons, a human has to know of the existence of these files and their semantics

@cmungall
Copy link
Member Author

To expand on the last point, this notebook demonstrates how you can ask the explorer "starting from a drug name, how do I get to the snomed ID of a disease the drug denotes by the drug name treats?". This is good, but it would be more powerful if you could it go even further: really what we want here is the phenotypes, so we'd like to just say "drug names to phenotypes of treated disease" and have it figure everything out. I assume that is what @kevinxin90 is aiming for here, but for pragmatic reasons (ie the fact the perceived only way to do this is via files) the last part is done explicitly rather than automatically by the explorer.

As an aside, I think this also determines the need for relationship types. When we ask the explorer "connect drug names to snomed IDs", how do we know it is connecting diseases in snomed? there are also snomed IDs for drugs, and plenty of other things. It is only an accident that the APIs used happen to give the desired answers here. Curious as to @newgene and @kevinxin90's thoughts on this.

@kevinxin90
Copy link
Contributor

@cmungall For the first part, totally agreed. Ideally, we should directly connect from drug name to phenotype. However, the results from MyChem can't be directly linked to BioLink, given MyChem only outputs 'SNOMED', while BioLink only takes 'DOID'. It would be great if there is an additional API doing the work.
For the second part, I think the question should be addressed by URI repos, e.g. 'identifiers.org' or 'prefixcommons'. A good example would be 'KEGG', it has information involving bio-entities like drugs, reaction, pathway, genes, etc. And 'identifiers.org' has a separate entry for each one, e.g. http://identifiers.org/kegg.compound/, http://identifiers.org/kegg.pathway/, http://identifiers.org/kegg.genes/. I think the same thing should be done with 'snomed' or 'omim' or 'ensembl', where one database contains information regarding multiple bio-entities.

@stuppie
Copy link
Contributor

stuppie commented Oct 24, 2017

Convert SNOMED ID to DOID using Drugcentral doid_xref file

Another option is wikidata, but snomed xrefs don't seem to be there
@stuppie / @putmantime is this a licensing issue? I would have thought it ok to put xrefs in, just not any further content. Maybe wd playing it safe?

Unfortunately, I think this is a licensing issue as SNOMED has a very strict licence regarding everything explicitly including identifiers.
See: https://www.wikidata.org/wiki/Wikidata:Property_proposal/sctid
Also, pinging @andrewsu

@stuppie
Copy link
Contributor

stuppie commented Oct 24, 2017

@cmungall A crude workaround using an API:
Get the xrefs for a given DOID using OLS: example

@cmungall
Copy link
Member Author

workaround

Thanks for the OLS example. Is it possible to go the opposite way? Or does that require OxO API?

Soon it should be possible to query for HPO phenotypes for a SNOMED ID from biolink directly (with the proviso that often the snomed term will be more general, e.g. less coverage of rare diseases). Of course, that takes some of the fun out of chaining APIs....

license

Thanks for the info.

@cmungall
Copy link
Member Author

@kevinxin90 and @newgene - have you considered using a different vocabulary than SNOMED in MyChem? We can discuss some other options this week. You could use mondo as part of your MyChem ingest pipeline to map.

For the second part, I think the question should be addressed by URI repos

So I think what you're suggesting is IDs like snomed.disease:nnn, snomed.drug:nnnn. I'm not a big fan of this approach, we can discuss more this week.

@stuppie
Copy link
Contributor

stuppie commented Oct 24, 2017

Is it possible to go the opposite way?

Not that I can see. Here are the docs.
Closest I can get is through search.

@cmungall
Copy link
Member Author

Yep, same for scigraph API: SciGraph/SciGraph#248

@cmungall
Copy link
Member Author

Following on from discussion about overloading IDs, I see there is both a lot of really useful stuff in mychem, and it's really well described by jsonld - nice work!

E.g. this is where the snomed IDs come from (via drugcentral)

https://github.com/NCATS-Tangerine/translator-api-registry/blob/master/mychem.info/jsonld_context/mychem_drug_1.1.json#L135-L147

But it looks like you can also get indications from aeolus (using meddra IDs):
http://mychem.info/v1/query?q=drugbank.name%3Ariluzole

it would be good to expand this notebook to include this, but I guess we need to expand the jsonld to describe this?

But here is my concern - if a database were to include contraindications, how would we annotate this in the jsonld? It seems the explorer approach just looks for some kind of path through the json from the query id to any id annotated with the relevant identifiers.org id. Without a predicate or relationship type connecting the two, we don't know what the semantics is. For the case of distinguishing drug->snomed equivalent and drug->snomed indicated for we can use different id prefixes to distinguish the two kinds of entities but I'm not sure this approach can be extended to distinguish indications from contraindications.

(this concern also applies to things like negative annotations in GO in mygene)

Look forward to discussing these issues this week!

@kshefchek kshefchek removed their assignment Jul 12, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants