Tool: AMIGO API for enrichment of gene for an organism #16

nathandunn · 2017-06-30T10:05:49Z

Want to emulate the behavior of this tool:

I think it actually calls this tool (which would also be fine), but I don't see a webservice for this (though maybe just a post is fine).

You can look at Amigo as well, but I think that is just wrapping panther:
http://wiki.geneontology.org/index.php/AmiGO_2_Web_Services#API_Documentation

@kltm is that true (the amigo enrichment is just calling Panther) or are you calling the GOLR backend? Are you doing a post to pantherdb for this or is there a hidden web-service that you are calling?

hexylena · 2017-06-30T12:23:35Z

This could be a data source, but as a tool this would be completely unreproducible. Is there anyway we can reproduce the infrastructure in some offline manner?

nathandunn · 2017-06-30T12:38:33Z

I was just thinking of using it as a datasource.

That being said it would be possible to run it locally, but I think the footprint / effort may be immense and would need to ingest / update data from a number of sources to be appropriately functional.

@kltm would know more.

hexylena · 2017-06-30T14:25:47Z

Yes, but that would make it reproducible. And term enrichment is usually done as a step in a workflow, I assume, it isn't the source of data, it's a processing step. And non reproducible processing steps are not good, so if there's a way to work around this by having local databases that are run for a query, then this becomes an attractive proposition for a Galaxy tool.

nathandunn · 2017-06-30T14:40:26Z

@erasche That is an excellent point. I think the way you get around that is to provide data versions during the query to amigo, which I don't think it supports. (is this right @cjmungall / @kltm ?)

hexylena · 2017-06-30T14:46:46Z

@nathandunn yep, that would be one solution (but would not work for completely network isolated galaxy instances, hence always the push for "is there a DB dump we can download, we can investigate modifying tooling to search that")

nathandunn · 2017-06-30T14:50:15Z

This is a good idea. If we had a self-contained version that would be great. @cjmungall / @kltm would it be feasible for @erasche to dockerize amigo or would this be a herculean undertaking?

kltm · 2017-06-30T23:31:04Z

@cmungall, not @cjmungall

There is no real API API for the enrichment at PANTHER anymore. At one point we had worked out TERP (a Term EnRichment Protocol), but that fell by the wayside with the practical considerations of working with PANTHER's TE quickly (e.g. https://github.com/geneontology/amigo/search?utf8=%E2%9C%93&q=TERP&type=Issues).

cmungall · 2017-07-01T00:39:29Z

You can write a python script to do this easily using ontobio

not well documented yet as you can see!
http://ontobio.readthedocs.io/en/latest/analyses.html#enrichment

Here's an example:
http://nbviewer.jupyter.org/github/biolink/ontobio/blob/master/notebooks/Phenotype_Enrichment.ipynb

(change category from 'phenotype' to 'function' for GO enrichment)

(this notebook does a whole lot more that you don't need, or if you did would be better split into separate tools)

Re: TERP this capability should be exposed via the web API (biolink) soon but for galaxy direct python API is probably fine

caveat: for querying GO you need to know in advance what kind of IDs to use. MOD IDs for MODs, UniProtKB IDs for everything else. I'll add an example for mapping. Would you want to do the ID mapping via a separate galaxy tool, or just bundle into the enrichment capability?

cmungall · 2017-07-01T00:40:04Z

When writing the tool, make sure to include background gene set, this is v important

…ion/docker-galaxy-genome-annotation#16

cmungall · 2017-07-01T02:29:54Z

Reproducibility: use a versioned PURL for the ontology and a specific version of the annotation files. Instructions in the PR about to come

cmungall · 2017-07-03T20:34:52Z

OK, slightly improved docs on enrichment here: http://ontobio.readthedocs.io/en/latest/analyses.html#enrichment

Note that one option is simply to wrap the existing cli script

nathandunn · 2017-07-06T16:07:11Z

@cmungall I think that wrapping the existing script is a much better idea especially if you plan to publish on pypi or the like.

Is there a way to query available versions and more importantly set URL version from within the API? Also, is there a way to confirm the source versions when getting results?

Maybe we can wait until the doc is fully developed.

cmungall · 2017-07-06T16:56:54Z

Should this be the responsibility of the enrichment tool, or should there be a separate fetcher tool (or two, one to get a versioned ont, other to get versioned annotations)? The latter seems more modular, you can then plug in different analytic tools without each analytic tool worrying about versioning. But I don't know galaxy best practice these days

hexylena · 2017-07-10T17:48:28Z

some brief notes

existing cli = great!
For a very "galaxy" way of doing things, I think this would be separate tools.
- One "data manager" which fetches the reference PURL (and maybe lets the user specify a specific version)
- And separate querying tools.

But these are implementation details. Maybe the next time IUC has a hackathon I'll have some time to work on this.

cmungall added a commit to biolink/ontobio that referenced this issue Jul 1, 2017

Added to RTD for enrichment analyses, driven by galaxy-genome-annotat…

694dde6

…ion/docker-galaxy-genome-annotation#16

cmungall mentioned this issue Jul 1, 2017

WIP: Enrichment docs biolink/ontobio#55

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tool: AMIGO API for enrichment of gene for an organism #16

Tool: AMIGO API for enrichment of gene for an organism #16

nathandunn commented Jun 30, 2017

hexylena commented Jun 30, 2017

nathandunn commented Jun 30, 2017

hexylena commented Jun 30, 2017

nathandunn commented Jun 30, 2017

hexylena commented Jun 30, 2017 •

edited

Loading

nathandunn commented Jun 30, 2017

kltm commented Jun 30, 2017

cmungall commented Jul 1, 2017

cmungall commented Jul 1, 2017

cmungall commented Jul 1, 2017

cmungall commented Jul 3, 2017

nathandunn commented Jul 6, 2017

cmungall commented Jul 6, 2017

hexylena commented Jul 10, 2017

Tool: AMIGO API for enrichment of gene for an organism #16

Tool: AMIGO API for enrichment of gene for an organism #16

Comments

nathandunn commented Jun 30, 2017

hexylena commented Jun 30, 2017

nathandunn commented Jun 30, 2017

hexylena commented Jun 30, 2017

nathandunn commented Jun 30, 2017

hexylena commented Jun 30, 2017 • edited Loading

nathandunn commented Jun 30, 2017

kltm commented Jun 30, 2017

cmungall commented Jul 1, 2017

cmungall commented Jul 1, 2017

cmungall commented Jul 1, 2017

cmungall commented Jul 3, 2017

nathandunn commented Jul 6, 2017

cmungall commented Jul 6, 2017

hexylena commented Jul 10, 2017

hexylena commented Jun 30, 2017 •

edited

Loading