Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

thesis pieces

Kai Blumberg edited this page Nov 6, 2017 · 31 revisions

The beginning draft(s) for written sections of my masters thesis. I want to employ a write as I go strategy once I have an application ontology and and data compiled, such that I may iteratively ask competency questions, solve them by querying and write about it. Hence I wish to iteratively add to the written thesis and application ontology.

Table of Contents

Lookup author of ontology term

Pre and post-composition of complex classes

Increased illumination of a marine water body

Query processes involving sea ice which may have an effect on phytoplankton blooms

What processes connect phytoplankton blooms to the ocean floor?

What genes are involved in the accumulation of toxins from algal bloom process?

What compounds play a role as algae metabolites?

Hunting for knowledge preying upon Eddie

Example GFbio use cases of ontology term requests from the scientific community

PCO contributions

ENVO contributions

ECOCORE contributions

PATO contributions


Lookup author of ontology term

For work up to this project see log from 20.10.17

To answer a competancy question such as:

"Can we find the author of an ontology term and if so give me their contact information"

Beginning with a sparql query such as:

PREFIX obo: <http://purl.obolibrary.org/obo/>
SELECT DISTINCT ?x ?label ?author
from <http://purl.obolibrary.org/obo/merged/ENVO>
WHERE
{
?x rdfs:subClassOf obo:ENVO_00002200.
?x rdfs:label  ?label.
?x obo:IAO_0000117 ?author .
}

Which gets the term editors of people who contributed to terms which are a subclasses of 'sea ice' ENVO_00002200. This returns a list of links to ORCIDs.

Then to get the contact information from these ORCIDs, we can use the ORCID API tutorial here, to query for the associated emails.

Using a two-legged OAuth authorization request to send a token request to be able to query orcid pages:

curl "Accept: application/json" -d "client_id=APP-NPXKK6HFN6TJ4YYI" -d "client_secret=060c36f2-cce2-4f74-bde0-a17d8bb30a97" -d "scope=/read-public" -d "grant_type=client_credentials" "https://sandbox.orcid.org/oauth/token"

Using the returned token 2bd6d6b7-9438-4a5a-8f87-7e43d6eaac25 I ran the following email retrieving query to get the email address associated with the example ORCID. Code sourced from here.

curl -i -H "Accept: application/vnd.orcid+xml" -H 'Authorization: Bearer 2bd6d6b7-9438-4a5a-8f87-7e43d6eaac25' 'https://api.sandbox.orcid.org/v2.0/0000-0002-9227-8514/email'

returning

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<email:emails path="/0000-0002-9227-8514/email" xmlns:internal="http://www.orcid.org/ns/internal" xmlns:funding="http://www.orcid.org/ns/funding" xmlns:preferences="http://www.orcid.org/ns/preferences" xmlns:address="http://www.orcid.org/ns/address" xmlns:education="http://www.orcid.org/ns/education" xmlns:work="http://www.orcid.org/ns/work" xmlns:deprecated="http://www.orcid.org/ns/deprecated" xmlns:other-name="http://www.orcid.org/ns/other-name" xmlns:history="http://www.orcid.org/ns/history" xmlns:employment="http://www.orcid.org/ns/employment" xmlns:error="http://www.orcid.org/ns/error" xmlns:common="http://www.orcid.org/ns/common" xmlns:person="http://www.orcid.org/ns/person" xmlns:activities="http://www.orcid.org/ns/activities" xmlns:record="http://www.orcid.org/ns/record" xmlns:researcher-url="http://www.orcid.org/ns/researcher-url" xmlns:peer-review="http://www.orcid.org/ns/peer-review" xmlns:personal-details="http://www.orcid.org/ns/personal-details" xmlns:bulk="http://www.orcid.org/ns/bulk" xmlns:keyword="http://www.orcid.org/ns/keyword" xmlns:email="http://www.orcid.org/ns/email" xmlns:external-identifier="http://www.orcid.org/ns/external-identifier">
    <common:last-modified-date>2016-10-26T17:03:35.739Z</common:last-modified-date>
    <email:email visibility="public" verified="true" primary="false">
        <common:created-date>2016-10-26T17:03:05.707Z</common:created-date>
        <common:last-modified-date>2016-10-26T17:03:35.739Z</common:last-modified-date>
        <common:source>
            <common:source-orcid>
                <common:uri>http://sandbox.orcid.org/0000-0002-9227-8514</common:uri>
                <common:path>0000-0002-9227-8514</common:path>
                <common:host>sandbox.orcid.org</common:host>
            </common:source-orcid>
            <common:source-name>Sofia Maria Hernandez Garcia</common:source-name>
        </common:source>
        <email:email>[email protected]</email:email>
    </email:email>
</email:emails>

From which we can grep out: <email:email>[email protected]</email:email>

Unfortunately I can't yet get this to work with different ORCID's (mine or Piers) perhaps continue here. I emailed the ORCID support with these questions to see if they can help answer this question. For now this is proof of concept for a way to answer the competency question.

Pre and post-composition of complex classes

log entry

see competency question 4

To answer the question:

"Can ontology users create their own complex union class post compositionally to achieve the same effect as precomposed classes such as marine environment determined by a diatom community and marine environment determined by a diatom community bloom?

As well as to illustrate how with semantic research we can defining complex classes:

Answer:

For marine environment determined by a diatom community, Instead of using the environment determined by an ecological community hierarchy to get marine environment determined by a diatom community as a subclass with the axioms: is a 'marine environment determined by a phytoplankton community' and 'determined by' some 'diatom-dominated community'. An ontology user could instead create the class post compositionally using the axioms: is a 'environmental system' and 'determined by' some (('ecological community' and 'composed primarily of' some Bacillariophyta) and 'marine water body')

For marine environment determined by a diatom community bloom ... same idea.

I will do it once as a pre-composition where I push and get an iri for the whole class. Then I will illustrate how to post-compose such classes using pieces from other classes as building blocks to illustrate how it can be done either way.

Increased illumination of a marine water body

see entry here

To answer the following competency question:

"What processes result in increased illumination of a marine water body?"

Making use of the ENVO class sea ice melting which contains the axioms ... we are able to perform the following query ... to give us any process which affects the illumination state of a water body. The query returns the sea ice melting class.

Query processes involving sea ice which may have an effect on phytoplankton blooms

see comp question here

To answer the question:

"Are there any processes involving sea ice which may have an effect on phytoplankton blooms?"

Which could also be asked as:

"What processes are causally downstream of sea ice melting?"

answer using a query for the causally down stream axioms which should link

sea ice melting -> marine water body stratification -> marine environment determined by a phytoplankton community bloom

What processes connect phytoplankton blooms to the ocean floor?

To answer the question:

"What processes connect phytoplankton blooms to the ocean floor?" //could revise wording

Answer:

phytoplankton bloom process is 'part of' some 'surface photoautotrophic biomass formation' This can then be linked to classes within aggregation of aquatic detritus in a water body hierarchy which inherit the axiom 'causally downstream of or within' some ('surface photoautotrophic biomass formation' and 'located in' some 'water body') marine snow formation process subclass of aggregation of aquatic detritus in a water body 'has output' some 'marine snow' then the marine particle sinking process 'has input' some 'marine snow' and finally 'has output' some (('organic material' or ('part of' some 'biogenous sediment')) and 'part of' some 'ocean floor')

All of this is contained in the more easily searchable union class marine benthic carbon export process which 'has related synonym' 'biological carbon pump'

What genes are involved in the accumulation of toxins from algal bloom process?

// Similar to the ontobee sparql tutorial example query #7

What compounds play a role as algae metabolites?

"What compounds play a role as algae metabolites?"

//Make use of the CHEBI class: algal metabolite //could have a link to this from soon to be PCO:alga for example 'has part' some CHEBI'algal metabolite'

collaborations

Hunting for knowledge preying upon Eddie

see issue 13

Get Eddie's input for hole in the ice bloom class plus some of the useful microbial communities for his work. Perhaps I could even use some of his 16 rRNA data for my pulled together dataset.

From Eddie's input the hole in the ice bloom class would be something along the lines of ice free zone in ice-covered region He will also send me his ORCID so I can put it on the PCO classes. The 16s data should be published by the end of the year so it should be available for use it for the thesis.

Example GFbio use cases of ontology term requests from the scientific community

From discussions with Ivo. He request a example based usage of how GFbio should use the following terms to annotate data. Environment, ecosystem, biome, habitat.

On the theme of annotating data with ontology classes using examples from GFbio submissions we could etch out potential competancy questions such as:

  1. Given an abstract/manuscript what ontology terms could be suggested for the annotation of such dataset (using a textming aproach similar to what I did int the lab rotation). Could do this with gfbio submissions of which we could use the manually finished annotations as the "true" reference. Could alternatively do this on ENA datasets, the same subset as Henny is working on in her review of the Nagoya protocols. Give me the id of all datasets of certain type, then give me all samples from all datasets and give me all properties of type environmetal feature, environmental biome, aggregate back to a dataset level, the samples in this data set have these ... biomes.

  2. Disscuss a how to propagate ontology annotations done on a sample back up to a whole data set inoreder to be able to do a semantic search (for example with GFbio) for ontology classes associated with datasets. Now the envo anotations are in the samples not in the data sets. Does GFbio propagate all unique envo terms from all samples in a dataset up to the dataset level, or do we annotate a dataset only with higher level classes.

ontology contributions

PCO contributions

possibly see PCO issue 61

ENVO contributions

include releases from lab rotation

ECOCORE contributions

PATO contributions

Clone this wiki locally