-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement reaction ingests (Rhea, BioPAX, etc) #372
Comments
What about adding CHEBI? In PheKnowLator, we have created specific triples that allow us to explicitly represent CHEBI chemicals, catalysts, and cofactors with respect to Reactome pathways. Ignacio Tripodi and I collaborated on validating this and ran some wet lab experiments that seemed to suggest this worked well when applied to a small human RNA-Seq time series toxicogenomics assay. Also, I AM HUGE fan of exploring different KG modeling design patterns. Perhaps after the PheKnowLator manuscript we can talk more seriously about some projects in that domain. |
yes, we should definitely add chebi. rhea and reactome already use chebi curious - how did you go about modeling this? wow that's amazing about validating on wet lab experiments |
We actually ingest CHEBI now in KG-COVID-19 (see here), although probably not as elaborately as what you describe for PheKnowLator
Yes, let's discuss, post-manuscript! |
Yes, we do add ChEBI to our KG. @callahantiff Would love to include what you have for PheKnowLator or find ways of subsetting specific parts. Happy to chat more on this when you are ready 👍 |
Happy to discuss that. I think you might be disappointed by how simple it ended up being in the end. How can I best answer the modeling question? I can describe the edge types/data sources we used? Small wet lab experiments, but some nonetheless. I'd love to do more. Thoroughly validating the content and relationships in a large heterogeneous KG (aside from using reasoners -- to at least cover some of the logical aspects) is a tough! |
Sorry, slightly non-sequitur here, but just want to mention that a "Knowledge Beacon" was built to access Rhea. It is still quietly running on the Translator subnet at https://kba.ncats.io/beacon/rhea/. It probably didn't adequately cover Rhea but it could be a source of inspiration or a few Python code snippets (or not?) |
Note: this should move to a generic kghub repo, keeping here for now
Need a TSV of reaction->participant edges from various sources, in order of priority
(we also have a heuristic way of generating these from GO text descriptions but this is outside the scope of this ticket)
The fields would be:
This schema to be added to bl (biolink/biolink-model#478)
The nodes would have all the usual properties. E.g. rhea would provide a description, xrefs
maybe additional node properties like
I suggest the ingest does not try and normalize the IDs, but leaves the source ID prefixes.
Some sources may have catalysis too - add these as other edge type.
Not of direct relevance to KG-hub, but relevant to @goodb @balhoff, we will also have something like a SPARQL transform that turns this into our standard OWL representation, which can be complex, involving unions, e.g
This is what we would use for OWL reasoning and in GO
Note this kind of alternate levels of representation for different purposes is exactly what I am getting at in Biological Knowledge Graph Modeling Design Patterns
We can also see this akin to dosdp templating - we have a simple TSV representation and an OWL expansion
The text was updated successfully, but these errors were encountered: