Skip to content

20170227 Ontology Change Improvement Call

marijane white edited this page Feb 27, 2017 · 1 revision

Date: Monday, Feb 27, 2017

Attendees: Mike Conlon, Marijane White, Brian Lowe, Christian Hauschke, Graham Triggs, Matt Mayernik, Tatiana Walther, Tenille Johnson, Juliane Schneider, Muhammad Javed, Benjamin Gross, Dong Joon (DJ) Lee, Damaris Murry, Linda Rowan, Huda Khan

Mike's Agenda and Meeting Notes: https://docs.google.com/document/d/1THkc6-pw1Gpsdqw1C0ZxxZSq0NVTB2XzNgZHNvqwaMY/edit#heading=h.clo3gxnndz34

Ontology changes for the VIVO project have not happened for some time. For a couple reasons -- last time had dramatic impacts, we would like to understand how to manage those impacts while making needed changes, and also, we didn't really know how. Multiple repos, extraction processes, complexity, led us into a place where not comfortable making changes of any kind.

Three big questions in the agenda doc:

  • Why?
  • What would success look like?
  • How do we get there?

Marijane: What would success look like? People would be able to upgrade, we would be able to extract vivo-core from vivo-isf, and we'd have a process for making changes.

Damaris: we have a lot of customized ontologies for artistic works and professional activities, been talking about adding it for years, would love to be able to

Christian: German community members had an ontology call today, they think it's not transparent where the discussion takes place, who is in charge? who is responsible? They have some classes, one found at Cornell, and another, how would we make a suggestion

Javed: Defining the scope of the vivo-core ontology? What do we want to capture that will be pushed by default to a VIVO instance? Should the very specific ontology terms like medical terms be part of the default ontology? 4 points for management: 1. who suggests the change, and how, 2. who analyzes the need for the change, 3. who analyzes the impact of a change? 4. How do we communicate with the community about that change?

Marijane: Another thing that I think would define success would perhaps be some funding for an ontologist role, not just to manage change but to also work with people like Graham to incorporate new parts of the ontology into the software

Damaris: is the goal of the VIVO ontology to only include only more established ontologies? Duke is trying to keep changes low key, but that's hard to avoid

DJ: Has customized local ontology for grey literature, and how to display it like publications, has to use a config file. Whenever he needs to customize, he has to do things an alternate way from the existing Authorship class, because the software makes it difficult to customize.

Javed: When I joined Cornell and started looking at the Ontology, didn't expect that there would be locally defined classes. Should maybe have some calls to help people understand how to extend things in a local namespace, etc. Need some documentation and guidance around this.

Mike: Javed has thought about modules, has that been in the context of vivo-core or the ISF? Agreed that medical terms (mentioned by Javed) should not be in a general instance.

Javed shares his screen to show some of the clinical stuff that seems out of scope for a generic ontology

Tenille: this brings up something that frustrates the eagle-i folks, which is keeping up with external changes and pulling them in.

Marijane: the MIREOT approach is great if your tools can't handle huge ontologies, but it a maintenance headache.

Mike: two different versions of SKOS referenced in vivo-core, for example

Marijane: again, we need a process. Christian mentioned the lack of transparency, in reality it is silence. Do we need to have a regular call? Fifth thursday?

Mike: needs to be more often in the beginning. This time of day works well for European participants.

Mike: Let's go back to the topic of success. It is a problem that changes in the ontology are not reflected in the software. Changes have to happen in lockstep.

DJ: Sometimes we want to update ontology in our system, but concerned that software updates will break local changes.

Graham: software dependencies depend on what is being changed. Regardless of what is being changed, you need to add descriptions that are local only to the application in order to have them displayed. What group does it go in, what rank does it have on the page, but those are still ontology-driven, they're just not part of the core ontology for describing what a particular class means. For simple extensions, that kind of works, but for more extensive changes you do need to make changes to the application. For DJ's example of publications, there are custom queries and templates.

Mike: so when we think about what would it take to be good at this, we also have to be thinking about the changes to/impact on the community. Local development could be frozen waiting for changes to be made. Communication needs to be clear so that sites understand what will happen to their local changes. Distinguish between proposed changes and actual changes. Every open ticket is a proposed change. Also changes that haven't been put into a ticket yet.

Benjamin: Thinking about need to change the software based on ontology changes, this means we need a confirmed set of changes actually being made, which will mean changes to the software, this creates some lead time.

Mike: Used to deliver new piece of sofware with significant ontology changes and updated scripts to change the triplestore to keep sites whole. New ontology, new ontology changes, new triplestore that reflected the changes, and new software that reflected the changes. This was not enough. Making the system whole did not make the community whole, because people had queries/ingest scripts/software etc that used the API and the data that depended on the old ontology. The work to reflect the changes was significant. To be good at this, we have to make the community whole. Requires more thinking about communication, training, workshops, etc. what do we need to get any site through the change? There may be changes below that threshhold, and then there

Marijane: Did previous changes remove old ontology? What if we could generate any version of vivo-core from the ISF?

Mike: changes were to patterns/structure of the ontology, things were too dramatic to leave old ontology in place. Changes like these come down to communication, education, helping people understand how to change their queries.

Graham: Everybody has to get data into the system, so you might have, for common data patterns, let's support a means of getting that data in that doesn't depend on the ontology, so we can change the ontology and people can still get their data into the system. Examples of things like Cornell's data distributor, encourage people to get data out rather than just writing SPARQL queries.

Marijane: so abstracting the ontology away from the data interface? Graham says yes.

Javed: Whenever a downstream app is dependent on the upstream app, you have to manage those changes. Whenever something changes in vivo-isf, we should update vivo-core. But you have to update the SPARQL queries, and therefore, like Mike says, we need to communicate to users about the changes.

Marijane: with 10 minutes left, should we start talking about next steps? are there any next actions we can take?

Damaris: there were some task forces last year, those worked very well. Could we do that?

Mike: agree, we need a way to do work. But people find it easier to discover ways to work as they're working. They decide what work they're going to do even without a process, and the process is discovered along the way.

Graham: could lock down the software to encourage things like making ontology changes in the right namespace

Mike: another thing, VIVO-ISF, and the extracted eagle-i ontology is it's own thing, ERO, can we use whatever they use to extract vivo-core?

Marijane: I know there are scripts that extract eagle-i, not sure where they live outside of Harvard's SVN? Could also use OBO tools like ROBOT https://github.com/ontodev/robot

Tenille: Yes, Shahim set it up and I believe the plan was for it to be used by VIVO as well, although I'm not sure what would be required on the technical end to make that happen. There is some documentation here: https://docs.google.com/document/d/1HO0s1C9D9f5gh2puS8_B3i_U2n78KB0uKnxbhoVzboI/edit

Graham: there should be published ontologies, the software should use whatever published ontologies it wants to use, it might hide some other parts of the ontology, but you hide them in the application and the administrator configures what users see.

Tenille: eagle-i application is all local, ontology modules extracted and are not stored locally. Recent changes were process-heavy, made pull requests that were merged by Marijane

Marijane: at the same time, that process is kind of like what Javed proposes. I couldn't test locally in eagle-i but I could at least look at changes in Protege before approving the merge.

Tenille: what made it difficult was not being able to see changes in eagle-i before Marijane merged them. Need a way to test locally.

Mike: we are at the top of the hour, can we meet again in two weeks? (March 13)

To be continued!

The VIVO-ISF ontology is an information standard for representing scholarly work.

Additional Resources

Clone this wiki locally