Skip to content

20180320 Ontology Change Improvement Call

marijane white edited this page Mar 20, 2018 · 1 revision

Date: 20180320

Attendees: Mike Conlon, Marijane White, Muhammad Javed, Brian Lowe, Andrei Tudor, Tatiana Walther, Violeta Ilik, Muhammad Javed, Damaris Murry

Agenda:

  1. vivo.owl and applicationConfiguration.owl – current state and next steps
    1. ApplicationConfiguration ontology terms https://goo.gl/yavY9D
    2. Namespaces referenced in VIVO and Vitro https://goo.gl/GPjmUa
    3. JIRA here: VIVO-1447 - Organize the ontology files. Produce a vivo.owl IN REVIEW
  2. Working ontology issues
    1. create a JIRA (or a GitHub issue – Mike keeps them synced)
    2. fork the ontology lab VIVO repo (we're not quite ready to work directly in the VIVO repo)
    3. create a branch named for the JIRA (vivo-xxxx)
    4. Do the work in your fork
    5. Make a minor pull request
    6. Review other's PRs and approve/request-changes

Minutes:

Mike: We have the vivo.owl candidate file and a pull request to the VIVO project to replace files with it. When I opened the pull request, I asked several of you to be reviewers, which means you will tell us whether we believe it should be part of the software. If you find something other than what you would prefer, you can create comments.

Javed: I was looking at those files yesterday. I downloaded the vivo-ontology-lab repo, started tomcat, and pulled up the VIVO instance. There were no issues, I logged in, went to the ontology list page, and I found that a few of the ontology names are missing, I believe.

Mike: Yes, that could be possible. Because there is a separate JIRA issue about updating ontologies.owl. What did you see in the ontology list?

Javed: The ones that were missing from the ontology list were...

Mike: And this was in the ontology list?

Javed: Yes. All of the terms were there.

Violeta: So we should worry about this in the separate ticket?

Tatiana: I think dcterms was not defined as an ontology in the last version, so you couldn't see it in the last version in the list, either.

Mike: I'll be sure to put it in the ticket, then.

Tatiana: It's also not an OWL ontology, I don't know if that might be why it doesn't show up.

Javed: No, it just needs a definition.

Mike: Yes, if we put it in ontologies.owl, it will show up there.

Javed: I think we do use it as OWL.

Violeta: and so does everyone else. =)

Tatiana: Ok.

Javed: so these are the ones I noticed missing, but all the properties were there.

Violeta: Javed, you did a better job than I did, because I didn't notice this was missing.

Javed: So I have my VIVO 1.10 with the new ontologies, and I want to use it with our Scholars @ Cornell data. I had to use the Jena model to convert from Jena 2 to Jena 3, using the wiki page with the changes and instructions on how to change my data. The steps are simple, one to import and one export, and it didn't work for me, I got a Null Pointer Exception when I started VIVO. Then I shut down tomcat to run the other .jar file, and got another Null Pointer Exception. I asked Jim Blake for help, he suggested putting quotes around the VIVO home path. Then Jim mentioned that Mike had made some changes and maybe the steps need to be updated?

Mike: Ralph, have you tried this?

Ralph: Yes, but I didn't change the name to VIVO 2.0. There are some other pages that might have gotten updated.

Mike: Sorry about this mess, we thought this was the page, but a different page was created. The Jena tools are in the VIVO repo, the ones you used, Javed, are in the lab ontology, which was created months and months ago. Try the one in the official repo and let me know if you have the same problem.

Ralph: Looking at my notes, one thing is to make sure you have the right directory, and then the original directions didn't indicate that you need the -i flag for import. I updated the instructions to reflect that. I did have some initial problems but everything went fine once I figured out the commands.

Javed: So do I have to create a jar file?

Ralph: For me it was in the package with the VIVO 2.0 snapshot

Mike: So that's a problem, because that means the updates that were made to that recently were not in that package.

Ralph: Yes, but I made my updates before you made those changes.

Mike: I'm just suggesting we use the ones in the official distribution. So I think the answer to Javed's question is yes, you have to create a jarfile.

Javed: Ok, so I will try that.

Mike: Which is not ideal, but if you do a maven clean install it should create it.

Javed: Should we have the jarfiles in the repo so people can just download them?

Mike: The problem is keeping them up to date.

Marijane: Wouldn't jarfiles be something we'd want to put in a GitHub release?

Mike: Yes, but they're under active development, they're moving too quickly.

Javed: ok, I will do that and continue with my testing. I was thinking before we do the first development sprint, I should figure out the steps to update the ontology and figure out who has updated.

Mike: I know Ralph has done it, and I have done it, and I believe Jim has. But I don't know how realistic a data set we used.

Ralph: I did a full system conversion, all of our data. It took the majority of the day to run, but it worked fine. I didn't like the lack of feedback it gave me.

Mike: Yeah, it's a distributed Jena library.

Javed, the page you showed us, the comments at the bottom are from Jim and myself about our experience, including size of data and runtime, and it turned out to be about the same. Like Ralph said, if you have a huge database it will take a long time. You can kind of estimate it from mine and Jim's numbers.

Bringing up the lab system is good, and doing smokescreen testing, that's one thing. A second thing is to do an ontological inspection of the file, make sure all the classes are there.

Marijane: That is the kind of review I'm planning to do.

Tatiana: I have also tested the vivo.owl, have made it as described, and it works fine, but the vivo application configuration and vitro files seem to be empty. If you click on them to display the classes and properties

Mike: Yes, that's the second ticket about ontologies.owl, and what you're describing is a consequence of that other ticket not being fixed.

Marijane: Can we note the missing stuff in the JIRA ticket for vivo.owl:

Javed: Marijane can you add yourself to to the sprint tasks page?

Marijane: Yes. Mike invited me to review, so I can do that.

Mike: This is the JIRA ticket that describes ontologies.owl, the missing stuff that we've discussed. https://jira.duraspace.org/browse/VIVO-1463

Tatiana: So if I have understood correctly, I should make a comment about what I missed.

Mike: Yes.

Violeta: So what is the conclusion, the owl file is OK, but the Jena tools were not the right version? So there are no issues with the ontology?

Javed: No issues if we load it with no data. Next I need to try it with data.

Mike: Yes, we're all reviewing the pull request.

Javed: Ok, so we're talking about these technical parts, things that are missing, how about the documentation now? Nobody here is working on the documentation. Do we know what needs to be changed?

Mike: I'm preparing to participate in the sprint, during which I will be focusing on documentation. There are many places where changes will need to be made to reflect the fact that this will all be in one file called vivo.owl. There will be a lot of searching and updating to make sure the wiki reflects the organization of the files as they will be in 1.10.

Javed: Do you think you might need some help with that?

Mike: I'll be on the sprint, so I'll expect a lot of conversation. It's something I know about and that I expect to be working on.

Javed: Ok. So do you think this ticket about the ontologies.owl file be in the sprint?

Mike: Yes, I was hoping it would be.

Javed: I agree. So I'm updating the sprint document to include ontologies.owl.

Mike: Most of the sprint tasks in there with names on them are mostly ontology tasks.

Tatiana: I just made a comment on the application configuration ontology files. I also have tried to add new classes, spin off company, for example, I inserted it into the Tbox but it was not enough.

Mike: So there is a ticket about spinoff company, and you're working on that ticket?

Tatiana: Yes, and I propose to also add the other FRAPO classes?

Marijane: The funding ones I suggested?

Javed: What is the difference between funding organization and funding agency?

Marijane: Oh, do we already have Funding Organization? I only suggested the FRAPO classes because Suzanne needed the identifier.

Violeta: Oh, ISNI.

Marijane: And funding agency.

Violeta: I could swear we already had ISNI.

Mike: It's in OpenVIVO.

And in your local Northwestern VIVO.

Violeta: Oh right!

Tatiana: so it should be a datatype property.

Violeta: That's how I did it.

Mike: That's how I did it too. There's a question about whether they should be properties or classes.

Marijane: That's what BIBFRAME does.

Violeta: Because it can be assigned to a person or an organization.

Marijane: I don't understand, wouldn't you just define the domain as the union of the two?

Javed: BIBFRAME is RDF, not OWL, so no union.

Marijane: Ohhh.

Mike: This is a pattern we need to examine, because of the ORCID implementation. If we're going to fix it we're going to want to have an identifier class, and the ORCID would be a subclass, with a property hanging off that.

Marijane: Do we have a ticket for that?

Mike: Yes.

Right now the ORCID is an owl:Thing and nothing else, which doesn't make any sense. A human can look at it and figure out what it is, but there's nothing else to indicate what it is.

I don't see a near-term reason to adopt an identifier pattern just so we can adopt an identifier like an ISNI.

Violeta: I don't remember why it was implemented this way. I'm concerned that this will be difficult to change in VIVOs.

Mike: I've looked at it, it's manageable. That's why we're assigning analysis to figure out the impact of tickets. So the ISNI would be a low impact since it has no impact on the software or anyone's running installation. It has an impact for a site like Northwestern, with a local implementation, they might want to migrate to the new one once we have it. That would be a medium impact ticket. You would need a CONSTRUCT query to add new triples and remove old ones.

Ralph: If there's an existing place for it in the ontology, that's where it will go. When we set up our Symplectic Elements system, they rewrote every single datatype. So when we set up VIVO, I said, this is the ontology, we should use it, and if it's not there, then we can make a small change.

Mike: That is something we should talk about. When people need to add something, they should ask.

Ralph: And that's something I've seen about adding ontology for arts and humanities. I would like to say there's an ontology here that already fits this so we don't have to switch later. I see us defining something like that sooner than later would be a beneficial thing for me, but also for the group.

Mike: It's on our todo list. We're sort of working out our ability to get things done.

Marijane: And I'd like to suggest that when people need to add things, they find an existing ontology, that could make transitioning easier if we decide to include that same ontology.

Javed: We have a few minutes left, so in the last developer call, there was a question about changing the ontology files. Don mentioned that they have made some changes in their ontology.

Mike: Who has?

Javed: Don has, at CU boulder. I think they changed some labels. They wanted to change how things showed up in the UI. So Don is wondering how they keep their changes? Should they just know what changes they made?

So let's say other organizations have their own customizations, to the third tier templates, so their person page, organization, page, look different. And we have the first-tier templates, when people make changes in their third tier templates, when we replace the first tier it changes it automatically.

Violeta: I don't know if that can be done.

Mike: That's how it works. If you have vivo.owl in your third tier, if you replace it with the new vivo.owl it will update.

Javed: The question is replacing a label in the original ontology. Is it a template question, or an ontology question? I see it two different ways.

Mike: Yes, those are two different ways to do it.

I'd really like to discuss any specific case, because the idea is that things have labels, and if they replaced labels, maybe they actually intended to replace the thing.

Javed: That's a different thing. In English we have synonyms. At Cornell, I call someone Faculty, but in another organization they might call them something else.

Mike: That's why I would like specifics, because if it's a synonym case, maybe we should handle that.

We found we had to add classes with labels because they were different things.

Javed: Maybe that isn't the best example, maybe consider the Funding Agency/Organization example, maybe they are the same thing with different names.

Violeta: I did this at Texas A&M, they didn't like the labels and so I had to rename them from the site admin.

Javed: So you updated the ontology.

Violeta: Not really, the classes stay the same.

Brian: No, traditionally we kept the labels separate, and the local value would override in an upgrade. This was a design decision in 2009 because we anticipated people might want different labels, which is why we had the first-time ontology folder.

Mike: But Brian, if a class has two labels, how does the interface decide which to show?

Brian: it picks the first one by alphabetical order.

Marijane: eagle-i handles this with a special application layer owl file, they define a preferred label, it cleans up inconsistent capitalization and also can be used to make a synonymous label preferred by a SME.

Violeta: and this is important in the library world, we define synonyms all the time.

Mike: I think that property is in SKOS too.

Tatiana: and even better properties in SKOS-XL.

The VIVO-ISF ontology is an information standard for representing scholarly work.

Additional Resources

Clone this wiki locally