Skip to content

20180712 Ontology Change Improvement Call

Ralph O'Flinn edited this page Jul 12, 2018 · 2 revisions

2018.07.12

Attendees: Christian Hauschke, Michael Conlon, Brian Lowe, Damaris Murry, Kitio Fofack, Muhammad Javed, Tatiana Walther, Violeta Ilik, Ralph O'Flinn

Agenda: https://wiki.duraspace.org/display/VIVO/2018-07-12+Ontology+Improvement+Call

Mike just got back from China, very interesting, lots of interesting ontology people, lots of good conversations.

  • now the editor of vivo.owl at BioPortal, though today it doesn't show that. https://bioportal.bioontology.org/ontologies/VIVO (note, NOT the VIVO-ISF entry, which is separate!)
  • Conversation with Mark from Stanford while in China; he is not a fan of BFO and upper ontologies. Mike's position is that he's just inherited these things.
  • Stefan suggested we get more involved with schema.org in some fashion

Use of XMLLiteral and VIVO-1528 https://jira.duraspace.org/browse/VIVO-1528

  • HTML editor pops up frequently because of this, which is inappropriate. HTML tags do not belong in most datatype values.
  • Can strings be marked as something that would bring up a plain text editor?
  • Mike and Huda agree this seems like app config stuff, not ontology stuff. Turns out functionality can be improved by assigning some domains and ranges. 160 of 220 datatype properties are lacking ranges.
  • Brian doesn't recall why these don't have ranges, he would like to know if any of them are used in restrictions on classes
  • Mike's inspection revealed many of these are from inherited ontologies, like BIBO and the FAO geopolitical ontology
  • Javed points out that people think about domains and ranges as constraints, even though they are not, so lots of ontology developers leave them blank
  • Several people ask, doesn't that apply more to object properties than datatype properties? Javed doesn't think so.
  • So people can put whatever they want in a datatype property. Even a URL for a page number.
  • This is the problem that SHACL was designed to solve. Ralph notes that the Product Evolution group may not be interested in SHACL; but Mike believes this is a premature ruling. Nobody yet knows enough about SHACL to know whether VIVO can use it, and the Ontology TF has the best chance of understanding that. Product Evolution is making some specious statements about RDF in VIVO, but we don't want to worry about that right now.
  • Back to the issue at hand, Brian agrees that this issue seems like an app config problem, but it does seem like it would be nice if the ontology had more detail to help the application do what it needs to do.
  • This would be particularly useful for the Overview properties.
  • Stefan's PR is a practical solution.
  • Javed doesn't think it would be a bad idea.
  • If we want to give the Overview properties a range, which one should they be given? the HTML type has been suggested, but this is problematic because that means no language tag, which could break things and could need a somewhat complex change to the application to keep things working, because the language would have to be pulled out of the HTML. This is not an insurmountable challenge, however, it might take a day or two of developer effort.
  • Ralph notes the history of the type, it used to have a language tag, but a discussion led them to implement it as an embedded tag.
  • Mike notes we just implemented RDF 1.1, which means as soon as you put a type on it, you get lang:string, which is incompatible with HTML. Ralph thinks Jena ignores this, however.
  • Ideally we'd have HTML with a language tag. Mike thinks the RDF design is wrong, that datatype and language are orthogonal and it's not implemented that way. Brian disagrees, because an HTML document could contain multiple languages, which Ralph notes is why the language tag was removed from the design.
  • Kitio thinks other languages in the HTML is a user matter.
  • Ralph asks whether it's ok to say that the 4-5 fields that should have the WSYWIG editor, if they should have rdfs:HTML, and then everything else gets the standard text box. Is that ok?
  • Mike thinks rdfs:HTML seems appropriate but he's concerned about losing functionality.
  • Ralph doesn't think we lose anything, that this is part of our overall internationalization efforts.
  • Mike is concerned about playing whack-a-mole, where one change has unintentional cascades
  • Ralph thinks HTML is more appropriate than XMLLiteral because the latter requires well-formed XML, which HTML is not.
  • Javed asks what the expectation is if we use rdfs:HTML -- do users have to enter full HTML content? Christian says the user would use a Rich Text Editor that produces HTML. And what goes in the triple? Mike and Ralph say HTML.
  • Christian asks if we should look at all the properties to see if other properties could benefit from HTML? Particularly if they have special characters that can't be entered as plain text. Mike agrees that's a good point. Ralph says that's why we have prereleases and testing, etc.
  • Brian points out we currently use rdfs:label for all those things, and we could not change the range of that property. So we'd have to introduce separate properties for things like titles, which is not a bad idea, but it's also more work.
  • Christian asks what happens if the HTML change were eventually reversed as a result of prerelease testing, does that mean there would be HTML in the plaintext? Yes.
  • Ralph notes that people already manage to get HTML into strings at his site, and he strips HTML out of those.
  • Mike summarizes: we are ok with the proposal, but we are concerned about unintended consequences of the change. See Mike's recommendation text in the agenda document at the Duraspace wiki, linked above.
  • Overall this is a good useful discussion to have. It will help improve the ontology.

Versioning

  • brief discussion about how to version the ontology moving forward, semantic versioning or not? Marijane points out that semantic versioning is a recipe for breaking things, but if breaking changes are unavoidable then maybe it's appropriate.

Humanities content

  • Ralph would like to review the ticket about the Duke humanities ontology
  • Mike regrets that this was not discussed at the conference, there are a handful of existing ontologies out there that could be used

The VIVO-ISF ontology is an information standard for representing scholarly work.

Additional Resources

Clone this wiki locally