Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Natural language definitions/comments missing after v2.3 #218

Closed
brandonnodnarb opened this issue Sep 9, 2020 · 3 comments
Closed

Natural language definitions/comments missing after v2.3 #218

brandonnodnarb opened this issue Sep 9, 2020 · 3 comments

Comments

@brandonnodnarb
Copy link
Member

brandonnodnarb commented Sep 9, 2020

Previous versions of SWEET --- v2.0 to v2.3 --- had natural language descriptions in an rdfs:comment tag. Many of these include a general citation to Wikipedia, i.e. trailing text [Wikipedia] without a direct page link.

Results of a SPARQL Query for rdfs:comment in v2.3 only in the attached txt file containing two fields: IRI -> rdfs:comment_text
SWEET_v2.3_URI-comments.txt The file contains 1119 lines; there may be duplicate entries.

As per the discussion in #208 and #211, it would probably be prudent to carry these over to the current version where applicable (e.g. non-Cryo terms). Even if these natural language descriptors aren't well cited they were manually added by previous developer(s); at minimum they provide further information which can be leveraged to improve the accuracy of automated mapping methods.

Thoughts? Comments?

@brandonnodnarb
Copy link
Member Author

(adding to the discussion previously on slack)

I had thought a SPARQL update query could do the trick.
something along the lines of:

`PREFIX : <http://sweetontology.net/sweetAll#>
 PREFIX SWEETv23: <http://sweetontology.net/sweetAll_version=20171004T160715#>
 SELECT ?sub ?com
 FROM <http://sweetontology.net/sweetAll_version=20171004T160715#>
 WHERE
 {
      GRAPH ?g { ?sub rdfs:comment ?com }          
  }
 COPY GRAPH TO DEFAULT`

(I have not tested --- for syntactic validity or accuracy).

If one were to try and run the query 'directly' via COR, I'm not sure how to specify the SWEET version in the query. I'm also assuming an admin would need actually issue the query. There may be a better way...

@brandonnodnarb
Copy link
Member Author

I have 1053 rdfs:comments from v2.3. I have created a one-off PR to gather feedback before processing the lot.

As these are from a previous version (and dropped somewhere along the line), I'm inclined to add them back with the understanding that any rdfs:comment not reified, or properly cited, is a holdover and needs to be verified and validated.

Please keep in mind the main reason these comments would be useful, in addition to contextualizing some of the entities, is to help automate matching algorithms e.g. #225, #208, and similar. I would expect these comments to drop off in future as they are replaced with cited material.

Thoughts?

@brandonnodnarb
Copy link
Member Author

resolved with #246

@brandonnodnarb brandonnodnarb added this to the 3.5.0 milestone Jul 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant