Skip to content
Sandra Mierz edited this page Nov 5, 2021 · 5 revisions

generate2vivo

generate2vivo is an extensible Data Ingest Tool for the open source software VIVO. It currently queries metadata from Datacite Commons, Crossref, ROR and ORCID and maps them to the VIVO ontology using sparql-generate. The resulting RDF data can be exported to a VIVO instance directly or it can be returned in JSON-LD.

Features

Starting point was the sparql-generate library that we use as an engine for our transformations, which are defined in different GENERATE queries.
Notice that code and queries are separate, this allows users

  • to write or change queries without going into the code
  • to reuse queries (meaning you can dump the code and only use the queries for example with the command line tool provided on the sparql-generate website)
  • to reuse code (meaning you can dump the queries if the data sources are not interesting for you and use only the code with your own queries)

In addition we gave the application a REST API so other programs or services can communicate with the application using HTTP requests which allows generate2vivo to be integrated in an existing data ingest process.

On the other side we added output functionality that allows you to export the generated data either directly into a VIVO instance via its SPARQL API or alternatively if you want to check the data before importing or are using a messaging service like Apache Kafka you can return the generated data as JSON-LD and do some post-processing with it.

Clone this wiki locally