-
Notifications
You must be signed in to change notification settings - Fork 6
Alternative Graph Backend deployment
We are moving away from Stardog as a graph backend, mostly because they no longer provide a free academic license but instead provide short-term "trials".
Take a look at https://github.com/neurobagel/planning/issues/9 to see our progress in picking a replacement.
In the meantime, here are instructions for deploying graphDB as our graph backend instead of Stardog.
Follow the Launch the API
section of our public docs,
but change the following variables in the .env
file from
the defaults described in the docs:
NB_GRAPH_IMG=ontotext/graphdb:10.3.1
NB_GRAPH_ROOT_CONT=/opt/graphdb/home
NB_GRAPH_PORT=7200
NB_GRAPH_PORT_HOST=7200
NB_GRAPH_DB=repositories/my_db # NOTE: for graphDB, this value should always take the the format of: repositories/<your_database_name>
Make a copy of the default docker-compose.yml
file in the same directory
and then run docker compose up -d
to launch
the Neurobagel services.
Refer to the API readme for additional instructions.
When the API, graph, and query tool have been started and are running for the first time, you will have to do some first-run configuration.
Also refer to https://graphdb.ontotext.com/documentation/10.0/devhub/rest-api/curl-commands.html#security-management
First, change the password for the admin user that has been automatically created by graphDB:
curl -X PATCH --header 'Content-Type: application/json' http://localhost:7200/rest/security/users/admin -d '
{"password": "NewAdminPassword"}'
make sure to replace "NewAdminPassword"
with your own, secure password.
Next, enable graphDB security to only allow authenticated users access:
curl -X POST --header 'Content-Type: application/json' -d true http://localhost:7200/rest/security
and confirm that this was successful:
➜ curl -X POST http://localhost:7200/rest/security
Unauthorized (HTTP status 401)
Now we can create a user for the API:
curl -X POST --header 'Content-Type: application/json' -u "admin:newpassword" -d '
{
"username": "DBUSER",
"password": "DBPASSWORD"
}' http://localhost:7200/rest/security/users/DBUSER
In graphDB, graph databases are called resources.
To create a new one, you will also have to prepare a data-config.ttl
file
that contains the settings for the resource you will create (see the graphDB docs).
make sure to that the value for rep:repositoryID
in the data-configl.ttl
file matches the value of
NB_GRAPH_DB
in your .env
file.
For example, if NB_GRAPH_DB=my_db
, then
rep:repositoryID "my_db" ;
.
You can use this example file and save
it as data-config.ttl
locally:
#
# RDF4J configuration template for a GraphDB repository
#
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix rep: <http://www.openrdf.org/config/repository#>.
@prefix sr: <http://www.openrdf.org/config/repository/sail#>.
@prefix sail: <http://www.openrdf.org/config/sail#>.
@prefix graphdb: <http://www.ontotext.com/config/graphdb#>.
[] a rep:Repository ;
rep:repositoryID "my_db" ;
rdfs:label "" ;
rep:repositoryImpl [
rep:repositoryType "graphdb:SailRepository" ;
sr:sailImpl [
sail:sailType "graphdb:Sail" ;
graphdb:read-only "false" ;
# Inference and Validation
graphdb:ruleset "rdfsplus-optimized" ;
graphdb:disable-sameAs "true" ;
graphdb:check-for-inconsistencies "false" ;
# Indexing
graphdb:entity-id-size "32" ;
graphdb:enable-context-index "false" ;
graphdb:enablePredicateList "true" ;
graphdb:enable-fts-index "false" ;
graphdb:fts-indexes ("default" "iri") ;
graphdb:fts-string-literals-index "default" ;
graphdb:fts-iris-index "none" ;
# Queries and Updates
graphdb:query-timeout "0" ;
graphdb:throw-QueryEvaluationException-on-timeout "false" ;
graphdb:query-limit-results "0" ;
# Settable in the file but otherwise hidden in the UI and in the RDF4J console
graphdb:base-URL "http://example.org/owlim#" ;
graphdb:defaultNS "" ;
graphdb:imports "" ;
graphdb:repository-type "file-repository" ;
graphdb:storage-folder "storage" ;
graphdb:entity-index-size "10000000" ;
graphdb:in-memory-literal-properties "true" ;
graphdb:enable-literal-index "true" ;
]
].
Then you can create a new graph db with the following command (replace "my_db" as needed):
curl -X PUT -u "admin:newpassword" http://localhost:7200/repositories/my_db --data-binary "@data-config.ttl" -H "Content-Type: application/x-turtle"
and add give our user access permission to the new resource:
curl -X PUT --header 'Content-Type: application/json' -d '
{"grantedAuthorities": ["WRITE_REPO_my_db","READ_REPO_my_db"]}' http://localhost:7200/rest/security/users/DBUSER -u "admin:newpassword"
-
"WRITE_REPO_my_db"
: Grants write permission. -
"READ_REPO_my_db"
: Grants read permission.
Note: make sure you replace my_db
with the name of the graph db you
have just created.
To test that the above setup steps worked correctly, we can add some example graph-ready data (JSONLD files) to the new graph db from the neurobagel/neurobagel_examples repository.
First, clone neurobagel/neurobagel_examples
:
git clone https://github.com/neurobagel/neurobagel_examples.git
The neurobagel/api
repo comes with a helper script add_data_to_graph.sh to automatically upload all JSONLD files in a directory to a user-specified graph database, with the option to clear the existing data in the database first.
A version of this script for a GraphDB endpoint is available from here.
Download the add_data_to_graph_graphdb.sh
script:
git clone https://gist.github.com/e10d0ba1d8e89d1564b7029b386e6637.git
To view all the command line arguments for the script:
./add_data_to_graph_graphdb.sh --help
ℹ️ Note: If you prefer to directly use
curl
requests to modify the graph database instead of the helper scriptAdd a single dataset to the graph database (example):
curl -u "<USERNAME>: <PASSWORD>" -i -X POST http://localhost:7200/repositories/<DATABASE_NAME>/statements \ -H "Content-Type: application/ld+json" \ --data-binary @<DATASET_NAME>.jsonldClear all data in the graph database (example):
curl -u "<USERNAME>: <PASSWORD>" -X POST http://localhost:7200/repositories/<DATABASE_NAME>/statements \ -H "Content-Type: application/sparql-update" \ --data-binary "DELETE { ?s ?p ?o } WHERE { ?s ?p ?o }"
Now, we will upload to the graph db we created above the data in the directory neurobagel_examples/data-upload/pheno-bids-output
. To do this, run the helper script as follows:
./add_data_to_graph_graphdb.sh PATH/TO/neurobagel_examples/data-upload/pheno-bids-output localhost:7200 repositories/my_db DBUSER DBPASSWORD \
--clear-data
NOTE: Here we added the --clear-data
flag to remove any existing data in the database (if the database is empty, the flag has no effect). You can choose to omit the flag or explicitly specify --no-clear-data
(default behaviour) to skip this step.