Skip to content

Annotating OpenNeuro datasets

Alyssa Dai edited this page Dec 21, 2023 · 3 revisions

Context

Current process to update and re-process the Neurobagel OpenNeuro forks and graph

  1. For datasets that we've annotated (openneuro-annotations):
    1. Clone the original OpenNeuroDatasets repos
    2. Pull any changes to the dataset from the original/upstream repo into our forks of the repos (OpenNeuroDatasets-JSONLD)
    3. Re-apply our annotations to the participants.json found in openneuro-annotations (preserving the original indentation if a participants.json already exists!) and push changes to our forks
    • i.e., right now, if the Neurobagel data dictionaries need to be updated following a data model change, this must be done in openneuro-annotations first
  2. Run the Neurobagel CLI on our forks (OpenNeuroDatasets-JSONLD), making a named copy of the pheno-bids JSONLD in openneuro-annotations
  3. Push any changes to JSONLD files in openneuro-annotations (to keep our public copy of the graph data files up-to-date)
  4. Upload the JSONLD data in openneuro-annotations to our open_neuro graph database, clearing the existing data in the graph

Problems encountered when updating the forks (OpenNeuroDatasets-JSONLD)

  • 7/448 datasets from openneuro-annotations could not be found in OpenNeuroDatasets - these datasets exist in https://datasets.datalad.org/openneuro, but not in OpenNeuroDatasets/OpenNeuroDatasets-JSONLD
  • 7/448 datasets from openneuro-annotations could not be updated in OpenNeuroDatasets-JSONLD
    • fatal: Not possible to fast-forward, aborting. -> could not be brought up-to-date with their upstreams, and so new participant.json files also were not pushed successfully
  • 2/448 datasets failed due to upstream default branch main not being found

CLI failures

  • 3/441 datasets failed because both master and main branches exist, but upstream now uses main so this was the fork branch updated, but the 'default' branch that is used by the CLI is still master

Internal issues relevant to updating the OpenNeuro node