-
Notifications
You must be signed in to change notification settings - Fork 6
Annotating OpenNeuro datasets
Alyssa Dai edited this page Dec 21, 2023
·
3 revisions
-
OpenNeuroDatasets-JSONLD repo contains forks of datasets that exist in the OpenNeuroDatasets repo that we have been able to semi-manually create data dictionaries for.
-
https://github.com/OpenNeuroDatasets-JSONLD/.github contains the code (
code/
) for updating the repos of OpenNeuroDatasets-JSONLD.⚠️ Scriptupdate_json
assumes upstream doesn't have any Neurobagel annotations.
-
https://github.com/OpenNeuroDatasets-JSONLD/.github contains the code (
-
openneuro-annotations contains JSONs auto-generated (with some minor manual revisions?) based on our manual spreadsheet annotation of the OpenNeuro datalad superdataset.
⚠️ This datalad superdataset does not entirely overlap with OpenNeuroDatasets repo (e.g., some datasets exist only in the datalad superdataset).
- For datasets that we've annotated (openneuro-annotations):
- Clone the original OpenNeuroDatasets repos
- Pull any changes to the dataset from the original/upstream repo into our forks of the repos (OpenNeuroDatasets-JSONLD)
- Re-apply our annotations to the
participants.json
found in openneuro-annotations (preserving the original indentation if aparticipants.json
already exists!) and push changes to our forks
- i.e., right now, if the Neurobagel data dictionaries need to be updated following a data model change, this must be done in
openneuro-annotations
first
- Run the Neurobagel CLI on our forks (OpenNeuroDatasets-JSONLD), making a named copy of the pheno-bids JSONLD in openneuro-annotations
- Push any changes to JSONLD files in openneuro-annotations (to keep our public copy of the graph data files up-to-date)
- Upload the JSONLD data in
openneuro-annotations
to ouropen_neuro
graph database, clearing the existing data in the graph
- 7/448 datasets from openneuro-annotations could not be found in OpenNeuroDatasets - these datasets exist in https://datasets.datalad.org/openneuro, but not in OpenNeuroDatasets/OpenNeuroDatasets-JSONLD
- 7/448 datasets from openneuro-annotations could not be updated in OpenNeuroDatasets-JSONLD
-
fatal: Not possible to fast-forward, aborting.
-> could not be brought up-to-date with their upstreams, and so newparticipant.json
files also were not pushed successfully
-
- 2/448 datasets failed due to upstream default branch
main
not being found
- 3/441 datasets failed because both
master
andmain
branches exist, but upstream now usesmain
so this was the fork branch updated, but the 'default' branch that is used by the CLI is still master