how to push sensitive information to a confidential section #261
Replies: 2 comments 2 replies
-
There is one crucial step here which is missing: datalad run-procedure setup --confidential This will configure the
This would be ignored by ChildProject, which looks for recordings inside of If you think it is necessary, I'll add more rules to EL1000 dataset' setup procedure to allow for recordings (like what has been done with png2019 et tsimane) |
Beta Was this translation helpful? Give feedback.
-
update: Trev uploaded the confidential .its and I saw them online, so I know the upload was successful. However, now I'm trying to anonymize (then re-import) the .its, and they don't seem to be there: https://gin.g-node.org/EL1000/winnipeg/src/main/annotations/its/confidential/raw/C004_20090801.its when I look at the history, I only see his commit: not sure what we did wrong - help welcome! |
Beta Was this translation helpful? Give feedback.
-
It's a good idea to keep a backup that is well organized and complete, including perhaps some sensitive information. To this end, you can have a datalad dataset set up with a structure whereby sensitive information goes into a "confidential" section. We are not going to explain how to set this up here, but instead assume this has already been done, and what you want to know is how to contribute to such a repository.
To make this more precise, we are going to be using the soderstrom corpus, which is part of the EL1000 superdataset. Soderstrom is private, and only admins have access to the confidential portion, so you will not be able to reproduce these steps unless you are one of the admins.
1. Make sure you have access to the confidential portion
For the Soderstrom dataset, this means following the EL1000's instructions for "gaining access to the data". Once we gave you the go-ahead, you can check that you can visit this site when you are logged in. If you can, that means you have access rights. (If you get a 404 error, you are either not logged in, or you don't have access rights.)
2. Make a local version of the dataset (not the confidential part, but the normal part)
For the Soderstrom dataset, this means that you navigate where you want to make a local copy of the data, and do:
3. Add your confidential files in a subfolder that contains "confidential" in its path -- provided the repo has been correctly set up
How these paths will work (ie whether the contents will be reflected only in the confidential section or broadly) depend on how your datalad dataset was set up. If you are not sure, it's better to do a trial run without any sensitive information.
If you are adding .its files, then put your confidential .its files inside
annotations/its/confidential/raw/
.If you are adding unvetted .eaf files that contain sensitive info, then put your confidential .eaf files inside
annotations/eaf/confidential/
.If you are adding metadata, then put it inside
metadata/confidential/
.If you are adding unvetted recordings, then put them inside
recordings/raw/confidential/
. NOTE: This one is not currently set up for the Soderstrom corpus!! So please don't share unvetted recordings just yet. If you are hoping to, let us know!And for anything else, consider putting them inside
extra/
but always usingconfidential/
in the path. For instance, perhaps you want to put transcripts of a discussion with the families. Then you can name the folderextra/interviews/confidential/
, so that one day you can put vetted transcripts inextra/interviews/
.4. Submit your changes
Share your changes by:
Beta Was this translation helpful? Give feedback.
All reactions