Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create example notebooks for dandi #1

Open
satra opened this issue Jan 3, 2020 · 5 comments
Open

create example notebooks for dandi #1

satra opened this issue Jan 3, 2020 · 5 comments
Assignees

Comments

@satra
Copy link
Member

satra commented Jan 3, 2020

@bendichter - would it be possible to create a specific notebook example for DANDI? that includes a cell on top that downloads/instructs to download data using datalad? once we move all the dandisets to AWS, this should become much faster.

just like we pull in the nwb repo, we can pull this repo in whenever a user logs in to the dandi hub, so any changes to this repo will get picked up.

you could in fact create the examples in the hub, and then send a PR to this repo with those notebooks.

@satra
Copy link
Member Author

satra commented Sep 26, 2020

@bendichter - now that there are different datasets on dandi, can we update the example notebooks to demonstrate running different kinds of visualization and analyses on the data? what would be a good way of getting community contributions.

@bendichter
Copy link
Member

@satra for the datalad request, are you looking for something like

!git config --global user.email "[email protected]"
!git config --global user.name "Ben Dichter"

!datalad download-url https://girder.dandiarchive.org/api/v1/item/5e70d3093da50caa9adaf2e6/download

@bendichter
Copy link
Member

bendichter commented Sep 27, 2020

@satra Sure, let's discuss what would be most useful and what we'd need to do to get there.

Here are some ideas for analysis and visualization:

  1. Show off NWBWidgets, which has been tested on a variety of dandisets: 3-7, 9-11, 13-17, 19, 21. This includes most extracellular electrophysiology and optical physiology dandisets currently available. We already have one notebook up and the code for this is identical for each of the other datasets. I'm not sure how best to communicate that it works on a variety of datasets.

  2. We could host the analysis notebook that accompanies the Rutishauser dataset (dandiset adding notebooks for #000055 #4).

  3. This SpikeInterface notebook demonstrating the following functionality:

  • load the data with spikeextractors package
  • preprocess the signals
  • run a popular spike sorting algorithm with different parameters
  • curate the spike sorting output using 1) quality metrics (automatic) - 2) consensus-based
  • save the results to NWB!
  1. Would be nice to post a similar pipeline in CaImAn(example notebook) or suite2p (example notebook).

We need raw extracellular electrophysiology for 3 and raw optical physiology for 4. we don't have any yet, however we do have simulation data that would work for 3 in dandiset 28. @alejoe91, can you help with this?

Those are the main Python-based analysis packages that are explicitly compatible with NWB and don't require a Qt window or something like that. We could potentially build interfaces to other analysis and visualization software, but that would take some considerable development, so lets think about what would be most useful and be selective about what tools we want to demonstrate to start.

@bendichter
Copy link
Member

bendichter commented Oct 3, 2020

@satra

Rutishauser notebooks
2 is now done: link. that repo also contains a second notebook here, but this notebook reads all of the data in the dandiset, not just a single session. I decided not to add this to the example notebooks for 2 reasons: 1) the analysis relies on the lab's internal file structure, and 2) running it would require downloading the entire dandiset into the jupyter hub instance. I could work through 1 and power through 2 if you want, just lmk.

SpikeInterface
Satra, you were right that we do appear to have raw data from an example session in dandiset #28, but that file is 12GB, and we'd need to download it. I tried timing that with %time in the notebook, but that didn't work, so I don't know how to time it. I'll work on that over the next few days and see if it wouldn't be too bad to transfer.

@satra
Copy link
Member Author

satra commented Oct 5, 2020

thanks for adding the first one. perhaps we can augment that dataset with zarrhdf5 files and use that to see if we can do the second notebook? or does it really require downloading all the data?

for the moment the 12G dataset should be ok to download, but again we could consider if that's a good use case for zarrhdf5..

satra pushed a commit that referenced this issue Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants