Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Pooch to download tutorial data #188

Open
3 of 5 tasks
remrama opened this issue Dec 20, 2024 · 4 comments · May be fixed by #192
Open
3 of 5 tasks

Use Pooch to download tutorial data #188

remrama opened this issue Dec 20, 2024 · 4 comments · May be fixed by #192
Assignees
Labels
enhancement 🚧 New feature or request

Comments

@remrama
Copy link
Collaborator

remrama commented Dec 20, 2024

Accessing/downloading datasets for YASA tutorials would be much easier with Pooch.

  • Add data files to OSF or Zenodo repository and remove from GitHub.
  • Create a fetchers.py module to access downloaders
  • Update tutorials and notebooks with new way to load data
  • Add dataset to test evaluation module
  • Update changelog

Pooch has very simple and clear documentation for this, and a great resource to work off of is Ensaio.

I'm afraid that some version of this will need to be implemented in v0.7. It feels necessary for me to make the tutorial notebooks for the new evaluation module, since those will require larger datasets to show their purpose. (therefore kind of blocking #166)

**In the future it would be cool to access larger datasets here, like for full analyses rather than demos. It doesn't require any storage. A major benefit would be if YASA performed some standardization step, handling the various formats that people share their PSG data in. But given all the complexities that come with that, this feature should be reserved for a dedicated contributor later on (if interested in implementation at all).

@remrama remrama self-assigned this Dec 20, 2024
@remrama remrama added the enhancement 🚧 New feature or request label Dec 20, 2024
@raphaelvallat
Copy link
Owner

I'm afraid that some version of this will need to be implemented in v0.7.

Agreed 👍

@remrama
Copy link
Collaborator Author

remrama commented Dec 21, 2024

Noting here that we will use Zenodo to store the YASA tutorial data. I remembered that Pooch has a nice DOIDownloader class that integrates well with Zenodo to automatically pull metadata about all the files. It's a simpler implementation this way.

Plus I just noticed that YASA also gets archived on Zenodo -- sweet :)
This is good to know and I might squeeze this onto the documentation somewhere. If people want to cite specific versions of YASA that they use in a paper (which is a good idea considering it's early stages of development), I think this is the best way to do it.

@raphaelvallat, I'll create a new YASA tutorial data repository on Zenodo and add you as admin or whatever the highest privileges are.

@remrama remrama mentioned this issue Dec 28, 2024
5 tasks
@remrama
Copy link
Collaborator Author

remrama commented Dec 28, 2024

Placing a list here of all the YASA data files to grab and put on Zenodo. If anyone notices a file I'm missing, please let me know in this thread.

From documentation quickstart page, there is a full-night PSG recording with corresponding hypnogram:

From notebooks folder on GitHub repository:

  • data_ECG_8hrs_200Hz.npz gh link
  • data_EOGs_REM_256Hz.npz gh link
  • data_N2_spindles_15sec_200Hz.txt gh link
  • data_N3_no-spindles_30sec_100Hz.txt gh link
  • data_full_6hrs_100Hz_9channels.npz gh link
  • data_full_6hrs_100Hz_Cz+Fz+Pz.npz gh link
  • data_full_6hrs_100Hz_hypno.npz gh link
  • data_full_6hrs_100Hz_hypno_30s.txt gh link
  • data_resting_EO_200Hz_raw.fif gh link
  • sub-02_hypno_30s.txt gh link
  • sub-02_mne_raw.fif gh link

@remrama
Copy link
Collaborator Author

remrama commented Dec 28, 2024

All data files are up on a new Zenodo repository here: https://doi.org/10.5281/zenodo.14564284

All filenames are the same except I removed some of the prefixes (e.g., data_) that seemed redundant now that they are all surrounded by other data files on a data repository.

@raphaelvallat I added you as another owner/admin on that Zenodo repository. Feel free to change anything around. Any file changes will bump a version, changing metadata will not (either are fine).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement 🚧 New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants