Skip to content

Commit 2e61fc8

Browse files
Sara04pre-commit-ci[bot]bruAristimunhaPierreGtch
authored
[WIP] Adding a A large EEG database with users' profile information for motor imagery Brain-Computer Interface dataset (#404)
* Adding a new dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks * Add data loading, update docstring * [pre-commit.ci] auto fixes from pre-commit.com hooks * Update dataset and add example * [pre-commit.ci] auto fixes from pre-commit.com hooks * Update Dreyer 2023 dataset class and example * Remove loading info before db is downloaded * Update dataset session and run naming; update plotting * Add whats_new * Add words to ignore in pre commit config * Add words to ignore in pre commit config * [pre-commit.ci] auto fixes from pre-commit.com hooks * [pre-commit.ci] auto fixes from pre-commit.com hooks * Update docstrings * Update summary table * Rename file to lowercase * rename file to lowercase * Update docstring ref links * Small adjustment on the tutorial * Do not expose the subjects and db_id parameters * playing and cleaning * updating the table * updating the table * fixing the limits * updating the API * updating here * suggestions from the code review * making the example run * Update moabb/datasets/dreyer2023.py Co-authored-by: Pierre Guetschel <[email protected]> Signed-off-by: Bru <[email protected]> * Update moabb/datasets/dreyer2023.py Co-authored-by: Pierre Guetschel <[email protected]> Signed-off-by: Bru <[email protected]> --------- Signed-off-by: Bru <[email protected]> Signed-off-by: Pierre Guetschel <[email protected]> Signed-off-by: Bru <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Bru <[email protected]> Co-authored-by: Pierre Guetschel <[email protected]> Co-authored-by: Pierre Guetschel <[email protected]> Co-authored-by: Bru <[email protected]>
1 parent 8ca59e4 commit 2e61fc8

File tree

10 files changed

+779
-2
lines changed

10 files changed

+779
-2
lines changed

.pre-commit-config.yaml

+1-1
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ repos:
7979
hooks:
8080
- id: codespell
8181
args:
82-
- --ignore-words-list=assertIn,additionals,alle,alot,bund,currenty,datas,farenheit,falsy,fo,haa,hass,iif,incomfort,ines,ist,nam,nd,pres,pullrequests,resset,rime,ser,serie,te,technik,ue,unsecure,withing,zar,crate
82+
- --ignore-words-list=assertIn,additionals,alle,alot,bund,currenty,datas,farenheit,falsy,fo,haa,hass,iif,incomfort,ines,ist,nam,nd,pres,pullrequests,resset,rime,ser,serie,te,technik,ue,unsecure,withing,zar,crate,Perfomances,Aline
8383
- --skip="./.*,*.csv,*.json,*.ambr"
8484
- --quiet-level=2
8585
exclude_types: [ csv, json, svg, pdf ]

docs/source/api/datasets.rst

+4
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,10 @@ Motor Imagery Datasets
1515
BNCI2015_001
1616
BNCI2015_004
1717
Cho2017
18+
Dreyer2023
19+
Dreyer2023A
20+
Dreyer2023B
21+
Dreyer2023C
1822
Lee2019_MI
1923
GrosseWentrup2009
2024
Ofner2017
Loading
Loading

docs/source/whats_new.rst

+2-1
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,14 @@ Develop branch - 1.2.1
1717

1818
Enhancements
1919
~~~~~~~~~~~~
20+
- Adding new motor imagery dataset, Dreyer2023 (PR :gh: `404` by `Sara Sedlar`_, `Sylvain Chevallier`_ and `Bruno Aristimunha`_)
2021
- Reordering the examples in the documentation (:gh:`807` by `Bruno Aristimunha`_)
2122
- Creating the meta information for the BIDS converted datasets (:gh:`688` by `Bruno Aristimunha`_)
2223

2324

2425
Bugs
2526
~~~~
27+
- Fix caching issue with incomplete results (:gh:`715` by `Sylvain Chevallier`_)
2628

2729
API changes
2830
~~~~~~~~~~~
@@ -57,7 +59,6 @@ Bugs
5759
- Fixing the dataset details for bids conversion (:gh:`698` by `Bruno Aristimunha`_)
5860
- Fixing unit issue and lack of montage with :class:`moabb.datasets.Rodrigues2017`, :class:`moabb.datasets.Rodrigues2017`, :class:`moabb.datasets.BaseCastillos2023`, :class:`moabb.datasets.BaseCastillos2023`, :class:`moabb.datasets.Huebner2018`, :class:`moabb.datasets.Cattan2019_PHMD`, :class:`moabb.datasets.Ofner2017` (:gh:`700` `Bruno Aristimunha`_)
5961
- Fix t-test permutation tests (:gh:`684` and :gh:`709` by `Gregoire Cattan`_, `Anton Andreev`_, `Marco Congedo`_ and `Bruno Aristimunha`_)
60-
- Fix caching issue with incomplete results (:gh:`715` by `Sylvain Chevallier`_)
6162

6263

6364
API changes
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,164 @@
1+
"""
2+
===============================================
3+
Examples of analysis of a Dreyer2023 A dataset.
4+
===============================================
5+
6+
This example shows how to plot Dreyer2023A Left-Right Imagery ROC AUC scores
7+
obtained with CSP+LDA pipeline versus demographic information of the examined
8+
subjects (gender and age) and experimenters (gender).
9+
10+
To reduce computational time, the example is provided for four subjects.
11+
12+
"""
13+
14+
# Authors: Sara Sedlar <[email protected]>
15+
# Sylvain Chevallier <[email protected]>
16+
# License: BSD (3-clause)
17+
18+
import matplotlib.patches as mpatches
19+
import matplotlib.pyplot as plt
20+
import seaborn as sb
21+
from pyriemann.estimation import Covariances
22+
from pyriemann.spatialfilters import CSP
23+
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis as LDA
24+
from sklearn.pipeline import make_pipeline
25+
26+
from moabb.datasets import Dreyer2023A
27+
from moabb.evaluations import WithinSessionEvaluation
28+
from moabb.paradigms import MotorImagery
29+
30+
31+
########################################################################################
32+
# 1. Defining dataset, selecting subject for analysis and getting data
33+
dreyer2023 = Dreyer2023A()
34+
dreyer2023.subject_list = [1, 5, 7, 35]
35+
dreyer2023.get_data()
36+
########################################################################################
37+
# 2. Defining MotorImagery paradigm and CSP+LDA pipeline
38+
paradigm = MotorImagery()
39+
pipelines = {}
40+
pipelines["CSP+LDA"] = make_pipeline(
41+
Covariances(estimator="oas"), CSP(nfilter=6), LDA(solver="lsqr", shrinkage="auto")
42+
)
43+
########################################################################################
44+
# 3. Within session evaluation of the pipeline
45+
evaluation = WithinSessionEvaluation(
46+
paradigm=paradigm, datasets=[dreyer2023], suffix="examples", overwrite=False
47+
)
48+
results = evaluation.process(pipelines)
49+
50+
########################################################################################
51+
# 4. Loading dataset info and concatenation with the obtained results
52+
info = dreyer2023.get_subject_info().rename(columns={"score": "score_MR"})
53+
# Creating a new column with subject's age
54+
info["Age"] = 2019 - info["Birth_year"]
55+
# Casting to int for merging
56+
info["subject"] = info["SUJ_ID"].astype(int)
57+
results["subject"] = results["subject"].astype(int)
58+
59+
results_info = results.merge(info, on="subject", how="left")
60+
61+
########################################################################################
62+
########################################################################################
63+
# 5.1 Plotting subject AUC ROC scores vs subject's gender
64+
fig, ax = plt.subplots(nrows=2, ncols=2, facecolor="white", figsize=[16, 8], sharey=True)
65+
fig.subplots_adjust(wspace=0.0, hspace=0.5)
66+
sb.boxplot(
67+
data=results_info, y="score", x="SUJ_gender", ax=ax[0, 0], palette="Set1", width=0.3
68+
)
69+
sb.stripplot(
70+
data=results_info,
71+
y="score",
72+
x="SUJ_gender",
73+
ax=ax[0, 0],
74+
palette="Set1",
75+
linewidth=1,
76+
edgecolor="k",
77+
size=3,
78+
alpha=0.3,
79+
zorder=1,
80+
)
81+
ax[0, 0].set_title("AUC ROC scores vs. subject gender")
82+
ax[0, 0].set_xticklabels(["Man", "Woman"])
83+
ax[0, 0].set_ylabel("ROC AUC")
84+
ax[0, 0].set_xlabel(None)
85+
ax[0, 0].set_ylim(0.3, 1)
86+
########################################################################################
87+
# 5.2 Plotting subject AUC ROC scores vs subjects's age per gender
88+
sb.regplot(
89+
data=results_info[results_info["SUJ_gender"] == 1][["score", "Age"]].astype(
90+
"float32"
91+
),
92+
y="score",
93+
x="Age",
94+
ax=ax[0, 1],
95+
scatter_kws={"color": "#e41a1c", "alpha": 0.5},
96+
line_kws={"color": "#e41a1c"},
97+
)
98+
sb.regplot(
99+
data=results_info[results_info["SUJ_gender"] == 2][["score", "Age"]].astype(
100+
"float32"
101+
),
102+
y="score",
103+
x="Age",
104+
ax=ax[0, 1],
105+
scatter_kws={"color": "#377eb8", "alpha": 0.5},
106+
line_kws={"color": "#377eb8"},
107+
)
108+
ax[0, 1].set_title("AUC ROC scores vs. subject age per gender")
109+
ax[0, 1].set_ylabel(None)
110+
ax[0, 1].set_xlabel(None)
111+
ax[0, 1].legend(
112+
handles=[
113+
mpatches.Patch(color="#e41a1c", label="Man"),
114+
mpatches.Patch(color="#377eb8", label="Woman"),
115+
]
116+
)
117+
########################################################################################
118+
# 5.3 Plotting subject AUC ROC scores vs experimenter's gender
119+
sb.boxplot(
120+
data=results_info, y="score", x="EXP_gender", ax=ax[1, 0], palette="Set1", width=0.3
121+
)
122+
sb.stripplot(
123+
data=results_info,
124+
y="score",
125+
x="EXP_gender",
126+
ax=ax[1, 0],
127+
palette="Set1",
128+
linewidth=1,
129+
edgecolor="k",
130+
size=3,
131+
alpha=0.3,
132+
zorder=1,
133+
)
134+
ax[1, 0].set_title("AUC ROC scores vs. experimenter gender")
135+
ax[1, 0].set_xticklabels(["Man", "Woman"])
136+
ax[1, 0].set_ylabel("ROC AUC")
137+
ax[1, 0].set_xlabel(None)
138+
ax[1, 0].set_ylim(0.3, 1)
139+
########################################################################################
140+
# 5.4 Plotting subject AUC ROC scores vs subject's age
141+
sb.regplot(
142+
data=results_info[["score", "Age"]].astype("float32"),
143+
y="score",
144+
x="Age",
145+
ax=ax[1, 1],
146+
scatter_kws={"color": "black", "alpha": 0.5},
147+
line_kws={"color": "black"},
148+
)
149+
ax[1, 1].set_title("AUC ROC scores vs. subject age")
150+
ax[1, 1].set_ylabel(None)
151+
plt.show()
152+
########################################################################################
153+
# 5.5 Obtained results for four selected subjects correspond to the following figure.
154+
#
155+
# .. image:: ../images/Dreyer_clf_scores_vs_subj_info/4_selected_subjects.png
156+
# :align: center
157+
# :alt: 4_selected_subjects
158+
159+
########################################################################################
160+
# Obtained results for all subjects correspond to the following figure.
161+
#
162+
# .. image:: ../images/Dreyer_clf_scores_vs_subj_info/all_subjects.png
163+
# :align: center
164+
# :alt: all_subjects

moabb/datasets/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@
5555
CastillosCVEP40,
5656
CastillosCVEP100,
5757
)
58+
from .dreyer2023 import Dreyer2023, Dreyer2023A, Dreyer2023B, Dreyer2023C
5859
from .epfl import EPFLP300
5960
from .erpcore2021 import (
6061
ErpCore2021_ERN,

moabb/datasets/download.py

+47
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
import urllib
1010
from pathlib import Path
1111

12+
import pandas as pd
1213
import requests
1314
from mne import get_config, set_config
1415
from mne.datasets.utils import _get_path
@@ -297,3 +298,49 @@ def download_if_missing(file_path, url, warn_missing=True):
297298
if warn_missing:
298299
warn(f"{file_path} not found. Downloading from {url}")
299300
urllib.request.urlretrieve(url, file_path)
301+
302+
303+
def create_metainfo_osf(osf_code: str) -> pd.DataFrame:
304+
"""Create a metadata file for a dataset stored on OSF."""
305+
# OSF API base URL for the project's OSF storage
306+
307+
base_url = f"https://api.osf.io/v2/nodes/{osf_code}/files/osfstorage/"
308+
309+
files = [] # to collect (name, url) tuples
310+
stack = [base_url + "?page[size]=100"] # start with base URL, up to 100 results
311+
312+
while stack:
313+
url = stack.pop()
314+
try:
315+
response = requests.get(url)
316+
data = response.json()
317+
except Exception as e:
318+
print(f"Failed to fetch {url}: {e}")
319+
continue
320+
321+
# Loop through items in this page
322+
for item in data.get("data", []):
323+
attrs = item.get("attributes", {})
324+
kind = attrs.get("kind")
325+
if kind == "folder":
326+
# If folder, add its listing URL to stack for later retrieval
327+
rel = item.get("relationships", {})
328+
files_rel = rel.get("files", {}) if rel else {}
329+
folder_url = files_rel.get("links", {}).get("related", {}).get("href")
330+
if folder_url:
331+
# Append page[size]=100 to folder URL as well for efficiency
332+
stack.append(folder_url + "?page[size]=100")
333+
elif kind == "file":
334+
name = attrs.get("name")
335+
download_url = item.get("links", {}).get("download")
336+
if name and download_url:
337+
files.append((name, download_url))
338+
339+
# If there's a next page, add it to stack to continue pagination
340+
next_url = data.get("links", {}).get("next")
341+
if next_url:
342+
stack.append(next_url)
343+
344+
metainfo = pd.DataFrame(files, columns=["filename", "url"])
345+
346+
return metainfo

0 commit comments

Comments
 (0)