Here we provide a list of publicly-available datasets that we have generated and uploaded to repositories. Some of the data is yet to be released, and will be available following publication.
NCBI Sequencing Read Archive
The following datasets have been uploaded to the NCBI Sequencing Read Archive (SRA) database in their original FASTQ data format.
Sequencing type
Sequencing runs (uploaded)
Bulk transcriptomics
425
Single-cell transcriptomics
2
Shotgun metagenomics
310
16S amplicon
1,069
ITS amplicon
373
Host organism
Context
BioProject
Availability
Bulk transcriptomics
Single-cell transcriptomics
Shotgun metagenomics
16S amplicon
ITS amplicon
Human
Infant cystic fibrosis
PRJNA978345 – 2024
✅ Released
96 stool samples
75 BAL samples
Rat
Early life stress + mild traumatic brain injury
PRJNA940177 – 2024
✅ Released
76 stool samples
Mouse
SHIP-deficient model of Crohn's-like ileitis and chronic lung inflammation
PRJNA1086166 – 2024
✅ Released
24 stool samples
Human
Paediatric healthy + infant wheeze
PRJNA1076275 – 2024
✅ Released
188 nasal swabs + 73 blood samples
320 nasal swabs
135 nasal swabs
Human
Early life + airways
PRJNA694493 – 2021
✅ Released
85 nasal swabs
118 nasal swabs + 119 oropharyngeal swabs
119 nasal swabs + 119 oropharyngeal swabs
Mouse OTII cells
Germinal centre expansion + IL-21 role
PRJNA776662 – 2021
✅ Released
8 culture samples
Mouse
Allergic airway inflammation
PRJNA641984 – 2020
✅ Released
20 stool samples
127 stool samples
Human
Male-associated infertility
PRJNA509076 – 2018
✅ Released
94 seminal fluid samples
Human
Early life + immune development
PRJNA475630 – 2018
✅ Released
16 tracheal aspirates
45 tracheal aspirates
Human
Paediatric severe wheeze + asthma
PRJNA1080233
⏳ To be released
55 bronchial brushes
28 bronchial brushes
Mouse
High fat diet
PRJNA1131116
⏳ To be released
24 ileum luminal samples + 24 ileum mucosal samples + 22 colon luminal samples
Mouse
Early life antibiotic treatment
PRJNA1112091
⏳ To be released
2 lung structural cell digests
96 stool samples
41 lung tissue samples + 30 BAL samples
European Nucleotide Archive
The following datasets have been uploaded to the European Nucleotide Archive (ENA) database in their original FASTQ data format.
Sequencing type
Sequencing runs (uploaded)
16S amplicon
1,179
Host organism
Context
Project ID
16S amplicon
Availability
Human
Early life + atopic dermatitis
PRJEB42268 – 2022
✅ Released
1,179 lateral upper arm swabs