The KORA cohort study [Holle et al. (2005)] data is only accessible after applying to a research project on the KORA-passt platform.
In the microbiome_ASV_data
folder are the
details about the data pre-processing and the data files handed-in after
successful application to a KORA project.
The R code for our pair matching implementation and diagnostic plots
generation can be found in the design
file. The matrix of
10,000 possible randomization of the intervention assignment is also
generated directly after matching.
Note 1: the matching functions Stephane_matching.R were written in Rcpp by Stéphane Shao.
Note 2: other matching strategies are valid. The researcher should take the conceptual hypothetical experiment into account when choosing its strategy.
Note 3: to make the matching easier we re-formated/coded the original
KORA variables. See
misc/format_KORA_variables.R
file.
The ASV (or OTU) data table, matched dataset, and phylogenetic tree are combined in a phyloseq object before making statistical analyses. Thus, the following code can be used for any other data combined in a phyloseq object.
R code in 1_alpha_diversity
folder.
We used Amy Willis’ R packages
breakaway
for richness
estimation [Willis and Bunge,
2015] and
DivNet
for Shannon index estimation
[Willis,
2020].
R code in 2_beta_diversity
folder.
The distance calculations where done with the phyloseq package and we
used Anna Plantinga’s R package
MiRKAT
for the test statistic calculations [Zhao et al.,
2015].
R code in 3_mean_diff_test
folder.
Cao, Lin, and Li’s github repository:
composition-two-sampe-test
[Cao, Lin, and Li,
2018].
R code in 4_differential_abundance
folder.
We use the function dacomp.test()
of Barak Brill’ R package:
dacomp
to calculate the test
statistic for all taxa at once [Brill, Amir, and Heller,
2020].
R code in 5_networks
folder.
Peschel et al.’s
(2020)
R package NetCoMi
enables
the estimation and comparision of networks for compositional data.
[Holle et al., 2005] Holle R, Happich M, Löwel H, Wichmann HE (2005); MONICA/KORA Study Group. KORA–a research platform for population based health research. Gesundheitswesen, 67.
[Willis and Bunge, 2015] Willis A and Bunge J (2015); Estimating diversity via frequency ratios. Biometric Methodology, 71:1042-1049.
[Willis and Bryan, 2020] Willis A and Bryan DM (2020); Estimating diversity in networked ecological communities Biostatistics, kxaa015.
[Zhao et al., 2015] Zhao N, Chen J, Carroll IM et al. (2015); Testing in Microbiome-Profiling Studies with MiRKAT, the Microbiome Regression-Based Kernel Association Test. Am J Hum Genet., 96(5):797-807.
[Cao, Lin, Li, 2018] Cao Y, Lin W, and Li H (2018); Two-sample tests of high-dimensional means for compositional data. Biometrika, 105:115-132.
[Brill, Amir, and Heller, 2020] Brill B, Amir A, and Heller R (2020) Testing for differential abundance in compositional counts data, with application to microbiome studies.] arXiv
[Peschel et al., 2020] Peschel et al. (2020) NetCoMi: network construction and comparison for microbiome data in R. Briefings in Bioinformatics, bbaa290.