Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mediation Chapter #667

Open
wants to merge 12 commits into
base: devel
Choose a base branch
from
6 changes: 4 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Package: OMA
Title: Orchestrating Microbiome Analysis with Bioconductor
Version: 0.98.35
Date: 2025-02-05
Version: 0.98.36
Date: 2025-02-10
Authors@R:
c(
person(given = "Tuomas", family = "Borman", role = c("aut", "cre"), email = "[email protected]", comment = c(ORCID = "0000-0002-8563-8884")),
Expand Down Expand Up @@ -49,6 +49,7 @@ Suggests:
factoextra,
fido,
forcats,
ggdag,
ggplot2,
ggpubr,
ggtree,
Expand All @@ -62,6 +63,7 @@ Suggests:
IntegratedLearner,
knitr,
maaslin3,
mediation,
mia,
miaTime,
miaViz,
Expand Down
1 change: 1 addition & 0 deletions inst/assets/_book.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ book:
chapters:
- pages/differential_abundance.qmd
- pages/correlation.qmd
- pages/mediation.qmd
- part: "Networks"
chapters:
- pages/network_learning.qmd
Expand Down
52 changes: 52 additions & 0 deletions inst/assets/bibliography.bib
Original file line number Diff line number Diff line change
Expand Up @@ -2384,6 +2384,58 @@ @article{mangiola2023
eprint = {https://www.pnas.org/doi/pdf/10.1073/pnas.2203828120},
}

@Article{Tingley2014mediation,
title = {{mediation}: {R} Package for Causal Mediation Analysis},
author = {Dustin Tingley and Teppei Yamamoto and Kentaro Hirose and Luke Keele and Kosuke Imai},
journal = {Journal of Statistical Software},
year = {2014},
volume = {59},
number = {5},
pages = {1--38},
url = {http://www.jstatsoft.org/v59/i05/},
}

@article{Xia2021mediation,
title={Mediation Analysis of Microbiome Data and Detection of Causality in Microbiome Studies},
author={Xia, Yinglin},
journal={Inflammation, Infection, and Microbiome in Cancers: Evidence, Mechanisms, and Implications},
pages={457--509},
year={2021},
publisher={Springer}
}

@article{Dinan2022antibiotics,
title={Antibiotics and mental health: The good, the bad and the ugly},
author={Dinan, Katherine and Dinan, Timothy},
journal={Journal of Internal Medicine},
volume={292},
number={6},
pages={858--869},
year={2022},
publisher={Wiley Online Library}
}

@article{Logan2014nutritional,
title={Nutritional psychiatry research: an emerging discipline and its intersection with global urbanization, environmental challenges and the evolutionary mismatch},
author={Logan, Alan C and Jacka, Felice N},
journal={Journal of Physiological Anthropology},
volume={33},
pages={1--16},
year={2014},
publisher={Springer}
}

@article{Fairchild2017best,
title={Best (but oft-forgotten) practices: mediation analysis},
author={Fairchild, Amanda J and McDaniel, Heather L},
journal={The American journal of clinical nutrition},
volume={105},
number={6},
pages={1259--1271},
year={2017},
publisher={Elsevier}
}

@article{Marchesi2015,
title = {The vocabulary of microbiome research: a proposal},
volume = {3},
Expand Down
295 changes: 295 additions & 0 deletions inst/pages/mediation.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,295 @@
# Mediation {#sec-mediation}

```{r setup, echo=FALSE, results="asis"}
library(rebook)
chapterPreamble()
```

Mediation analysis is used to study the effect of an exposure variable (X) on
the outcome (Y) through a third factor, known as mediator (M). Mathematically,
this relationship can be described as follows:

$$
Y \thicksim X * M
$$

The contribution of a mediator is typically quantified in terms of Average
Causal Mediated Effect (ACME), that is, the portion of the association between X
and Y that is explained by M. In practice, this corresponds to the difference
between the Total Effect (TE) and the Average Direct Effect (ADE):

$$
ACME = TE - ADE
$$

The microbiome can mediate the effects of multiple environmental stimuli on
human health. However, the importance of its role as a mediator depends on the
nature of the stimulus. For example, the effect of dietary fiber intake on host
behaviour is largely mediated by the gut microbiome [@Logan2014nutritional]. In
contrast, the indirect impact of antibiotic use on mental health through an
altered microbiome represents a more subtle process [@Dinan2022antibiotics].

In general, the wide range of mediation effects can be divided into two classes:

- partial mediation, where the mediator assists the exposure variable in
conveying only part of the effect on the outcome
- complete mediation, where the mediator conveys the full effect of the the
exposure variable on the outcome

```{r}
#| label: fig-mediation
#| fig-cap: Directed acyclic graphs for two possible relationships involving
#| mediation. A. The effect of x on y is mostly direct, and only a portion
#| thereof is mediated by m. B. The effect of x on y is completely mediated by m.
#| message: false
#| echo: false

# Import libraries
library(ggdag)
library(ggplot2)
library(patchwork)

# Plot triangle for complete mediation
p1 <- ggdag_mediation_triangle(x_y_associated = TRUE, stylized = TRUE) +
ggtitle("A: Partial mediation") +
theme_dag()

# Plot triangle for partial mediation
p2 <- ggdag_mediation_triangle(stylized = TRUE) +
ggtitle("B: Complete mediation") +
theme_dag()

# Combine plots
p1 | p2
```

Mediation analysis is based on the assumption that the exposure variable, the
mediator and the outcome follow one another in this temporal sequence. Therefore,
investigating mediation is a suitable analytical choice in longitudinal studies,
whereas it is discouraged in cross-sectional ones [@Fairchild2017best].

We demonstrate a standard mediation analysis with the `hitchip1006` dataset
from the `miaTime` package, which contains a genus-level assay for 1006 Western
adults of 6 different nationalities.

```{r}
#| label: mediation1
#| message: false

# Import libraries
RiboRings marked this conversation as resolved.
Show resolved Hide resolved
library(mia)
library(miaViz)
library(scater)
library(patchwork)
library(knitr)

# Load dataset
data(hitchip1006, package = "miaTime")
tse <- hitchip1006
```

In our analyses, nationality and BMI group will represent the exposure (X)
and outcome (Y) variables, respectively. We make the broad assumption that
nationality reflects differences in the living environment between subjects.

```{r}
#| label: mediation2

# Convert BMI variable to numeric
tse$bmi_group <- as.numeric(tse$bmi_group)

# Agglomerate features by phylum
tse <- agglomerateByRank(tse, rank = "Phylum")

# Apply clr transformation to counts assay
tse <- transformAssay(
tse,
method = "clr",
pseudocount = 1
)
```

In the following examples, the effect of living environment on BMI mediated
by the microbiome is investigated in three different steps:

1. global contribution by alpha diversity
2. individual contributions by assay features
3. joint contributions by reduced dimensions

## Alpha diversity as mediator

First, we ask whether alpha diversity mediates the effect of living environment
on BMI. Using the `getMediation` function, the variables X, Y and M are specified
with the arguments `treatment`, `outcome` and `mediator`, respectively. We
control for sex and age and limit comparisons to two nationality groups, Central
Europeans (control) vs. Scandinavians (treatment).

```{r}
#| label: mediation3
#| message: false

# Analyse mediated effect of nationality on BMI via alpha diversity
# 100 permutations were done to speed up execution, but ~1000 are recommended
med_df <- getMediation(
tse,
treatment = "nationality",
outcome = "bmi_group",
mediator = "diversity",
covariates = c("sex", "age"),
treat.value = "Scandinavia",
control.value = "CentralEurope",
boot = TRUE, sims = 100
)

# Plot results as a forest plot
plotMediation(med_df, layout = "forest")
```

The forest plot above shows significance for both ACME and ADE, which suggests
that alpha diversity is a partial mediator of living environment on BMI. In
contrast, if ACME but not ADE were significant, complete mediation would be
inferred. The negative sign of the effect means that a lower BMI and alpha
diversity are associated with the control group (Scandinavians).

## Assay features as mediators

If we suspect that only certain features of the microbiome act as mediators,
we can estimate their individual contributions by fitting one model for each
feature in a selected assay. As multiple tests are performed, it is good
practice to correct the significance of the findings with a method of choice to
adjust p-values.

```{r}
#| label: mediation4
#| message: false

# Analyse mediated effect of nationality on BMI via clr-transformed features
# 100 permutations were done to speed up execution, but ~1000 are recommended
tse <- addMediation(
tse, name = "assay_mediation",
treatment = "nationality",
outcome = "bmi_group",
assay.type = "clr",
covariates = c("sex", "age"),
treat.value = "Scandinavia",
control.value = "CentralEurope",
boot = TRUE, sims = 100,
p.adj.method = "fdr"
)

# View results
kable(metadata(tse)$assay_mediation)
```

For convenience, results can be visualized with a heatmap, where rows represent
features and columns correspond to the coefficients for TE, ADE and ACME.
Significant findings can be marked with p-values or stars.

```{r}
#| label: mediation5

# Plot results as a heatmap
plotMediation(
tse, "assay_mediation",
layout = "heatmap",
add.significance = "symbol"
)
```

Results suggest that only four out of eight features (Bacteroidetes, Firmicutes,
Proteobacteria and Verrucomicrobia) partially mediate the effect of living
environment on BMI. As the sign is negative, a smaller abundance of these
mediators is found in Scandinavians compared to Central Europeans, which matches
the negative trend of alpha diversity.

While analyses were conducted at the phylum level to simplify results, using
original assays without agglomeration also represents a valid option. However,
the increase in phylogenetic resolution also implies a higher probability of
spurious findings, which in turn necessitates a stronger correction for multiple
comparisons. A solution to this issue is proposed in the following section.

## Reduced dimensions as mediators

Performing mediation analysis for each feature provides insight into individual
contributions. However, this approach greatly increases the number of multiple
tests to correct for and thus it reduces statistical power. To overcome this
issue, it is possible to assess the joint contributions of groups of features
by means of dimensionality reduction.

```{r}
#| label: mediation6
#| message: false

# Reduce dimensions with PCA
tse <- runPCA(
tse, name = "PCA",
assay.type = "clr",
ncomponents = 3
)

# Analyse mediated effect of nationality on BMI via principal components
# 100 permutations were done to speed up execution, but ~1000 are recommended
tse <- addMediation(
tse, name = "reddim_mediation",
treatment = "nationality",
outcome = "bmi_group",
dimred = "PCA",
covariates = c("sex", "age"),
treat.value = "Scandinavia",
control.value = "CentralEurope",
boot = TRUE, sims = 100,
p.adj.method = "fdr"
)

# View results
kable(metadata(tse)$reddim_mediation)
```

Results can be displayed as one forest plot for each reduced dimension. When
combined with a heatmap of the feature loadings by dimension, it helps deduce
whether certain groups of features act as mediators.

```{r}
#| label: mediation7

# Plot results as multiple forest plots
p1 <- plotMediation(
tse, "reddim_mediation",
layout = "forest"
)

# Plot loadings by principal component
p2 <- plotLoadings(
tse, "PCA",
ncomponents = 3, n = 8,
layout = "heatmap"
)

# Combine plots
p1 / p2
```

The plot above suggests that only PC1 partially mediates the effect of living
environment on BMI. Within this dimension, Bacteroidetes and Actinobacteria are
the largest contributors in opposite directions. Interestingly, in the previous
section the former but not the latter appeared significant individually, which
implies that mediation might emerge from their joint contribution.

## Final remarks

This chapter introduced the concept of mediation and demonstrated a standard
analysis of the microbiome as mediator at three different levels (global,
individual and joint contributions). Importantly, the provided method is
based on the `mediation` package and is limited to univariate comparisons and
binary conditions for the exposure variable [@Tingley2014mediation]. Therefore,
it is recommended to reduce the number of mediators under study by means of a
knowledge-based strategy to preserve statistical power.

A few methods for multivariate mediation analysis of high-dimensional omic
data also exist [@Xia2021mediation]. However, no one solution has emerged yet to
become the golden standard in microbiome data analysis, mainly because the
available approaches can only partially accommodate for the specific properties
of microbiome data, such as compositionality, sparsity and its hierarchical
structure. While this chapter proposed a standard approach to mediation
analysis, in the future fine-tuned solutions for the microbiome may also become
common.
Loading