Skip to content

Commit

Permalink
cite fig 1 in section 2, add ribbons to legends (#66)
Browse files Browse the repository at this point in the history
  • Loading branch information
eahowerton committed May 8, 2024
1 parent 5614f51 commit efe8917
Showing 1 changed file with 11 additions and 7 deletions.
18 changes: 11 additions & 7 deletions analysis/paper/hubEnsembles_manuscript.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -102,15 +102,17 @@ For probabilistic predictions, there are two commonly used classes of methods to

The quantile average combines a set of quantile functions, $\mathcal{Q} = \{F_i^{-1}(\theta)| i \in 1,...,N \}$, with a given set of weights, $\pmb{w}$, as $$
F^{-1}_Q(\theta) = C_Q(\mathcal{Q}, \pmb{w}) = \sum_{i = 1}^Nw_iF^{-1}_i(\theta).
$$This computes the average value of predictions across different models for each fixed quantile level $\theta$. It is also possible to use other combination functions, such as a weighted median, to combine quantile predictions.
$$
This computes the average value of predictions across different models for each fixed quantile level $\theta$. It is also possible to use other combination functions, such as a weighted median, to combine quantile predictions.
The probability average or linear pool is calculated by averaging probabilities across predictions for a fixed value of the target variable, $x$. In other words, for a set $\mathcal{F} = \{F_i(x)| i \in 1,...,N \}$ containing the values of CDFs at the point $x$ and weights $\pmb{w}$, the linear pool is calculated as
$$
F_{LOP}(x) = C_{LOP}(\mathcal{F}, \pmb{w}) = \sum_{i = 1}^Nw_iF_i(x).
$$
For a set of PMF values, $\{f_i(x)|i \in 1, ..., N\}$, the linear pool can be equivalently calculated: $f_{LOP}(x) = \sum_{i = 1}^N w_i f_i(x)$.
For a set of PMF values, $\{f_i(x)|i \in 1, ..., N\}$, the linear pool can be equivalently calculated: $f_{LOP}(x) = \sum_{i = 1}^N w_i f_i(x)$. For a visual depiction of these equations, see @fig-example-quantile-average-and-linear-pool below.
The different averaging methods for probabilistic predictions yield different properties of the resulting ensemble distribution. For example, the variance of the linear pool is $\sigma^2_{LOP} = \sum_{i=1}^Nw_i\sigma_i^2 + \sum_{i=1}^Nw_i(\mu_i-\mu_{LOP})^2$, where $\mu_i$ is the mean and $\sigma^2_i$ is the variance of individual prediction $i$, and although there is no closed-form variance for the quantile average, the variance of the quantile average will always be less than or equal to that of the linear pool [@lichtendahl2013]. Both methods generate distributions with the same mean, $\mu_Q = \mu_{LOP} = \sum_{i=1}^Nw_i\mu_i$, which is the mean of individual model means [@lichtendahl2013]. The linear pool method preserves variation between individual models, whereas the quantile average cancels away this variation under the assumption it constitutes sampling error [@howerton2023].
Expand Down Expand Up @@ -185,7 +187,7 @@ The third group of columns in model output specify the model predictions and det
This representation of predictive model output is codified by the `model_out_tbl` S3 class in the [hubUtils]{.pkg} package, one of the foundational hubverse packages. Although this S3 class is required for all [hubEnsembles]{.pkg} functions, model predictions in other formats can easily be transformed using the `as_model_out_tbl()` function from [hubUtils]{.pkg}. An example of this transformation is provided in @sec-case-study.
| `output_type` | `output_type_id` | `value` |
|:----------------|:----------------------|:-------------------------------|
|:-----------------|:----------------------|:------------------------------|
| `mean` | NA (not used for mean predictions) | Numeric: The mean of the predictive distribution |
| `median` | NA (not used for median predictions) | Numeric: The median of the predictive distribution |
| `quantile` | Numeric between 0.0 and 1.0: A quantile level | Numeric: The quantile of the predictive distribution at the quantile level specified by the `output_type_id` |
Expand All @@ -200,7 +202,7 @@ This representation of predictive model output is codified by the `model_out_tbl
The [hubEnsembles]{.pkg} package includes two functions that perform ensemble calculations: `simple_ensemble()`, which applies some function to each model prediction, and `linear_pool()`, which computes an ensemble using the linear opinion pool method. In the following sections, we outline the implementation details for each function and how these implementations correspond to the statistical ensembling methods described in @sec-defs. A short description of the calculation performed by each function is summarized by output type in @tbl-fns-by-output-type.
| `output_type` | `simple_ensemble(..., agg_fun="mean")` | `linear_pool()` |
|----------------|----------------------------|----------------------------|
|------------------|---------------------------|---------------------------|
| `mean` | mean of individual model means | mean of individual model means |
| `median` | mean of individual model medians | NA |
| `quantile` | mean of individual model target variable values at each quantile level, $F^{-1}_Q(\theta)$ | quantile of the distribution obtained by computing the mean of estimated individual model cumulative probabilities at each target variable value, $F^{-1}_{LOP}(\theta)$ |
Expand Down Expand Up @@ -583,7 +585,7 @@ We can plot these forecasts and the target data using the `plot_step_ahead_model
#| fig-cap: "One example quantile forecast of weekly incident influenza
#| hospitalizations in Massachusetts from each of three models (panels).
#| Forecasts are represented by a median (line), 50% and 90% prediction
#| intervals. Gray points represent observed incident hospitalizations."
#| intervals (ribbons). Gray points represent observed incident hospitalizations."
#| fig-width: 8
#| fig-height: 4
model_outputs_plot <- hubExamples::forecast_outputs |>
Expand Down Expand Up @@ -758,7 +760,8 @@ As expected, the mean, median, and geometric mean each give us slightly differen
#| hospitalizations in Massachusetts. Each ensemble combines individual
#| predictions from the example hub (@fig-plot-ex-mods) using a different
#| method: arithmetic mean, geometric mean, or median. All methods correspond to
#| variations of the quantile average approach."
#| variations of the quantile average approach. Ensembles are represented by a median
#| (line), 50% and 90% prediction intervals (ribbons)."
#| fig-height: 4
#| fig-width: 8
Expand Down Expand Up @@ -835,7 +838,8 @@ In @fig-plot-ex-quantile-and-linear-pool, we compare ensemble results generated
#| predictions of weekly incident influenza hospitalizations in Massachusetts,
#| which provide an example of quantile output type. Note, for quantile output
#| type, `simple_ensemble` corresponds to a quantile average. Ensembles combine
#| individual models from the example hub (@fig-plot-ex-mods)."
#| individual models from the example hub, and are represented by a median
#| (line), 50% and 90% prediction intervals (ribbons) (@fig-plot-ex-mods)."
#| fig-width: 10
#| fig-height: 4
Expand Down

0 comments on commit efe8917

Please sign in to comment.