Presentation of results from cohort analyses
Cohorts are important scientific sources of health and wellness information. Because of this, how results are presented needs to be carefully considered. The medium of presentation, be it plots or tables, can impact how the findings are seen and consumed. In this chapter we will cover some tips and ways of presenting cohort findings.

Presenting cohort findings is tricky, be careful

Adding more details to each model item in a list

Imagine you've ran several models. You:

  • Scaled predictors to compare estimates.
  • Set confounders and other predictors as baseline age, sex, smoking, and education.
  • Have each predictor with unadjusted and adjusted models (time and subject ID were included in all).
  • Tidied models and exponentiated estimates.

You have 8 models in total, stored as a list. We should add more details to each model to differentiate them from each other. Use map() from purrr to wrangle each model simultaneously. The new code here, term[2], selects the main predictor, which is the second element in the term column.



  • Using map(), add a model column to indicate the models are "Unadjusted".


  • Add the adjustment column using model = "Unadjusted".


# Add predictor and model type to each list item
unadjusted_models_list <- ___(
  	# .x is purrr for "model goes here"
            # This selects predictor, not confounder
            predictor = term[2],
          	# Indicate model "adjustment"
            model = ___


# Add predictor and model type to each list item
unadjusted_models_list <- map(
  	# .x is purrr for "model goes here"
            # This selects predictor, not confounder
            predictor = term[2],
            # Indicate model "adjustment"
            model = "Unadjusted"



  • Do the same thing for the "Adjusted" model list.


  • Refer to each list object in map with .x.
  • Include the ~ before mutate.


# Add predictor and model type to each list item
adjusted_models_list <- map(
  	# .x is purrr for "model goes here"
          	# This selects predictor, not confounder
            predictor = term[2],
          	# Indicate model "adjustment"
            model = ___


# Add predictor and model type to each list item
adjusted_models_list <- map(
  	# .x is purrr for "model goes here"
          	# This selects predictor, not confounder
            predictor = term[2],
          	# Indicate model "adjustment"
            model = "Adjusted"


Combining the list of models into one data frame

The most efficient approach to later plotting and creating tables is to have all models in a single data frame. You've already prepared them a bit, now it's time to combine them together so you can continue working with them.


  • Use bind_rows() to combine unadjusted_models_list and adjusted_models_list.
  • Continuing the pipe, add an outcome column with the value "got_cvd".
  • Finally, use filter() to keep only conditions where the predictor equals term and effect equals "fixed".


  • Filter when predictor and term are the same (predictor == term).


unadjusted_models_list <- map(
    ~mutate(.x, predictor = term[2], model = "Unadjusted")
adjusted_models_list <- map(
    ~mutate(.x, predictor = term[2], model = "Adjusted")


all_models <- bind_rows(
  		# Combine the two lists of models 
	) %>% 
    mutate(outcome = ___) %>% 
	# Keep only predictor rows and fixed effects
    filter(___ == , ___ == ___)



all_models <- bind_rows(
  		# Combine the two lists of models 
	) %>% 
    mutate(outcome = "got_cvd") %>% 
	# Keep only predictor rows and fixed effects
    filter(predictor == term, effect == "fixed")



Communicating cohort findings through graphs

Plotting model estimate and uncertainty

Statistical analysis used on cohort data usually output some time of regression estimate along with a measure of uncertainty (e.g. 95% confidence interval). Sometimes it makes sense to present these results in a table, but often the better approach is to create a figure instead. Figures show magnitude, direction, uncertainty, and comparison of results very effectively.

Create a plot of the unadjusted model results that highlights the estimate and uncertainty of the estimate.



  • For this exercise, filter to keep model that is equal to "Unadjusted".


  • Filter should be model == "Unadjusted".


# Keep only unadjusted models
unadjusted_results <- all_models %>% 
    filter(___ == ___)

# Check filtered data


# Keep only unadjusted models
unadjusted_results <- all_models %>% 
    filter(model == "Unadjusted")

# Check filtered data



  • Set y as predictor, x as estimate, xmin as conf.low, and xmax as conf.high.
  • Add geom_point() and geom_errorbarh() layers.


  • The errorbarh (horizontal) requires xmin = conf.low, xmax = conf.high.


unadjusted_results <- all_models %>% 
    filter(model == "Unadjusted")

# Create a dot and error bar plot
model_plot <- unadjusted_results %>% 
    ggplot(aes(y = ___, x = ___,
              xmin = ___, xmax = ___)) +
    ___() +

# Check the plot


unadjusted_results <- all_models %>% 
    filter(model == "Unadjusted")

# Create a dot and error bar plot
model_plot <- unadjusted_results %>% 
    ggplot(aes(y = predictor, x = estimate, 
               xmin = conf.low, xmax = conf.high)) +
    geom_point() +

# Check the plot



  • Use geom_vline() to add a vertical line, setting xintercept as 1 for the "center line".


  • Set xintercept = 1 for the center line.


unadjusted_results <- all_models %>% 
    filter(model == "Unadjusted")

# Create a dot and error bar plot
model_plot <- unadjusted_results %>% 
    ggplot(aes(y = predictor, x = estimate, 
               xmin = conf.low, xmax = conf.high)) +
    geom_point() +
    geom_errorbarh() +
    # Add vertical line
    ___(xintercept = ___)

# Check the plot


unadjusted_results <- all_models %>% 
    filter(model == "Unadjusted")

# Create a dot and error bar plot
model_plot <- unadjusted_results %>% 
    ggplot(aes(y = predictor, x = estimate, 
               xmin = conf.low, xmax = conf.high)) +
    geom_point() +
    geom_errorbarh() +
    # Add vertical line
    geom_vline(xintercept = 1)

# Check the plot


Create a more polished plot

Now that we've created this plot, let's polish it up. We want it to be "publication quality", since we'll eventually present this figure to others.

As with the previous exercise, use the unadjusted_results data frame you created to plot the findings. This time, make the plot more polished and presentable.


  • Set the point size to 3, the error bar height to 0.1, and the linetype to "dotted".
  • Include appropriate axis labels ("Predictors" on the y and "Odds Ratio (95% CI)" on the x). Recall that CI means confidence interval.
  • Set the theme to theme_classic().


  • The labels should be of the form x = "Axis Label" (e.g. for the x-axis).


unadjusted_results <- all_models %>% 
    filter(model == "Unadjusted")


# Make the plot more polished
model_plot <- unadjusted_results %>% 
    ggplot(aes(y = predictor, x = estimate, xmin = conf.low, xmax = conf.high)) +
    geom_point(size = ___) +
    geom_errorbarh(height = ___) +
    geom_vline(xintercept = 1, linetype = ___) +
    labs(y = ___, x = ___) +
	# Set the theme


# Make the plot more polished
model_plot <- unadjusted_results %>% 
    ggplot(aes(y = predictor, x = estimate, xmin = conf.low, xmax = conf.high)) +
    geom_point(size = 3) +
    geom_errorbarh(height = 0.1) +
    geom_vline(xintercept = 1, linetype = "dotted") +
    labs(y = "Predictors", x = "Odds ratio (95% CI)") +
	# Set the theme


Visualize unadjusted and adjusted model results

The STROBE best practices indicate to show both "crude" (unadjusted) and adjusted model results. Showing both can be informative and insightful into the research question. Create a plot of your results showing both unadjusted and adjusted models. Do the same steps as in the previous exercise for creating the plot.


  • Use the all_models data frame this time.
  • Using facet_grid() to split the plot by rows with the model variable (called inside vars()).


  • The faceting variable should be called as vars(model).




# Show results of both adjusted and unadjusted
plot_all_models <- ___ %>% 
    ggplot(aes(y = predictor, x = estimate, xmin = conf.low, xmax = conf.high)) +
    geom_point(size = 3) +
    geom_errorbarh(height = 0.1) +
    geom_vline(xintercept = 1, linetype = "dotted") +
    # Facet plot by model
    ___(rows = vars(___)) +
    labs(y = "Predictors", x = "Odds ratio (95% CI)") +


# Show results of both adjusted and unadjusted
plot_all_models <- all_models %>% 
    ggplot(aes(y = predictor, x = estimate, xmin = conf.low, xmax = conf.high)) +
    geom_point(size = 3) +
    geom_errorbarh(height = 0.1) +
    geom_vline(xintercept = 1, linetype = "dotted") +
    # Split plot by model
    facet_grid(rows = vars(model)) +
    labs(y = "Predictors", x = "Odds ratio (95% CI)") +


Use tables effectively to show your results

Present the basic characteristics of the cohort

A classic use for tables is showing the basic characteristics of a cohort dataset, as there are diverse data types and summary statistics that need to be shown. Including a basic participant characteristics table is part of the STROBE best practices. This table can be quite informative for others when they interpret your analysis results.

Using the carpenter package, create a table showing summary statistics for each data collection visit.



  • Set "followup_visit_number" for the header argument of outline_table().


  • Use quotes around followup_visit_number.


# Create a table of summary statistics
characteristics_table <- tidied_framingham %>% 
	# These discrete variables are numeric, but must be factors
	mutate_at(vars(followup_visit_number, got_cvd), as.factor) %>% 
	# Set followup visit number as table column
    outline_table(header = ___) 

# Check the table


# Create a table of summary statistics
characteristics_table <- tidied_framingham %>% 
	# These discrete variables are numeric, but must be factors
    mutate_at(vars(followup_visit_number, got_cvd), as.factor) %>% 
	# Set followup visit number as table column
    outline_table(header = "followup_visit_number") 

# Check the table



  • Add a row for the "got_cvd", "sex", and "education_combined" variables, using stat_nPct for the stat argument.


  • The variables should be quoted, e.g. "sex".


# Create a table of summary statistics
characteristics_table <- tidied_framingham %>% 
    mutate_at(vars(followup_visit_number, got_cvd), as.factor) %>% 
    outline_table(header = "followup_visit_number") %>% 
    # Show n (%) for discrete variables as rows
    add_rows(c(___, ___, ___), stat = ___)

# Check the table


# Create a table of summary statistics
characteristics_table <- tidied_framingham %>% 
    mutate_at(vars(followup_visit_number, got_cvd), as.factor) %>% 
    outline_table(header = "followup_visit_number") %>% 
    # Show n (%) for discrete variables as rows
    add_rows(c("got_cvd", "sex", "education_combined"), stat = stat_nPct)

# Check the table



  • Add rows for "total_cholesterol", "body_mass_index", and "participant_age" using stat_medianIQR for the stat argument.


  • The variables need to be surrounded by quotes, just like the function above.


# Create a table of summary statistics
characteristics_table <- tidied_framingham %>% 
    mutate_at(vars(followup_visit_number, got_cvd), as.factor) %>% 
    outline_table(header = "followup_visit_number") %>% 
    add_rows(c("got_cvd", "sex", "education_combined"), stat = stat_nPct) %>% 
    # Show median (range) for continuous variables
    add_rows(c(___, ___, ___), stat = ___)

# Check the table


# Create a table of summary statistics
characteristics_table <- tidied_framingham %>% 
    mutate_at(vars(followup_visit_number, got_cvd), as.factor) %>% 
    outline_table(header = "followup_visit_number") %>% 
    add_rows(c("got_cvd", "sex", "education_combined"), stat = stat_nPct) %>% 
    # Show median (range) for continuous variables
    add_rows(c("total_cholesterol", "body_mass_index", "participant_age"), stat = stat_medianIQR)

# Check the table



  • Rename the table "header" to "Measures", "Baseline", "Second followup", and "Third followup", then build_table() to markdown format.


  • The new column headers should be given as a character vector.


characteristics_table <- tidied_framingham %>% 
    mutate_at(vars(followup_visit_number, got_cvd), as.factor) %>% 
    outline_table(header = "followup_visit_number") %>% 
    add_rows(c("got_cvd", "sex", "education_combined"), stat = stat_nPct) %>% 
    add_rows(c("participant_age", "body_mass_index", "total_cholesterol"), stat = stat_medianIQR) %>% 
    # Rename headers to better titles
    renaming("header", c(___, ___, ___, ___))

# Build the table and convert to markdown form


characteristics_table <- tidied_framingham %>% 
    mutate_at(vars(followup_visit_number, got_cvd), as.factor) %>% 
    outline_table(header = "followup_visit_number") %>% 
    add_rows(c("got_cvd", "sex", "education_combined"), stat = stat_nPct) %>% 
    add_rows(c("participant_age", "body_mass_index", "total_cholesterol"), stat = stat_medianIQR) %>% 
    # Rename headers to better titles
    renaming("header", c("Measures", "Baseline", "Second followup", "Third followup"))

# Build the table and convert to markdown form


Supplemental tables of raw numbers for results

While the main messaging and presentation of results should emphasize figures over tables, often it is useful to other researchers (especially those doing meta-analyses or aggregating results) that the raw model results be given as well. Here we can use tables to give this data, as a supplement to the figure.

Provide the estimates and confidence intervals of the unadjusted and adjusted model results in a table format that you could include in a document or report. The packages glue, stringr, and knitr have been loaded.



  • Round the estimate, conf.low, and conf.high to 2 digits using the function round (don't use with ()).


  • mutate_at() takes variables (as the first argument) inside vars() and applies a function like round as the second argument.


# Prepare the results for the table
table_model_results <- all_models %>% 
    # Round values of variables to 3
    mutate_at(vars(___, ___, ___), ___, digits = ___)

# View wrangled data


# Prepare the results for the table
table_model_results <- all_models %>% 
    # Round values of variables to 3
    mutate_at(vars(estimate, conf.low, conf.high), round, digits = 2)

# View wrangled data



  • Use glue() to create a new variable in the form estimate (conf.low, conf.high), then replace underscores with spaces in predictor with str_replace_all().


  • The estimate and CI variables should be placed inside the {} in glue().


# Prepare the results for the table
table_model_results <- all_models %>% 
    mutate_at(vars(estimate, conf.low, conf.high), round, digits = 2) %>% 
    # Use glue function to combine variables
    mutate(estimate_ci = glue("{___} ({___}, {___})"),
           # Underscores to spaces in predictor
           predictor = str_replace_all(___, "_", " "))

# View wrangled data


# Prepare the results for the table
table_model_results <- all_models %>% 
    mutate_at(vars(estimate, conf.low, conf.high), round, digits = 2) %>% 
    # Use glue function to combine variables
    mutate(estimate_ci = glue("{estimate} ({conf.low}, {conf.high})"),
           # Underscores to spaces in predictor
           predictor = str_replace_all(predictor, "_", " "))

# View wrangled data



  • Keeping model, predictor, and estimate_ci variables, use spread on model and estimate_ci.
  • Create the formatted table with kable().


  • spread takes two arguments: 1) the current discrete column (model) that will be the new columns names, and 2) the values (estimate_ci) that will be in the new columns.


# Prepare the results for the table
table_model_results <- all_models %>% 
    mutate_at(vars(estimate, conf.low, conf.high), round, digits = 2) %>% 
    mutate(estimate_ci = glue("{estimate} ({conf.low}, {conf.high})"),
           predictor = predictor %>% 
               str_remove("_scaled") %>% 
               str_replace_all("_", " ")) %>%
    # Keep then spread variables for final table
    select(___, ___, ___) %>% 
    spread(___, ___)

# Create a Markdown table
kable(___, caption = "Estimates and 95% CI from all models.")


# Prepare the results for the table
table_model_results <- all_models %>% 
    mutate_at(vars(estimate, conf.low, conf.high), round, digits = 2) %>% 
    mutate(estimate_ci = glue("{estimate} ({conf.low}, {conf.high})"),
           predictor = predictor %>% 
               str_remove("_scaled") %>% 
               str_replace_all("_", " ")) %>%
    # Keep then spread variables for final table
    select(model, predictor, estimate_ci) %>% 
    spread(model, estimate_ci)

# Create a Markdown table
kable(table_model_results, caption = "Estimates and 95% CI from all models.")


