Skip to content

357 stat methods vignette #365

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 29 commits into from
Feb 13, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
fc424e5
initial stats methods vignette
jtalboys Nov 21, 2024
3a866c1
creating ards for stats methods up to as_cards_fn usage
jtalboys Dec 12, 2024
0c9c733
Merge branch 'main' into 357-stat-methods-vignette
jtalboys Dec 12, 2024
11b1dce
add as_cards_fn content
jtalboys Dec 12, 2024
5f91fc6
Merge branch 'main' into 357-stat-methods-vignette
ddsjoberg Jan 8, 2025
1827617
Merge branch 'main' into 357-stat-methods-vignette
ddsjoberg Jan 14, 2025
dd2cdcb
new line for each sentence
jtalboys Jan 27, 2025
cdbcb05
mention single row required in broom tidy output
jtalboys Jan 27, 2025
1b69db0
simplify function definition
jtalboys Jan 27, 2025
e39b7d5
add wilcox to WORDLIST for stats methods vignette
jtalboys Jan 27, 2025
a5a6f16
style code in vignette
jtalboys Jan 27, 2025
bd5546f
simplify remaining wilcox test calls
jtalboys Jan 27, 2025
bcc224f
Merge branch 'main' into 357-stat-methods-vignette
ddsjoberg Jan 30, 2025
24aacd5
Merge branch 'main' into 357-stat-methods-vignette
ddsjoberg Jan 30, 2025
a3c67cf
Update DESCRIPTION
ddsjoberg Jan 30, 2025
ba47bef
Update DESCRIPTION
ddsjoberg Jan 31, 2025
64665ba
Update docs.yaml
ddsjoberg Jan 31, 2025
cc6ee35
Update creating-ards-for-stats-methods.Rmd
ddsjoberg Jan 31, 2025
f6e32f7
extend detail around broom::tidy example
jtalboys Feb 10, 2025
e53f587
Merge branch 'main' into 357-stat-methods-vignette
ddsjoberg Feb 10, 2025
3ce16f3
Update DESCRIPTION
ddsjoberg Feb 10, 2025
20cfac4
Merge branch '357-stat-methods-vignette' of https://github.com/jtalbo…
ddsjoberg Feb 10, 2025
01c67b2
Adding `cards::ard_complex()` example
ddsjoberg Feb 10, 2025
9c7f2ed
Merge branch 'main' into 357-stat-methods-vignette
ddsjoberg Feb 12, 2025
7da3428
Merge branch 'main' into 357-stat-methods-vignette
ddsjoberg Feb 12, 2025
4a0ec85
Merge branch 'main' into 357-stat-methods-vignette
ddsjoberg Feb 13, 2025
27b9d3b
fix typo in stats methods vignette
jtalboys Feb 13, 2025
9a19105
pkgdown updates for new article
ddsjoberg Feb 13, 2025
c5ee67a
Merge branch '357-stat-methods-vignette' of https://github.com/jtalbo…
ddsjoberg Feb 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/docs.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -42,3 +42,4 @@ jobs:
with:
default-landing-page: latest-tag
additional-unit-test-report-directories: unit-test-report-non-cran
deps-installation-method: setup-r-dependencies
4 changes: 2 additions & 2 deletions DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ Suggests:
testthat (>= 3.2.0),
withr (>= 3.0.0)
Config/Needs/coverage: hms
Config/Needs/website: rmarkdown, jsonlite, yaml, gtsummary, tfrmt, gt,
fontawesome, insightsengineering/nesttemplate
Config/Needs/website: rmarkdown, jsonlite, yaml, gtsummary, tfrmt, cardx,
gt, fontawesome, insightsengineering/nesttemplate
Config/testthat/edition: 3
Config/testthat/parallel: true
Encoding: UTF-8
Expand Down
23 changes: 19 additions & 4 deletions _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,26 @@ template:

navbar:
structure:
left: [intro, reference, articles, news]
left: [intro, reference, tutorials, news]
right: [search, github]
github:
icon: fa-github
href: https://github.com/insightsengineering/cards
left:
- icon: fa-home
href: index.html
- text: Reference
href: reference/index.html
- text: Articles
menu:
- text: "Getting Started"
href: articles/getting-started.html
- text: "Creating New ARDs"
href: articles/creating-ards.html
- text: "Other ARD Representations"
href: articles/structures.html
- text: News
href: news/index.html
github:
icon: fa-github
href: https://github.com/insightsengineering/cards

# development:
# mode: auto
Expand Down
1 change: 1 addition & 0 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
Expand Up @@ -47,3 +47,4 @@ unlist
unlists
unnested
unnests
wilcox
184 changes: 184 additions & 0 deletions vignettes/articles/creating-ards.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,184 @@
---
title: "Creating New ARDs"
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```

The basic functions offered by {cards} can create a wide variety of ARDs.
However, sometimes we may need to include the outputs from more complicated statistical methods in our ARDs.
In this article we'll look at a few different ways to implement the output from these statistical methods.

## {cardx}

The {cardx} package is an extension of {cards}.
The idea is that {cards} provides the core functions to create ARDs, while {cardx} contains a large number of extensions that implement various, commonly used statistical methods.
There are a large number of extensions for a wide variety of methods, including (but not limited to):

* Regression models
* ANOVA
* Chi-squared Test
* t-test
* LS Mean Difference
* Survival Estimates and Differences


When looking to include the output from a statistical method your first port of call should be to see if it has already been implemented in {cardx}.
You can find the full list of available functions [here](https://insightsengineering.github.io/cardx/main/reference/index.html).


Consider a simple t-test comparing the mean age (`AGE`) across two treatments arms (`ARM`).
In {cardx} we have the function `cardx::ard_stats_t_test()`

```{r cardx}
cards::ADSL |>
dplyr::filter(ARM %in% c("Xanomeline High Dose", "Xanomeline Low Dose")) |>
cardx::ard_stats_t_test(by = ARM, variables = AGE)
```

In the output, we see the outputs from the t-test; the mean difference, confidence interval limits and p-value.
It's also useful to see the functions inputs; for example, we can see that we did not use equal variances as the `stat` is `FALSE` for `stat_name` `var.equal`.
This is useful for re-use, if we need to run the test again we can use the ARD to see what options we need to use to recreate the result.

# Create a new `ard_*()` function

_But what do we do if the statistical method that we want to use hasn't been implemented already in {cardx}?_

Implementing a new function to create an ARD for a statistical method is often simple; all we need to do is write a function that outputs the results as a named list!

We'll first look at how `broom::tidy()` can make this even easier for us, then provide an example on how to implement from scratch.

## Using `broom::tidy()`

Typically, a user will pass a function which returns a scalar value in the `cards::ard_continuous(statistic)` argument.
However, the argument also allows for functions that return named lists, where the names from the list will then be used as the statistic names.
Since data frames or tibbles are just named lists with a little more formatting, we can pass a function to the `statistic` argument which returns a single row data frame or tibble and the behavior will be the same.
Each column of the table gives the name of the statistic via its column name, and then the corresponding value.

Commonly used statistical methods outputs are able to be passed through `broom::tidy()`, which will convert the output into a tibble.
We can then pass the output of `broom::tidy()` through to the `cards::ard_continuous(statistic)` argument of the ARD function we wish to use, this leads to an ARD output like we see in the above example, where we have one row per relevant input or output from the statistical method.

Please note that we can only use the `broom::tidy()` output directly, as we see in the below example, when the output is a tibble with a single row.

Let's extend our t-test example from above.
This time we want to carry out a one-sample t-test.
We can just pass the code to carry out the one-sample t-test and pass the output through `broom::tidy()` to the `statistic` argument like so:

```{r tidy-stats-method}
cards::ADSL |>
dplyr::filter(ARM %in% c("Xanomeline High Dose", "Xanomeline Low Dose")) |>
cards::ard_continuous(
variables = AGE,
statistic = everything() ~ list(t_test = \(x) t.test(x) |> broom::tidy())
) |>
dplyr::mutate(context = "t_test_one_sample")
```

In the above chunk of code, if we focus on what we pass to the `statistic` argument:

* `everything()` means we want to run this test on all columns passed to the `variables` argument.
* `t.test(x)` is the function which carries out the statistical test, in this case with the default arguments.
* We pipe the output of `t.test(x)` into `broom::tidy()` which converts the output of the test into a tibble (which, remember, is just a named list!)
* The function we've defined (`\(x) t.test(x) |> broom::tidy()`) itself needs to be included in a named list, where here we've chosen to use the label `'t_test'`.

Over 100 different statistical methods implemented in R are able to be 'tidied' using `broom::tidy()`.
However, the method you aim to use might not be, or the current `broom::tidy()` implementation might not contain the information that you need to be in your ARD.
In that case we'll have to format the output ourselves.

## Without `broom::tidy()`

As mentioned above, we need to define a function which carries out our required statistical method and outputs a named list of the information we wish to include in the ARD.

As an example, let's write a function which carries out a Wilcoxon signed rank test over one variable using the function `wilcox.test`.
As an output we just want to record the method and the p-value.

```{r wilcox-function}
wilcox_one_var <- \(x) wilcox.test(x)[c("method", "p.value")]
```

Let's now use this function when creating an ARD with {cards}.
Remember we just need the statistic to be a named list, so we'll call our function inside a named list.
We also don't need to specify any arguments, in this case it will pick up that the one variable `x` corresponds to the data we are testing, in this case `AGE` for the individual treatment arms.

```{r wilcox-with-cards}
cards::ADSL |>
cards::ard_continuous(
variables = AGE,
by = ARM,
statistic = ~ list(wilcox = wilcox_one_var)
)
```

We see here that we get an output of 8 rows, 2 rows (one for the method, and one for the p-value) for each of the 4 treatment arms.

## Complex inputs

The examples above are great to illustrate a simple case, but it is perhaps a rare scenario where we are implementing a statistical method with a single vector as its input.
How would we update the code above if we need to implement a two-sample t-test?

The `cards::ard_complex()` is similar to `cards::ard_continuous()`, but allows for more complex inputs in the function passed in the `statistic` argument.
In `cards::ard_continuous()`, the functions passed must accept a single vector, e.g. `\(x) t.test(x)`.
But in `cards::ard_complex()`, in addition to the vector being passed, the `data` subset, the `full_data`, character `by`, and character `strata` are also passed (see the `cards::ard_complex()` for a full description). Your function does not need to utilize each of these elements, but each _will be passed_ to your function.
As a result, we recommend your function accept the triple dots to handle unused arguments.

An implementation of a two-sample t-test may look like this:

```{r}
ttest_two_sample <- \(x, data, ...) t.test(x ~ data[["ARM"]]) |> broom::tidy()

cards::ADSL |>
dplyr::filter(ARM %in% c("Xanomeline High Dose", "Xanomeline Low Dose")) |>
cards::ard_complex(
variables = AGE,
statistic = everything() ~ list(t_test = ttest_two_sample)
) |>
dplyr::mutate(context = "t_test_two_sample")
```

# Handling errors

Let's consider what happens when we encounter an error in our statistical method.

```{r wilcox-fn-error}
wilcox_one_var_error <- function(x) {
stop("AN ERROR!")
wilcox.test(x)[c("method", "p.value")]
}

cards::ADSL |>
cards::ard_continuous(
variables = AGE,
by = ARM,
statistic = ~ list(wilcox = wilcox_one_var_error)
)
```

In the output we see that we only get 4 rows of output, the error has been stored in the `error` column but `stat_name` and `stat_label` now just take the list name of "wilcox" that we define in the statistic argument.
This could have unintended effects in downstream code, we may be relying on the `stat_name` and `stat_label` having values of "method" and "p-value", or just that the output has 2 rows per treatment arm.

To handle this we can specify the expected results from our function, so that even if we encounter an error during the code run we can be assured that the output will be of a consistent format so as not to impact downstream code.

Here's an example of how to specify the expected output using `cards::as_cards_fn()`:

```{r as_cards_fn}
wilcox_one_var_error <- cards::as_cards_fn(
wilcox_one_var_error,
stat_names = c("method", "p.value")
)

cards::ADSL |>
cards::ard_continuous(
variables = AGE,
by = ARM,
statistic = ~ list(wilcox = wilcox_one_var_error)
)
```

Our function becomes the first argument to `cards::as_cards_fn()`, then the second argument is `stat_names` where we specify the expected names of the output list.

In the output shown here, the `error` column is still populated with the error.
However, now we have the expected 8 rows and we can see that the `stat_name` and `stat_label` match the values specified in the `stat_names` argument in the `as_cards_fn()`---helping us to avoid problems in code that relies on this output.
Loading