Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds denom argument to count s_* functions #1326

Merged
merged 4 commits into from
Oct 24, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 7 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# tern 0.9.6.9007

### Enhancements
* Added the `denom` parameter to `s_count_cumulative()`, `s_count_missed_doses()`, and `s_count_occurrences_by_grade()`.
* Added `"N_row"` as an optional input to `denom` in `s_count_occurrences()`.
* Refactored `a_count_occurrences_by_grade()` to no longer use `make_afun()`.

### Bug Fixes
* Fixed bug in `a_summary()` causing non-unique `row_name` values to occur when multiple statistics are selected for count variables.

Expand All @@ -15,7 +20,7 @@
* Refactored `estimate_incidence_rate` to work as both an analyze function and a summarize function, controlled by the added `summarize` parameter. When `summarize = TRUE`, labels can be fine-tuned via the new `label_fmt` argument to the same function.
* Added `fraction` statistic to the `analyze_var_count` method group.
* Improved `summarize_glm_count()` documentation and all its associated functions to better describe the results and the functions' purpose.
* Added `method` argument to `s_odds_ratio()` and `estimate_odds_ratio()` to control whether exact or approximate conditional likelihood calculations are used.
* Added `method` argument to `s_odds_ratio()` and `estimate_odds_ratio()` to control whether exact or approximate conditional likelihood calculations are used.

### Bug Fixes
* Added defaults for `d_count_cumulative` parameters as described in the documentation.
Expand Down Expand Up @@ -72,7 +77,7 @@
### Miscellaneous
* Added function `expect_snapshot_ggplot` to test setup file to process plot snapshot tests and allow plot dimensions to be set.
* Adapted to argument renames introduced in `ggplot2` 3.5.0.
* Renamed `individual_patient_plot.R` to `g_ipp.R`.
* Renamed `individual_patient_plot.R` to `g_ipp.R`.
* Removed all instances of deprecated parameters `time_unit_input`, `time_unit_output`, `na_level` and `indent_mod`.
* Removed deprecated functions `summarize_vars`, `control_summarize_vars`, `a_compare`, `create_afun_summary`, `create_afun_compare`, and `summary_custom`.
* Removed `vdiffr` package from Suggests in DESCRIPTION file.
Expand Down
48 changes: 20 additions & 28 deletions R/analyze_variables.R
Original file line number Diff line number Diff line change
Expand Up @@ -238,11 +238,6 @@ s_summary.numeric <- function(x,

#' @describeIn analyze_variables Method for `factor` class.
#'
#' @param denom (`string`)\cr choice of denominator for factor proportions. Options are:
#' * `n`: number of values in this row and column intersection.
#' * `N_row`: total number of values in this row across columns.
#' * `N_col`: total number of values in this column across rows.
#'
#' @return
#' * If `x` is of class `factor` or converted from `character`, returns a `list` with named `numeric` items:
#' * `n`: The [length()] of `x`.
Expand Down Expand Up @@ -283,12 +278,11 @@ s_summary.numeric <- function(x,
#' @export
s_summary.factor <- function(x,
na.rm = TRUE, # nolint
denom = c("n", "N_row", "N_col"),
denom = c("n", "N_col", "N_row"),
.N_row, # nolint
.N_col, # nolint
...) {
assert_valid_factor(x)
denom <- match.arg(denom)

if (na.rm) {
x <- x[!is.na(x)] %>% fct_discard("<Missing>")
Expand All @@ -301,20 +295,23 @@ s_summary.factor <- function(x,
y$n <- length(x)

y$count <- as.list(table(x, useNA = "ifany"))
dn <- switch(denom,
n = length(x),
N_row = .N_row,
N_col = .N_col
)

denom <- match.arg(denom) %>%
switch(
n = length(x),
N_row = .N_row,
N_col = .N_col
)

y$count_fraction <- lapply(
y$count,
function(x) {
c(x, ifelse(dn > 0, x / dn, 0))
c(x, ifelse(denom > 0, x / denom, 0))
}
)
y$fraction <- lapply(
y$count,
function(count) c("num" = count, "denom" = dn)
function(count) c("num" = count, "denom" = denom)
)

y$n_blq <- sum(grepl("BLQ|LTR|<[1-9]|<PCLLOQ", x))
Expand Down Expand Up @@ -346,7 +343,7 @@ s_summary.factor <- function(x,
#' @export
s_summary.character <- function(x,
na.rm = TRUE, # nolint
denom = c("n", "N_row", "N_col"),
denom = c("n", "N_col", "N_row"),
.N_row, # nolint
.N_col, # nolint
.var,
Expand All @@ -370,11 +367,6 @@ s_summary.character <- function(x,

#' @describeIn analyze_variables Method for `logical` class.
#'
#' @param denom (`string`)\cr choice of denominator for proportion. Options are:
#' * `n`: number of values in this row and column intersection.
#' * `N_row`: total number of values in this row across columns.
#' * `N_col`: total number of values in this column across rows.
#'
#' @return
#' * If `x` is of class `logical`, returns a `list` with named `numeric` items:
#' * `n`: The [length()] of `x` (possibly after removing `NA`s).
Expand Down Expand Up @@ -406,22 +398,22 @@ s_summary.character <- function(x,
#' @export
s_summary.logical <- function(x,
na.rm = TRUE, # nolint
denom = c("n", "N_row", "N_col"),
denom = c("n", "N_col", "N_row"),
.N_row, # nolint
.N_col, # nolint
...) {
denom <- match.arg(denom)
if (na.rm) x <- x[!is.na(x)]
y <- list()
y$n <- length(x)
count <- sum(x, na.rm = TRUE)
dn <- switch(denom,
n = length(x),
N_row = .N_row,
N_col = .N_col
)
denom <- match.arg(denom) %>%
switch(
n = length(x),
N_row = .N_row,
N_col = .N_col
)
y$count <- count
y$count_fraction <- c(count, ifelse(dn > 0, count / dn, 0))
y$count_fraction <- c(count, ifelse(denom > 0, count / denom, 0))
y$n_blq <- 0L
y
}
Expand Down
4 changes: 4 additions & 0 deletions R/argument_convention.R
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@
#' @param col_by (`factor`)\cr defining column groups.
#' @param conf_level (`proportion`)\cr confidence level of the interval.
#' @param data (`data.frame`)\cr the dataset containing the variables to summarize.
#' @param denom (`string`)\cr choice of denominator for proportion. Options are:
#' * `n`: number of values in this row and column intersection.
#' * `N_row`: total number of values in this row across columns.
#' * `N_col`: total number of values in this column across rows.
#' @param df (`data.frame`)\cr data set containing all analysis variables.
#' @param groups_lists (named `list` of `list`)\cr optionally contains for each `subgroups` variable a
#' list, which specifies the new group levels via the names and the
Expand Down
16 changes: 14 additions & 2 deletions R/count_cumulative.R
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,10 @@ h_count_cumulative <- function(x,
length(x[is_keep & x > threshold])
}

result <- c(count = count, fraction = count / .N_col)
result <- c(
count = count,
fraction = if (count == 0 && .N_col == 0) 0 else count / .N_col
)
result
}

Expand Down Expand Up @@ -112,11 +115,20 @@ s_count_cumulative <- function(x,
lower_tail = TRUE,
include_eq = TRUE,
.N_col, # nolint
.N_row, # nolint
denom = c("N_col", "n", "N_row"),
...) {
checkmate::assert_numeric(thresholds, min.len = 1, any.missing = FALSE)

denom <- match.arg(denom) %>%
switch(
n = length(x),
N_row = .N_row,
N_col = .N_col
)

count_fraction_list <- Map(function(thres) {
result <- h_count_cumulative(x, thres, lower_tail, include_eq, .N_col = .N_col, ...)
result <- h_count_cumulative(x, thres, lower_tail, include_eq, .N_col = denom, ...)
label <- d_count_cumulative(thres, lower_tail, include_eq)
formatters::with_label(result, label)
}, thresholds)
Expand Down
8 changes: 6 additions & 2 deletions R/count_missed_doses.R
Original file line number Diff line number Diff line change
Expand Up @@ -58,13 +58,17 @@ d_count_missed_doses <- function(thresholds) {
#' @keywords internal
s_count_missed_doses <- function(x,
thresholds,
.N_col) { # nolint
.N_col, # nolint
.N_row, # nolint
denom = c("N_col", "n", "N_row")) {
stat <- s_count_cumulative(
x = x,
thresholds = thresholds,
lower_tail = FALSE,
include_eq = TRUE,
.N_col = .N_col
.N_col = .N_col,
.N_row = .N_row,
denom = denom
)
labels <- d_count_missed_doses(thresholds)
for (i in seq_along(stat$count_fraction)) {
Expand Down
31 changes: 18 additions & 13 deletions R/count_occurrences.R
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,10 @@ NULL
#' @describeIn count_occurrences Statistics function which counts number of patients that report an
#' occurrence.
#'
#' @param denom (`string`)\cr choice of denominator for patient proportions. Can be:
#' - `N_col`: total number of patients in this column across rows
#' - `n`: number of patients with any occurrences
#' @param denom (`string`)\cr choice of denominator for proportion. Options are:
#' * `N_col`: total number of patients in this column across rows.
#' * `n`: number of patients with any occurrences.
#' * `N_row`: total number of patients in this row across columns.
#'
#' @return
#' * `s_count_occurrences()` returns a list with:
Expand All @@ -66,15 +67,17 @@ NULL
#' s_count_occurrences(
#' df,
#' .N_col = 4L,
#' .N_row = 4L,
#' .df_row = df,
#' .var = "MHDECOD",
#' id = "USUBJID"
#' )
#'
#' @export
s_count_occurrences <- function(df,
denom = c("N_col", "n"),
denom = c("N_col", "n", "N_row"),
.N_col, # nolint
.N_row, # nolint
.df_row,
drop = TRUE,
.var = "MHDECOD",
Expand All @@ -84,7 +87,6 @@ s_count_occurrences <- function(df,
checkmate::assert_count(.N_col)
checkmate::assert_multi_class(df[[.var]], classes = c("factor", "character"))
checkmate::assert_multi_class(df[[id]], classes = c("factor", "character"))
denom <- match.arg(denom)

occurrences <- if (drop) {
# Note that we don't try to preserve original level order here since a) that would required
Expand All @@ -101,10 +103,12 @@ s_count_occurrences <- function(df,
df[[.var]]
}
ids <- factor(df[[id]])
dn <- switch(denom,
n = nlevels(ids),
N_col = .N_col
)
denom <- match.arg(denom) %>%
switch(
n = nlevels(ids),
N_row = .N_row,
N_col = .N_col
)
has_occurrence_per_id <- table(occurrences, ids) > 0
n_ids_per_occurrence <- as.list(rowSums(has_occurrence_per_id))
list(
Expand All @@ -118,12 +122,12 @@ s_count_occurrences <- function(df,
c(i, i / denom)
}
},
denom = dn
denom = denom
),
fraction = lapply(
n_ids_per_occurrence,
function(i, denom) c("num" = i, "denom" = denom),
denom = dn
denom = denom
)
)
}
Expand All @@ -147,9 +151,10 @@ s_count_occurrences <- function(df,
a_count_occurrences <- function(df,
labelstr = "",
id = "USUBJID",
denom = c("N_col", "n"),
denom = c("N_col", "n", "N_row"),
drop = TRUE,
.N_col, # nolint
.N_row, # nolint
.var = NULL,
.df_row = NULL,
.stats = NULL,
Expand All @@ -159,7 +164,7 @@ a_count_occurrences <- function(df,
na_str = default_na_str()) {
denom <- match.arg(denom)
x_stats <- s_count_occurrences(
df = df, denom = denom, .N_col = .N_col, .df_row = .df_row, drop = drop, .var = .var, id = id
df = df, denom = denom, .N_col = .N_col, .N_row = .N_row, .df_row = .df_row, drop = drop, .var = .var, id = id
)
if (is.null(unlist(x_stats))) {
return(NULL)
Expand Down
Loading
Loading