-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #1792 multiple summary vars #2124
Changes from 9 commits
71ad294
55dc1b0
8540821
44ad305
38e2011
11a9e25
f0c96f6
585edc7
0856f6b
5e1ef4c
8589e49
2693711
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
Package: admiral | ||
Type: Package | ||
Title: ADaM in R Asset Library | ||
Version: 0.12.0 | ||
Version: 0.12.0.9000 | ||
Authors@R: c( | ||
person("Ben", "Straub", email = "[email protected]", role = c("aut", "cre")), | ||
person("Stefan", "Bundfuss", role = "aut"), | ||
|
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
|
@@ -853,34 +853,52 @@ get_not_mapped <- function() { | |||||
admiral_environment$nmap | ||||||
} | ||||||
|
||||||
#' Merge a Summary Variable | ||||||
#' Merge Summary Variables | ||||||
#' | ||||||
#' @description Merge a summary variable from a dataset to the input dataset. | ||||||
#' | ||||||
#' **Note:** This is a wrapper function for the more generic `derive_vars_merged`. | ||||||
#' | ||||||
#' @param dataset Input dataset | ||||||
#' | ||||||
#' The variables specified by the `by_vars` argument are expected. | ||||||
#' | ||||||
#' @param dataset_add Additional dataset | ||||||
#' | ||||||
#' The variables specified by the `by_vars` and the `analysis_var` arguments | ||||||
#' are expected. | ||||||
#' The variables specified by the `by_vars` and the variables used on the left | ||||||
#' hand sides of the `new_vars` arguments are expected. | ||||||
#' | ||||||
#' @param new_var Variable to add | ||||||
#' | ||||||
#' `r lifecycle::badge("deprecated")` Please use `new_vars` instead. | ||||||
#' | ||||||
#' The specified variable is added to the input dataset (`dataset`) and set to | ||||||
#' the summarized values. | ||||||
#' | ||||||
#' @param by_vars Grouping variables | ||||||
#' | ||||||
#' The values of `analysis_var` are summarized by the specified variables. The | ||||||
#' summarized values are merged to the input dataset (`dataset`) by the | ||||||
#' specified by variables. | ||||||
#' The expressions on the left hand sides of `new_vars` are evaluated by the | ||||||
#' specified variables. The resulting values are merged to the input dataset | ||||||
#' (`dataset`) by the specified by variables. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
I got confused There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The by variables are used twice. I have emphasized both. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I also added more details to "Details". |
||||||
#' | ||||||
#' *Permitted Values*: list of variables created by `exprs()` | ||||||
#' | ||||||
#' @param new_vars New variables to add | ||||||
#' | ||||||
#' The specified variables are added to the input dataset. | ||||||
#' | ||||||
#' A named list of expressions is expected: | ||||||
#' + LHS refer to a variable. | ||||||
#' + RHS refers to the values to set to the variable. This can be a string, a | ||||||
#' symbol, a numeric value, an expression or NA. If summary functions are | ||||||
#' used, the values are summarized by the variables specified for `by_vars`. | ||||||
#' | ||||||
#' For example: | ||||||
#' ``` | ||||||
#' new_vars = exprs( | ||||||
#' DOSESUM = sum(AVAL), | ||||||
#' DOSEMEAN = mean(AVAL) | ||||||
#' ) | ||||||
#' ``` | ||||||
#' | ||||||
#' @param filter_add Filter for additional dataset (`dataset_add`) | ||||||
#' | ||||||
#' Only observations fulfilling the specified condition are taken into account | ||||||
|
@@ -891,11 +909,15 @@ get_not_mapped <- function() { | |||||
#' | ||||||
#' @param analysis_var Analysis variable | ||||||
#' | ||||||
#' `r lifecycle::badge("deprecated")` Please use `new_vars` instead. | ||||||
#' | ||||||
#' The values of the specified variable are summarized by the function | ||||||
#' specified for `summary_fun`. | ||||||
#' | ||||||
#' @param summary_fun Summary function | ||||||
#' | ||||||
#' `r lifecycle::badge("deprecated")` Please use `new_vars` instead. | ||||||
#' | ||||||
#' The specified function that takes as input `analysis_var` and performs the | ||||||
#' calculation. This can include built-in functions as well as user defined | ||||||
#' functions, for example `mean` or `function(x) mean(x, na.rm = TRUE)`. | ||||||
|
@@ -906,18 +928,16 @@ get_not_mapped <- function() { | |||||
#' 1. The records from the additional dataset (`dataset_add`) are restricted | ||||||
#' to those matching the `filter_add` condition. | ||||||
#' | ||||||
#' 1. The values of the analysis variable (`analysis_var`) are summarized by | ||||||
#' the summary function (`summary_fun`) for each by group (`by_vars`) in the | ||||||
#' additional dataset (`dataset_add`). | ||||||
#' 1. The new variables (`new_vars`) are created for each by group (`by_vars`) | ||||||
#' in the additional dataset (`dataset_add`). | ||||||
#' | ||||||
#' 1. The summarized values are merged to the input dataset as a new variable | ||||||
#' (`new_var`). For observations without a matching observation in the | ||||||
#' additional dataset the new variable is set to `NA`. Observations in the | ||||||
#' additional dataset which have no matching observation in the input dataset | ||||||
#' are ignored. | ||||||
#' 1. The new variables are merged to the input dataset. For observations | ||||||
#' without a matching observation in the additional dataset the new variables | ||||||
#' are set to `NA`. Observations in the additional dataset which have no | ||||||
#' matching observation in the input dataset are ignored. | ||||||
#' | ||||||
#' @return The output dataset contains all observations and variables of the | ||||||
#' input dataset and additionally the variable specified for `new_var`. | ||||||
#' input dataset and additionally the variables specified for `new_vars`. | ||||||
#' | ||||||
#' @family der_gen | ||||||
#' @keywords der_gen | ||||||
|
@@ -947,9 +967,10 @@ get_not_mapped <- function() { | |||||
#' adbds, | ||||||
#' dataset_add = adbds, | ||||||
#' by_vars = exprs(USUBJID, AVISIT), | ||||||
#' new_var = MEANVIS, | ||||||
#' analysis_var = AVAL, | ||||||
#' summary_fun = function(x) mean(x, na.rm = TRUE) | ||||||
#' new_vars = exprs( | ||||||
#' MEANVIS = mean(AVAL, na.rm = TRUE), | ||||||
#' MAXVIS = max(AVAL, na.rm = TRUE) | ||||||
#' ) | ||||||
#' ) | ||||||
#' | ||||||
#' # Add a variable listing the lesion ids at baseline | ||||||
|
@@ -981,46 +1002,55 @@ get_not_mapped <- function() { | |||||
#' dataset_add = adtr, | ||||||
#' by_vars = exprs(USUBJID), | ||||||
#' filter_add = AVISIT == "BASELINE", | ||||||
#' new_var = LESIONSBL, | ||||||
#' analysis_var = LESIONID, | ||||||
#' summary_fun = function(x) paste(x, collapse = ", ") | ||||||
#' new_vars = exprs(LESIONSBL = paste(LESIONID, collapse = ", ")) | ||||||
#' ) | ||||||
#' | ||||||
derive_var_merged_summary <- function(dataset, | ||||||
dataset_add, | ||||||
by_vars, | ||||||
new_vars = NULL, | ||||||
new_var, | ||||||
filter_add = NULL, | ||||||
analysis_var, | ||||||
summary_fun) { | ||||||
assert_vars(by_vars) | ||||||
by_vars_left <- replace_values_by_names(by_vars) | ||||||
by_vars_right <- chr2vars(paste(vars2chr(by_vars))) | ||||||
new_var <- assert_symbol(enexpr(new_var)) | ||||||
analysis_var <- assert_symbol(enexpr(analysis_var)) | ||||||
# once new_var is removed new_vars should be mandatory | ||||||
assert_expr_list(new_vars, named = TRUE, optional = TRUE) | ||||||
filter_add <- | ||||||
assert_filter_cond(enexpr(filter_add), optional = TRUE) | ||||||
assert_s3_class(summary_fun, "function") | ||||||
assert_data_frame( | ||||||
dataset, | ||||||
required_vars = by_vars_left | ||||||
) | ||||||
assert_data_frame( | ||||||
dataset_add, | ||||||
required_vars = expr_c(by_vars_right, analysis_var) | ||||||
required_vars = expr_c(by_vars_right, extract_vars(new_vars)) | ||||||
) | ||||||
|
||||||
if (!missing(new_var) || !missing(analysis_var) || !missing(summary_fun)) { | ||||||
deprecate_warn( | ||||||
"1.0.0", | ||||||
I("derive_var_merged_summary(new_var = , anaylsis_var = , summary_fun = )"), | ||||||
"derive_var_merged_summary(new_vars = )" | ||||||
) | ||||||
new_var <- assert_symbol(enexpr(new_var)) | ||||||
analysis_var <- assert_symbol(enexpr(analysis_var)) | ||||||
assert_s3_class(summary_fun, "function") | ||||||
new_vars <- exprs(!!new_var := {{ summary_fun }}(!!analysis_var), !!!new_vars) | ||||||
} | ||||||
|
||||||
# Summarise the analysis value and merge to the original dataset | ||||||
derive_vars_merged( | ||||||
dataset, | ||||||
dataset_add = get_summary_records( | ||||||
dataset_add, | ||||||
by_vars = by_vars_right, | ||||||
filter = !!filter_add, | ||||||
analysis_var = !!analysis_var, | ||||||
summary_fun = summary_fun | ||||||
), | ||||||
by_vars = by_vars, | ||||||
new_vars = exprs(!!new_var := !!analysis_var) | ||||||
set_values_to = new_vars, | ||||||
) %>% | ||||||
select(!!!by_vars_right, names(new_vars)), | ||||||
by_vars = by_vars | ||||||
) | ||||||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should have
admiral (development version)
instead of the number. I thinkusethis::use_version("dev")
will update for youThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated, although
use_version("dev")
did not work well (it increased to 0.12.0.9001 and added an additional heading inNEWS.md
and the commit failed). I think we should calluse_version("dev")
once directly after each release.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe every time we merge into
main
whiledeveloping
this will be incremented. @zdz2101 @ddsjobergThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep it simple and always stay at x.x.x.9000. Even in the "R packages" book it is suggested to increment the development version only in exceptional cases:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree it will make it a lot easier on us if we don't have to change this every time we merge into
main
. Seems like this will cause a lot of tedious merge conflicts.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem to make more sense, I'll remove the running of the line from the PR template, and it can be added as a standard post-release task
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like the admiral family often makes updates that affect downstream dependencies. With that in mind, incrementing the version number will make it easy for these downstream dependencies to ensure the min dev version is installed. @cicdguy created a workflow to increment the dev version number with each PR, and I think it's a good addition to include in admiral