-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closes #34 function for growth parameters for height/length #45
Changes from all commits
5ba5655
70271b3
e7f97e7
879fe9f
e5a231b
357c726
719fd83
7e3af6e
af7d60c
52b2885
d56dab4
5d11c2f
9bc420f
feba775
0918f3d
c9015a8
d550a0d
14270f6
4fa8ceb
6fd61fc
617f781
b3bf788
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,289 @@ | ||
#' Derive Anthropometric indicators (Z-Scores/Percentiles-for-Height/Length) | ||
#' based on Standard Growth Charts | ||
#' | ||
#' Derive Anthropometric indicators (Z-Scores/Percentiles-for-Height/Length) | ||
#' based on Standard Growth Charts for Weight by Height/Length | ||
#' | ||
#' @param dataset Input dataset | ||
#' | ||
#' The variables specified in `sex`, `height`, `height_unit`, `parameter`, `analysis_var` | ||
#' are expected to be in the dataset. | ||
rossfarrugia marked this conversation as resolved.
Show resolved
Hide resolved
|
||
#' | ||
#' @param sex Sex | ||
#' | ||
#' A character vector is expected. | ||
#' | ||
#' Expected Values: `M`, `F` | ||
#' | ||
#' @param height Current Height/length | ||
#' | ||
#' A numeric vector is expected. Note that this is the actual height at the current visit. | ||
#' | ||
#' @param height_unit Height/Length Unit | ||
# | ||
#' A character vector is expected. | ||
#' | ||
#' Expected values: 'cm' | ||
#' | ||
#' @param meta_criteria Metadata dataset | ||
#' | ||
#' A metadata dataset with the following expected variables: | ||
#' `HEIGHT_LENGTH`, `HEIGHT_LENGTHU`, `SEX`, `L`, `M`, `S` | ||
#' | ||
#' The dataset can be derived from WHO or user-defined datasets. | ||
#' The WHO growth chart metadata datasets are available in the package and will | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we list out the names of the metadata and clarify that "who_wt_for_lgth_boys" and "who_wt_for_lgth_girls" are for subjects with age <730.5 days, and "who_wt_for_ht_boys" and "who_wt_for_ht_girls" are for those with age>=730.5 days? By doing this, the user knows the metadata he/she can use, and also knows to apply different metadata based on subjects' age. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is a good idea, but I think we'll need to draw up a bigger vignette/article for metadata creation/preprocessing anyway There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hopefully the template and vignette can cover this more so user gets the full context |
||
#' require small modifications. | ||
rossfarrugia marked this conversation as resolved.
Show resolved
Hide resolved
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The original WHO metadata table has height/length increment of 0.5, but the metadata in the admiralpeds packages has increment of 0.1. How was it extrapolated? Do we have any documentation on that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. https://www.who.int/tools/child-growth-standards/standards/weight-for-length-height scroll to the bottom on the |
||
#' | ||
#' Datasets `who_wt_for_lgth_boys`/`who_wt_for_lgth_girls` are for subject age < 730.5 days. | ||
#' Datasets `who_wt_for_ht_boys`/`who_wt_for_ht_girls` are for subjects age >= 730.5 days. | ||
#' | ||
#' * `HEIGHT_LENGTH` - Height/Length | ||
#' * `HEIGHT_LENGTHU` - Height/Length Unit | ||
#' * `SEX` - Sex | ||
#' * `L` - Power in the Box-Cox transformation to normality | ||
#' * `M` - Median | ||
#' * `S` - Coefficient of variation | ||
#' | ||
#' @param parameter Anthropometric measurement parameter to calculate z-score or percentile | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we include a reminder that the expected unit of weight is kg? |
||
#' | ||
#' A condition is expected with the input dataset `VSTESTCD`/`PARAMCD` | ||
#' for which we want growth derivations: | ||
#' | ||
#' e.g. `parameter = VSTESTCD == "WEIGHT"`. | ||
#' | ||
#' There is WHO metadata available for Weight available in the `admiralpeds` package. | ||
#' Weight measures are expected to be in the unit "kg". | ||
#' | ||
#' @param analysis_var Variable containing anthropometric measurement | ||
#' | ||
#' A numeric vector is expected, e.g. `AVAL`, `VSSTRESN` | ||
#' | ||
#' @param set_values_to_sds Variables to be set for Z-Scores | ||
#' | ||
#' The specified variables are set to the specified values for the new | ||
#' observations. For example, | ||
#' `set_values_to_sds(exprs(PARAMCD = “WTASDS”, PARAM = “Weight-for-height z-score”))` | ||
rossfarrugia marked this conversation as resolved.
Show resolved
Hide resolved
|
||
#' defines the parameter code and parameter. | ||
#' | ||
#' *Permitted Values*: List of variable-value pairs | ||
#' | ||
#' If left as default value, `NULL`, then parameter not derived in output dataset | ||
#' | ||
#' @param set_values_to_pctl Variables to be set for Percentile | ||
#' | ||
#' The specified variables are set to the specified values for the new | ||
#' observations. For example, | ||
#' `set_values_to_pctl(exprs(PARAMCD = “WTHPCTL”, PARAM = “Weight-for-height percentile”))` | ||
#' defines the parameter code and parameter. | ||
#' | ||
#' *Permitted Values*: List of variable-value pair | ||
#' | ||
#' If left as default value, `NULL`, then parameter not derived in output dataset | ||
#' | ||
#' @return The input dataset additional records with the new parameter added. | ||
#' | ||
#' | ||
#' @family der_prm_bds_vs | ||
#' | ||
#' @keywords der_prm_bds_vs | ||
#' | ||
#' @export | ||
#' | ||
#' @examples | ||
#' library(dplyr) | ||
#' library(lubridate) | ||
#' library(rlang) | ||
#' library(admiral) | ||
#' | ||
#' advs <- dm_peds %>% | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can we add comments to this example please? as its quite a lot for users to follow with so much pre-processing before we even get to the example function call |
||
#' select(USUBJID, BRTHDTC, SEX) %>% | ||
#' right_join(., vs_peds, by = "USUBJID") %>% | ||
#' mutate( | ||
#' VSDT = ymd(VSDTC), | ||
#' BRTHDT = ymd(BRTHDTC) | ||
#' ) %>% | ||
#' derive_vars_duration( | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
#' new_var = AAGECUR, | ||
#' new_var_unit = AAGECURU, | ||
#' start_date = BRTHDT, | ||
#' end_date = VSDT, | ||
#' out_unit = "days" | ||
#' ) | ||
#' | ||
#' heights <- vs_peds %>% | ||
#' filter(VSTESTCD == "HEIGHT") %>% | ||
#' select(USUBJID, VSSTRESN, VSSTRESU, VSDTC) %>% | ||
#' rename( | ||
#' HGTTMP = VSSTRESN, | ||
#' HGTTMPU = VSSTRESU | ||
#' ) | ||
#' | ||
#' advs <- advs %>% | ||
#' right_join(., heights, by = c("USUBJID", "VSDTC")) | ||
#' | ||
#' advs_under2 <- advs %>% | ||
#' filter(AAGECUR < 730) | ||
#' | ||
#' advs_over2 <- advs %>% | ||
#' filter(AAGECUR >= 730.5) | ||
#' | ||
#' who_under2 <- bind_rows( | ||
#' (admiralpeds::who_wt_for_lgth_boys %>% | ||
#' mutate( | ||
#' SEX = "M", | ||
#' height_unit = "cm" | ||
#' ) | ||
#' ), | ||
#' (admiralpeds::who_wt_for_lgth_girls %>% | ||
#' mutate( | ||
#' SEX = "F", | ||
#' height_unit = "cm" | ||
#' ) | ||
#' ) | ||
#' ) %>% | ||
#' rename( | ||
#' HEIGHT_LENGTH = Length, | ||
#' HEIGHT_LENGTHU = height_unit | ||
#' ) | ||
#' | ||
#' who_over2 <- bind_rows( | ||
#' (admiralpeds::who_wt_for_ht_boys %>% | ||
#' mutate( | ||
#' SEX = "M", | ||
#' height_unit = "cm" | ||
#' ) | ||
#' ), | ||
#' (admiralpeds::who_wt_for_ht_girls %>% | ||
#' mutate( | ||
#' SEX = "F", | ||
#' height_unit = "cm" | ||
#' ) | ||
#' ) | ||
#' ) %>% | ||
#' rename( | ||
#' HEIGHT_LENGTH = Height, | ||
#' HEIGHT_LENGTHU = height_unit | ||
#' ) | ||
#' | ||
#' | ||
#' advs_under2 <- derive_params_growth_height( | ||
#' advs_under2, | ||
#' sex = SEX, | ||
#' height = HGTTMP, | ||
#' height_unit = HGTTMPU, | ||
#' meta_criteria = who_under2, | ||
#' parameter = VSTESTCD == "WEIGHT", | ||
#' analysis_var = VSSTRESN, | ||
#' set_values_to_sds = exprs( | ||
#' PARAMCD = "WGHSDS", | ||
#' PARAM = "Weight-for-height z-score" | ||
#' ), | ||
#' set_values_to_pctl = exprs( | ||
#' PARAMCD = "WGHPCTL", | ||
#' PARAM = "Weight-for-height percentile" | ||
#' ) | ||
#' ) | ||
#' | ||
#' advs_over2 <- derive_params_growth_height( | ||
#' advs_over2, | ||
#' sex = SEX, | ||
#' height = HGTTMP, | ||
#' height_unit = HGTTMPU, | ||
#' meta_criteria = who_over2, | ||
#' parameter = VSTESTCD == "WEIGHT", | ||
#' analysis_var = VSSTRESN, | ||
#' set_values_to_sds = exprs( | ||
#' PARAMCD = "WGHSDS", | ||
#' PARAM = "Weight-for-height z-score" | ||
#' ), | ||
#' set_values_to_pctl = exprs( | ||
#' PARAMCD = "WGHPCTL", | ||
#' PARAM = "Weight-for-height percentile" | ||
#' ) | ||
#' ) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Shall we include codes to bind rows of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense just for the sake of completing the example |
||
#' | ||
#' bind_rows(advs_under2, advs_over2) | ||
derive_params_growth_height <- function(dataset, | ||
sex, | ||
height, | ||
height_unit, | ||
meta_criteria, | ||
parameter, | ||
Fanny-Gautier marked this conversation as resolved.
Show resolved
Hide resolved
|
||
analysis_var, | ||
set_values_to_sds = NULL, | ||
set_values_to_pctl = NULL) { | ||
# Apply assertions to each argument to ensure each object is appropriate class | ||
sex <- assert_symbol(enexpr(sex)) | ||
height <- assert_symbol(enexpr(height)) | ||
height_unit <- assert_symbol(enexpr(height_unit)) | ||
analysis_var <- assert_symbol(enexpr(analysis_var)) | ||
assert_data_frame(dataset, required_vars = expr_c(sex, height, height_unit, analysis_var)) | ||
assert_data_frame(meta_criteria, required_vars = exprs(SEX, HEIGHT_LENGTH, HEIGHT_LENGTHU, L, M, S)) # nolint | ||
|
||
assert_expr(enexpr(parameter)) | ||
assert_varval_list(set_values_to_sds, optional = TRUE) | ||
assert_varval_list(set_values_to_pctl, optional = TRUE) | ||
|
||
if (is.null(set_values_to_sds) && is.null(set_values_to_pctl)) { | ||
abort("One of `set_values_to_sds`/`set_values_to_pctl` has to be specified.") | ||
} | ||
|
||
# create a unified join naming convention, hard to figure out in by argument | ||
dataset <- dataset %>% | ||
mutate( | ||
sex_join := {{ sex }}, | ||
heightu_join := {{ height_unit }} | ||
) | ||
|
||
# Process metadata | ||
# Metadata should contain SEX, HEIGHT_LENGTH, HEIGHT_LENGTHU, L, M, S | ||
# Processing the data to be compatible with the dataset object | ||
processed_md <- meta_criteria %>% | ||
arrange(SEX, HEIGHT_LENGTHU, HEIGHT_LENGTH) %>% | ||
group_by(SEX, HEIGHT_LENGTHU) %>% | ||
mutate(next_height = lead(HEIGHT_LENGTH)) %>% | ||
rename( | ||
sex_join = SEX, | ||
prev_height = HEIGHT_LENGTH, | ||
heightu_join = HEIGHT_LENGTHU | ||
) | ||
|
||
# Merge the dataset that contains the vs records and filter the L/M/S that match height | ||
# To parse out the appropriate age, create [x, y) using prev_height <= height < next_height | ||
added_records <- dataset %>% | ||
filter(!!enexpr(parameter)) %>% | ||
left_join(., | ||
processed_md, | ||
by = c("sex_join", "heightu_join"), | ||
relationship = "many-to-many" | ||
) %>% | ||
filter(prev_height <= {{ height }} & {{ height }} < next_height) | ||
|
||
dataset_final <- dataset | ||
|
||
# create separate records objects as appropriate depending if user specific sds and/or pctl | ||
if (!is_empty(set_values_to_sds)) { | ||
add_sds <- added_records %>% | ||
mutate( | ||
AVAL := (({{ analysis_var }} / M)^L - 1) / (L * S), # nolint | ||
!!!set_values_to_sds | ||
) | ||
|
||
dataset_final <- bind_rows(dataset, add_sds) %>% | ||
select(-c(L, M, S, sex_join, heightu_join, prev_height, next_height)) | ||
} | ||
|
||
if (!is_empty(set_values_to_pctl)) { | ||
add_pctl <- added_records %>% | ||
mutate( | ||
AVAL := (({{ analysis_var }} / M)^L - 1) / (L * S), # nolint | ||
AVAL = pnorm(AVAL) * 100, | ||
!!!set_values_to_pctl | ||
) | ||
|
||
dataset_final <- bind_rows(dataset_final, add_pctl) %>% | ||
select(-c(L, M, S, sex_join, heightu_join, prev_height, next_height)) | ||
} | ||
|
||
return(dataset_final) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
ADSL | ||
ADaM | ||
ADaMs | ||
Anthropometric | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing parameters from the issue:
age,
age_unit,
height_age,
These are all needed as we need to know at which age to assume height instead of body length but they're optional as: If only ever length or height is used then leave this NULL and just feed in only the corresponding by length or height metadata (instead of the combined version which has both)
I agree measure argument in the issue not needed - as we can add the height temp var to the input dataset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is so that you use the right metadata from WHO depending on patient age and which way they were likely measuring (height or body length). it'd be good in the examples to use height_age = 730.5 days i.e. 2 years (even as default from what David explained to us?). See https://github.com/pharmaverse/admiralpeds/blob/35_advs_vignette/vignettes/advs.Rmd from line 229 for further explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rossfarrugia I think that's why in the example I actually split the dataset into over 2 and under 2 and just ran the function twice, otherwise you would need to create some sort of additional joining variable on both sides, dataset and metadata, which involves additional pre-processing to both datasets, while adding complexity to the function too, the "modularity" of running it twice felt more intuitive to me
I'm open to this adoption with additional arguments, but I wonder what other programmers would think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zdz2101 i'm open to your approach - it does give the user complete control still. we'll just need to well comment this to explain our approach in your roxygen2 function documentation example (e.g. at the end you should add a comment to explain that the 2 resulting dataframes would need to be set back together to get the complete ADVS for this parameter) and also we'll need to explain well in our template comments and our vignette.
@Fanny-Gautier @Lina2689 what do you think? as the template authors i would trust your advice here as you'll have a good read on what makes this least complex for users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the template we selected the records from the metadata where MEASURE ="LENGTH" for patients <730.5 days, and MEASURE="HEIGHT" for patients >=730.5 days. As Zelos mentioned, it will require additional variables to merge the right data depending on the age.
We also added message in the ADVS peds template for the same
message("To derive height/length parameters, below function needs to call separately for Height and Length based on the input data and current age of the patient, as it depends on your CRF guidelines.")
.I think it is easier to split it because if the user has only HEIGHT then there is only one call, similarly for LENGTH. But if the user has both LENGTH and HEIGHT in the data, it will complexify the merge. I think the user has more flexibility while splitting the derivation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense Fanny, sounds like we have a plan then! and thanks for the other comments here too - looks like me and you picked up similar spots.