Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #34 function for growth parameters for height/length #45

Merged
merged 22 commits into from
Jun 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
# Generated by roxygen2: do not edit by hand

export(derive_params_growth_age)
export(derive_params_growth_height)
importFrom(admiral,derive_vars_dy)
importFrom(admiraldev,assert_character_scalar)
importFrom(admiraldev,assert_character_vector)
Expand Down
289 changes: 289 additions & 0 deletions R/derive_params_growth_height.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,289 @@
#' Derive Anthropometric indicators (Z-Scores/Percentiles-for-Height/Length)
#' based on Standard Growth Charts
#'
#' Derive Anthropometric indicators (Z-Scores/Percentiles-for-Height/Length)
#' based on Standard Growth Charts for Weight by Height/Length
#'
#' @param dataset Input dataset
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing parameters from the issue:
age,
age_unit,
height_age,
These are all needed as we need to know at which age to assume height instead of body length but they're optional as: If only ever length or height is used then leave this NULL and just feed in only the corresponding by length or height metadata (instead of the combined version which has both)

I agree measure argument in the issue not needed - as we can add the height temp var to the input dataset

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is so that you use the right metadata from WHO depending on patient age and which way they were likely measuring (height or body length). it'd be good in the examples to use height_age = 730.5 days i.e. 2 years (even as default from what David explained to us?). See https://github.com/pharmaverse/admiralpeds/blob/35_advs_vignette/vignettes/advs.Rmd from line 229 for further explanation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rossfarrugia I think that's why in the example I actually split the dataset into over 2 and under 2 and just ran the function twice, otherwise you would need to create some sort of additional joining variable on both sides, dataset and metadata, which involves additional pre-processing to both datasets, while adding complexity to the function too, the "modularity" of running it twice felt more intuitive to me

I'm open to this adoption with additional arguments, but I wonder what other programmers would think

Copy link
Collaborator

@rossfarrugia rossfarrugia Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zdz2101 i'm open to your approach - it does give the user complete control still. we'll just need to well comment this to explain our approach in your roxygen2 function documentation example (e.g. at the end you should add a comment to explain that the 2 resulting dataframes would need to be set back together to get the complete ADVS for this parameter) and also we'll need to explain well in our template comments and our vignette.

@Fanny-Gautier @Lina2689 what do you think? as the template authors i would trust your advice here as you'll have a good read on what makes this least complex for users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the template we selected the records from the metadata where MEASURE ="LENGTH" for patients <730.5 days, and MEASURE="HEIGHT" for patients >=730.5 days. As Zelos mentioned, it will require additional variables to merge the right data depending on the age.
We also added message in the ADVS peds template for the same message("To derive height/length parameters, below function needs to call separately for Height and Length based on the input data and current age of the patient, as it depends on your CRF guidelines.").
I think it is easier to split it because if the user has only HEIGHT then there is only one call, similarly for LENGTH. But if the user has both LENGTH and HEIGHT in the data, it will complexify the merge. I think the user has more flexibility while splitting the derivation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense Fanny, sounds like we have a plan then! and thanks for the other comments here too - looks like me and you picked up similar spots.

#'
#' The variables specified in `sex`, `height`, `height_unit`, `parameter`, `analysis_var`
#' are expected to be in the dataset.
rossfarrugia marked this conversation as resolved.
Show resolved Hide resolved
#'
#' @param sex Sex
#'
#' A character vector is expected.
#'
#' Expected Values: `M`, `F`
#'
#' @param height Current Height/length
#'
#' A numeric vector is expected. Note that this is the actual height at the current visit.
#'
#' @param height_unit Height/Length Unit
#
#' A character vector is expected.
#'
#' Expected values: 'cm'
#'
#' @param meta_criteria Metadata dataset
#'
#' A metadata dataset with the following expected variables:
#' `HEIGHT_LENGTH`, `HEIGHT_LENGTHU`, `SEX`, `L`, `M`, `S`
#'
#' The dataset can be derived from WHO or user-defined datasets.
#' The WHO growth chart metadata datasets are available in the package and will
Copy link
Contributor

@Minlei0201 Minlei0201 Jun 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we list out the names of the metadata and clarify that "who_wt_for_lgth_boys" and "who_wt_for_lgth_girls" are for subjects with age <730.5 days, and "who_wt_for_ht_boys" and "who_wt_for_ht_girls" are for those with age>=730.5 days? By doing this, the user knows the metadata he/she can use, and also knows to apply different metadata based on subjects' age.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea, but I think we'll need to draw up a bigger vignette/article for metadata creation/preprocessing anyway

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully the template and vignette can cover this more so user gets the full context

#' require small modifications.
rossfarrugia marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original WHO metadata table has height/length increment of 0.5, but the metadata in the admiralpeds packages has increment of 0.1. How was it extrapolated? Do we have any documentation on that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.who.int/tools/child-growth-standards/standards/weight-for-length-height

scroll to the bottom on the expanded tables section you will find the increments were by 0.1

#'
#' Datasets `who_wt_for_lgth_boys`/`who_wt_for_lgth_girls` are for subject age < 730.5 days.
#' Datasets `who_wt_for_ht_boys`/`who_wt_for_ht_girls` are for subjects age >= 730.5 days.
#'
#' * `HEIGHT_LENGTH` - Height/Length
#' * `HEIGHT_LENGTHU` - Height/Length Unit
#' * `SEX` - Sex
#' * `L` - Power in the Box-Cox transformation to normality
#' * `M` - Median
#' * `S` - Coefficient of variation
#'
#' @param parameter Anthropometric measurement parameter to calculate z-score or percentile
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we include a reminder that the expected unit of weight is kg?

#'
#' A condition is expected with the input dataset `VSTESTCD`/`PARAMCD`
#' for which we want growth derivations:
#'
#' e.g. `parameter = VSTESTCD == "WEIGHT"`.
#'
#' There is WHO metadata available for Weight available in the `admiralpeds` package.
#' Weight measures are expected to be in the unit "kg".
#'
#' @param analysis_var Variable containing anthropometric measurement
#'
#' A numeric vector is expected, e.g. `AVAL`, `VSSTRESN`
#'
#' @param set_values_to_sds Variables to be set for Z-Scores
#'
#' The specified variables are set to the specified values for the new
#' observations. For example,
#' `set_values_to_sds(exprs(PARAMCD = “WTASDS”, PARAM = “Weight-for-height z-score”))`
rossfarrugia marked this conversation as resolved.
Show resolved Hide resolved
#' defines the parameter code and parameter.
#'
#' *Permitted Values*: List of variable-value pairs
#'
#' If left as default value, `NULL`, then parameter not derived in output dataset
#'
#' @param set_values_to_pctl Variables to be set for Percentile
#'
#' The specified variables are set to the specified values for the new
#' observations. For example,
#' `set_values_to_pctl(exprs(PARAMCD = “WTHPCTL”, PARAM = “Weight-for-height percentile”))`
#' defines the parameter code and parameter.
#'
#' *Permitted Values*: List of variable-value pair
#'
#' If left as default value, `NULL`, then parameter not derived in output dataset
#'
#' @return The input dataset additional records with the new parameter added.
#'
#'
#' @family der_prm_bds_vs
#'
#' @keywords der_prm_bds_vs
#'
#' @export
#'
#' @examples
#' library(dplyr)
#' library(lubridate)
#' library(rlang)
#' library(admiral)
#'
#' advs <- dm_peds %>%
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add comments to this example please? as its quite a lot for users to follow with so much pre-processing before we even get to the example function call

#' select(USUBJID, BRTHDTC, SEX) %>%
#' right_join(., vs_peds, by = "USUBJID") %>%
#' mutate(
#' VSDT = ymd(VSDTC),
#' BRTHDT = ymd(BRTHDTC)
#' ) %>%
#' derive_vars_duration(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trunc_out = FALSE can be removed as its default anyway for this function - to simplify the example

#' new_var = AAGECUR,
#' new_var_unit = AAGECURU,
#' start_date = BRTHDT,
#' end_date = VSDT,
#' out_unit = "days"
#' )
#'
#' heights <- vs_peds %>%
#' filter(VSTESTCD == "HEIGHT") %>%
#' select(USUBJID, VSSTRESN, VSSTRESU, VSDTC) %>%
#' rename(
#' HGTTMP = VSSTRESN,
#' HGTTMPU = VSSTRESU
#' )
#'
#' advs <- advs %>%
#' right_join(., heights, by = c("USUBJID", "VSDTC"))
#'
#' advs_under2 <- advs %>%
#' filter(AAGECUR < 730)
#'
#' advs_over2 <- advs %>%
#' filter(AAGECUR >= 730.5)
#'
#' who_under2 <- bind_rows(
#' (admiralpeds::who_wt_for_lgth_boys %>%
#' mutate(
#' SEX = "M",
#' height_unit = "cm"
#' )
#' ),
#' (admiralpeds::who_wt_for_lgth_girls %>%
#' mutate(
#' SEX = "F",
#' height_unit = "cm"
#' )
#' )
#' ) %>%
#' rename(
#' HEIGHT_LENGTH = Length,
#' HEIGHT_LENGTHU = height_unit
#' )
#'
#' who_over2 <- bind_rows(
#' (admiralpeds::who_wt_for_ht_boys %>%
#' mutate(
#' SEX = "M",
#' height_unit = "cm"
#' )
#' ),
#' (admiralpeds::who_wt_for_ht_girls %>%
#' mutate(
#' SEX = "F",
#' height_unit = "cm"
#' )
#' )
#' ) %>%
#' rename(
#' HEIGHT_LENGTH = Height,
#' HEIGHT_LENGTHU = height_unit
#' )
#'
#'
#' advs_under2 <- derive_params_growth_height(
#' advs_under2,
#' sex = SEX,
#' height = HGTTMP,
#' height_unit = HGTTMPU,
#' meta_criteria = who_under2,
#' parameter = VSTESTCD == "WEIGHT",
#' analysis_var = VSSTRESN,
#' set_values_to_sds = exprs(
#' PARAMCD = "WGHSDS",
#' PARAM = "Weight-for-height z-score"
#' ),
#' set_values_to_pctl = exprs(
#' PARAMCD = "WGHPCTL",
#' PARAM = "Weight-for-height percentile"
#' )
#' )
#'
#' advs_over2 <- derive_params_growth_height(
#' advs_over2,
#' sex = SEX,
#' height = HGTTMP,
#' height_unit = HGTTMPU,
#' meta_criteria = who_over2,
#' parameter = VSTESTCD == "WEIGHT",
#' analysis_var = VSSTRESN,
#' set_values_to_sds = exprs(
#' PARAMCD = "WGHSDS",
#' PARAM = "Weight-for-height z-score"
#' ),
#' set_values_to_pctl = exprs(
#' PARAMCD = "WGHPCTL",
#' PARAM = "Weight-for-height percentile"
#' )
#' )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we include codes to bind rows of advs_under2 and advs_over2?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense just for the sake of completing the example

#'
#' bind_rows(advs_under2, advs_over2)
derive_params_growth_height <- function(dataset,
sex,
height,
height_unit,
meta_criteria,
parameter,
Fanny-Gautier marked this conversation as resolved.
Show resolved Hide resolved
analysis_var,
set_values_to_sds = NULL,
set_values_to_pctl = NULL) {
# Apply assertions to each argument to ensure each object is appropriate class
sex <- assert_symbol(enexpr(sex))
height <- assert_symbol(enexpr(height))
height_unit <- assert_symbol(enexpr(height_unit))
analysis_var <- assert_symbol(enexpr(analysis_var))
assert_data_frame(dataset, required_vars = expr_c(sex, height, height_unit, analysis_var))
assert_data_frame(meta_criteria, required_vars = exprs(SEX, HEIGHT_LENGTH, HEIGHT_LENGTHU, L, M, S)) # nolint

assert_expr(enexpr(parameter))
assert_varval_list(set_values_to_sds, optional = TRUE)
assert_varval_list(set_values_to_pctl, optional = TRUE)

if (is.null(set_values_to_sds) && is.null(set_values_to_pctl)) {
abort("One of `set_values_to_sds`/`set_values_to_pctl` has to be specified.")
}

# create a unified join naming convention, hard to figure out in by argument
dataset <- dataset %>%
mutate(
sex_join := {{ sex }},
heightu_join := {{ height_unit }}
)

# Process metadata
# Metadata should contain SEX, HEIGHT_LENGTH, HEIGHT_LENGTHU, L, M, S
# Processing the data to be compatible with the dataset object
processed_md <- meta_criteria %>%
arrange(SEX, HEIGHT_LENGTHU, HEIGHT_LENGTH) %>%
group_by(SEX, HEIGHT_LENGTHU) %>%
mutate(next_height = lead(HEIGHT_LENGTH)) %>%
rename(
sex_join = SEX,
prev_height = HEIGHT_LENGTH,
heightu_join = HEIGHT_LENGTHU
)

# Merge the dataset that contains the vs records and filter the L/M/S that match height
# To parse out the appropriate age, create [x, y) using prev_height <= height < next_height
added_records <- dataset %>%
filter(!!enexpr(parameter)) %>%
left_join(.,
processed_md,
by = c("sex_join", "heightu_join"),
relationship = "many-to-many"
) %>%
filter(prev_height <= {{ height }} & {{ height }} < next_height)

dataset_final <- dataset

# create separate records objects as appropriate depending if user specific sds and/or pctl
if (!is_empty(set_values_to_sds)) {
add_sds <- added_records %>%
mutate(
AVAL := (({{ analysis_var }} / M)^L - 1) / (L * S), # nolint
!!!set_values_to_sds
)

dataset_final <- bind_rows(dataset, add_sds) %>%
select(-c(L, M, S, sex_join, heightu_join, prev_height, next_height))
}

if (!is_empty(set_values_to_pctl)) {
add_pctl <- added_records %>%
mutate(
AVAL := (({{ analysis_var }} / M)^L - 1) / (L * S), # nolint
AVAL = pnorm(AVAL) * 100,
!!!set_values_to_pctl
)

dataset_final <- bind_rows(dataset_final, add_pctl) %>%
select(-c(L, M, S, sex_join, heightu_join, prev_height, next_height))
}

return(dataset_final)
}
1 change: 1 addition & 0 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
ADSL
ADaM
ADaMs
Anthropometric
Expand Down
4 changes: 4 additions & 0 deletions man/derive_params_growth_age.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading