Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closes #2427 general issue adopt data and data raw conventions #2494

Conversation

jimrothstein
Copy link
Collaborator

@jimrothstein jimrothstein commented Aug 8, 2024

Thank you for your Pull Request! We have developed this task checklist from the Development Process Guide to help with the final steps of the process. Completing the below tasks helps to ensure our reviewers can maximize their time on your code as well as making sure the admiral codebase remains robust and consistent.

Please check off each taskbox as an acknowledgment that you completed the task or check off that it is not relevant to your Pull Request. This checklist is part of the Github Action workflows and the Pull Request will not be merged into the main branch until you have checked off each task.

  • Place Closes #<insert_issue_number> into the beginning of your Pull Request Title (Use Edit button in top-right if you need to update)
  • Code is formatted according to the tidyverse style guide. Run styler::style_file() to style R and Rmd files
  • Updated relevant unit tests or have written new unit tests, which should consider realistic data scenarios and edge cases, e.g. empty datasets, errors, boundary cases etc. - See Unit Test Guide
  • If you removed/replaced any function and/or function parameters, did you fully follow the deprecation guidance?
  • Review the Cheat Sheet. Make any required updates to it by editing the file inst/cheatsheet/admiral_cheatsheet.pptx and re-upload a PDF and a PNG version of it to the same folder. (The PNG version can be created by taking a screenshot of the PDF version.)
  • Update to all relevant roxygen headers and examples, including keywords and families. Refer to the categorization of functions to tag appropriate keyword/family.
  • Run devtools::document() so all .Rd files in the man folder and the NAMESPACE file in the project root are updated appropriately
  • Address any updates needed for vignettes and/or templates
  • Update NEWS.md under the header # admiral (development version) if the changes pertain to a user-facing function (i.e. it has an @export tag) or documentation aimed at users (rather than developers). A Developer Notes section is available in NEWS.md for tracking developer-facing issues.
  • Build admiral site pkgdown::build_site() and check that all affected examples are displayed correctly and that all new functions occur on the "Reference" page.
  • Address or fix all lintr warnings and errors - lintr::lint_package()
  • Run R CMD check locally and address all errors and warnings - devtools::check()
  • Link the issue in the Development Section on the right hand side.
  • Address all merge conflicts and resolve appropriately
  • Pat yourself on the back for a job well done! Much love to your accomplishment!

@jimrothstein jimrothstein marked this pull request as draft August 8, 2024 22:51
@jimrothstein
Copy link
Collaborator Author

jimrothstein commented Aug 8, 2024

This is a DRAFT

GOAL:

create data/*.rda files following pharamaversesdtm convention

QUESTION:

Is this draft PR an acceptable approach?

What I did:

Create: example_qs.rda in TWO WAYS:

  • the original method and
  • newer convention (as pharamaversesdtm), scripts in data-raw/*.R

Pls. lookover new code: data-raw/example_qs_new.R

And tested in data-raw/test.R ; this checks that created *.rda files are IDENTICAL
(test.R to be removed)

Thx.

@manciniedoardo
Copy link
Collaborator

HI @jimrothstein is this ready for review? I see some comments but also see it's still marked draft.

@jimrothstein
Copy link
Collaborator Author

"ready for review? "

review = NO

@manciniedoardo

To address this issue #2494 (data conventions), I am proposing using a code similar to this draft.
Before I do more, or prepare true PR, I hope for thumbs up for thumbs down or discussion.

Goal: Create data/example_qs.rda

Replace current code: inst/example_scripts/example_qs.R
With example code: data-raw/example_qs.R

For future, how should I communicate or label these kinds of pre-PR questions about proposed code/proposed approach?

Thx

@manciniedoardo
Copy link
Collaborator

@jimrothstein sorry I had misunderstood. The second way (using data_raw) looks good to me. @pharmaverse/admiral any other thoughts before Jim prepares a PR?

@jimrothstein
Copy link
Collaborator Author

jimrothstein commented Aug 17, 2024

@manciniedoardo DRAFT

GOAL

Replace: inst/example_scripts/example_qs.R
With: data-raw/example_qs.R

To create: data/example_qs.rda

Note: Once this PR is successful:

  • Will do several other datasets similarly.

Still to do for THIS PR

Except for a few comments (to be removed or modified), this will become a PR to create *data/example_qs.rda"

Not sure:

  • Need to test the example data?
  • Remove old code in inst/example_scripts/example_qs.R ?
  • Not sure Proper way to document this change?

@bms63
Copy link
Collaborator

bms63 commented Aug 20, 2024

HI @jimrothstein this looks good so far and creating a draft and asking for feedback seems appropriate to me.

I would just make sure that the example data where it is used doesn't change anything. I'm not super familiar with where this is being used.

Also - looking over at the inst folder - I see some other data creation stuff, e.g. adlb_grading. I am wondering if that should be moved over as well.

Anyone (@pharmaverse/admiral ) know what this script is for? https://github.com/pharmaverse/admiral/blob/main/inst/example_scripts/derive_single_dose.R

@bundfussr
Copy link
Collaborator

Anyone (@pharmaverse/admiral ) know what this script is for? https://github.com/pharmaverse/admiral/blob/main/inst/example_scripts/derive_single_dose.R

It creates the ex_single dataset, which is used in the OCCDS vignette.

@jimrothstein
Copy link
Collaborator Author

jimrothstein commented Aug 22, 2024

Plan (Updated 22 AUG 2024)

To verify that created data files (*.rda) are IDENTICAL to those before any changes.
SITUATION NOW:

  1. data/.rda* Path to created data.
  2. inst/example_scripts/*.R Path to *.R files, create *.rda files.

AFTER:

  1. (NO CHANGE) data/*.rda Path to original data.
  2. (NEW) data-raw/*.R Path to new code.
  3. (REMOVE?) inst/example_scripts/*.R Path to old code.

To effect this, propose these intermediate steps:

  1. (Temporary) Add: old_data/ folder to hold copy of *.rda files.
  2. Empty: data/ .
  3. Run: new code (data-raw/*.R) New data saved in data/*.rda
  4. Run: **test.R (**or equivalent as testthat) verify old and new code produces IDENTICAL *.rda files.
  5. Pause for review/discuss

As a final step

  1. Remove folder and old code inst/example_scripts/*.R
  2. Remove old_data/*.rda (copies of orginal data)
  3. Remove test.R,
  4. PR

As initial run through:

  1. Do this for example_qs.rda ONLY
  2. Pause for review/discuss.
  3. Then to for *all remaining .rda files

How to document this?

@bms63
Copy link
Collaborator

bms63 commented Aug 22, 2024

Hey Jim - You can play fast and loose with this. The actions we have set up will crash out with all our tests and examples if things are not set up correctly with the example data.

Everything is backed up as well - so if something goes awry we can also revert. Thanks for taking this on!!

@jimrothstein
Copy link
Collaborator Author

jimrothstein commented Aug 23, 2024

@bms63 Ready for 1st serious review.

Note:

  • example_qs.rda ONLY
  • documentation requirements? roxygen?
  • test.R in data-raw/ for convenience, remove?
  • note use of environments when using load the data (in test.R)
  • a couple other small things.

@bms63
Copy link
Collaborator

bms63 commented Aug 24, 2024

@bms63 Ready for 1st serious review.

Note:

  • example_qs.rda ONLY

looking good

  • documentation requirements? roxygen?

I think we could benefit from describing the variables, but lets do this in a separate issue and just focus on adopting the data/data-raw convention. Do you mind making one?

  • test.R in data-raw/ for convenience, remove?
    I think this should be removed unless you think this might be handy for others. It should probably have a different name
  • note use of environments when using load the data (in test.R)

noted

  • a couple other small things.

can you fix all the failing actions: code style, linting, etc.

Thanks again!!

@bms63
Copy link
Collaborator

bms63 commented Aug 25, 2024

Since the test.R file is just for testing purposes. Can you just comment out the source parts until we can remove this file? Honestly, it feels like things are pretty good and we can just drop testing for if the two are identical. I'm not really worried about it.

…no more testing. The file data/example_qs.rda will be used, going forward.
@jimrothstein
Copy link
Collaborator Author

jimrothstein commented Aug 25, 2024

@bms63

Removed all the testing files for example_qs.

still failing (URL). I think my local, #2427 issue branch is not up-to-date. Working on it ...

update
My local copy has this url and it is not broken -- I do not see the line github actions is unhappy with.

@bms63
Copy link
Collaborator

bms63 commented Aug 25, 2024

Ah dang looks like we need to roxygenize these files

@jimrothstein
Copy link
Collaborator Author

  1. roxygenize ?

Which files?? In pharamversesdtm, I do not think R files in data-raw/ are documented/roxygenized .
(Example: dm_vaccine.R https://github.com/pharmaverse/pharmaversesdtm/blob/main/data-raw/dm_vaccine.R)

  1. From above (Closes #2427 general issue adopt data and data raw conventions #2494 (comment))

I think we could benefit from describing the variables, but lets do this in a separate issue and just focus on adopting the data/data-raw convention. Do you mind making one?

Something like?

new issue:
Document variables in data/*.rda files ( created by R code in data-raw/)

@bms63
Copy link
Collaborator

bms63 commented Aug 26, 2024

@jimrothstein I haven't looked too closely. But whatever is causing this needs to get fixed. thought it was telling us to use roxygen stuff.
image

@jimrothstein
Copy link
Collaborator Author

jimrothstein commented Aug 26, 2024

lint and styler pass now. 👍
(I learned this: no need to use library(xyz) in data-raw/*.R)

fda URL has me stumped - rest my eyes and take another look, must be something obvious.

Errors in vignettes/hys_law.Rmd

* [https://www.fda.gov/media/116737/download](https://www.fda.gov/media/116737/download): Failed: Network error (status code: 404)

@jimrothstein
Copy link
Collaborator Author

(will re-lint and re-style, but hope otherwise getting closer to goal)

Copy link
Collaborator

@bms63 bms63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run styler on your code please

@bms63
Copy link
Collaborator

bms63 commented Sep 27, 2024

Styler will fix a lot of these linting issues so then we can discuss the more problematic ones

@bms63
Copy link
Collaborator

bms63 commented Sep 27, 2024

Alright! We are almost there!! Yay!

Ran  devtools::lint()

To be changed to standard form:   CACHE_DIR, DATA_RAW ... once I am sure code is correct.

NO change to .RBuildignore (not sure if consensus)
@jimrothstein
Copy link
Collaborator Author

adsl.rmd vignette code

knitr::purl("adsl.Rmd", output = "output_file.R")

## ----setup, include = FALSE----------------------------------------------------------------------------------------------------------------------------------------
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(admiraldev)


## ----message=FALSE, warning=FALSE----------------------------------------------------------------------------------------------------------------------------------
library(admiral)
library(dplyr, warn.conflicts = FALSE)
library(pharmaversesdtm)
library(lubridate)
library(stringr)

dm <- pharmaversesdtm::dm
ds <- pharmaversesdtm::ds
ex <- pharmaversesdtm::ex
ae <- pharmaversesdtm::ae
lb <- pharmaversesdtm::lb

dm <- convert_blanks_to_na(dm)
ds <- convert_blanks_to_na(ds)
ex <- convert_blanks_to_na(ex)
ae <- convert_blanks_to_na(ae)
lb <- convert_blanks_to_na(lb)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- dm %>%
  select(-DOMAIN)


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, RFSTDTC, COUNTRY, AGE, SEX, RACE, ETHNIC, ARM, ACTARM)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- dm %>%
  mutate(TRT01P = ARM, TRT01A = ACTARM)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
# impute start and end time of exposure to first and last respectively,
# do not impute date
ex_ext <- ex %>%
  derive_vars_dtm(
    dtc = EXSTDTC,
    new_vars_prefix = "EXST"
  ) %>%
  derive_vars_dtm(
    dtc = EXENDTC,
    new_vars_prefix = "EXEN",
    time_imputation = "last"
  )

adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "PLACEBO"))) & !is.na(EXSTDTM),
    new_vars = exprs(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF),
    order = exprs(EXSTDTM, EXSEQ),
    mode = "first",
    by_vars = exprs(STUDYID, USUBJID)
  ) %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "PLACEBO"))) & !is.na(EXENDTM),
    new_vars = exprs(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
    order = exprs(EXENDTM, EXSEQ),
    mode = "last",
    by_vars = exprs(STUDYID, USUBJID)
  )


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_dtm_to_dt(source_vars = exprs(TRTSDTM, TRTEDTM))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_var_trtdurd()


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, RFSTDTC, TRTSDTM, TRTSDT, TRTEDTM, TRTEDT, TRTDURD)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
# convert character date to numeric date without imputation
ds_ext <- derive_vars_dt(
  ds,
  dtc = DSSTDTC,
  new_vars_prefix = "DSST"
)

adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds_ext,
    by_vars = exprs(STUDYID, USUBJID),
    new_vars = exprs(EOSDT = DSSTDT),
    filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  ds_ext,
  display_vars = exprs(USUBJID, DSCAT, DSDECOD, DSTERM, DSSTDT, DSSTDTC),
  filter = DSDECOD != "SCREEN FAILURE"
)


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, EOSDT))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
format_eosstt <- function(x) {
  case_when(
    x %in% c("COMPLETED") ~ "COMPLETED",
    x %in% c("SCREEN FAILURE") ~ NA_character_,
    TRUE ~ "DISCONTINUED"
  )
}


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds,
    by_vars = exprs(STUDYID, USUBJID),
    filter_add = DSCAT == "DISPOSITION EVENT",
    new_vars = exprs(EOSSTT = format_eosstt(DSDECOD)),
    missing_values = exprs(EOSSTT = "ONGOING")
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, EOSDT, EOSSTT))


## ------------------------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds,
    by_vars = exprs(USUBJID),
    new_vars = exprs(DCSREAS = DSDECOD, DCSREASP = DSTERM),
    filter_add = DSCAT == "DISPOSITION EVENT" &
      !(DSDECOD %in% c("SCREEN FAILURE", "COMPLETED", NA))
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, EOSDT, EOSSTT, DCSREAS, DCSREASP))


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  select(-DCSREAS, -DCSREASP)


## ------------------------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds,
    by_vars = exprs(USUBJID),
    new_vars = exprs(DCSREAS = DSDECOD),
    filter_add = DSCAT == "DISPOSITION EVENT" &
      DSDECOD %notin% c("SCREEN FAILURE", "COMPLETED", NA)
  ) %>%
  derive_vars_merged(
    dataset_add = ds,
    by_vars = exprs(USUBJID),
    new_vars = exprs(DCSREASP = DSTERM),
    filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD %in% "OTHER"
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, EOSDT, EOSSTT, DCSREAS, DCSREASP))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds_ext,
    filter_add = DSDECOD == "RANDOMIZED",
    by_vars = exprs(STUDYID, USUBJID),
    new_vars = exprs(RANDDT = DSSTDT)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, RANDDT))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_dt(
    new_vars_prefix = "DTH",
    dtc = DTHDTC
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl %>% filter(!is.na(DTHDT) | row_number() %% 50 == 0), display_vars = exprs(USUBJID, TRTEDT, DTHDTC, DTHDT, DTHFL))


## ----eval=FALSE----------------------------------------------------------------------------------------------------------------------------------------------------
## adsl <- adsl %>%
##   derive_vars_dt(
##     new_vars_prefix = "DTH",
##     dtc = DTHDTC,
##     date_imputation = "first"
##   )


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_extreme_event(
    by_vars = exprs(STUDYID, USUBJID),
    events = list(
      event(
        dataset_name = "ae",
        condition = AEOUT == "FATAL",
        set_values_to = exprs(DTHCAUS = AEDECOD),
      ),
      event(
        dataset_name = "ds",
        condition = DSDECOD == "DEATH" & grepl("DEATH DUE TO", DSTERM),
        set_values_to = exprs(DTHCAUS = DSTERM),
      )
    ),
    source_datasets = list(ae = ae, ds = ds),
    tmp_event_nr_var = event_nr,
    order = exprs(event_nr),
    mode = "first",
    new_vars = exprs(DTHCAUS)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, EOSDT, DTHDTC, DTHDT, DTHCAUS),
  filter = DTHFL == "Y"
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  select(-DTHCAUS) %>% # remove it before deriving it again
  derive_vars_extreme_event(
    by_vars = exprs(STUDYID, USUBJID),
    events = list(
      event(
        dataset_name = "ae",
        condition = AEOUT == "FATAL",
        set_values_to = exprs(DTHCAUS = AEDECOD, DTHDOM = "AE", DTHSEQ = AESEQ),
      ),
      event(
        dataset_name = "ds",
        condition = DSDECOD == "DEATH" & grepl("DEATH DUE TO", DSTERM),
        set_values_to = exprs(DTHCAUS = DSTERM, DTHDOM = "DS", DTHSEQ = DSSEQ),
      )
    ),
    source_datasets = list(ae = ae, ds = ds),
    tmp_event_nr_var = event_nr,
    order = exprs(event_nr),
    mode = "first",
    new_vars = exprs(DTHCAUS, DTHDOM, DTHSEQ)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTEDT, DTHDTC, DTHDT, DTHCAUS, DTHDOM, DTHSEQ),
  filter = DTHFL == "Y"
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  mutate(DTHCGR1 = case_when(
    is.na(DTHDOM) ~ NA_character_,
    DTHDOM == "AE" ~ "ADVERSE EVENT",
    str_detect(DTHCAUS, "(PROGRESSIVE DISEASE|DISEASE RELAPSE)") ~ "PROGRESSIVE DISEASE",
    TRUE ~ "OTHER"
  ))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_duration(
    new_var = DTHADY,
    start_date = TRTSDT,
    end_date = DTHDT
  )


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_duration(
    new_var = LDDTHELD,
    start_date = TRTEDT,
    end_date = DTHDT,
    add_one = FALSE
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTEDT, DTHDTC, DTHDT, DTHCAUS, DTHADY, LDDTHELD),
  filter = DTHFL == "Y"
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_extreme_event(
    by_vars = exprs(STUDYID, USUBJID),
    events = list(
      event(
        dataset_name = "ae",
        order = exprs(AESTDTC, AESEQ),
        condition = !is.na(AESTDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(AESTDTC, highest_imputation = "M"),
          seq = AESEQ
        ),
      ),
      event(
        dataset_name = "ae",
        order = exprs(AEENDTC, AESEQ),
        condition = !is.na(AEENDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(AEENDTC, highest_imputation = "M"),
          seq = AESEQ
        ),
      ),
      event(
        dataset_name = "lb",
        order = exprs(LBDTC, LBSEQ),
        condition = !is.na(LBDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(LBDTC, highest_imputation = "M"),
          seq = LBSEQ
        ),
      ),
      event(
        dataset_name = "adsl",
        condition = !is.na(TRTEDT),
        set_values_to = exprs(LSTALVDT = TRTEDT, seq = 0),
      )
    ),
    source_datasets = list(ae = ae, lb = lb, adsl = adsl),
    tmp_event_nr_var = event_nr,
    order = exprs(LSTALVDT, seq, event_nr),
    mode = "last",
    new_vars = exprs(LSTALVDT)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTEDT, DTHDTC, LSTALVDT),
  filter = !is.na(TRTSDT)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  select(-LSTALVDT) %>% # created in the previous call
  derive_vars_extreme_event(
    by_vars = exprs(STUDYID, USUBJID),
    events = list(
      event(
        dataset_name = "ae",
        order = exprs(AESTDTC, AESEQ),
        condition = !is.na(AESTDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(AESTDTC, highest_imputation = "M"),
          LALVSEQ = AESEQ,
          LALVDOM = "AE",
          LALVVAR = "AESTDTC"
        ),
      ),
      event(
        dataset_name = "ae",
        order = exprs(AEENDTC, AESEQ),
        condition = !is.na(AEENDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(AEENDTC, highest_imputation = "M"),
          LALVSEQ = AESEQ,
          LALVDOM = "AE",
          LALVVAR = "AEENDTC"
        ),
      ),
      event(
        dataset_name = "lb",
        order = exprs(LBDTC, LBSEQ),
        condition = !is.na(LBDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(LBDTC, highest_imputation = "M"),
          LALVSEQ = LBSEQ,
          LALVDOM = "LB",
          LALVVAR = "LBDTC"
        ),
      ),
      event(
        dataset_name = "adsl",
        condition = !is.na(TRTEDT),
        set_values_to = exprs(LSTALVDT = TRTEDT, LALVSEQ = NA_integer_, LALVDOM = "ADSL", LALVVAR = "TRTEDTM"),
      )
    ),
    source_datasets = list(ae = ae, lb = lb, adsl = adsl),
    tmp_event_nr_var = event_nr,
    order = exprs(LSTALVDT, LALVSEQ, event_nr),
    mode = "last",
    new_vars = exprs(LSTALVDT, LALVSEQ, LALVDOM, LALVVAR)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTEDT, DTHDTC, LSTALVDT, LALVDOM, LALVSEQ, LALVVAR),
  filter = !is.na(TRTSDT)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
format_agegr1 <- function(var_input) {
  case_when(
    var_input < 18 ~ "<18",
    between(var_input, 18, 64) ~ "18-64",
    var_input > 64 ~ ">64",
    TRUE ~ "Missing"
  )
}

format_region1 <- function(var_input) {
  case_when(
    var_input %in% c("CAN", "USA") ~ "North America",
    !is.na(var_input) ~ "Rest of the World",
    TRUE ~ "Missing"
  )
}


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  mutate(
    AGEGR1 = format_agegr1(AGE),
    REGION1 = format_region1(COUNTRY)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, AGE, SEX, COUNTRY, AGEGR1, REGION1)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_var_merged_exist_flag(
    dataset_add = ex,
    by_vars = exprs(STUDYID, USUBJID),
    new_var = SAFFL,
    condition = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "PLACEBO")))
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTSDT, ARM, ACTARM, SAFFL)
)

@bms63
Copy link
Collaborator

bms63 commented Sep 30, 2024

I say we #nolint all the issues present in the lintr check. I'm not super worried about them as these are just helper scripts.

I feel like the admiral adsl was not run with the latest template/vignette scripts updates to them (they look pretty similar). Let's just go with the current script you have put together for admiral_adsl

@pharmaverse/admiral - I think this is getting very close. Any final thoughts.

…o nolint a block of code: # nolint start and # nolint end can be used; (done in this commit)

Also note, possible to turn off  SPECIFIC linters (also done in this commit for "object_name_linter")
@jimrothstein
Copy link
Collaborator Author

Note: This commit uses a few features of package:lintr

…o nolint a block of code: # nolint start and # nolint end can be used; (done in this commit)

Also note, possible to turn off  SPECIFIC linters (also done in this commit for "object_name_linter")
…it reset back before I accidentally pt it into a push?)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need this

@bms63
Copy link
Collaborator

bms63 commented Oct 3, 2024

I'm ready to merge this into main. @pharmaverse/admiral any last thoughts or concerns?

@bms63 bms63 merged commit 9cf826d into pharmaverse:main Oct 3, 2024
18 of 19 checks passed
@bundfussr
Copy link
Collaborator

I'm ready to merge this into main. @pharmaverse/admiral any last thoughts or concerns?

Sorry for the late reply. I was off because the 3rd October is a bank holiday here.

I think the PR shouldn't have been merged. There are a lot of unresolved issues:

  • The create_*() scripts don't run.
  • The scripts include code which is never run.
  • data-raw/admiral_adlb.R was added to the repo but it is never used.
  • The data-backup folder should be removed because it is not necessary but increases the size of the repo (and the scripts do not update the backup anyway).
  • ...

@bms63 , how should we move on?

@bms63
Copy link
Collaborator

bms63 commented Oct 7, 2024

I'm ready to merge this into main. @pharmaverse/admiral any last thoughts or concerns?

Sorry for the late reply. I was off because the 3rd October is a bank holiday here.

I think the PR shouldn't have been merged. There are a lot of unresolved issues:

  • The create_*() scripts don't run.
  • The scripts include code which is never run.
  • data-raw/admiral_adlb.R was added to the repo but it is never used.
  • The data-backup folder should be removed because it is not necessary but increases the size of the repo (and the scripts do not update the backup anyway).
  • ...

@bms63 , how should we move on?

I think another PR will do the trick:

  • Remove data-raw/admiral_adlb.R and data_backup.
  • I was under the impression the create_*() scripts are run manually - are you saying they don't work or they don't auto-run??
  • We can clean up the code to remove unused code.
    @jimrothstein can you fix this please

@bundfussr
Copy link
Collaborator

  • I was under the impression the create_*() scripts are run manually - are you saying they don't work or they don't auto-run??

I get (when running them manually):

> #
> # Finally, save reduced ds
> #
> use_data(admiral_adsl, overwrite = TRUE)
Error in use_data(admiral_adsl, overwrite = TRUE) : 
  could not find function "use_data"

@bms63
Copy link
Collaborator

bms63 commented Oct 7, 2024

should it be usethis::use_data() ?

@bundfussr
Copy link
Collaborator

should it be usethis::use_data() ?

Yes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants