Closes #2427 general issue adopt data and data raw conventions #2494

jimrothstein · 2024-08-08T22:51:11Z

Thank you for your Pull Request! We have developed this task checklist from the Development Process Guide to help with the final steps of the process. Completing the below tasks helps to ensure our reviewers can maximize their time on your code as well as making sure the admiral codebase remains robust and consistent.

Please check off each taskbox as an acknowledgment that you completed the task or check off that it is not relevant to your Pull Request. This checklist is part of the Github Action workflows and the Pull Request will not be merged into the main branch until you have checked off each task.

jimrothstein · 2024-08-08T22:55:42Z

This is a DRAFT

GOAL:

create data/*.rda files following pharamaversesdtm convention

QUESTION:

Is this draft PR an acceptable approach?

What I did:

Create: example_qs.rda in TWO WAYS:

the original method and
newer convention (as pharamaversesdtm), scripts in data-raw/*.R

Pls. lookover new code: data-raw/example_qs_new.R

And tested in data-raw/test.R ; this checks that created *.rda files are IDENTICAL
(test.R to be removed)

Thx.

manciniedoardo · 2024-08-12T07:09:37Z

HI @jimrothstein is this ready for review? I see some comments but also see it's still marked draft.

jimrothstein · 2024-08-12T16:56:13Z

"ready for review? "

review = NO

@manciniedoardo

To address this issue #2494 (data conventions), I am proposing using a code similar to this draft.
Before I do more, or prepare true PR, I hope for thumbs up for thumbs down or discussion.

Goal: Create data/example_qs.rda

Replace current code: inst/example_scripts/example_qs.R
With example code: data-raw/example_qs.R

For future, how should I communicate or label these kinds of pre-PR questions about proposed code/proposed approach?

Thx

manciniedoardo · 2024-08-13T15:15:59Z

@jimrothstein sorry I had misunderstood. The second way (using data_raw) looks good to me. @pharmaverse/admiral any other thoughts before Jim prepares a PR?

…st PR

jimrothstein · 2024-08-17T00:02:06Z

@manciniedoardo DRAFT

GOAL

Replace: inst/example_scripts/example_qs.R
With: data-raw/example_qs.R

To create: data/example_qs.rda

Note: Once this PR is successful:

Will do several other datasets similarly.

Still to do for THIS PR

Except for a few comments (to be removed or modified), this will become a PR to create *data/example_qs.rda"

Not sure:

Need to test the example data?
Remove old code in inst/example_scripts/example_qs.R ?
Not sure Proper way to document this change?

bms63 · 2024-08-20T12:13:20Z

HI @jimrothstein this looks good so far and creating a draft and asking for feedback seems appropriate to me.

I would just make sure that the example data where it is used doesn't change anything. I'm not super familiar with where this is being used.

Also - looking over at the inst folder - I see some other data creation stuff, e.g. adlb_grading. I am wondering if that should be moved over as well.

Anyone (@pharmaverse/admiral ) know what this script is for? https://github.com/pharmaverse/admiral/blob/main/inst/example_scripts/derive_single_dose.R

bundfussr · 2024-08-20T12:24:17Z

Anyone (@pharmaverse/admiral ) know what this script is for? https://github.com/pharmaverse/admiral/blob/main/inst/example_scripts/derive_single_dose.R

It creates the ex_single dataset, which is used in the OCCDS vignette.

jimrothstein · 2024-08-22T22:03:20Z

Plan (Updated 22 AUG 2024)

To verify that created data files (*.rda) are IDENTICAL to those before any changes.
SITUATION NOW:

data/.rda* Path to created data.
inst/example_scripts/*.R Path to *.R files, create *.rda files.

AFTER:

(NO CHANGE) data/*.rda Path to original data.
(NEW) data-raw/*.R Path to new code.
(REMOVE?) inst/example_scripts/*.R Path to old code.

To effect this, propose these intermediate steps:

(Temporary) Add: old_data/ folder to hold copy of *.rda files.
Empty: data/ .
Run: new code (data-raw/*.R) New data saved in data/*.rda
Run: **test.R (**or equivalent as testthat) verify old and new code produces IDENTICAL *.rda files.
Pause for review/discuss

As a final step

Remove folder and old code inst/example_scripts/*.R
Remove old_data/*.rda (copies of orginal data)
Remove test.R,
PR

As initial run through:

Do this for example_qs.rda ONLY
Pause for review/discuss.
Then to for *all remaining .rda files

How to document this?

bms63 · 2024-08-22T22:36:06Z

Hey Jim - You can play fast and loose with this. The actions we have set up will crash out with all our tests and examples if things are not set up correctly with the example data.

Everything is backed up as well - so if something goes awry we can also revert. Thanks for taking this on!!

jimrothstein · 2024-08-23T19:19:38Z

@bms63 Ready for 1st serious review.

Note:

example_qs.rda ONLY
documentation requirements? roxygen?
test.R in data-raw/ for convenience, remove?
note use of environments when using load the data (in test.R)
a couple other small things.

bms63 · 2024-08-24T15:02:06Z

@bms63 Ready for 1st serious review.

Note:

example_qs.rda ONLY

looking good

documentation requirements? roxygen?

I think we could benefit from describing the variables, but lets do this in a separate issue and just focus on adopting the data/data-raw convention. Do you mind making one?

test.R in data-raw/ for convenience, remove?
I think this should be removed unless you think this might be handy for others. It should probably have a different name

note use of environments when using load the data (in test.R)

noted

a couple other small things.

can you fix all the failing actions: code style, linting, etc.

Thanks again!!

…onventions

bms63 · 2024-08-25T14:33:27Z

Since the test.R file is just for testing purposes. Can you just comment out the source parts until we can remove this file? Honestly, it feels like things are pretty good and we can just drop testing for if the two are identical. I'm not really worried about it.

…no more testing. The file data/example_qs.rda will be used, going forward.

jimrothstein · 2024-08-25T19:14:29Z

@bms63

Removed all the testing files for example_qs.

still failing (URL). I think my local, #2427 issue branch is not up-to-date. Working on it ...

update
My local copy has this url and it is not broken -- I do not see the line github actions is unhappy with.

(DILI)

bms63 · 2024-08-25T23:30:34Z

Ah dang looks like we need to roxygenize these files

jimrothstein · 2024-08-26T17:30:56Z

roxygenize ?

Which files?? In pharamversesdtm, I do not think R files in data-raw/ are documented/roxygenized .
(Example: dm_vaccine.R https://github.com/pharmaverse/pharmaversesdtm/blob/main/data-raw/dm_vaccine.R)

From above (Closes #2427 general issue adopt data and data raw conventions #2494 (comment))

I think we could benefit from describing the variables, but lets do this in a separate issue and just focus on adopting the data/data-raw convention. Do you mind making one?

Something like?

new issue:
Document variables in data/*.rda files ( created by R code in data-raw/)

bms63 · 2024-08-26T17:39:01Z

@jimrothstein I haven't looked too closely. But whatever is causing this needs to get fixed. thought it was telling us to use roxygen stuff.

…alls

style_pkg() - no errors, does check example_qs.R lint() reports no errors.

jimrothstein · 2024-08-26T19:05:07Z

lint and styler pass now. 👍
(I learned this: no need to use library(xyz) in data-raw/*.R)

fda URL has me stumped - rest my eyes and take another look, must be something obvious.

Errors in vignettes/hys_law.Rmd

* [https://www.fda.gov/media/116737/download](https://www.fda.gov/media/116737/download): Failed: Network error (status code: 404)

jimrothstein · 2024-09-24T17:48:06Z

(will re-lint and re-style, but hope otherwise getting closer to goal)

bms63

Can you run styler on your code please

bms63 · 2024-09-27T19:46:19Z

Styler will fix a lot of these linting issues so then we can discuss the more problematic ones

bms63 · 2024-09-27T19:50:19Z

Alright! We are almost there!! Yay!

Ran devtools::lint() To be changed to standard form: CACHE_DIR, DATA_RAW ... once I am sure code is correct. NO change to .RBuildignore (not sure if consensus)

jimrothstein · 2024-09-29T00:24:07Z

adsl.rmd vignette code

knitr::purl("adsl.Rmd", output = "output_file.R")

## ----setup, include = FALSE----------------------------------------------------------------------------------------------------------------------------------------
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(admiraldev)


## ----message=FALSE, warning=FALSE----------------------------------------------------------------------------------------------------------------------------------
library(admiral)
library(dplyr, warn.conflicts = FALSE)
library(pharmaversesdtm)
library(lubridate)
library(stringr)

dm <- pharmaversesdtm::dm
ds <- pharmaversesdtm::ds
ex <- pharmaversesdtm::ex
ae <- pharmaversesdtm::ae
lb <- pharmaversesdtm::lb

dm <- convert_blanks_to_na(dm)
ds <- convert_blanks_to_na(ds)
ex <- convert_blanks_to_na(ex)
ae <- convert_blanks_to_na(ae)
lb <- convert_blanks_to_na(lb)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- dm %>%
  select(-DOMAIN)


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, RFSTDTC, COUNTRY, AGE, SEX, RACE, ETHNIC, ARM, ACTARM)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- dm %>%
  mutate(TRT01P = ARM, TRT01A = ACTARM)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
# impute start and end time of exposure to first and last respectively,
# do not impute date
ex_ext <- ex %>%
  derive_vars_dtm(
    dtc = EXSTDTC,
    new_vars_prefix = "EXST"
  ) %>%
  derive_vars_dtm(
    dtc = EXENDTC,
    new_vars_prefix = "EXEN",
    time_imputation = "last"
  )

adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "PLACEBO"))) & !is.na(EXSTDTM),
    new_vars = exprs(TRTSDTM = EXSTDTM, TRTSTMF = EXSTTMF),
    order = exprs(EXSTDTM, EXSEQ),
    mode = "first",
    by_vars = exprs(STUDYID, USUBJID)
  ) %>%
  derive_vars_merged(
    dataset_add = ex_ext,
    filter_add = (EXDOSE > 0 |
      (EXDOSE == 0 &
        str_detect(EXTRT, "PLACEBO"))) & !is.na(EXENDTM),
    new_vars = exprs(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
    order = exprs(EXENDTM, EXSEQ),
    mode = "last",
    by_vars = exprs(STUDYID, USUBJID)
  )


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_dtm_to_dt(source_vars = exprs(TRTSDTM, TRTEDTM))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_var_trtdurd()


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, RFSTDTC, TRTSDTM, TRTSDT, TRTEDTM, TRTEDT, TRTDURD)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
# convert character date to numeric date without imputation
ds_ext <- derive_vars_dt(
  ds,
  dtc = DSSTDTC,
  new_vars_prefix = "DSST"
)

adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds_ext,
    by_vars = exprs(STUDYID, USUBJID),
    new_vars = exprs(EOSDT = DSSTDT),
    filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD != "SCREEN FAILURE"
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  ds_ext,
  display_vars = exprs(USUBJID, DSCAT, DSDECOD, DSTERM, DSSTDT, DSSTDTC),
  filter = DSDECOD != "SCREEN FAILURE"
)


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, EOSDT))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
format_eosstt <- function(x) {
  case_when(
    x %in% c("COMPLETED") ~ "COMPLETED",
    x %in% c("SCREEN FAILURE") ~ NA_character_,
    TRUE ~ "DISCONTINUED"
  )
}


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds,
    by_vars = exprs(STUDYID, USUBJID),
    filter_add = DSCAT == "DISPOSITION EVENT",
    new_vars = exprs(EOSSTT = format_eosstt(DSDECOD)),
    missing_values = exprs(EOSSTT = "ONGOING")
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, EOSDT, EOSSTT))


## ------------------------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds,
    by_vars = exprs(USUBJID),
    new_vars = exprs(DCSREAS = DSDECOD, DCSREASP = DSTERM),
    filter_add = DSCAT == "DISPOSITION EVENT" &
      !(DSDECOD %in% c("SCREEN FAILURE", "COMPLETED", NA))
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, EOSDT, EOSSTT, DCSREAS, DCSREASP))


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  select(-DCSREAS, -DCSREASP)


## ------------------------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds,
    by_vars = exprs(USUBJID),
    new_vars = exprs(DCSREAS = DSDECOD),
    filter_add = DSCAT == "DISPOSITION EVENT" &
      DSDECOD %notin% c("SCREEN FAILURE", "COMPLETED", NA)
  ) %>%
  derive_vars_merged(
    dataset_add = ds,
    by_vars = exprs(USUBJID),
    new_vars = exprs(DCSREASP = DSTERM),
    filter_add = DSCAT == "DISPOSITION EVENT" & DSDECOD %in% "OTHER"
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, EOSDT, EOSSTT, DCSREAS, DCSREASP))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_merged(
    dataset_add = ds_ext,
    filter_add = DSDECOD == "RANDOMIZED",
    by_vars = exprs(STUDYID, USUBJID),
    new_vars = exprs(RANDDT = DSSTDT)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl, display_vars = exprs(USUBJID, RANDDT))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_dt(
    new_vars_prefix = "DTH",
    dtc = DTHDTC
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(adsl %>% filter(!is.na(DTHDT) | row_number() %% 50 == 0), display_vars = exprs(USUBJID, TRTEDT, DTHDTC, DTHDT, DTHFL))


## ----eval=FALSE----------------------------------------------------------------------------------------------------------------------------------------------------
## adsl <- adsl %>%
##   derive_vars_dt(
##     new_vars_prefix = "DTH",
##     dtc = DTHDTC,
##     date_imputation = "first"
##   )


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_extreme_event(
    by_vars = exprs(STUDYID, USUBJID),
    events = list(
      event(
        dataset_name = "ae",
        condition = AEOUT == "FATAL",
        set_values_to = exprs(DTHCAUS = AEDECOD),
      ),
      event(
        dataset_name = "ds",
        condition = DSDECOD == "DEATH" & grepl("DEATH DUE TO", DSTERM),
        set_values_to = exprs(DTHCAUS = DSTERM),
      )
    ),
    source_datasets = list(ae = ae, ds = ds),
    tmp_event_nr_var = event_nr,
    order = exprs(event_nr),
    mode = "first",
    new_vars = exprs(DTHCAUS)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, EOSDT, DTHDTC, DTHDT, DTHCAUS),
  filter = DTHFL == "Y"
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  select(-DTHCAUS) %>% # remove it before deriving it again
  derive_vars_extreme_event(
    by_vars = exprs(STUDYID, USUBJID),
    events = list(
      event(
        dataset_name = "ae",
        condition = AEOUT == "FATAL",
        set_values_to = exprs(DTHCAUS = AEDECOD, DTHDOM = "AE", DTHSEQ = AESEQ),
      ),
      event(
        dataset_name = "ds",
        condition = DSDECOD == "DEATH" & grepl("DEATH DUE TO", DSTERM),
        set_values_to = exprs(DTHCAUS = DSTERM, DTHDOM = "DS", DTHSEQ = DSSEQ),
      )
    ),
    source_datasets = list(ae = ae, ds = ds),
    tmp_event_nr_var = event_nr,
    order = exprs(event_nr),
    mode = "first",
    new_vars = exprs(DTHCAUS, DTHDOM, DTHSEQ)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTEDT, DTHDTC, DTHDT, DTHCAUS, DTHDOM, DTHSEQ),
  filter = DTHFL == "Y"
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  mutate(DTHCGR1 = case_when(
    is.na(DTHDOM) ~ NA_character_,
    DTHDOM == "AE" ~ "ADVERSE EVENT",
    str_detect(DTHCAUS, "(PROGRESSIVE DISEASE|DISEASE RELAPSE)") ~ "PROGRESSIVE DISEASE",
    TRUE ~ "OTHER"
  ))


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_duration(
    new_var = DTHADY,
    start_date = TRTSDT,
    end_date = DTHDT
  )


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_duration(
    new_var = LDDTHELD,
    start_date = TRTEDT,
    end_date = DTHDT,
    add_one = FALSE
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTEDT, DTHDTC, DTHDT, DTHCAUS, DTHADY, LDDTHELD),
  filter = DTHFL == "Y"
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_vars_extreme_event(
    by_vars = exprs(STUDYID, USUBJID),
    events = list(
      event(
        dataset_name = "ae",
        order = exprs(AESTDTC, AESEQ),
        condition = !is.na(AESTDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(AESTDTC, highest_imputation = "M"),
          seq = AESEQ
        ),
      ),
      event(
        dataset_name = "ae",
        order = exprs(AEENDTC, AESEQ),
        condition = !is.na(AEENDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(AEENDTC, highest_imputation = "M"),
          seq = AESEQ
        ),
      ),
      event(
        dataset_name = "lb",
        order = exprs(LBDTC, LBSEQ),
        condition = !is.na(LBDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(LBDTC, highest_imputation = "M"),
          seq = LBSEQ
        ),
      ),
      event(
        dataset_name = "adsl",
        condition = !is.na(TRTEDT),
        set_values_to = exprs(LSTALVDT = TRTEDT, seq = 0),
      )
    ),
    source_datasets = list(ae = ae, lb = lb, adsl = adsl),
    tmp_event_nr_var = event_nr,
    order = exprs(LSTALVDT, seq, event_nr),
    mode = "last",
    new_vars = exprs(LSTALVDT)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTEDT, DTHDTC, LSTALVDT),
  filter = !is.na(TRTSDT)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  select(-LSTALVDT) %>% # created in the previous call
  derive_vars_extreme_event(
    by_vars = exprs(STUDYID, USUBJID),
    events = list(
      event(
        dataset_name = "ae",
        order = exprs(AESTDTC, AESEQ),
        condition = !is.na(AESTDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(AESTDTC, highest_imputation = "M"),
          LALVSEQ = AESEQ,
          LALVDOM = "AE",
          LALVVAR = "AESTDTC"
        ),
      ),
      event(
        dataset_name = "ae",
        order = exprs(AEENDTC, AESEQ),
        condition = !is.na(AEENDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(AEENDTC, highest_imputation = "M"),
          LALVSEQ = AESEQ,
          LALVDOM = "AE",
          LALVVAR = "AEENDTC"
        ),
      ),
      event(
        dataset_name = "lb",
        order = exprs(LBDTC, LBSEQ),
        condition = !is.na(LBDTC),
        set_values_to = exprs(
          LSTALVDT = convert_dtc_to_dt(LBDTC, highest_imputation = "M"),
          LALVSEQ = LBSEQ,
          LALVDOM = "LB",
          LALVVAR = "LBDTC"
        ),
      ),
      event(
        dataset_name = "adsl",
        condition = !is.na(TRTEDT),
        set_values_to = exprs(LSTALVDT = TRTEDT, LALVSEQ = NA_integer_, LALVDOM = "ADSL", LALVVAR = "TRTEDTM"),
      )
    ),
    source_datasets = list(ae = ae, lb = lb, adsl = adsl),
    tmp_event_nr_var = event_nr,
    order = exprs(LSTALVDT, LALVSEQ, event_nr),
    mode = "last",
    new_vars = exprs(LSTALVDT, LALVSEQ, LALVDOM, LALVVAR)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTEDT, DTHDTC, LSTALVDT, LALVDOM, LALVSEQ, LALVVAR),
  filter = !is.na(TRTSDT)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
format_agegr1 <- function(var_input) {
  case_when(
    var_input < 18 ~ "<18",
    between(var_input, 18, 64) ~ "18-64",
    var_input > 64 ~ ">64",
    TRUE ~ "Missing"
  )
}

format_region1 <- function(var_input) {
  case_when(
    var_input %in% c("CAN", "USA") ~ "North America",
    !is.na(var_input) ~ "Rest of the World",
    TRUE ~ "Missing"
  )
}


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  mutate(
    AGEGR1 = format_agegr1(AGE),
    REGION1 = format_region1(COUNTRY)
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, AGE, SEX, COUNTRY, AGEGR1, REGION1)
)


## ----eval=TRUE-----------------------------------------------------------------------------------------------------------------------------------------------------
adsl <- adsl %>%
  derive_var_merged_exist_flag(
    dataset_add = ex,
    by_vars = exprs(STUDYID, USUBJID),
    new_var = SAFFL,
    condition = (EXDOSE > 0 | (EXDOSE == 0 & str_detect(EXTRT, "PLACEBO")))
  )


## ----eval=TRUE, echo=FALSE-----------------------------------------------------------------------------------------------------------------------------------------
dataset_vignette(
  adsl,
  display_vars = exprs(USUBJID, TRTSDT, ARM, ACTARM, SAFFL)
)

bms63 · 2024-09-30T16:13:37Z

I say we #nolint all the issues present in the lintr check. I'm not super worried about them as these are just helper scripts.

I feel like the admiral adsl was not run with the latest template/vignette scripts updates to them (they look pretty similar). Let's just go with the current script you have put together for admiral_adsl

@pharmaverse/admiral - I think this is getting very close. Any final thoughts.

…o nolint a block of code: # nolint start and # nolint end can be used; (done in this commit) Also note, possible to turn off SPECIFIC linters (also done in this commit for "object_name_linter")

jimrothstein · 2024-09-30T21:51:33Z

Note: This commit uses a few features of package:lintr

turn off lintr for block of code (vs line-by-line)
turn off a specific lintr ("object_name_deliminter")
Very rich number of options to finely control linting: SEE https://lintr.r-lib.org/articles/lintr.html

…o nolint a block of code: # nolint start and # nolint end can be used; (done in this commit) Also note, possible to turn off SPECIFIC linters (also done in this commit for "object_name_linter")

…it reset back before I accidentally pt it into a push?)

bms63 · 2024-10-01T12:46:02Z

purl_adsl_vignette.R

We do not need this

bms63 · 2024-10-03T13:40:09Z

I'm ready to merge this into main. @pharmaverse/admiral any last thoughts or concerns?

bundfussr · 2024-10-07T15:52:36Z

I'm ready to merge this into main. @pharmaverse/admiral any last thoughts or concerns?

Sorry for the late reply. I was off because the 3rd October is a bank holiday here.

I think the PR shouldn't have been merged. There are a lot of unresolved issues:

The create_*() scripts don't run.
The scripts include code which is never run.
data-raw/admiral_adlb.R was added to the repo but it is never used.
The data-backup folder should be removed because it is not necessary but increases the size of the repo (and the scripts do not update the backup anyway).
...

@bms63 , how should we move on?

bms63 · 2024-10-07T16:01:02Z

I'm ready to merge this into main. @pharmaverse/admiral any last thoughts or concerns?

Sorry for the late reply. I was off because the 3rd October is a bank holiday here.

I think the PR shouldn't have been merged. There are a lot of unresolved issues:

The create_*() scripts don't run.

The scripts include code which is never run.

data-raw/admiral_adlb.R was added to the repo but it is never used.

The data-backup folder should be removed because it is not necessary but increases the size of the repo (and the scripts do not update the backup anyway).

...

@bms63 , how should we move on?

I think another PR will do the trick:

Remove data-raw/admiral_adlb.R and data_backup.
I was under the impression the create_*() scripts are run manually - are you saying they don't work or they don't auto-run??
We can clean up the code to remove unused code.
@jimrothstein can you fix this please

bundfussr · 2024-10-07T16:06:13Z

I was under the impression the create_*() scripts are run manually - are you saying they don't work or they don't auto-run??

I get (when running them manually):

> #
> # Finally, save reduced ds
> #
> use_data(admiral_adsl, overwrite = TRUE)
Error in use_data(admiral_adsl, overwrite = TRUE) : 
  could not find function "use_data"

bms63 · 2024-10-07T16:07:45Z

should it be usethis::use_data() ?

bundfussr · 2024-10-07T16:11:03Z

should it be usethis::use_data() ?

Yes

jimrothstein added 2 commits August 5, 2024 17:28

attempt zero

11a337f

2427_adopt_data_data_raw_conventions-FIrst

eaea9d6

jimrothstein marked this pull request as draft August 8, 2024 22:51

jimrothstein requested review from manciniedoardo and bms63 August 8, 2024 22:56

2427 ..DRAFT .. except for a few comments to be removed, this is almo…

95b4eca

…st PR

Issue 2427 first serious attempt for example_qs.rda, see comments

2bd4290

jimrothstein added 2 commits August 24, 2024 17:01

2427_ ran spelling, styler, lintr, but it dislikes use of source()

f16131d

Merge branch 'main' into 2427_general_issue_adopt_data_and_data_raw_c…

bd67872

…onventions

2427 removed data-raw/test.R and old_data/example_qs.rda. This means …

05d16aa

…no more testing. The file data/example_qs.rda will be used, going forward.

jimrothstein added 2 commits August 26, 2024 11:09

2427 EXPERIMENT ... data-raw/example_qs.R ... commented out library c…

f70b75e

…alls

EXPERIMENT 2 - remove unnecessary commnets;

6e41535

style_pkg() - no errors, does check example_qs.R lint() reports no errors.

EXPERIMENT 3 - hys_law.Rmd missing close paraenthesis at end of FDA url.

14e144b

jimrothstein added 2 commits September 24, 2024 10:32

add diffdf and key to track differences

f6aa82f

update my branch with latest main

a5bb4f1

jimrothstein requested review from bundfussr and bms63 September 27, 2024 02:33

bms63 reviewed Sep 27, 2024

View reviewed changes

jimrothstein added 2 commits September 27, 2024 13:05

Ran styler::style_pkg()

d89ce0c

Ran devtools::lint() To be changed to standard form: CACHE_DIR, DATA_RAW ... once I am sure code is correct. NO change to .RBuildignore (not sure if consensus)

re-run roxygen

ff67700

Apply # nolint; NOTE: library(lintr) has many options. For example, t…

0076b24

…o nolint a block of code: # nolint start and # nolint end can be used; (done in this commit) Also note, possible to turn off SPECIFIC linters (also done in this commit for "object_name_linter")

jimrothstein added 3 commits September 30, 2024 15:04

Apply # nolint; NOTE: library(lintr) has many options. For example, t…

873f163

…o nolint a block of code: # nolint start and # nolint end can be used; (done in this commit) Also note, possible to turn off SPECIFIC linters (also done in this commit for "object_name_linter")

fixing

1456d5e

manged to mangle git. Do not think need file with purl. (Delete it? g…

cd5244b

…it reset back before I accidentally pt it into a push?)

bms63 reviewed Oct 1, 2024

View reviewed changes

purl_adsl_vignette.R Outdated

Copy link

Collaborator

bms63 Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do not need this

jimrothstein added 2 commits October 1, 2024 10:19

remove purl_adsl_vignette.R; added by error

678a46e

merge local branch 2427 with upstream

1323f8c

bms63 added 2 commits October 3, 2024 11:48

chore: #2427 developer news

392cdfe

chore: #2427 spelling

1850a8a

bms63 merged commit 9cf826d into pharmaverse:main Oct 3, 2024
18 of 19 checks passed

bms63 mentioned this pull request Oct 7, 2024

General Issues: data-raw/data Round 2 #2526

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Closes #2427 general issue adopt data and data raw conventions #2494

Closes #2427 general issue adopt data and data raw conventions #2494

jimrothstein commented Aug 8, 2024 •

edited

Loading

jimrothstein commented Aug 8, 2024 •

edited

Loading

manciniedoardo commented Aug 12, 2024

jimrothstein commented Aug 12, 2024

manciniedoardo commented Aug 13, 2024

jimrothstein commented Aug 17, 2024 •

edited

Loading

bms63 commented Aug 20, 2024

bundfussr commented Aug 20, 2024

jimrothstein commented Aug 22, 2024 •

edited

Loading

bms63 commented Aug 22, 2024

jimrothstein commented Aug 23, 2024 •

edited

Loading

bms63 commented Aug 24, 2024 •

edited

Loading

bms63 commented Aug 25, 2024

jimrothstein commented Aug 25, 2024 •

edited

Loading

bms63 commented Aug 25, 2024

jimrothstein commented Aug 26, 2024

bms63 commented Aug 26, 2024

jimrothstein commented Aug 26, 2024 •

edited

Loading

jimrothstein commented Sep 24, 2024

bms63 left a comment

bms63 commented Sep 27, 2024

bms63 commented Sep 27, 2024

jimrothstein commented Sep 29, 2024

bms63 commented Sep 30, 2024

jimrothstein commented Sep 30, 2024

bms63 Oct 1, 2024

bms63 commented Oct 3, 2024

bundfussr commented Oct 7, 2024

bms63 commented Oct 7, 2024

bundfussr commented Oct 7, 2024

bms63 commented Oct 7, 2024

bundfussr commented Oct 7, 2024

Closes #2427 general issue adopt data and data raw conventions #2494

Closes #2427 general issue adopt data and data raw conventions #2494

Conversation

jimrothstein commented Aug 8, 2024 • edited Loading

jimrothstein commented Aug 8, 2024 • edited Loading

This is a DRAFT

GOAL:

QUESTION:

What I did:

manciniedoardo commented Aug 12, 2024

jimrothstein commented Aug 12, 2024

manciniedoardo commented Aug 13, 2024

jimrothstein commented Aug 17, 2024 • edited Loading

GOAL

Note: Once this PR is successful:

Still to do for THIS PR

bms63 commented Aug 20, 2024

bundfussr commented Aug 20, 2024

jimrothstein commented Aug 22, 2024 • edited Loading

bms63 commented Aug 22, 2024

jimrothstein commented Aug 23, 2024 • edited Loading

bms63 commented Aug 24, 2024 • edited Loading

bms63 commented Aug 25, 2024

jimrothstein commented Aug 25, 2024 • edited Loading

bms63 commented Aug 25, 2024

jimrothstein commented Aug 26, 2024

bms63 commented Aug 26, 2024

jimrothstein commented Aug 26, 2024 • edited Loading

jimrothstein commented Sep 24, 2024

bms63 left a comment

Choose a reason for hiding this comment

bms63 commented Sep 27, 2024

bms63 commented Sep 27, 2024

jimrothstein commented Sep 29, 2024

adsl.rmd vignette code

bms63 commented Sep 30, 2024

jimrothstein commented Sep 30, 2024

bms63 Oct 1, 2024

Choose a reason for hiding this comment

bms63 commented Oct 3, 2024

bundfussr commented Oct 7, 2024

bms63 commented Oct 7, 2024

bundfussr commented Oct 7, 2024

bms63 commented Oct 7, 2024

bundfussr commented Oct 7, 2024

jimrothstein commented Aug 8, 2024 •

edited

Loading

jimrothstein commented Aug 8, 2024 •

edited

Loading

jimrothstein commented Aug 17, 2024 •

edited

Loading

jimrothstein commented Aug 22, 2024 •

edited

Loading

jimrothstein commented Aug 23, 2024 •

edited

Loading

bms63 commented Aug 24, 2024 •

edited

Loading

jimrothstein commented Aug 25, 2024 •

edited

Loading

jimrothstein commented Aug 26, 2024 •

edited

Loading