Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove dynamic domain determination #193

Merged
merged 21 commits into from
Jan 16, 2024
Merged

Remove dynamic domain determination #193

merged 21 commits into from
Jan 16, 2024

Conversation

elimillera
Copy link
Member

@elimillera elimillera commented Nov 27, 2023

Changes Description

First draft of removing they dynamic determination of .df names. This is replaced with the domain argument and a new function for setting the df_arg attribute called xportr_domain_name. Closes #182

Task List

  • The spirit of xportr is met in your Pull Request
  • Place Closes #<insert_issue_number> into the beginning of your Pull Request Title (Use Edit button in top-right if you need to update)
  • Summary of changes filled out in the above Changes Description. Can be removed or left blank if changes are minor/self-explanatory.
  • Check that your Pull Request is targeting the devel branch, Pull Requests to main should use the Release Pull Request Template
  • Code is formatted according to the tidyverse style guide. Use styler package and functions to style files accordingly.
  • Updated relevant unit tests or have written new unit tests. See our Wiki for conventions used in this package.
  • Creation/updated relevant roxygen headers and examples. See our Wiki for conventions used in this package.
  • Run devtools::document() so all .Rd files in the man folder and the NAMESPACE file in the project root are updated appropriately
  • Run pkgdown::build_site() and check that all affected examples are displayed correctly and that all new/updated functions occur on the "Reference" page.
  • Update NEWS.md if the changes pertain to a user-facing function (i.e. it has an @export tag) or documentation aimed at users (rather than developers)
  • Address any updates needed for vignettes and/or templates
  • Link the issue Development Panel so that it closes after successful merging.
  • Fix merge conflicts
  • Pat yourself on the back for a job well done! Much love to your accomplishment!

Copy link

codecov bot commented Nov 27, 2023

Codecov Report

Attention: 1 lines in your changes are missing coverage. Please review.

Comparison is base (6a74578) 100.00% compared to head (466d7ef) 99.82%.
Report is 6 commits behind head on main.

❗ Current head 466d7ef differs from pull request most recent head e3a31b1. Consider uploading reports for the commit e3a31b1 to get more accurate results

Files Patch % Lines
R/metadata.R 66.66% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##              main     #193      +/-   ##
===========================================
- Coverage   100.00%   99.82%   -0.18%     
===========================================
  Files           12       12              
  Lines          595      582      -13     
===========================================
- Hits           595      581      -14     
- Misses           0        1       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@vedhav vedhav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job reflecting the domain changes across all the xportr functions. Left a few small comments. Will have a closer look at the vignette and docs after this.

R/metadata.R Outdated Show resolved Hide resolved
tests/testthat/test-metadata.R Show resolved Hide resolved
tests/testthat/test-type.R Show resolved Hide resolved
R/utils-xportr.R Show resolved Hide resolved
R/type.R Show resolved Hide resolved
@bms63
Copy link
Collaborator

bms63 commented Dec 5, 2023

@elimillera what is your ETA on this PR? We need to merge into #190 to finish this one off

@elimillera elimillera marked this pull request as ready for review December 6, 2023 13:53
Copy link
Collaborator

@bms63 bms63 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this update warrants a note in the news/changelog!!

@bms63 bms63 linked an issue Dec 7, 2023 that may be closed by this pull request
@EeethB EeethB linked an issue Dec 7, 2023 that may be closed by this pull request
Copy link
Collaborator

@averissimo averissimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • There's a styler CI error
  • Minor comment to update comments in R/*
  • Should this change be added to NEWS.md? (it seems like a big breaking change for those that use this functionality)

R/df_label.R Show resolved Hide resolved
Copy link
Collaborator

@averissimo averissimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests/testthat/test-pipe.R should be renamed as it no longer test pipes independently.

Some options:

  • rename to tests/testthat/test-domain.R
  • same rename, but move xporter_domain_name() to its own R/domain.R
    • following {testthat} convention to test active file
  • move the tests to tests/testthat/test-metadata.R

R/length.R Show resolved Hide resolved
Copy link
Collaborator

@averissimo averissimo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry for the multiple review, but I've been noticing as I take a look at the assertion PR)

As it is, it allows for domain to be NULL with all the xportr functions, which I think itsn't the desired behavior.

Previously it wasn't the case as it was fallbacking to the dynamic determination.

WDYT @elimillera ?

@elimillera
Copy link
Member Author

(sorry for the multiple review, but I've been noticing as I take a look at the assertion PR)

As it is, it allows for domain to be NULL with all the xportr functions, which I think itsn't the desired behavior.

Previously it wasn't the case as it was fallbacking to the dynamic determination.

WDYT @elimillera ?

I actually think this is ok. If the user has really basic metadata that just has one domain. I think xportr is safe to assume that no filtering needs to be done for domain. I do see the arguments for requiring users explicitly supply it though

R/metadata.R Outdated Show resolved Hide resolved
README.Rmd Outdated Show resolved Hide resolved
@bms63
Copy link
Collaborator

bms63 commented Dec 19, 2023

I think some updates need to be made to the vignettes so the R-CMD checks pass

@EeethB
Copy link
Collaborator

EeethB commented Dec 21, 2023

Could you add the new function to the pkgdown site too?

@averissimo
Copy link
Collaborator

averissimo commented Dec 21, 2023

I was now testing all examples with empty domain and the "empty domain" problem appears on all other functions (with exception of df_label)

😕

Do you want to deal this on this PR or deal with this on a new one?

Expand to see failing (reproducible) code
library(xportr)
#> ℹ Loading xportr

# format ----------------------------------------------------------------------

adsl <- data.frame(
  USUBJID = c(1001, 1002, 1003),
  BRTHDT = c(1, 1, 2)
)

metadata <- data.frame(
  dataset = c("adsl", "adsl"),
  variable = c("USUBJID", "BRTHDT"),
  format = c(NA, "DATE9.")
)

adsl <- xportr_format(adsl, metadata)
#> Error in `dplyr::filter()`:
#> ℹ In argument: `dataset == domain & !is.na(format)`.
#> Caused by error:
#> ! `..1` must be of size 2 or 1, not size 0.
#> Backtrace:
#>      ▆
#>   1. ├─xportr::xportr_format(adsl, metadata)
#>   2. │ └─metadata %>% ... at xportr/R/format.R:76:5
#>   3. ├─dplyr::filter(., !!sym(domain_name) == domain & !is.na(!!sym(format_name)))
#>   4. ├─dplyr:::filter.data.frame(., !!sym(domain_name) == domain & !is.na(!!sym(format_name))) at dplyr/R/filter.R:119:3
#>   5. │ └─dplyr:::filter_rows(.data, dots, by) at dplyr/R/filter.R:134:3
#>   6. │   └─dplyr:::filter_eval(...) at dplyr/R/filter.R:149:3
#>   7. │     ├─base::withCallingHandlers(...) at dplyr/R/filter.R:217:3
#>   8. │     └─mask$eval_all_filter(dots, env_filter) at dplyr/R/filter.R:217:3
#>   9. │       └─dplyr (local) eval() at dplyr/R/data-mask.R:99:7
#>  10. ├─dplyr:::dplyr_internal_error(...)
#>  11. │ └─rlang::abort(class = c(class, "dplyr:::internal_error"), dplyr_error_data = data) at dplyr/R/conditions.R:194:3
#>  12. │   └─rlang:::signal_abort(cnd, .file) at rlang/R/cnd-abort.R:379:3
#>  13. │     └─base::signalCondition(cnd) at rlang/R/cnd-abort.R:850:5
#>  14. └─dplyr (local) `<fn>`(`<dpl:::__>`)
#>  15.   └─rlang::abort(message, class = error_class, parent = parent, call = error_call) at dplyr/R/conditions.R:235:5

# df_label --------------------------------------------------------------------

# adsl <- data.frame(
#   USUBJID = c(1001, 1002, 1003),
#   SITEID = c(001, 002, 003),
#   AGE = c(63, 35, 27),
#   SEX = c("M", "F", "M")
# )
#
# metadata <- data.frame(
#   dataset = c("adsl", "adae"),
#   label = c("Subject-Level Analysis", "Adverse Events Analysis")
# )
#
# adsl <- xportr_df_label(adsl, metadata)

# label -----------------------------------------------------------------------

adsl <- data.frame(
  USUBJID = c(1001, 1002, 1003),
  SITEID = c(001, 002, 003),
  AGE = c(63, 35, 27),
  SEX = c("M", "F", "M")
)

metadata <- data.frame(
  dataset = "adsl",
  variable = c("USUBJID", "SITEID", "AGE", "SEX"),
  label = c("Unique Subject Identifier", "Study Site Identifier", "Age", "Sex")
)

adsl <- xportr_label(adsl, metadata)
#> Error in `dplyr::filter()`:
#> ℹ In argument: `dataset == domain`.
#> Caused by error:
#> ! `..1` must be of size 4 or 1, not size 0.
#> Backtrace:
#>      ▆
#>   1. ├─xportr::xportr_label(adsl, metadata)
#>   2. │ └─metadata %>% dplyr::filter(!!sym(domain_name) == domain) at xportr/R/label.R:92:5
#>   3. ├─dplyr::filter(., !!sym(domain_name) == domain)
#>   4. ├─dplyr:::filter.data.frame(., !!sym(domain_name) == domain) at dplyr/R/filter.R:119:3
#>   5. │ └─dplyr:::filter_rows(.data, dots, by) at dplyr/R/filter.R:134:3
#>   6. │   └─dplyr:::filter_eval(...) at dplyr/R/filter.R:149:3
#>   7. │     ├─base::withCallingHandlers(...) at dplyr/R/filter.R:217:3
#>   8. │     └─mask$eval_all_filter(dots, env_filter) at dplyr/R/filter.R:217:3
#>   9. │       └─dplyr (local) eval() at dplyr/R/data-mask.R:99:7
#>  10. ├─dplyr:::dplyr_internal_error(...)
#>  11. │ └─rlang::abort(class = c(class, "dplyr:::internal_error"), dplyr_error_data = data) at dplyr/R/conditions.R:194:3
#>  12. │   └─rlang:::signal_abort(cnd, .file) at rlang/R/cnd-abort.R:379:3
#>  13. │     └─base::signalCondition(cnd) at rlang/R/cnd-abort.R:850:5
#>  14. └─dplyr (local) `<fn>`(`<dpl:::__>`)
#>  15.   └─rlang::abort(message, class = error_class, parent = parent, call = error_call) at dplyr/R/conditions.R:235:5

# length ----------------------------------------------------------------------

adsl <- data.frame(
  USUBJID = c(1001, 1002, 1003),
  BRTHDT = c(1, 1, 2)
)

metadata <- data.frame(
  dataset = c("adsl", "adsl"),
  variable = c("USUBJID", "BRTHDT"),
  length = c(10, 8)
)

adsl <- xportr_length(adsl, metadata)
#> Error in `filter()`:
#> ℹ In argument: `dataset == domain`.
#> Caused by error:
#> ! `..1` must be of size 2 or 1, not size 0.
#> Backtrace:
#>      ▆
#>   1. ├─xportr::xportr_length(adsl, metadata)
#>   2. │ └─metadata %>% filter(!!sym(domain_name) == domain) at xportr/R/length.R:99:5
#>   3. ├─dplyr::filter(., !!sym(domain_name) == domain)
#>   4. ├─dplyr:::filter.data.frame(., !!sym(domain_name) == domain) at dplyr/R/filter.R:119:3
#>   5. │ └─dplyr:::filter_rows(.data, dots, by) at dplyr/R/filter.R:134:3
#>   6. │   └─dplyr:::filter_eval(...) at dplyr/R/filter.R:149:3
#>   7. │     ├─base::withCallingHandlers(...) at dplyr/R/filter.R:217:3
#>   8. │     └─mask$eval_all_filter(dots, env_filter) at dplyr/R/filter.R:217:3
#>   9. │       └─dplyr (local) eval() at dplyr/R/data-mask.R:99:7
#>  10. ├─dplyr:::dplyr_internal_error(...)
#>  11. │ └─rlang::abort(class = c(class, "dplyr:::internal_error"), dplyr_error_data = data) at dplyr/R/conditions.R:194:3
#>  12. │   └─rlang:::signal_abort(cnd, .file) at rlang/R/cnd-abort.R:379:3
#>  13. │     └─base::signalCondition(cnd) at rlang/R/cnd-abort.R:850:5
#>  14. └─dplyr (local) `<fn>`(`<dpl:::__>`)
#>  15.   └─rlang::abort(message, class = error_class, parent = parent, call = error_call) at dplyr/R/conditions.R:235:5

# order -----------------------------------------------------------------------

adsl <- data.frame(
  BRTHDT = c(1, 1, 2),
  STUDYID = c("mid987650", "mid987650", "mid987650"),
  TRT01A = c("Active", "Active", "Placebo"),
  USUBJID = c(1001, 1002, 1003)
)

metadata <- data.frame(
  dataset = c("adsl", "adsl", "adsl", "adsl"),
  variable = c("STUDYID", "USUBJID", "TRT01A", "BRTHDT"),
  order = 1:4
)

adsl <- xportr_order(adsl, metadata)
#> Error in `dplyr::filter()`:
#> ℹ In argument: `dataset == domain & !is.na(order)`.
#> Caused by error:
#> ! `..1` must be of size 4 or 1, not size 0.
#> Backtrace:
#>      ▆
#>   1. ├─xportr::xportr_order(adsl, metadata)
#>   2. │ └─metadata %>% ... at xportr/R/order.R:95:5
#>   3. ├─dplyr::filter(., !!sym(domain_name) == domain & !is.na(!!sym(order_name)))
#>   4. ├─dplyr:::filter.data.frame(., !!sym(domain_name) == domain & !is.na(!!sym(order_name))) at dplyr/R/filter.R:119:3
#>   5. │ └─dplyr:::filter_rows(.data, dots, by) at dplyr/R/filter.R:134:3
#>   6. │   └─dplyr:::filter_eval(...) at dplyr/R/filter.R:149:3
#>   7. │     ├─base::withCallingHandlers(...) at dplyr/R/filter.R:217:3
#>   8. │     └─mask$eval_all_filter(dots, env_filter) at dplyr/R/filter.R:217:3
#>   9. │       └─dplyr (local) eval() at dplyr/R/data-mask.R:99:7
#>  10. ├─dplyr:::dplyr_internal_error(...)
#>  11. │ └─rlang::abort(class = c(class, "dplyr:::internal_error"), dplyr_error_data = data) at dplyr/R/conditions.R:194:3
#>  12. │   └─rlang:::signal_abort(cnd, .file) at rlang/R/cnd-abort.R:379:3
#>  13. │     └─base::signalCondition(cnd) at rlang/R/cnd-abort.R:850:5
#>  14. └─dplyr (local) `<fn>`(`<dpl:::__>`)
#>  15.   └─rlang::abort(message, class = error_class, parent = parent, call = error_call) at dplyr/R/conditions.R:235:5

# type ------------------------------------------------------------------------

metadata <- data.frame(
  dataset = "test",
  variable = c("Subj", "Param", "Val", "NotUsed"),
  type = c("numeric", "character", "numeric", "character"),
  format = NA
)

.df <- data.frame(
  Subj = as.character(123, 456, 789),
  Different = c("a", "b", "c"),
  Val = c("1", "2", "3"),
  Param = c("param1", "param2", "param3")
)

df2 <- xportr_type(.df, metadata)
#> Error in `filter()`:
#> ℹ In argument: `dataset == domain`.
#> Caused by error:
#> ! `..1` must be of size 4 or 1, not size 0.
#> Backtrace:
#>      ▆
#>   1. ├─xportr::xportr_type(.df, metadata)
#>   2. │ └─metadata %>% filter(!!sym(domain_name) == domain) at xportr/R/type.R:117:5
#>   3. ├─dplyr::filter(., !!sym(domain_name) == domain)
#>   4. ├─dplyr:::filter.data.frame(., !!sym(domain_name) == domain) at dplyr/R/filter.R:119:3
#>   5. │ └─dplyr:::filter_rows(.data, dots, by) at dplyr/R/filter.R:134:3
#>   6. │   └─dplyr:::filter_eval(...) at dplyr/R/filter.R:149:3
#>   7. │     ├─base::withCallingHandlers(...) at dplyr/R/filter.R:217:3
#>   8. │     └─mask$eval_all_filter(dots, env_filter) at dplyr/R/filter.R:217:3
#>   9. │       └─dplyr (local) eval() at dplyr/R/data-mask.R:99:7
#>  10. ├─dplyr:::dplyr_internal_error(...)
#>  11. │ └─rlang::abort(class = c(class, "dplyr:::internal_error"), dplyr_error_data = data) at dplyr/R/conditions.R:194:3
#>  12. │   └─rlang:::signal_abort(cnd, .file) at rlang/R/cnd-abort.R:379:3
#>  13. │     └─base::signalCondition(cnd) at rlang/R/cnd-abort.R:850:5
#>  14. └─dplyr (local) `<fn>`(`<dpl:::__>`)
#>  15.   └─rlang::abort(message, class = error_class, parent = parent, call = error_call) at dplyr/R/conditions.R:235:5

Created on 2023-12-21 with reprex v2.0.2

@bms63
Copy link
Collaborator

bms63 commented Dec 21, 2023

I was now testing all examples with empty domain and the "empty domain" problem appears on all other functions (with exception of df_label)

😕

Do you want to deal this on this PR or deal with this on a new one?

Expand to see failing (reproducible) code

I think we should handle this issue in this PR.

@elimillera
Copy link
Member Author

I was now testing all examples with empty domain and the "empty domain" problem appears on all other functions (with exception of df_label)

😕

Do you want to deal this on this PR or deal with this on a new one?

Expand to see failing (reproducible) code

Thanks for that @averissimo I made some updates and added some tests for that. It will just ignore the domain if it is null here. Is that more what you had in mind to fix it?

@EeethB
Copy link
Collaborator

EeethB commented Jan 9, 2024

@elimillera I saw the xportr_domain_name() function is like xportr_metadata() but only for the domain. Could we just make metadata an optional argument in xportr_metadata() and not need an additional function?

R/metadata.R Show resolved Hide resolved
README.Rmd Outdated Show resolved Hide resolved
@elimillera elimillera merged commit 48f8130 into main Jan 16, 2024
11 checks passed
@elimillera elimillera deleted the 182-remove-df-expr branch January 16, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants