Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding tbl_merge(merge_vars) argument #2119

Merged
merged 5 commits into from
Jan 15, 2025
Merged

Adding tbl_merge(merge_vars) argument #2119

merged 5 commits into from
Jan 15, 2025

Conversation

ddsjoberg
Copy link
Owner

@ddsjoberg ddsjoberg commented Jan 9, 2025

What changes are proposed in this pull request?

  • Adding the tbl_merge(merge_vars) argument, which allows users to specify any merging columns. Additionally, columns selected by cards::all_ard_groups() have been added to the default merging columns, which provides the functionality for merging the results from tbl_hierarchical() and tbl_hierarchical_count(). (Add tbl_merge(merge_vars) argument? #1861)

    This does, however, introduce one change in behavior from the previous version of tbl_merge(). Previously, merging on a table with the same variable, but with a different label would be reconciled silently in the background and the first label would be used in the final table. While this may have been useful in a few edge cases, it largely was an unintuitive result. This update performs more straightforward merging and the results are more aligned with users' expectations.

If there is an GitHub issue associated with this pull request, please provide link.
closes #1861
closes #2086

@michaelcurry1123 thank you so much for reviewing this one!!

While making this update, I attempted to simplify the function quite a bit. I am nervous that I removed some complexity that was required to properly merge. 😬
We used to have this section of code where we would nest the table by the "variable" column, then merge each of the nested data frames for each variable separately. Then would stack them all. I am not sure why we had that additional complexity??? Anyway, while you're testing, if you could keep a script of the tested scenarios and perhaps we could add additional unit tests?


Reviewer Checklist (if item does not apply, mark is as complete)

  • PR branch has pulled the most recent updates from main branch.
  • If a bug was fixed, a unit test was added.
  • Run pkgdown::build_site(). Check the R console for errors, and review the rendered website.
  • Code coverage is suitable for any new functions/features: devtools::test_coverage()
  • usethis::use_spell_check() runs with no spelling errors in documentation
  • All GitHub Action workflows pass with a ✅

When the branch is ready to be merged into master:

  • Update NEWS.md with the changes from this pull request under the heading "# gtsummary (development version)". If there is an issue associated with the pull request, reference it in parentheses at the end update (see NEWS.md for examples).
  • Increment the version number using usethis::use_version(which = "dev")
  • Run usethis::use_spell_check() again
  • Approve Pull Request
  • Merge the PR. Please use "Squash and merge".

@ddsjoberg ddsjoberg changed the title Adding tbl_merge(mege_vars) argument Adding tbl_merge(merge_vars) argument Jan 9, 2025
@michaelcurry1123
Copy link
Collaborator

Hi @ddsjoberg ! Yes I do remember that, I believe that we had issues with when variables either had different names and/or labels and what to merge on. I think most of the pressure testing I did showed that if you don't have variable and labels similar you will get weird results (as your error messages say). I would just check out example 9 in the attached script
pressure_test_tbl_merge.txt
because I believe that may be an error (but might be good to check all the samples)!

@ddsjoberg
Copy link
Owner Author

@michaelcurry1123 ! Thank you for the thorough review!! I took your code and tried to make a quick comparison document between the old version and updated version of the merging function.
https://rpubs.com/ddsjoberg/gtsummary-merge-comparison

It seems like the only change from the new function from the previous version is that we used to allow merging the same variable, but with different labels. I think I prefer the updated version because it's ore clear, but will need to document this well, just in case someone is utilizing this strange "feature".

How does that sound to you? A reasonable breaking change?

@michaelcurry1123
Copy link
Collaborator

yes that sounds great to me! hope the review was helpful

@ddsjoberg
Copy link
Owner Author

yes that sounds great to me! hope the review was helpful

@michaelcurry1123 very helpful, thank you!! I am going to go ahead and merge. Do you think any of the examples you tested should be added as unit tests?

@ddsjoberg ddsjoberg merged commit accfc2b into main Jan 15, 2025
7 checks passed
@ddsjoberg ddsjoberg deleted the 1861-tbl_merge-vars branch January 15, 2025 00:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Merge tbl_hierarchical() and  tbl_hierarchical_count() Add tbl_merge(merge_vars) argument?
2 participants