Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor datatype logic #68

Merged
merged 28 commits into from
Jul 21, 2024
Merged

Refactor datatype logic #68

merged 28 commits into from
Jul 21, 2024

Conversation

mayer79
Copy link
Owner

@mayer79 mayer79 commented Jul 21, 2024

This is a large PR that refactors the way missRanger() deals with variables that cannot be directly modeled by ranger(). The new implementation is slightly more picky, but also more safe.

It is an important step towards out-of-sample application (#58).

Here a summary:

  • Columns of special type like date/time can't be imputed anymore.
  • pmm() is more picky: xtrain and xtest must both be either numeric, logical, or factor (with identical levels).
  • Now requires ranger >= 0.16.0.
  • More compact vignettes.
  • Many relevant ranger() arguments are now explicit arguments in missRanger() to improve tab-completion experience:
    • num.trees = 500
    • mtry = NULL
    • min.node.size = NULL
    • min.bucket = NULL
    • max.depth = NULL
    • replace = TRUE
    • sample.fraction = if (replace) 1 else 0.632
    • case.weights = NULL
    • num.threads = NULL
    • save.memory = FALSE
  • Slightly more info before fitting.

@codecov-commenter
Copy link

codecov-commenter commented Jul 21, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 87.17949% with 15 lines in your changes missing coverage. Please review.

Project coverage is 87.83%. Comparing base (20247e3) to head (77037c1).
Report is 11 commits behind head on main.

Files Patch % Lines
R/missRanger.R 84.94% 14 Missing ⚠️
R/pmm.R 94.11% 1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##             main      #68       +/-   ##
===========================================
- Coverage   97.95%   87.83%   -10.13%     
===========================================
  Files           5        5               
  Lines         245      263       +18     
===========================================
- Hits          240      231        -9     
- Misses          5       32       +27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mayer79 mayer79 merged commit 2af3e44 into main Jul 21, 2024
7 checks passed
@mayer79 mayer79 deleted the refactor-datatype-logic branch July 21, 2024 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants