Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different imputation results for mice.impute.rf based on operating system #688

Open
janickweberpals opened this issue Feb 5, 2025 · 0 comments
Labels

Comments

@janickweberpals
Copy link

janickweberpals commented Feb 5, 2025

First of all thanks for this amazing package and associated documentation, I really learned so much about missing data!

I noticed that I get slightly different results when running imputations that use a random forest model depending the OS/platform. This is even when defining the same seed. I tested the following reprex with the same R (4.3.2) and mice package version (3.17.0) on two different platforms (Linux Ubuntu and MacOS) and I get two slightly different results:

# mice version 3.17.0
library(mice)
 
# impute data
imp <- mice(
  data = nhanes,
  m = 20,
  method = "rf",
  rfPackage = "ranger",
  seed = 1234,
  printFlag = FALSE
  )
 
# make long
imp_long <- complete(imp, action = "long")
 
# show bmi average
mean(imp_long$bmi)

Output on Ubuntu 20.04.6: 26.5236
Output on MacOS 15.2: 26.4744

I suspect it may have to do with the RNG invoked by the downstream random forest package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant