-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Global configuration could manage global number of threads #10027
Comments
After working on distributed systems, I'm a bit afraid of this global variable, it's global and it's thread local. This results in the fact that it needs special handling for distributed systems, and it fails without possible fix when user launches a new thread without setting the value again. |
But a failure here would be no issue, since then it could just default back to OMP's config. The idea would be to use this as a means to override the OMP config, so that it would check for this first, and then if it's not set, would look at the OMP config. |
Depending on your perspective, this change of behavior can be considered a bug. Let's not make things that can surprise people. In general, global variables are not very desirable.
I think booster local variable is fine. It's a bit inconvenient when writing tests and examples, but it's really a CRAN issue instead of a real world problem. We can workaround it by setting global OMP environment variable before running any test, XGBoost respects those variables. xgboost/tests/python/test_openmp.py Line 83 in 4dfbe2a
|
I think we can close this now that #10028 is merged. |
XGBoost has a global configuration state:
https://xgboost.readthedocs.io/en/stable/parameter.html#global-configuration
Some places in the code try to retrieve a global number of threads, such as here:
xgboost/src/gbm/gbtree_model.cc
Line 112 in 4dfbe2a
The default number of threads is determined from OpenMP, which is a reasonable thing to do:
xgboost/src/common/threading_utils.cc
Line 100 in 4dfbe2a
But being a global configuration, would be more logical to allow users to modify it through the functions that are named as such.
It would make things better for the R interface, since currently there are workarounds like these which result in degraded performance for users:
xgboost/src/gbm/gbtree_model.cc
Line 120 in 4dfbe2a
If it were a global option, should be possible to adopt a mechanism like LightGBM's which changes the number of configured threads before tests and examples, but doesn't have the lines that change the threads in the user-visible part of the examples, so that it executes with reduced threads on CRAN but doesn't suggest doing so to the user.
Example:
https://github.com/microsoft/LightGBM/blob/252828fd86627d7405021c3377534d6a8239dd69/R-package/R/lgb.Dataset.R#L783
The text was updated successfully, but these errors were encountered: