Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Enable vector-valued parameters #9849

Merged
merged 3 commits into from
Dec 6, 2023

Conversation

david-cortes
Copy link
Contributor

ref #9810

This PR adds support for passing parameters that might consist of a vector of values instead of a single value, thereby enabling support for multi-quantile regression.

I see in the python interface that parameter eval_metric behaves quite differently as it requires passing entries by triggering multiple calls to XGBoosterSetParam but with the same parameter name, instead of passing a json list as a single parameter, so that's the approach I am following here.

Copy link
Member

@trivialfis trivialfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the work on quantile regression! One small question in the comment.

instead of passing a json list as a single parameter

That would be possible. The JSON parameter was introduced long after the initial parameter implementation.

Comment on lines +98 to +103
if (NROW(params[['eval_metric']]) > 1) {
eval_metrics <- as.list(params[["eval_metric"]])
names(eval_metrics) <- rep("eval_metric", length(eval_metrics))
params_without_ev_metrics <- within(params, rm("eval_metric"))
params <- c(params_without_ev_metrics, eval_metrics)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit confused here, how does the existing code handle multiple evaluation metrics? I think the PR #8657 was supposed to handle it, but I could be wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Up this point, if following the code in the master branch, the object p would also have a list with one entry per metric, PLUS another entry where the metrics are together.

For example, the code in the test would produce something like this for variable p up to the point before the lapply:

$eval_metric
$eval_metric[[1]]
[1] "error"

$eval_metric[[2]]
[1] "auc"

$eval_metric[[3]]
[1] "logloss"


$eval_metric
[1] "error"

$eval_metric
[1] "auc"

$eval_metric
[1] "logloss"

The call to lapply (current master branch) as it is in this PR would take the first entry from the multi-valued one, so it will transform it like this:

$eval_metric
[1] "error"

$eval_metric
[1] "error"

$eval_metric
[1] "auc"

$eval_metric
[1] "logloss"

After the changes in this PR, it will only have the non-repeated ones. Since the lapply code is being changed here in way in which multi-valued entries now produce a different input to the C function, I thought the easiest way would be to remove the multi-valued entry.

I experimented with passing them as a JSON list, but that doesn't seem to have the intended effect - for example, xgb.config produces this error:

Error in xgboost::xgb.config(object) : 
  [21:31:12] ../..//src/metric/metric.cc:49: Unknown metric function ["error", "auc", "logloss"]

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the explanation, I can reproduce the repeated values.

@trivialfis trivialfis merged commit 1de3f41 into dmlc:master Dec 6, 2023
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants