Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MultiOutputClassifier can not work with early_stopping_round? #5277

Closed
yiwiz-sai opened this issue Jun 10, 2022 · 4 comments
Closed

MultiOutputClassifier can not work with early_stopping_round? #5277

yiwiz-sai opened this issue Jun 10, 2022 · 4 comments

Comments

@yiwiz-sai
Copy link

yiwiz-sai commented Jun 10, 2022

pip3 freeze |grep lightgbm
lightgbm==3.3.2
import lightgbm as lgb
from sklearn.multioutput import*
from sklearn.datasets import * 
params = {'num_leaves':30, 'verbose': -1,
         'num_iterations':10, 'early_stopping_round':3,
         'metric':['auc', 'binary_logloss']}

clf = MultiOutputClassifier(lgb.LGBMClassifier(**params))
# x, y = make_classification(n_classes=2, n_samples=50)
x, y = make_multilabel_classification(n_features=5,n_samples=50, n_classes=5, n_labels=2)
#print(x)
#print(y)
clf.fit(x, y, eval_set=[(x,y)])
clf.predict_proba(x)

get error

ValueError: y should be a 1d array, got an array of shape (50, 5) instead.

I don't find related testing in https://github.com/microsoft/LightGBM/blob/master/tests/python_package_test/test_sklearn.py

@jmoralez
Copy link
Collaborator

Hi @yiwiz-sai, thank you for your interest in LightGBM. Indeed this doesn't work because every model takes the eval_set argument as-is and we'd need to do something like eval_set=[(x, y[:, i])] for each model. What MultiOutputClassifier does is just fit a model to each column in y so you could do that manually or use sklearn.ensemble.HistGradientBoostingClassifier which is inspired by LightGBM and builds the validation sets for early stopping internally so that one works in the way you want it to.

Please let us know if this helps.

@yiwiz-sai
Copy link
Author

Hi @yiwiz-sai, thank you for your interest in LightGBM. Indeed this doesn't work because every model takes the eval_set argument as-is and we'd need to do something like eval_set=[(x, y[:, i])] for each model. What MultiOutputClassifier does is just fit a model to each column in y so you could do that manually or use sklearn.ensemble.HistGradientBoostingClassifier which is inspired by LightGBM and builds the validation sets for early stopping internally so that one works in the way you want it to.

Please let us know if this helps.

I see, thanks. can lightgbm support this in the future ?

@jmoralez
Copy link
Collaborator

I think that feature request is already in #2302 as #3313.

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed.
To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues
including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants