-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-16312 constrainted glm issues [nocheck] #16317
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @wendycwong . I suggest one improvement; otherwise, LGTM.
model = glm(**params) | ||
model.train(x = predictors, y = response, training_frame = train_data) | ||
print(model.coef()) | ||
print(glm.getConstraintsInfo(model)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You wrote: no need to check anything. But what about adding an assertion at the end that the model is not None? I suggest adding a try-catch and throwing a clear message instead of letting the test fail (in case of some problem).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the model
is None
then it would fail in model.train(...)
. What benefit would an assert have (especially if it would be at the end)? Or did you mean something like:
model = glm(**params)
+ assert model is not None, "The model is None"
model.train(x = predictors, y = response, training_frame = train_data)
print(model.coef())
print(glm.getConstraintsInfo(model))
?
If so, I don't understand the reason of adding the try except clause.
(Sorry if it is obvious, I haven't properly started the review yet so I might still get it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The advantage of failing with an assert message is to explain why it failed. If the NPE occurs and we know the potential problem, we can explain it in the message and save time in the future. We can also catch only NPE and other errors shown as usual.
I suggest improving the message from "The model is None" to something like "The model training failed with NPE. It could be caused by beta constraints setting."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without my fix, the test will fail. However, after my fix, the NPE is no longer there. However, I did add a test because I know the beta constraints are always satisfied. Hence, I check that the beta constraints are always satisfied.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From @maurever , she is not comfortable for tests without some kind of assert statement. However, @tomasfryda feels that if the assert statement is too general, it is not very meaningful.
Taking both of your suggestions, I added assert statements that will check and make sure that the beta constraints are satisfied at the end of the model building process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @wendycwong. Thanks for improving the test. I went through the PR again and found nothing else to fix. I just saw that the Jenkins test has not been passed. But it may be due to infrastructure reasons. Please check that; otherwise, LGTM.
* Continue to double check algo. * fix bug in gradient update. * implemented various version of IRLSM * Found GLM original with gradient magnitude change best * GH-16312: fix wrong error raised by duplicated/conflicted constraints. * force beta constraint to be satisfied at the end if it is not. * GH-16312: add assert check to test per Veronika suggestion. * GH-16312: fix tests after fixing constrained GLM bugs. * GH-16312: fixed NPE error in checkCoeffsBounds * GH-16312: fix test failure. * remove conflicting constraint tests as we currently do not have the capability to do so right now. * change dataset path from AWS to local
* Continue to double check algo. * fix bug in gradient update. * implemented various version of IRLSM * Found GLM original with gradient magnitude change best * GH-16312: fix wrong error raised by duplicated/conflicted constraints. * force beta constraint to be satisfied at the end if it is not. * GH-16312: add assert check to test per Veronika suggestion. * GH-16312: fix tests after fixing constrained GLM bugs. * GH-16312: fixed NPE error in checkCoeffsBounds * GH-16312: fix test failure. * remove conflicting constraint tests as we currently do not have the capability to do so right now. * change dataset path from AWS to local
This PR fixes problems in #16312
I fixed three issues here: