GH-16312 constrainted glm issues [nocheck] #16317

wendycwong · 2024-07-01T22:16:40Z

This PR fixes problems in #16312

I fixed three issues here:

constrained GLM finished before constraint conditions are satisfied. There is a bug in gradient check and I fixed it. However, the constraint conditions are not satisfied at the end of the iteration. This is caused by either line search failure or no progress was made. However, if you compare the results with and without the constraints, the coefficients with constraints are closer to the constraints then the ones from model without constraints.
When specifying beta constraints with only the lower bounds, npe values occurred. I fixed this one by checking to make sure the beta constraint is not null before accessing its elements.
A set of three linear constraints are evaluated to be duplicated or conflicting when the constraints are neither. I fixed this error also because I dropped the last column/rows which are supposed to be for the constant. However, in generating the constraints matrix, I did not include the constant.

maurever

Thanks @wendycwong . I suggest one improvement; otherwise, LGTM.

maurever · 2024-07-03T13:43:11Z

h2o-py/tests/testdir_algos/glm/pyunit_GH_16312_contrained_GLM_beta_constraint_NPE_large.py

+    model = glm(**params)
+    model.train(x = predictors, y = response, training_frame = train_data)
+    print(model.coef())
+    print(glm.getConstraintsInfo(model))


You wrote: no need to check anything. But what about adding an assertion at the end that the model is not None? I suggest adding a try-catch and throwing a clear message instead of letting the test fail (in case of some problem).

If the model is None then it would fail in model.train(...). What benefit would an assert have (especially if it would be at the end)? Or did you mean something like:

model = glm(**params) + assert model is not None, "The model is None" model.train(x = predictors, y = response, training_frame = train_data) print(model.coef()) print(glm.getConstraintsInfo(model))

?

If so, I don't understand the reason of adding the try except clause.

(Sorry if it is obvious, I haven't properly started the review yet so I might still get it.)

The advantage of failing with an assert message is to explain why it failed. If the NPE occurs and we know the potential problem, we can explain it in the message and save time in the future. We can also catch only NPE and other errors shown as usual.

I suggest improving the message from "The model is None" to something like "The model training failed with NPE. It could be caused by beta constraints setting."

Without my fix, the test will fail. However, after my fix, the NPE is no longer there. However, I did add a test because I know the beta constraints are always satisfied. Hence, I check that the beta constraints are always satisfied.

From @maurever , she is not comfortable for tests without some kind of assert statement. However, @tomasfryda feels that if the assert statement is too general, it is not very meaningful.

Taking both of your suggestions, I added assert statements that will check and make sure that the beta constraints are satisfied at the end of the model building process.

…apability to do so right now.

maurever

Hi @wendycwong. Thanks for improving the test. I went through the PR again and found nothing else to fix. I just saw that the Jenkins test has not been passed. But it may be due to infrastructure reasons. Please check that; otherwise, LGTM.

h2o-algos/src/test/java/hex/glm/GLMConstrainedTest.java

* Continue to double check algo. * fix bug in gradient update. * implemented various version of IRLSM * Found GLM original with gradient magnitude change best * GH-16312: fix wrong error raised by duplicated/conflicted constraints. * force beta constraint to be satisfied at the end if it is not. * GH-16312: add assert check to test per Veronika suggestion. * GH-16312: fix tests after fixing constrained GLM bugs. * GH-16312: fixed NPE error in checkCoeffsBounds * GH-16312: fix test failure. * remove conflicting constraint tests as we currently do not have the capability to do so right now. * change dataset path from AWS to local

wendycwong added 5 commits June 27, 2024 12:11

Continue to double check algo.

31bb7bf

fix bug in gradient update.

93a1920

implemented various version of IRLSM

79cf959

Found GLM original with gradient magnitude change best

64a7c5d

GH-16312: fix wrong error raised by duplicated/conflicted constraints.

fc5ac52

wendycwong requested review from maurever and tomasfryda July 1, 2024 22:16

force beta constraint to be satisfied at the end if it is not.

0d629c6

maurever requested changes Jul 3, 2024

View reviewed changes

wendycwong added 2 commits July 16, 2024 15:25

GH-16312: add assert check to test per Veronika suggestion.

3542e77

GH-16312: fix tests after fixing constrained GLM bugs.

f1f1f41

wendycwong requested a review from maurever July 18, 2024 21:43

wendycwong added 4 commits July 23, 2024 14:45

GH-16312: fixed NPE error in checkCoeffsBounds

761651d

GH-16312: fix test failure.

492c3e4

remove conflicting constraint tests as we currently do not have the c…

dd297e2

…apability to do so right now.

changed the error message caught in the tests.

7653b84

maurever previously approved these changes Jul 31, 2024

View reviewed changes

h2o-algos/src/test/java/hex/glm/GLMConstrainedTest.java Outdated Show resolved Hide resolved

wendycwong changed the title ~~GH-16312 constrainted glm issues~~ GH-16312 constrainted glm issues [nocheck] Jul 31, 2024

change dataset path from AWS to local

87ea910

wendycwong dismissed maurever’s stale review via 87ea910 July 31, 2024 20:50

maurever self-requested a review August 1, 2024 08:06

maurever previously approved these changes Aug 1, 2024

View reviewed changes

Fix test failure due to wrong error message.

6e0c86b

wendycwong dismissed maurever’s stale review via 6e0c86b August 2, 2024 20:19

wendycwong requested a review from maurever August 2, 2024 20:40

maurever approved these changes Aug 7, 2024

View reviewed changes

wendycwong merged commit 17467da into master Aug 9, 2024
67 of 69 checks passed

wendycwong deleted the wendy_gh_16312_constrainted_GLM_issues branch August 9, 2024 17:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GH-16312 constrainted glm issues [nocheck] #16317

GH-16312 constrainted glm issues [nocheck] #16317

wendycwong commented Jul 1, 2024

maurever left a comment

maurever Jul 3, 2024

tomasfryda Jul 3, 2024

maurever Jul 4, 2024

wendycwong Jul 16, 2024

wendycwong Jul 18, 2024

maurever left a comment

GH-16312 constrainted glm issues [nocheck] #16317

GH-16312 constrainted glm issues [nocheck] #16317

Conversation

wendycwong commented Jul 1, 2024

maurever left a comment

Choose a reason for hiding this comment

maurever Jul 3, 2024

Choose a reason for hiding this comment

tomasfryda Jul 3, 2024

Choose a reason for hiding this comment

maurever Jul 4, 2024

Choose a reason for hiding this comment

wendycwong Jul 16, 2024

Choose a reason for hiding this comment

wendycwong Jul 18, 2024

Choose a reason for hiding this comment

maurever left a comment

Choose a reason for hiding this comment