-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
components: move params.env image updating to Init stage #1191
base: incubation
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
i am fine with this change, but would like to confirm one requirement first. |
ec79c0f
to
d91c3e7
Compare
@bartoszmajsak may be you want to be aware of the proposal |
6245043
to
a78e709
Compare
a78e709
to
ea0ebda
Compare
@@ -311,6 +328,10 @@ func main() { //nolint:funlen,maintidx | |||
setupLog.Error(err, "unable to set up ready check") | |||
os.Exit(1) | |||
} | |||
if err := initComponents(platform); err != nil { | |||
setupLog.Error(err, "unable to init components") | |||
os.Exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should here be os.Exit(1)?
say any of the downstream component fails to set image, will cause operator crash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. One one hand it makes problems obvious on startup. On it would make some of the system working even if dashboard tries to use wrong repo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what i am afarid is
if compoent X fail to update, but user does not even want to enable component X but only Y and Z, should that make pod crash ?
in thoery , it should not even fail,
and in another theory if it was a glitch: the restart of pod , due to crash, can fix X in the next run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If fatal init is not needed, implicit init() will work. UPD: missed platform parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For non-fatal case I would like at least to log the error. Would it be ok to pass Context and use Bartosz' idea to wrap logger in it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that sounds better, we will need to show the error, only part is i dont know if we should make it fatal.
@lburgazzoli / @VaishnaviHire how you feel about this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it non-fatal, but on the per-Init level. Leaving possibility :)
components/component.go
Outdated
@@ -39,6 +39,10 @@ type Component struct { | |||
DevFlags *DevFlags `json:"devFlags,omitempty"` | |||
} | |||
|
|||
func (c *Component) Init(_ cluster.Platform) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe rename to InitImage
() if it is only for image not other extra params.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did hope it can be useful for other initializations as well. (but actually could no find any independent of reconcile loop)
"codeflare-operator-controller-image": "RELATED_IMAGE_ODH_CODEFLARE_OPERATOR_IMAGE", // no need mcad, embedded in cfo | ||
} | ||
|
||
if err := deploy.ApplyParams(ParamsPath, imageParamMap); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we also update docs on https://github.com/opendatahub-io/opendatahub-operator/blob/incubation/pkg/deploy/envParams.go#L67
just to be clear that it only update if it does not match otherwise it wont do anything
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, by reading the comments, i do not feel it reflects what the fuction is doing with recent chagnes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's anyway unrelated to this patchset, can come as fixup for the aforementioned patch.
components/codeflare/codeflare.go
Outdated
if err := deploy.ApplyParams(ParamsPath, imageParamMap, map[string]string{"namespace": dscispec.ApplicationsNamespace}); err != nil { | ||
return fmt.Errorf("failed update image from %s : %w", CodeflarePath+"/bases", err) | ||
} | ||
if err := deploy.ApplyParams(ParamsPath, map[string]string{}, map[string]string{"namespace": dscispec.ApplicationsNamespace}); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if err := deploy.ApplyParams(ParamsPath, map[string]string{}, map[string]string{"namespace": dscispec.ApplicationsNamespace}); err != nil { | |
if err := deploy.ApplyParams(ParamsPath, nil, map[string]string{"namespace": dscispec.ApplicationsNamespace}); err != nil { |
do not even need an empty map ? same for the ray one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, leftover. (It's a bit counter intuitive that access nil map does not crash)
FYI @csams |
b3d279f
to
4898857
Compare
/retest-required |
/retest required |
@ykaliuta: The
Use In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest-required |
3 similar comments
/retest-required |
/retest-required |
/retest-required |
4898857
to
59e68b2
Compare
/test opendatahub-operator-e2e |
2 similar comments
/test opendatahub-operator-e2e |
/test opendatahub-operator-e2e |
59e68b2
to
f46c9fc
Compare
It passes locally if I disable training operator which fails to fetch image from docker.io. |
/test opendatahub-operator-e2e |
2 similar comments
/test opendatahub-operator-e2e |
/test opendatahub-operator-e2e |
f46c9fc
to
199dc92
Compare
/test opendatahub-operator-e2e |
199dc92
to
f0302f4
Compare
Add Init() method to the Component interface and call it from main on startup. Will be used to move startup-time code from ReconcileComponent (like adjusting params.env). Signed-off-by: Yauheni Kaliuta <[email protected]>
Since it can be not desireable to return fatal error for some init problems, it is still useful to log them. Wrap the logger into Context like controller-runtime does [1][2]. [1] https://github.com/kubernetes-sigs/controller-runtime/blob/38546806f2faf5973e3321a7bd5bb3afdbb5767d/pkg/internal/controller/controller.go#L297 [2] https://github.com/kubernetes-sigs/controller-runtime/blob/38546806f2faf5973e3321a7bd5bb3afdbb5767d/pkg/internal/controller/controller.go#L111 Signed-off-by: Yauheni Kaliuta <[email protected]>
f0302f4
to
095d4e6
Compare
/retest-required |
Jira: https://issues.redhat.com/browse/RHOAIENG-11592 Image names in environment are not supposed to be changed during runtime of the operator, so it makes sense to update them only on startup. If manifests are overriden by DevFlags, the DevFlags' version will be used. The change is straight forward for most of the components where only images are updated and params.env is located in the kustomize root directory, but some components (dashboard, ray, codeflare, modelregistry) also update some extra parameters. For them image part only is moved to Init since other updates require runtime DSCI information. The patch also changes logic for ray, codeflare, and modelregistry in this regard to update non-image parameters regardless of DevFlags like it was changed in dashboard recently. The DevFlags functionality brings some concerns: - For most components the code is written such a way that as soon as DevFlags supplied the global path variables are changed and never reverted back to the defaults. For some (dashboard, trustyai) there is (still global) OverridePath/entryPath pair and manifests reverted to the default, BUT there is no logic of transition. - codeflare: when manifests are overridden namespace param is updated in the hardcoded (stock) path; This logic is preserved. Signed-off-by: Yauheni Kaliuta <[email protected]>
04d8aad
to
15f9b9c
Compare
read logs more carefully and found my mistake |
func (c *Component) Init(_ context.Context, _ cluster.Platform) error { | ||
return nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why need this? you foresee some component might not have implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly put just in case. Can remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought was left from the first commit, when only one component implemented with Init(..) at that time.
Jira: https://issues.redhat.com/browse/RHOAIENG-11592
Image names in environment are not supposed to be changed during
runtime of the operator, so it makes sense to update them only on
startup.
If manifests are overriden by DevFlags, the DevFlags' version will
be used.
The change is straight forward for most of the components where only
images are updated and params.env is located in the kustomize root
directory, but some components (dashboard, ray, codeflare,
modelregistry) also update some extra parameters. For them image
part only is moved to Init since other updates require runtime DSCI
information.
The patch also changes logic for ray, codeflare, and modelregistry
in this regard to update non-image parameters regardless of DevFlags
like it was changed in dashboard recently.
The DevFlags functionality brings some concerns:
For most components the code is written such a way that as soon as
DevFlags supplied the global path variables are changed and never
reverted back to the defaults. For some (dashboard, trustyai) there
is (still global) OverridePath/entryPath pair and manifests reverted
to the default, BUT there is no logic of transition.
codeflare: when manifests are overridden namespace param is
updated in the hardcoded (stock) path;
This logic is preserved.
Description
How Has This Been Tested?
Screenshot or short clip
Merge criteria