fix(csi): make csi use mr version from go.mod in ci #598

Al-Pragliola · 2024-11-27T18:11:26Z

Description

When building/testing CSI in the CI we are using "local" mr as a dependency instead of the version pinned in the go.mod file, this bring an inconsistency between what we are shipping as a docker image and what is actually in the codebase.

Side effect: This should fix all the problems that dependabot gives us when touching mr go.mod.

How Has This Been Tested?

no local testing is needed

Merge criteria:

All the commits have been signed-off (To pass the DCO check)

The commits have meaningful messages; the author will squash them after approval or in case of manual merges will ask to merge with squash.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work.
Code changes follow the kubeflow contribution guidelines.

Signed-off-by: Alessio Pragliola <[email protected]>

google-oss-prow · 2024-11-27T18:11:32Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ckadner for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Signed-off-by: Alessio Pragliola <[email protected]>

Al-Pragliola · 2024-11-27T18:15:17Z

/cc @tarilabs @lampajr

tarilabs

I understand this re-uses (*in github action) the standard docker-build target from the main makefile, and

/lgtm

thanks a lot @Al-Pragliola

lampajr

Hi @Al-Pragliola,

Overall I don't think this is going to work as expected for two reasons:

If we change .github/workflows/csi-build-and-push-image.yml we might be in a chicken-egg situation as we would like to release CSI together with model-registry but we don't have yet the tag released
If we change .github/workflows/csi-test.yml we might be not able to test csi with a dev version of model-registry, i.e., we might be not able to detect any breaking change in model-registry that would break csi

wdyt?

lampajr · 2024-11-27T18:26:04Z

.github/workflows/csi-test.yml

If we do this, we are making use of the model-registry version that has pinned in the go.mod as you said, but this makes impossible to test the current CSI with the dev (not yet released) version of model-registry.

Yeah, this is not the solution to the bigger problem of the relationship between CSI and MR and it's not aimed to be, for that we should consider what you guys proposed in the related issue #338

But to test CSI with the current unreleased version of MR we could make use of:

go get github.com/kubeflow/model-registry@<commit-hash>

so that when we are releasing a new version of both we take the latest commit hash and update the go.mod (if it's needed at all, because if there's no change to the rest client it should be compatible)

wdyt?

Al-Pragliola · 2024-11-27T18:29:45Z

I understand this re-uses (*in github action) the standard docker-build target from the main makefile, and

/lgtm

thanks a lot @Al-Pragliola

yep, with docker-build-dev we had

# Copy the Go Modules manifests
COPY csi/go.mod go.mod
COPY csi/go.sum go.sum
RUN echo "replace github.com/kubeflow/model-registry => ../" >> go.mod

overriding the version of model-registry from the csi/go.mod file

lampajr · 2024-11-27T18:39:37Z

Expanding on my previous comment, with docker-build-dev I had to do

# Copy the Go Modules manifests
COPY csi/go.mod go.mod
COPY csi/go.sum go.sum
RUN echo "replace github.com/kubeflow/model-registry => ../" >> go.mod

exactly because we would like to test the csi with the local/dev model-registry in order to be able to detect any breaking change earlier.

Consider the scenario where you change something, by mistake or on purpose, you would not be able to detect that if you use a pinned version of model registry.

On the other hand, we could try to change .github/workflows/csi-build-and-push-image.yml to make use of a pinned version of the model-registry but either:

We change/set that version during the workflow (as it is triggered on tag push) but how do we manage the main branch push?
We decide the keep csi and model registry NOT in sync and keep using a pinned version, in this case we should not release csi on tag push

Al-Pragliola · 2024-11-27T18:48:37Z

Thanks for you feedback @lampajr , what do you think about having another GHA that procs only when changes are made in the pkg folder and checks compatibility with CSI (and in the future the inferenceservice controller) using in this case the docker-build-dev target?

thinking out loud, we are only changing dependencies in the build phase, we still test against the current MR deployment when running kind so we should still be able to detect breaking changes

Al-Pragliola · 2024-11-27T19:40:10Z

Expanding on my previous comment, with docker-build-dev I had to do
# Copy the Go Modules manifests
COPY csi/go.mod go.mod
COPY csi/go.sum go.sum
RUN echo "replace github.com/kubeflow/model-registry => ../" >> go.mod
exactly because we would like to test the csi with the local/dev model-registry in order to be able to detect any breaking change earlier.

Consider the scenario where you change something, by mistake or on purpose, you would not be able to detect that if you use a pinned version of model registry.

On the other hand, we could try to change .github/workflows/csi-build-and-push-image.yml to make use of a pinned version of the model-registry but either:
* We change/set that version during the workflow (as it is triggered on tag push) but how do we manage the `main` branch push?

* We decide the keep csi and model registry **NOT** in sync and keep using a pinned version, in this case we should not release csi on tag push

Let's analyze this scenario:

breaking changes from MR and pinned version in go.mod

(from this PR) CI will build using the pinned version from go.mod so the build will succeed - then the e2e tests will fail because we are testing against the local built version of MR (and there's a breaking change)

pro:

we avoid false negatives from go.mod bumps in MR
docker image and the codebase are in sync

cons:

we rely on the robustness of our e2e tests to be sure that everything works
we may have to bump using commit hashes for dev releases of MR

(previous) CI will fail to build and stop here

pro:

ci will fail earlier
no need to bump the go.mod file on release

cons:

what we push to the image hub does not reflect what is in the codebase
there will be false negatives if we change the go.mod in MR

neither is a perfect solution, as mentioned above, we can improve the "fix" from this PR by adding a GHA that actually checks if something breaks when there's a change in the mr pkg folder

lampajr · 2024-11-28T11:24:15Z

Hi @Al-Pragliola,

I agree there is no "best" solution here and I see drawbacks on both approaches tbh.

Let me write down some comments

(from this PR) CI will build using the pinned version from go.mod so the build will succeed - then the e2e tests will fail because we are testing against the local built version of MR (and there's a breaking change)
pro:
we avoid false negatives from go.mod bumps in MR

I am not sure this is actually guaranteed, are you saying that because you had some evidence confirming this?

docker image and the codebase are in sync

I agree this would be clearly better, or alternatively we could clearly specify the CSI uses the locally referenced model-registry without ever pinning a version there

cons:
we rely on the robustness of our e2e tests to be sure that everything works
we may have to bump using commit hashes for dev releases of MR

Another cons I see is that model-registry and CSI won't be in sync anymore and a tagged version of model registry will contain a CSI that references an older version of model-registry (as the tag was not there when you are tagging MR) and this could create even further confusion imho.

Overall, I think that having the possibility to check incompatibility issues earlier is appreciated, so having a way to check the CSI with the local/dev version of model registry is really good.

TBH I think the ~best solution would be to find what is the best approach among those mentioned here #338

lampajr · 2024-11-28T11:27:28Z

(previous) CI will fail to build and stop here
pro:
ci will fail earlier
no need to bump the go.mod file on release

cons:
what we push to the image hub does not reflect what is in the codebase

I think we can remove this cons by clearly setting that CSI uses the locally referenced model-registry codebase, without pinning any version, I think this is actully possible (i.e., keeping "replace github.com/kubeflow/model-registry => ../" by default)

there will be false negatives if we change the go.mod in MR

I agree this is a problem, but I am not sure this is going to be fixed with this PR, happy to be wrong though

Al-Pragliola · 2024-11-28T12:10:06Z

I think we can remove this cons by clearly setting that CSI uses the locally referenced model-registry codebase, without pinning any version, I think this is actully possible (i.e., keeping "replace github.com/kubeflow/model-registry => ../" by default)

there will be false negatives if we change the go.mod in MR

I agree this is a problem, but I am not sure this is going to be fixed with this PR, happy to be wrong though

You are right, I did not provide any evidence 🙏🏼 I opened a PR for a recent example of a false negative, here you can see the previous behavior:

#548
https://github.com/kubeflow/model-registry/actions/runs/12032677420/job/33545058641?pr=548

and post-"fix":

#600

I agree with you that we should implement a solution from #338 , I am a fan of the sub command approach 👍🏼

lampajr · 2024-11-29T10:25:04Z

I think we can remove this cons by clearly setting that CSI uses the locally referenced model-registry codebase, without pinning any version, I think this is actully possible (i.e., keeping "replace github.com/kubeflow/model-registry => ../" by default)

there will be false negatives if we change the go.mod in MR

I agree this is a problem, but I am not sure this is going to be fixed with this PR, happy to be wrong though

You are right, I did not provide any evidence 🙏🏼 I opened a PR for a recent example of a false negative, here you can see the previous behavior:

#548 https://github.com/kubeflow/model-registry/actions/runs/12032677420/job/33545058641?pr=548

and post-"fix":

#600

Thanks for the examples 🙏🏻

I agree with you that we should implement a solution from #338 , I am a fan of the sub command approach 👍🏼

I am planning to propose something sooner, then I think we can discuss whether that approach is cleaner or not and if we want to proceed that way, wdyt?

Al-Pragliola · 2024-11-29T11:07:19Z

I am planning to propose something sooner, then I think we can discuss whether that approach is cleaner or not and if we want to proceed that way, wdyt?

Sure, that sounds good to me. Let's discuss it further once you've proposed the approach

Al-Pragliola · 2024-12-05T16:22:16Z

Closed in favor of #601

fix(csi): make csi use mr version from go.mod in ci

062ae3f

Signed-off-by: Alessio Pragliola <[email protected]>

google-oss-prow bot added the do-not-merge/work-in-progress label Nov 27, 2024

google-oss-prow bot requested review from andreyvelich and rareddy November 27, 2024 18:11

github-actions bot added the Area/GitHub label Nov 27, 2024

google-oss-prow bot added the size/XS label Nov 27, 2024

chore(csi): bump mr version in go.mod file

73ac6f1

Signed-off-by: Alessio Pragliola <[email protected]>

github-actions bot added the Area/CSI label Nov 27, 2024

google-oss-prow bot added size/S and removed size/XS labels Nov 27, 2024

Al-Pragliola marked this pull request as ready for review November 27, 2024 18:15

google-oss-prow bot removed the do-not-merge/work-in-progress label Nov 27, 2024

google-oss-prow bot requested review from ckadner and Tomcli November 27, 2024 18:15

google-oss-prow bot requested review from lampajr and tarilabs November 27, 2024 18:15

tarilabs reviewed Nov 27, 2024

View reviewed changes

google-oss-prow bot assigned tarilabs Nov 27, 2024

google-oss-prow bot added the lgtm label Nov 27, 2024

lampajr reviewed Nov 27, 2024

View reviewed changes

Al-Pragliola mentioned this pull request Nov 28, 2024

chore: bump logr dep to 1.4.2 #600

Closed

4 tasks

Al-Pragliola mentioned this pull request Dec 5, 2024

feat(csi): restructure the CSI as different mr executable #601

Merged

7 tasks

Al-Pragliola closed this Dec 5, 2024

Al-Pragliola deleted the al-pragliola-fix-make-csi-use-pinned-mr-version branch December 5, 2024 16:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(csi): make csi use mr version from go.mod in ci #598

fix(csi): make csi use mr version from go.mod in ci #598

Al-Pragliola commented Nov 27, 2024 •

edited

Loading

google-oss-prow bot commented Nov 27, 2024

Al-Pragliola commented Nov 27, 2024

tarilabs left a comment •

edited

Loading

lampajr left a comment •

edited

Loading

lampajr Nov 27, 2024

Al-Pragliola Nov 27, 2024

Al-Pragliola commented Nov 27, 2024

lampajr commented Nov 27, 2024

Al-Pragliola commented Nov 27, 2024 •

edited

Loading

Al-Pragliola commented Nov 27, 2024 •

edited

Loading

lampajr commented Nov 28, 2024

lampajr commented Nov 28, 2024

Al-Pragliola commented Nov 28, 2024 •

edited

Loading

lampajr commented Nov 29, 2024

Al-Pragliola commented Nov 29, 2024

Al-Pragliola commented Dec 5, 2024

fix(csi): make csi use mr version from go.mod in ci #598

fix(csi): make csi use mr version from go.mod in ci #598

Conversation

Al-Pragliola commented Nov 27, 2024 • edited Loading

Description

How Has This Been Tested?

Merge criteria:

google-oss-prow bot commented Nov 27, 2024

Al-Pragliola commented Nov 27, 2024

tarilabs left a comment • edited Loading

Choose a reason for hiding this comment

lampajr left a comment • edited Loading

Choose a reason for hiding this comment

lampajr Nov 27, 2024

Choose a reason for hiding this comment

Al-Pragliola Nov 27, 2024

Choose a reason for hiding this comment

Al-Pragliola commented Nov 27, 2024

lampajr commented Nov 27, 2024

Al-Pragliola commented Nov 27, 2024 • edited Loading

Al-Pragliola commented Nov 27, 2024 • edited Loading

lampajr commented Nov 28, 2024

lampajr commented Nov 28, 2024

Al-Pragliola commented Nov 28, 2024 • edited Loading

lampajr commented Nov 29, 2024

Al-Pragliola commented Nov 29, 2024

Al-Pragliola commented Dec 5, 2024

Al-Pragliola commented Nov 27, 2024 •

edited

Loading

tarilabs left a comment •

edited

Loading

lampajr left a comment •

edited

Loading

Al-Pragliola commented Nov 27, 2024 •

edited

Loading

Al-Pragliola commented Nov 27, 2024 •

edited

Loading

Al-Pragliola commented Nov 28, 2024 •

edited

Loading