feat(py): Add support for storing models in S3 - [DRAFT] #765

syntaxsdev · 2025-02-05T21:57:45Z

Within the Python client, users will be able to directly store models to an S3 compatible object storage.
[DRAFT]

Description

The bulk of the changes were done in clients/python/src/_client.py

How Has This Been Tested?

Merge criteria:

All the commits have been signed-off (To pass the DCO check)

The commits have meaningful messages; the author will squash them after approval or in case of manual merges will ask to merge with squash.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work.
Code changes follow the kubeflow contribution guidelines.

If you have UI changes

The developer has added tests or explained why testing cannot be added.
Included any necessary screenshots or gifs if it was a UI change.
Verify that UI/UX changes conform the UX guidelines for Kubeflow.

Signed-off-by: syntaxsdev <[email protected]>

google-oss-prow · 2025-02-05T21:57:51Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign ckadner for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

clients/python/src/model_registry/_client.py

tarilabs

thank you @syntaxsdev for this!

some initial comments below.
which type of test can we consider to make sure the functionality is covered?

I'm thinking we could have some dedicated e2e test by extending the current opt-in pytest mechanism and deploy minio in that "scenario" of e2e testing. Do you have some additional ideas?

clients/python/src/model_registry/_client.py

tarilabs · 2025-02-05T22:28:54Z

clients/python/src/model_registry/_client.py

+            secret_access_key=secret_access_key,
+        )
+        try:
+            s3.upload_file(file, bucket_name, name)


see above comment amount "file" vs "path" (don't recall if there is any native boto3 api to do for the folder)

not understanding what you meant by this, but I have renamed the parameter to path.
If you give it a relative location it will resolve the file

it was in relation if it uploads recursively or need an explicit orchestration

It does not. See here: #765 (comment)

tarilabs · 2025-02-05T22:29:55Z

clients/python/src/model_registry/_client.py

+
+    def save_to_s3(
+        self,
+        file: str,


I presume if this is a path, it uploads recursively the path contents.
Can we confirm this, and describe it also in the pydoc?

No, it only uploads a singular file due to how upload_file works. Are you suggesting writing a wrapper to achieve recursive path uploads?

see in some of the tutorials how we show usage of S3 for multiple files in a bucket, wdyt?

what exactly are you referring to?
if you are referring to this then that's not exactly what I was talking about.

Afaik, boto3 S3 does not have a multiple upload definition or allow recursive. uploads, we'd have to build that.

that's not a problem - the issue is do we want to add that built into this method and if so, see #765 (comment)

what exactly are you referring to?

tutorials (of ODH, but also other projects) which show how to persist multiple files in the identified bucket; sorry if I was not clear

…sting with minio locally Signed-off-by: syntaxsdev <[email protected]>

syntaxsdev · 2025-02-07T20:05:46Z

clients/python/tests/conftest.py

+    os.remove(model_file.name)
+
+
+@pytest.fixture


no scope added here because MonkeyPatch needs to use a function scope and default is function so its omitted

syntaxsdev · 2025-02-07T20:08:29Z

clients/python/tests/test_client.py

@@ -623,3 +629,44 @@ def test_hf_import_default_env(client: ModelRegistry):

    for k in env_values:
        os.environ.pop(k)
+
+
+@pytest.mark.dd


ignore for now, will change

feat(py): add support storing models in S3 - draft

48b1043

Signed-off-by: syntaxsdev <[email protected]>

google-oss-prow bot added the do-not-merge/work-in-progress label Feb 5, 2025

google-oss-prow bot requested review from andreyvelich, tarilabs and zijianjoy February 5, 2025 21:57

github-actions bot added the Area/MR Python client label Feb 5, 2025

google-oss-prow bot added the size/L label Feb 5, 2025

syntaxsdev commented Feb 5, 2025

View reviewed changes

clients/python/src/model_registry/_client.py Outdated Show resolved Hide resolved

tarilabs reviewed Feb 5, 2025

View reviewed changes

feat: added init tests for storing to s3, created e2e pipeline for te…

afdc8cf

…sting with minio locally Signed-off-by: syntaxsdev <[email protected]>

syntaxsdev force-pushed the feat/store-on-s3-py branch from 176b6a8 to afdc8cf Compare February 7, 2025 19:56

syntaxsdev commented Feb 7, 2025

View reviewed changes

Merge branch 'kubeflow:main' into feat/store-on-s3-py

a1556c8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(py): Add support for storing models in S3 - [DRAFT] #765

feat(py): Add support for storing models in S3 - [DRAFT] #765

syntaxsdev commented Feb 5, 2025 •

edited

Loading

google-oss-prow bot commented Feb 5, 2025

tarilabs left a comment

tarilabs Feb 5, 2025

syntaxsdev Feb 6, 2025

tarilabs Feb 6, 2025

syntaxsdev Feb 6, 2025

tarilabs Feb 5, 2025

syntaxsdev Feb 6, 2025 •

edited

Loading

tarilabs Feb 6, 2025

syntaxsdev Feb 6, 2025

tarilabs Feb 6, 2025

syntaxsdev Feb 7, 2025

syntaxsdev Feb 7, 2025

		os.remove(model_file.name)


		@pytest.fixture

feat(py): Add support for storing models in S3 - [DRAFT] #765

Are you sure you want to change the base?

feat(py): Add support for storing models in S3 - [DRAFT] #765

Conversation

syntaxsdev commented Feb 5, 2025 • edited Loading

Description

How Has This Been Tested?

Merge criteria:

google-oss-prow bot commented Feb 5, 2025

tarilabs left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

syntaxsdev Feb 6, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

syntaxsdev commented Feb 5, 2025 •

edited

Loading

syntaxsdev Feb 6, 2025 •

edited

Loading