Feature: Support multiple inference.py files and universal inference.… #228

sachanub · 2022-08-15T20:32:30Z

Design doc: https://quip-amazon.com/biizAu4KYIuP/Multi-Model-Endpoint-Model-Specific-Inference-Files

Issue #, if available: https://t.corp.amazon.com/D30094580

Description of changes:

Added support for multiple inference.py files if Sagemaker multi model mode is enabled.
Modified python_service.py and serve.py to use universal inference.py as default and multiple inference.py files if available.
Universal requirements.txt file is used.
Changed directory structure of universal inference.py file.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

sagemaker-bot · 2022-08-15T20:36:46Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 9110754
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

docker/build_artifacts/sagemaker/python_service.py

sagemaker-bot · 2022-08-15T23:35:15Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: b19568b
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-15T23:50:50Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 1a5d9e4
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-16T03:33:36Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: fa6d180
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-16T06:52:26Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: d99b720
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-16T08:19:14Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 77378c6
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-16T09:11:12Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 96678b6
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-16T19:47:12Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: d2361b7
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-16T20:46:12Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 8860ba2
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-16T22:09:51Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: d50072a
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-16T23:16:04Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 3ef74b0
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-16T23:52:11Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 31a6f1f
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-17T00:38:02Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: dfe1bee
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-17T18:37:25Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 5ba8af2
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-17T18:50:33Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: d3f7cea
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-17T19:20:53Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: aba0963
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-17T19:58:00Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 919de5c
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-17T20:53:43Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: f9a736f
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-17T21:45:30Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 5e4ffa9
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-17T23:24:25Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: d046d40
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-18T00:40:06Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: bd336a8
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-18T19:40:23Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 7334fb8
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-18T20:29:10Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 4647b73
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-18T21:07:24Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 008c2b0
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-18T21:38:39Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: f171de9
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-18T22:20:27Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: a456ba9
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-18T23:08:46Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: a7e9024
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-19T00:56:14Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 54db583
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-24T21:07:45Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 7b4e572
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

waytrue17 · 2022-08-24T23:10:16Z

test/integration/local/test_pre_post_processing_mme1.py


 @pytest.mark.skip_gpu
 def test_specific_versions():
+    MODEL_NAME = MODEL_NAMES[0]


Why do we test model0 only?

Model0 is half_plus_three and Model1 is half_plus_two .We test only model0 in this one because half_plus_three has two versions available in the repo but I found only one version for half_plus_two in the repo.

test/integration/local/test_pre_post_processing_mme1.py

waytrue17 · 2022-08-24T23:46:51Z

docker/build_artifacts/sagemaker/serve.py

+    def _get_number_of_gpu_on_host(self):
+        nvidia_smi_exist = os.path.exists("/usr/bin/nvidia-smi")
+        if nvidia_smi_exist:
+            return len(subprocess.check_output(['nvidia-smi', '-L'])
+                       .decode('utf-8').strip().split('\n'))
+
+        return 0
+


These are not relevant. Could you try to pull from the latest commit?

I have removed them from the recent commit.

sagemaker-bot · 2022-08-26T19:40:01Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 32d6b9a
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-26T23:16:51Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 8048712
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-26T23:55:32Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: ab13c3f
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-30T21:53:29Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 4d9b478
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-30T22:25:35Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 275c8d9
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

…py file along with universal requirements.txt file

sagemaker-bot · 2022-08-30T23:11:17Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 116fb22
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

…py file along with universal requirements.txt file

…hanub/sagemaker-tensorflow-serving-container into multimodel_endpoints_support

sagemaker-bot · 2022-08-31T00:25:57Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 5cde4a4
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-31T08:37:32Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 46328d1
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-08-31T20:07:26Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 195239d
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

sagemaker-bot · 2022-09-23T18:43:59Z

AWS CodeBuild CI Report

CodeBuild project: sagemaker-tensorflow-serving-container-pr
Commit ID: 002292a
Result: SUCCEEDED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

lxning · 2022-10-07T21:02:31Z

README.md

 2. If you are working in a network-isolation situation or if you don't want to install dependencies at runtime everytime your Endpoint starts or Batch Transform job runs, you may want to put pre-downloaded dependencies under `code/lib` directory in your model archive, the container will then add the modules to the Python path. Note that if both `code/lib` and `code/requirements.txt` are present in the model archive, the `requirements.txt` will be ignored.

 Your untarred model directory structure may look like this if you are using `requirements.txt`:

-        model1
+        /opt/ml/models/model1/model


why needs extra subdir model?

I think that is the directory structure which is expected when we create the endpoint. We might need to confirm with the hosting team regarding this directory structure.

I think this is what the directory structure will look like on the platform once the files are downloaded to disk. Better not to confuse users since this is referring to the directory structure of the archive

lxning · 2022-10-07T21:05:25Z

README.md

@@ -687,7 +687,20 @@ Multi-Model Endpoint can be used together with Pre/Post-Processing. Each model w
                |--lib
                    |--external_module
                |--inference.py


How to provide model specific inference.py via SM SDK MME? Can you provide add notebook in SM examples?

I can create a notebook for SM examples which will demonstrate the usage of model-specific inference.py files.

Updated the directory structure

lxning · 2022-10-07T21:06:20Z

docker/build_artifacts/sagemaker/python_service.py

@@ -26,8 +27,8 @@
 import tfs_utils

 SAGEMAKER_MULTI_MODEL_ENABLED = os.environ.get("SAGEMAKER_MULTI_MODEL", "false").lower() == "true"
-MODEL_DIR = "models" if SAGEMAKER_MULTI_MODEL_ENABLED else "model"
-INFERENCE_SCRIPT_PATH = f"/opt/ml/{MODEL_DIR}/code/inference.py"
+MODEL_DIR = "" if SAGEMAKER_MULTI_MODEL_ENABLED else "model/"


Why needs to change the dir structures?

The problem with this path is that /opt/ml/models is a read-only file system for multi-model endpoints. Assuming the structure as shown below in an s3 bucket, the universal inference.py can’t be downloaded to /opt/ml/models/code/inference.py.

This problem has been highlighted in other Github issues (#212 and #211) as well and a PR (#215) was created to solve the issue but it was not merged.

We can use the path /opt/ml/code/inference.py for the universal inference.py file in the case of a multi-model endpoint.

maaquib · 2023-01-09T18:58:44Z

README.md

 2. If you are working in a network-isolation situation or if you don't want to install dependencies at runtime everytime your Endpoint starts or Batch Transform job runs, you may want to put pre-downloaded dependencies under `code/lib` directory in your model archive, the container will then add the modules to the Python path. Note that if both `code/lib` and `code/requirements.txt` are present in the model archive, the `requirements.txt` will be ignored.

 Your untarred model directory structure may look like this if you are using `requirements.txt`:

-        model1
+        /opt/ml/models/model1/model


I think this is what the directory structure will look like on the platform once the files are downloaded to disk. Better not to confuse users since this is referring to the directory structure of the archive

maaquib · 2023-01-09T18:59:49Z

README.md

@@ -687,7 +687,20 @@ Multi-Model Endpoint can be used together with Pre/Post-Processing. Each model w
                |--lib
                    |--external_module
                |--inference.py


maaquib · 2023-01-09T19:00:24Z

README.md


+        /opt/ml/models/model1/model


Same as above, remove /opt/ml/models

waytrue17 reviewed Aug 15, 2022

View reviewed changes

docker/build_artifacts/sagemaker/python_service.py Outdated Show resolved Hide resolved

waytrue17 reviewed Aug 15, 2022

View reviewed changes

docker/build_artifacts/sagemaker/python_service.py Outdated Show resolved Hide resolved

waytrue17 reviewed Aug 24, 2022

View reviewed changes

test/integration/local/test_pre_post_processing_mme1.py Show resolved Hide resolved

waytrue17 reviewed Aug 24, 2022

View reviewed changes

Feature: Support multiple inference.py files and universal inference.…

116fb22

…py file along with universal requirements.txt file

sachanub added 3 commits August 30, 2022 16:18

Feature: Support multiple inference.py files and universal inference.…

a20d398

…py file along with universal requirements.txt file

Merge branch 'multimodel_endpoints_support' of https://github.com/sac…

35cef64

…hanub/sagemaker-tensorflow-serving-container into multimodel_endpoints_support

update inference path

5cde4a4

update inference path

46328d1

update inference paths

195239d

sachanub added 3 commits September 21, 2022 15:32

Change model path

0b44c19

Change model path

50b1d79

Change model path

002292a

Fix directory structure for inference files and update documentations

96346a9

lxning reviewed Oct 7, 2022

View reviewed changes

maaquib approved these changes Jan 10, 2023

View reviewed changes

sachanub requested a review from davidthomas426 January 10, 2023 22:47

sachanub and others added 2 commits January 10, 2023 15:01

Updated README file

00ef7fd

Merge branch 'master' into multimodel_endpoints_support

55a6336

sachanub closed this Feb 23, 2023

Feature: Support multiple inference.py files and universal inference.… #228

Feature: Support multiple inference.py files and universal inference.… #228

Conversation

sachanub commented Aug 15, 2022 • edited Loading

sagemaker-bot commented Aug 15, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 15, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 15, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 16, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 16, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 16, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 16, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 16, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 16, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 16, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 16, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 16, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 17, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 17, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 17, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 17, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 17, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 17, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 17, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 17, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 18, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 18, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 18, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 18, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 18, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 18, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 18, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 19, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 24, 2022

AWS CodeBuild CI Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

waytrue17 Aug 24, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sagemaker-bot commented Aug 26, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 26, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 26, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 30, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 30, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 30, 2022

AWS CodeBuild CI Report

sagemaker-bot commented Aug 31, 2022

sachanub commented Aug 15, 2022 •

edited

Loading

waytrue17 Aug 24, 2022 •

edited

Loading