-
Notifications
You must be signed in to change notification settings - Fork 101
Feature: Support multiple inference.py files and universal inference.… #228
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
|
||
@pytest.mark.skip_gpu | ||
def test_specific_versions(): | ||
MODEL_NAME = MODEL_NAMES[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we test model0 only?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Model0 is half_plus_three and Model1 is half_plus_two .We test only model0 in this one because half_plus_three has two versions available in the repo but I found only one version for half_plus_two in the repo.
def _get_number_of_gpu_on_host(self): | ||
nvidia_smi_exist = os.path.exists("/usr/bin/nvidia-smi") | ||
if nvidia_smi_exist: | ||
return len(subprocess.check_output(['nvidia-smi', '-L']) | ||
.decode('utf-8').strip().split('\n')) | ||
|
||
return 0 | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are not relevant. Could you try to pull from the latest commit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have removed them from the recent commit.
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
…py file along with universal requirements.txt file
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
…py file along with universal requirements.txt file
…hanub/sagemaker-tensorflow-serving-container into multimodel_endpoints_support
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
README.md
Outdated
2. If you are working in a network-isolation situation or if you don't want to install dependencies at runtime everytime your Endpoint starts or Batch Transform job runs, you may want to put pre-downloaded dependencies under `code/lib` directory in your model archive, the container will then add the modules to the Python path. Note that if both `code/lib` and `code/requirements.txt` are present in the model archive, the `requirements.txt` will be ignored. | ||
|
||
Your untarred model directory structure may look like this if you are using `requirements.txt`: | ||
|
||
model1 | ||
/opt/ml/models/model1/model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why needs extra subdir model?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is the directory structure which is expected when we create the endpoint. We might need to confirm with the hosting team regarding this directory structure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is what the directory structure will look like on the platform once the files are downloaded to disk. Better not to confuse users since this is referring to the directory structure of the archive
@@ -687,7 +687,20 @@ Multi-Model Endpoint can be used together with Pre/Post-Processing. Each model w | |||
|--lib | |||
|--external_module | |||
|--inference.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How to provide model specific inference.py via SM SDK MME? Can you provide add notebook in SM examples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can create a notebook for SM examples which will demonstrate the usage of model-specific inference.py files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the directory structure
@@ -26,8 +27,8 @@ | |||
import tfs_utils | |||
|
|||
SAGEMAKER_MULTI_MODEL_ENABLED = os.environ.get("SAGEMAKER_MULTI_MODEL", "false").lower() == "true" | |||
MODEL_DIR = "models" if SAGEMAKER_MULTI_MODEL_ENABLED else "model" | |||
INFERENCE_SCRIPT_PATH = f"/opt/ml/{MODEL_DIR}/code/inference.py" | |||
MODEL_DIR = "" if SAGEMAKER_MULTI_MODEL_ENABLED else "model/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why needs to change the dir structures?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with this path is that /opt/ml/models
is a read-only file system for multi-model endpoints. Assuming the structure as shown below in an s3 bucket, the universal inference.py
can’t be downloaded to /opt/ml/models/code/inference.py.
This problem has been highlighted in other Github issues (#212 and #211) as well and a PR (#215) was created to solve the issue but it was not merged.
We can use the path /opt/ml/code/inference.py
for the universal inference.py
file in the case of a multi-model endpoint.
README.md
Outdated
2. If you are working in a network-isolation situation or if you don't want to install dependencies at runtime everytime your Endpoint starts or Batch Transform job runs, you may want to put pre-downloaded dependencies under `code/lib` directory in your model archive, the container will then add the modules to the Python path. Note that if both `code/lib` and `code/requirements.txt` are present in the model archive, the `requirements.txt` will be ignored. | ||
|
||
Your untarred model directory structure may look like this if you are using `requirements.txt`: | ||
|
||
model1 | ||
/opt/ml/models/model1/model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is what the directory structure will look like on the platform once the files are downloaded to disk. Better not to confuse users since this is referring to the directory structure of the archive
@@ -687,7 +687,20 @@ Multi-Model Endpoint can be used together with Pre/Post-Processing. Each model w | |||
|--lib | |||
|--external_module | |||
|--inference.py |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
README.md
Outdated
|
||
/opt/ml/models/model1/model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, remove /opt/ml/models
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
Design doc: https://quip-amazon.com/biizAu4KYIuP/Multi-Model-Endpoint-Model-Specific-Inference-Files
Issue #, if available: https://t.corp.amazon.com/D30094580
Description of changes:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.