create endpoint with InferenceAmiVersion #602

CoderHam · 2024-11-20T16:45:05Z

This PR introduces several changes to the client.py file, primarily focusing on the create_endpoint function. The modifications aim to enhance the deployment process for both fine-tuned and pre-trained models, ensuring compatibility with various Python versions and providing a more robust error-handling mechanism.

Summary of Changes:

Import Updates: The import statements have been streamlined, removing redundant imports from the typing module.
create_endpoint Function:
- A new variable useBoto is introduced to determine the deployment method based on the presence of s3_models_dir.
- The function now includes a try-except block to handle potential errors when deleting the model.
- The logic for setting the role variable has been improved, considering the useBoto flag.
- The model deployment process has been updated to handle validation parameter issues in Python 3.6.
- For pre-trained models, the PR adds a new deployment method using boto to set the InferenceAmiVersion.
S3 URL Parsing: The PR replaces the parse_s3_url function with lazy_sagemaker().s3.parse_s3_url for parsing S3 URLs, ensuring consistency and compatibility.

Detailed Changes:

Import Updates:
- Removed: TokenLikelihood from the generation module.
- Removed: Tuple from the typing module.
create_endpoint Function:
- Added: useBoto variable to determine the deployment method.
- Added: try-except block to handle potential errors when deleting the model.
- Modified: The logic for setting the role variable now considers the useBoto flag.
- Modified: The model deployment process now handles validation parameter issues in Python 3.6.
- Added: A new deployment method for pre-trained models using boto to set the InferenceAmiVersion.
S3 URL Parsing:
- Replaced: parse_s3_url with lazy_sagemaker().s3.parse_s3_url for parsing S3 URLs.

hemant-co added 3 commits November 20, 2024 21:50

x

f3fd5a8

pickup changes from cohere-ai/cohere-aws#196

d57d515

misc cleanup and import fixes

76920c1

jpekmez approved these changes Nov 20, 2024

View reviewed changes

CoderHam marked this pull request as ready for review November 20, 2024 17:33

billytrend-cohere approved these changes Nov 21, 2024

View reviewed changes

CoderHam merged commit 756515a into main Nov 21, 2024
4 checks passed

CoderHam deleted the hemant-aws-sagemaker branch November 21, 2024 20:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

create endpoint with InferenceAmiVersion #602

create endpoint with InferenceAmiVersion #602

Uh oh!

CoderHam commented Nov 20, 2024 •

edited by cohere-pr-pal bot

Loading

Uh oh!

Uh oh!

Uh oh!

create endpoint with InferenceAmiVersion #602

create endpoint with InferenceAmiVersion #602

Uh oh!

Conversation

CoderHam commented Nov 20, 2024 • edited by cohere-pr-pal bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of Changes:

Detailed Changes:

Uh oh!

Uh oh!

Uh oh!

CoderHam commented Nov 20, 2024 •

edited by cohere-pr-pal bot

Loading