Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support load_lora_weights in inference API deploy #131

Open
haktan-suren opened this issue Aug 21, 2024 · 0 comments
Open

Support load_lora_weights in inference API deploy #131

haktan-suren opened this issue Aug 21, 2024 · 0 comments

Comments

@haktan-suren
Copy link

Currently there is no way to add load_lora_weights in deployment

hub = {
    'HF_MODEL_ID': 'black-forest-labs/FLUX.1-dev',
    'HF_TASK':'text-to-image',                         
    'HF_TOKEN':'TOKEN'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub,                                                # configuration for loading model from Hub
   role=role,                                              # IAM role with permissions to create an endpoint
   transformers_version="4.26",                             # Transformers version used
   pytorch_version="1.13",                                  # PyTorch version used
   py_version='py39',                                      # Python version used
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
   initial_instance_count=1,
   instance_type="ml.g4dn.xlarge"
)

Maybe in hub, there could be a new env var as "HF_LORA_MODEL"

Similar implementation present in here aws-samples/sagemaker-stablediffusion-quick-kit@bd37fe9...2d1c43b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant