Support load_lora_weights in inference API deploy #131

haktan-suren · 2024-08-21T21:33:51Z

Currently there is no way to add load_lora_weights in deployment

hub = {
    'HF_MODEL_ID': 'black-forest-labs/FLUX.1-dev',
    'HF_TASK':'text-to-image',                         
    'HF_TOKEN':'TOKEN'
}

# create Hugging Face Model Class
huggingface_model = HuggingFaceModel(
   env=hub,                                                # configuration for loading model from Hub
   role=role,                                              # IAM role with permissions to create an endpoint
   transformers_version="4.26",                             # Transformers version used
   pytorch_version="1.13",                                  # PyTorch version used
   py_version='py39',                                      # Python version used
)

# deploy model to SageMaker Inference
predictor = huggingface_model.deploy(
   initial_instance_count=1,
   instance_type="ml.g4dn.xlarge"
)

Maybe in hub, there could be a new env var as "HF_LORA_MODEL"

Similar implementation present in here aws-samples/sagemaker-stablediffusion-quick-kit@bd37fe9...2d1c43b

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support load_lora_weights in inference API deploy #131

Support load_lora_weights in inference API deploy #131

haktan-suren commented Aug 21, 2024

Support load_lora_weights in inference API deploy #131

Support load_lora_weights in inference API deploy #131

Comments

haktan-suren commented Aug 21, 2024