Add `pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311` #135

alvarobartt · 2024-12-17T12:30:19Z

Description

This PR bumps the version of the huggingface-inference-toolkit to 0.5.4 to release the latest PyTorch DLC for Inference, that comes with bumped versions for transformers, diffusers, huggingface_hub and accelerate.

The dependency bump for transformers mainly introduces new architectures such as ModernBERT, ColPali, Falcon 3, etc., as well as several fixes and improvements overall. Also diffusers comes with new text-to-image pipelines for SANA and Flux Control.

Read more about the latest releases for each dependency on their respective release notes:

transformers at https://github.com/huggingface/transformers/releases/tag/v4.48.0
diffusers at https://github.com/huggingface/diffusers/releases/tag/v0.32.2
accelerate at https://github.com/huggingface/accelerate/releases/tag/v1.2.1
huggingface_hub at https://github.com/huggingface/huggingface_hub/releases/tag/v0.27.0

Additionally, this PR updates the entrypoint.sh for it to be more robust and consistent in formatting, while also adding the requirements.txt installation when the HF_MODEL_DIR is set, as when running the PyTorch Inference DLC on GKE using a mount or a volume with custom code, the requirements.txt where not being installed and custom code with custom requirements couldn't be used, but now the requirements.txt will be installed if a path in HF_MODEL_ID, HF_MODEL_DIR or AIP_STORAGE_URI is provided.

Finally, it also adds flash-attn as a dependency for the GPU image, so as to benefit from it, which is useful in some scenarios as e.g. answerdotai/ModernBERT-base as described in https://huggingface.co/answerdotai/ModernBERT-base#usage; among may others.

…(WIP)

alvarobartt added 2 commits December 17, 2024 13:25

Add pytorch/inference/gpu/2.3.1/transformers/4.47.0/py311

135e17a

Add pytorch/inference/cpu/2.3.1/transformers/4.47.0/py311

ca2a44b

alvarobartt self-assigned this Dec 17, 2024

alvarobartt added container pytorch Pytorch related Issues labels Dec 17, 2024

alvarobartt changed the title ~~Add pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.47.0/py311~~ Add pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311 Jan 15, 2025

alvarobartt added 2 commits January 15, 2025 16:48

Bump transformers version based on huggingface-inference-toolkit …

e4ca220

…(WIP)

Set huggingface-inference-toolkit released version

56fa8a8

alvarobartt requested a review from philschmid January 18, 2025 12:14

alvarobartt marked this pull request as ready for review January 18, 2025 12:14

alvarobartt added 6 commits January 18, 2025 13:14

Merge branch 'main' into pytorch-inference-release

5cee193

Update entrypoint.sh for PyTorch Inference DLC

a518cc1

Fix code-comment formatting

c52af08

Fix PORT handling and remove -u from entrypoint.sh

f3e2eb1

Add flash-attn installation to PyTorch Training on GPU

26ffde9

Fix entrypoint.sh on env var check and HF_MODEL_DIR handling

1248003

alvarobartt merged commit 60b829f into main Jan 28, 2025
1 check passed

alvarobartt deleted the pytorch-inference-release branch January 28, 2025 15:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311` #135

Add `pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311` #135

alvarobartt commented Dec 17, 2024 •

edited

Loading

Add pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311 #135

Add pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311 #135

Conversation

alvarobartt commented Dec 17, 2024 • edited Loading

Description

Add `pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311` #135

Add `pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311` #135

alvarobartt commented Dec 17, 2024 •

edited

Loading