Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311 #135

Merged
merged 10 commits into from
Jan 28, 2025

Conversation

alvarobartt
Copy link
Member

@alvarobartt alvarobartt commented Dec 17, 2024

Description

This PR bumps the version of the huggingface-inference-toolkit to 0.5.4 to release the latest PyTorch DLC for Inference, that comes with bumped versions for transformers, diffusers, huggingface_hub and accelerate.

The dependency bump for transformers mainly introduces new architectures such as ModernBERT, ColPali, Falcon 3, etc., as well as several fixes and improvements overall. Also diffusers comes with new text-to-image pipelines for SANA and Flux Control.

Read more about the latest releases for each dependency on their respective release notes:


Additionally, this PR updates the entrypoint.sh for it to be more robust and consistent in formatting, while also adding the requirements.txt installation when the HF_MODEL_DIR is set, as when running the PyTorch Inference DLC on GKE using a mount or a volume with custom code, the requirements.txt where not being installed and custom code with custom requirements couldn't be used, but now the requirements.txt will be installed if a path in HF_MODEL_ID, HF_MODEL_DIR or AIP_STORAGE_URI is provided.

Finally, it also adds flash-attn as a dependency for the GPU image, so as to benefit from it, which is useful in some scenarios as e.g. answerdotai/ModernBERT-base as described in https://huggingface.co/answerdotai/ModernBERT-base#usage; among may others.

@alvarobartt alvarobartt self-assigned this Dec 17, 2024
@alvarobartt alvarobartt added container pytorch Pytorch related Issues labels Dec 17, 2024
@alvarobartt alvarobartt changed the title Add pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.47.0/py311 Add pytorch/inference/{cpu,gpu}/2.3.1/transformers/4.48.0/py311 Jan 15, 2025
@alvarobartt alvarobartt marked this pull request as ready for review January 18, 2025 12:14
@alvarobartt alvarobartt merged commit 60b829f into main Jan 28, 2025
1 check passed
@alvarobartt alvarobartt deleted the pytorch-inference-release branch January 28, 2025 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
container pytorch Pytorch related Issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant