-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
entrypoint.sh for TGI does not implemented requirements.txt installation process #138
Comments
And it seems custom handler, handler.py is not recognized by TGI container. |
Hi here @jk1333, sorry in advance for the misunderstanding if any! Indeed the custom handler just applies to the PyTorch Inference DLC which is powered internally by the Are you asking because you faced any problem when serving a model? Is there anything I can help you with? Please let me know 🤗 |
Thanks for confirm @alvarobartt :) What we are wanting to achieve is collecting /metrics values to VertexAI's cloud monitoring what we can see through prometheus. And also wondered we can utilize loading model and customize handling. But it seems quite complicate linking it to custom handler. Will it be a way to bring the /metrics values to other monitoring apis like pushing ? It seems current dlc architecture once hosted on vertex ai can not utilize the values. |
AFAIK that's one of the Vertex AI constraints, as the endpoints other than |
Thanks for discussion! Thanks for interests! |
Hi here @jk1333 thanks for sharing, I'll have a look at your code; for Vertex AI we're exposing the |
Hello team,
Like this sample, https://github.com/huggingface/Google-Cloud-Containers/blob/main/containers/pytorch/inference/gpu/2.3.1/transformers/4.46.1/py311/entrypoint.sh
The entrypoint needs requirements.txt provisioning process.
But in this TGI sample does not contains these procedure.
https://github.com/huggingface/Google-Cloud-Containers/blob/main/containers/tgi/gpu/3.0.1/entrypoint.sh
Is it missing or handled by text_generation_launcher process internally ?
The text was updated successfully, but these errors were encountered: