You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After got a bug: "Can’t scale up due to exceeded quota" and do some research, I suspect that my free trial (300$) account is not able to increase GPU quota (even I have activated my account to not be trial anymore and have to contact sale)
Is there anyway I can run this with cpu instead.
Thank you
The text was updated successfully, but these errors were encountered:
Hi here @AndrewNgo-ini I'm afraid you won't be able to run TGI on CPUs with the current container on Google Cloud, as that's only for GPU (and coming for TPU too).
Great, so atm I'm afraid that we're not shipping the Text Generation Inference (TGI) DLC for any other hardware than NVIDIA GPUs and soon to come Google TPUs, so you could still demo it, but without the container being officially hosted on Google Cloud; anyway, let me know if you run into any issues when running the demo, happy to help! 🤗
Hi,
Im currently following this doc: https://huggingface.co/docs/google-cloud/en/examples/gke-tgi-multi-lora-deployment
After got a bug: "Can’t scale up due to exceeded quota" and do some research, I suspect that my free trial (300$) account is not able to increase GPU quota (even I have activated my account to not be trial anymore and have to contact sale)
Is there anyway I can run this with cpu instead.
Thank you
The text was updated successfully, but these errors were encountered: