Can we use Multi-LORA CPU #128

AndrewNgo-ini · 2024-12-05T05:42:51Z

Hi,

Im currently following this doc: https://huggingface.co/docs/google-cloud/en/examples/gke-tgi-multi-lora-deployment

After got a bug: "Can’t scale up due to exceeded quota" and do some research, I suspect that my free trial (300$) account is not able to increase GPU quota (even I have activated my account to not be trial anymore and have to contact sale)

Is there anyway I can run this with cpu instead.

Thank you

alvarobartt · 2024-12-05T12:13:48Z

Hi here @AndrewNgo-ini I'm afraid you won't be able to run TGI on CPUs with the current container on Google Cloud, as that's only for GPU (and coming for TPU too).

Anyway, you should be able to run TGI on Intel CPUs as of https://huggingface.co/docs/text-generation-inference/installation_intel#using-tgi-with-intel-cpus, even if it's not the recommended hardware. You should be able to re-use the container at ghcr.io/huggingface/text-generation-inference:2.4.1-intel-cpu, let me know if that works.

Hope that helps 🤗 Also happy to know more about your use-case / needs to see how we can support those better in the coming months!

AndrewNgo-ini · 2024-12-05T12:23:57Z

Good day @alvarobartt
My use case is to demo for devfest day of GDG. so latency wouldnt be a problem.

I’ll try to try out cpu approach and would not mind to have a PR to adjust this problem. Any suggestion I can do this properly

alvarobartt · 2024-12-12T10:06:41Z

Great, so atm I'm afraid that we're not shipping the Text Generation Inference (TGI) DLC for any other hardware than NVIDIA GPUs and soon to come Google TPUs, so you could still demo it, but without the container being officially hosted on Google Cloud; anyway, let me know if you run into any issues when running the demo, happy to help! 🤗

alvarobartt self-assigned this Dec 5, 2024

alvarobartt added the question label Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we use Multi-LORA CPU #128

Can we use Multi-LORA CPU #128

AndrewNgo-ini commented Dec 5, 2024

alvarobartt commented Dec 5, 2024

AndrewNgo-ini commented Dec 5, 2024

alvarobartt commented Dec 12, 2024

Can we use Multi-LORA CPU #128

Can we use Multi-LORA CPU #128

Comments

AndrewNgo-ini commented Dec 5, 2024

alvarobartt commented Dec 5, 2024

AndrewNgo-ini commented Dec 5, 2024

alvarobartt commented Dec 12, 2024