Support Mac/MPS for inference #7

carlosgjs · 2023-12-07T18:39:47Z

Using the HF pipeline with the bitsandbytes quantization doesn't work on MPS yet. However, the llama.cpp runtime works well on a Mac, so that can be leveraged. We need to dynamically load/use a runtime based on the platform.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Mac/MPS for inference #7

Support Mac/MPS for inference #7

carlosgjs commented Dec 7, 2023

Support Mac/MPS for inference #7

Support Mac/MPS for inference #7

Comments

carlosgjs commented Dec 7, 2023