Support Llama / Hugging Face's Universal Format (GGUF) #659

BrainSlugs83 · 2024-05-30T04:56:31Z

BrainSlugs83
May 30, 2024

Are there any plans to support the Hugging Face / Llama.cpp universal format (GGUF)?

This format is very popular and uses just a single file to describe a whole model (even for mixture of experts models), it's optimized for fast loading for inference (whether on the CPU or GPU, or elsewhere), and supports quantization. There's also built-in tooling on hugging face to automatically convert other repositories to this format.

The format is designed to be unambiguous by containing all the information needed to load a model. It is also designed to be extensible, so that new information can be added to models without breaking compatibility.

And there is a huge repo of models based on Llama, Mistral, etc. that are already in this format; including fine tunes of Microsoft Phi.

It would be hugely convenient for developers if the DirectML model loader could just load those these...

[Side Question: what is the current supported format? -- I can't really find any repos on hugging face that seem similar enough to the supported Phi-3 repo that "just work" they always complain about missing JSON files, etc. -- and I'm not even fully sure what the current format is, let alone how to convert an existing model to it.]

BrainSlugs83 · 2024-06-26T01:32:35Z

BrainSlugs83
Jun 26, 2024
Author

Bump.

0 replies

jarroddavis68 · 2024-07-01T07:13:51Z

jarroddavis68
Jul 1, 2024

@BrainSlugs83 I saw this post and was inspired to come up with a solution, at least in the interim, because I was hoping for a single file format as well. I'm working on this project you may find useful. I will just let the video role. If it is something you may be interested in and wish to test it, let me know.

vfolder_phi3-onnx.mp4

0 replies

natke · 2024-07-01T17:49:45Z

natke
Jul 1, 2024
Collaborator

Hi @BrainSlugs83, this API uses the ONNX (Open Neural Network Exchange) model format. Moving this issue into a Discussion as a feature request.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support Llama / Hugging Face's Universal Format (GGUF) #659

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

Support Llama / Hugging Face's Universal Format (GGUF) #659

BrainSlugs83 May 30, 2024

Replies: 3 comments

BrainSlugs83 Jun 26, 2024 Author

jarroddavis68 Jul 1, 2024

natke Jul 1, 2024 Collaborator

BrainSlugs83
May 30, 2024

BrainSlugs83
Jun 26, 2024
Author

jarroddavis68
Jul 1, 2024

natke
Jul 1, 2024
Collaborator