-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error with gguf conversion. #1416
Comments
I'm trying to add a new method which should make GGUF conversions easier - was planning to add it in today, but it's more complicated than I expected - it'll come out by EOW hopefully! In the meantime, use |
Build tutorial says that we need to build llama.cpp first. Then these files shall get generated. I used cmake but files were not getting generated. Nothing I tried worked. Then I used the python files in llama.cpp folder to convert the model to gguf manually..
Atleast model files are getting generated. But I am unable to convert these model files to use in ollama..
|
@jainpradeep Did you create a Modelfile? |
@danielhanchen yes sir. I fixed the issue following the link |
@danielhanchen any update? or is it like #1376 ? |
Still working on it! |
On Windows, the executable will be *.exe, |
Besides, the file name "convert-hf-to-gguf.py" is wrong in save.py. |
Here's what I get while trying to quantize my latest attempt at finetuning.
'---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
Cell In[12], line 12
9 if False: model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "f16", token = "")
11 # Save to q4_k_m GGUF
---> 12 if True: model.save_pretrained_gguf("fictions", tokenizer, quantization_method = "q5_k")
13 if False: model.push_to_hub_gguf("hf/model", tokenizer, quantization_method = "q4_k_m", token = "")
15 # Save to multiple GGUF options - much faster if you want multiple!
File /usr/local/lib/python3.11/dist-packages/unsloth/save.py:1734, in unsloth_save_pretrained_gguf(self, save_directory, tokenizer, quantization_method, first_conversion, push_to_hub, token, private, is_main_process, state_dict, save_function, max_shard_size, safe_serialization, variant, save_peft_format, tags, temporary_location, maximum_memory_usage)
1731 is_sentencepiece_model = check_if_sentencepiece_model(self)
1733 # Save to GGUF
-> 1734 all_file_locations, want_full_precision = save_to_gguf(
1735 model_type, model_dtype, is_sentencepiece_model,
1736 new_save_directory, quantization_method, first_conversion, makefile,
1737 )
1739 # Save Ollama modelfile
1740 modelfile = create_ollama_modelfile(tokenizer, all_file_locations[0])
File /usr/local/lib/python3.11/dist-packages/unsloth/save.py:1069, in save_to_gguf(model_type, model_dtype, is_sentencepiece, model_directory, quantization_method, first_conversion, _run_installer)
1067 quantize_location = "llama.cpp/llama-quantize"
1068 else:
-> 1069 raise RuntimeError(
1070 "Unsloth: The file 'llama.cpp/llama-quantize' or 'llama.cpp/quantize' does not exist.\n"
1071 "But we expect this file to exist! Maybe the llama.cpp developers changed the name?"
1072 )
1073 pass
1075 # See #730
1076 # Filenames changed again!
RuntimeError: Unsloth: The file 'llama.cpp/llama-quantize' or 'llama.cpp/quantize' does not exist.
But we expect this file to exist! Maybe the llama.cpp developers changed the name?'
The text was updated successfully, but these errors were encountered: