Training fails for Mistral 7B #1

bengrine · 2024-05-20T21:54:43Z

python run.py --train --model ./Mistral-7B-v0.1 --data ./data --batch-size 1 --lora-layers 4 fails with the following error:

Loading pretrained model Traceback (most recent call last): File "/Users/Tony/Devel/qlora-mlx/run.py", line 153, in <module> model, tokenizer, train_set, valid_set, test_set = initilize_converted_model(args) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Tony/Devel/qlora-mlx/run.py", line 20, in initilize_converted_model model, tokenizer, _ = lora_utils.load(args.model) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/Tony/Devel/qlora-mlx/utils.py", line 137, in load model.load_weights(list(weights.items())) File "/Users/Tony/opt/miniconda3/envs/AppleMLX2/lib/python3.11/site-packages/mlx/nn/layers/base.py", line 215, in load_weights raise ValueError( ValueError: Expected shape (4096, 512) but received shape (4096, 4096) for parameter model.layers.0.self_attn.q_proj.weight

Running on Intel Mac Pro 2019.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training fails for Mistral 7B #1

Training fails for Mistral 7B #1

bengrine commented May 20, 2024

Training fails for Mistral 7B #1

Training fails for Mistral 7B #1

Comments

bengrine commented May 20, 2024