You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I follow the instructions in MoA and MOA-kernel to configure environment, but I got some issues described below:
When running accuracy_test.py in MOA-kernel, I encounter the following error:
ImportError: cannot import name '_prepare_4d_causal_attention_mask_for_sdpa' from 'transformers.models.llama.modeling_llama'
When ruuning the code in MoA, the "Calibration Dataset Generation", "Profile" and "optimization" can run successfully, but not for "validation". Running scripts/pipeline/perplexity_evaluate.py reported "ModuleNotFoundError: No module named 'flash_attn'", then after installing this package (flash_attn 2.5.8) and ran again, it was reported "MoA/models/llama/modeling_llama.py, line 221, in forward
self.num_key_value_groups == 1
AssertionError: only support one key value group now, but got 4"
Was not the MoA-kernel successfully installed? And how could I solve the problems? :)
The text was updated successfully, but these errors were encountered:
Hi @youyu-2024 .
For issue 1, it seems that it's an issue of transformers installation. Can you check whether you have installed the correct version of transformers? If not, you can run pip install transformers==4.44.2. We suggest you install the requirements.txt from the MoA repo before installing the kernel.
For issue 2, it seems that you are using a GQA model, not an MHA model. Please follow MoA's instructions to convert it to the MHA version before compression. Alternatively, you can use our one-step compression pipeline, python scripts/pipeline/main.py --model_path X --model_name Y --is_gpa (remember to add is_gqa. If this is not the case, please let us know.
@fuvty For issue 1, I actually install the requirements.txt from the MoA repo before installing the kernel. The version of transformers is definitely 4.44.2, (other packages versions are partly listed as python 3.10.16, torch 2.2.0, print(torch.version) 2.2.0+cu121, nvcc --version cuda_12.4, torchvision 0.17.0, flashinfer 0.1.5, transformers 4.44.2).
For issue 2, now I ran with your example plan as "CUDA_VISIBLE_DEVICES=0 python scripts/evaluate/retrieval_evaluate.py --model_name lmsys/vicuna-7b-v1.5-16k --moa_config examples/lmsys--vicuna-7b-v1.5-16k/moa_alpha_beta.json --output_dir output/lmsys--vicuna-7b-v1.5-16k/evaluate/retrieval --length_level 8", but the error is like this:
@youyu-2024 Hi. Sorry that it takes us a little time to identify the issue.
For issue 1, could you try installing transformers==4.36.2, just for the kernel test (accuracy_test.py). If it passes, the kernel installation should be fine and you can then use 4.44.2 for MoA. We will update the test script to adapt to 4.44.2;
It will be best to proceed to issue 2 after we verify the successful installation. Besides, please provide the complete error message with the full call stack.
python 3.10.16
torch 2.2.0
print(torch.version) 2.2.0+cu121
nvcc --version cuda_12.4
torchvision 0.17.0
flashinfer 0.1.5
transformers 4.44.2
I follow the instructions in MoA and MOA-kernel to configure environment, but I got some issues described below:
When running accuracy_test.py in MOA-kernel, I encounter the following error:
ImportError: cannot import name '_prepare_4d_causal_attention_mask_for_sdpa' from 'transformers.models.llama.modeling_llama'
When ruuning the code in MoA, the "Calibration Dataset Generation", "Profile" and "optimization" can run successfully, but not for "validation". Running scripts/pipeline/perplexity_evaluate.py reported "ModuleNotFoundError: No module named 'flash_attn'", then after installing this package (flash_attn 2.5.8) and ran again, it was reported "MoA/models/llama/modeling_llama.py, line 221, in forward
self.num_key_value_groups == 1
AssertionError: only support one key value group now, but got 4"
Was not the MoA-kernel successfully installed? And how could I solve the problems? :)
The text was updated successfully, but these errors were encountered: