From a8ff1ddd5f30c4728ff6951a925e7ffcf82486cf Mon Sep 17 00:00:00 2001 From: charlifu Date: Thu, 13 Jun 2024 19:46:32 +0000 Subject: [PATCH 1/2] update quark quantizer command --- ROCm_performance.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/ROCm_performance.md b/ROCm_performance.md index 1c47a818ec852..302366d3f1a7b 100644 --- a/ROCm_performance.md +++ b/ROCm_performance.md @@ -30,8 +30,8 @@ python3 quantize_quark.py --model_dir [llama2 checkpoint folder] \ --output_dir output_dir \ --quant_scheme w_fp8_a_fp8_o_fp8 \ --num_calib_data 128 \ - --export_safetensors \ - --no_weight_matrix_merge + --model_export vllm_adopted_safetensors \ + --no_weight_matrix_mergee ``` For more details, please refer to Quark's documentation. From 7e09aeacddbdbf4dae13f1b497b1003cb0272818 Mon Sep 17 00:00:00 2001 From: charlifu Date: Thu, 13 Jun 2024 19:49:27 +0000 Subject: [PATCH 2/2] typo --- ROCm_performance.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ROCm_performance.md b/ROCm_performance.md index 302366d3f1a7b..bae57ea62d47c 100644 --- a/ROCm_performance.md +++ b/ROCm_performance.md @@ -31,7 +31,7 @@ python3 quantize_quark.py --model_dir [llama2 checkpoint folder] \ --quant_scheme w_fp8_a_fp8_o_fp8 \ --num_calib_data 128 \ --model_export vllm_adopted_safetensors \ - --no_weight_matrix_mergee + --no_weight_matrix_merge ``` For more details, please refer to Quark's documentation.