From e7c6fc61ed587ab3d469dfee21fa1dc435c4f057 Mon Sep 17 00:00:00 2001 From: jiqing-feng <107918818+jiqing-feng@users.noreply.github.com> Date: Sat, 21 Sep 2024 10:45:15 +0800 Subject: [PATCH] docs: add cpu benchmark (#1366) * cpu benchmark * try to fix formatting * cleanup * cleanup --------- Co-authored-by: Titus <9048635+Titus-von-Koeller@users.noreply.github.com> --- docs/source/non_cuda_backends.mdx | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/docs/source/non_cuda_backends.mdx b/docs/source/non_cuda_backends.mdx index fca586534..fc7c6ac27 100644 --- a/docs/source/non_cuda_backends.mdx +++ b/docs/source/non_cuda_backends.mdx @@ -24,4 +24,18 @@ Thank you for your support! ### Intel -### AMD +The following performance data is collected from Intel 4th Gen Xeon (SPR) platform. The tables show speed-up and memory compared with different data types of [Llama-2-7b-chat-hf](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf). + +#### Inference (CPU) + +| Data Type | BF16 | INT8 | NF4 | FP4 | +|---|---|---|---|---| +| Speed-Up (vs BF16) | 1.0x | 0.6x | 2.3x | 0.03x | +| Memory (GB) | 13.1 | 7.6 | 5.0 | 4.6 | + +#### Fine-Tuning (CPU) + +| Data Type | AMP BF16 | INT8 | NF4 | FP4 | +|---|---|---|---|---| +| Speed-Up (vs AMP BF16) | 1.0x | 0.38x | 0.07x | 0.07x | +| Memory (GB) | 40 | 9 | 6.6 | 6.6 |