[Doc] Improve installation signposting (#12575)

- Make device tab names more explicit - Add comprehensive list of devices to https://docs.vllm.ai/en/latest/getting_started/installation/index.html - Add `attention` blocks to the intro of all devices that don't have pre-built wheels/images --------- Signed-off-by: Harry Mellor <[email protected]>
vllm-project · Jan 31, 2025 · 60808bd · 60808bd
1 parent fc54214
commit 60808bd
Show file tree

Hide file tree

Showing 13 changed files with 111 additions and 59 deletions.
diff --git a/docs/source/getting_started/installation/ai_accelerator/hpu-gaudi.inc.md b/docs/source/getting_started/installation/ai_accelerator/hpu-gaudi.inc.md
@@ -2,6 +2,10 @@
 
 This tab provides instructions on running vLLM with Intel Gaudi devices.
 
+:::{attention}
+There are no pre-built wheels or images for this device, so you must build vLLM from source.
+:::
+
 ## Requirements
 
 - OS: Ubuntu 22.04 LTS

diff --git a/docs/source/getting_started/installation/ai_accelerator/index.md b/docs/source/getting_started/installation/ai_accelerator/index.md
@@ -5,7 +5,8 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} TPU
+::::{tab-item} Google TPU
+:selected:
 :sync: tpu
 
 :::{include} tpu.inc.md
@@ -25,7 +26,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 
 ::::
 
-::::{tab-item} Neuron
+::::{tab-item} AWS Neuron
 :sync: neuron
 
 :::{include} neuron.inc.md
@@ -52,7 +53,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} TPU
+::::{tab-item} Google TPU
 :sync: tpu
 
 :::{include} tpu.inc.md
@@ -72,7 +73,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 
 ::::
 
-::::{tab-item} Neuron
+::::{tab-item} AWS Neuron
 :sync: neuron
 
 :::{include} neuron.inc.md
@@ -99,7 +100,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} TPU
+::::{tab-item} Google TPU
 :sync: tpu
 
 :::{include} tpu.inc.md
@@ -119,7 +120,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 
 ::::
 
-::::{tab-item} Neuron
+::::{tab-item} AWS Neuron
 :sync: neuron
 
 :::{include} neuron.inc.md
@@ -146,7 +147,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} TPU
+::::{tab-item} Google TPU
 :sync: tpu
 
 :::{include} tpu.inc.md
@@ -166,7 +167,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 
 ::::
 
-::::{tab-item} Neuron
+::::{tab-item} AWS Neuron
 :sync: neuron
 
 :::{include} neuron.inc.md
@@ -193,7 +194,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} TPU
+::::{tab-item} Google TPU
 :sync: tpu
 
 :::{include} tpu.inc.md
@@ -213,7 +214,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 
 ::::
 
-::::{tab-item} Neuron
+::::{tab-item} AWS Neuron
 :sync: neuron
 
 :::{include} neuron.inc.md
@@ -242,7 +243,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} TPU
+::::{tab-item} Google TPU
 :sync: tpu
 
 :::{include} tpu.inc.md
@@ -262,7 +263,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 
 ::::
 
-::::{tab-item} Neuron
+::::{tab-item} AWS Neuron
 :sync: neuron
 
 :::{include} neuron.inc.md
@@ -289,7 +290,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} TPU
+::::{tab-item} Google TPU
 :sync: tpu
 
 :::{include} tpu.inc.md
@@ -309,7 +310,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 
 ::::
 
-::::{tab-item} Neuron
+::::{tab-item} AWS Neuron
 :sync: neuron
 
 :::{include} neuron.inc.md
@@ -336,7 +337,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} TPU
+::::{tab-item} Google TPU
 :sync: tpu
 
 :::{include} tpu.inc.md
@@ -354,7 +355,7 @@ vLLM is a Python library that supports the following AI accelerators. Select you
 
 ::::
 
-::::{tab-item} Neuron
+::::{tab-item} AWS Neuron
 :sync: neuron
 
 :::{include} neuron.inc.md

diff --git a/docs/source/getting_started/installation/ai_accelerator/neuron.inc.md b/docs/source/getting_started/installation/ai_accelerator/neuron.inc.md
@@ -4,6 +4,10 @@ vLLM 0.3.3 onwards supports model inferencing and serving on AWS Trainium/Infere
 Paged Attention and Chunked Prefill are currently in development and will be available soon.
 Data types currently supported in Neuron SDK are FP16 and BF16.
 
+:::{attention}
+There are no pre-built wheels or images for this device, so you must build vLLM from source.
+:::
+
 ## Requirements
 
 - OS: Linux

diff --git a/docs/source/getting_started/installation/ai_accelerator/openvino.inc.md b/docs/source/getting_started/installation/ai_accelerator/openvino.inc.md
@@ -2,6 +2,10 @@
 
 vLLM powered by OpenVINO supports all LLM models from [vLLM supported models list](#supported-models) and can perform optimal model serving on all x86-64 CPUs with, at least, AVX2 support, as well as on both integrated and discrete Intel® GPUs ([the list of supported GPUs](https://docs.openvino.ai/2024/about-openvino/release-notes-openvino/system-requirements.html#gpu)).
 
+:::{attention}
+There are no pre-built wheels or images for this device, so you must build vLLM from source.
+:::
+
 ## Requirements
 
 - OS: Linux

diff --git a/docs/source/getting_started/installation/ai_accelerator/tpu.inc.md b/docs/source/getting_started/installation/ai_accelerator/tpu.inc.md
@@ -30,6 +30,10 @@ For TPU pricing information, see [Cloud TPU pricing](https://cloud.google.com/tp
 You may need additional persistent storage for your TPU VMs. For more
 information, see [Storage options for Cloud TPU data](https://cloud.devsite.corp.google.com/tpu/docs/storage-options).
 
+:::{attention}
+There are no pre-built wheels for this device, so you must either use the pre-built Docker image or build vLLM from source.
+:::
+
 ## Requirements
 
 - Google Cloud TPU VM

diff --git a/docs/source/getting_started/installation/cpu/apple.inc.md b/docs/source/getting_started/installation/cpu/apple.inc.md
@@ -4,6 +4,10 @@ vLLM has experimental support for macOS with Apple silicon. For now, users shall
 
 Currently the CPU implementation for macOS supports FP32 and FP16 datatypes.
 
+:::{attention}
+There are no pre-built wheels or images for this device, so you must build vLLM from source.
+:::
+
 ## Requirements
 
 - OS: `macOS Sonoma` or later

diff --git a/docs/source/getting_started/installation/cpu/arm.inc.md b/docs/source/getting_started/installation/cpu/arm.inc.md
@@ -4,6 +4,10 @@ vLLM has been adapted to work on ARM64 CPUs with NEON support, leveraging the CP
 
 ARM CPU backend currently supports Float32, FP16 and BFloat16 datatypes.
 
+:::{attention}
+There are no pre-built wheels or images for this device, so you must build vLLM from source.
+:::
+
 ## Requirements
 
 - OS: Linux

diff --git a/docs/source/getting_started/installation/cpu/index.md b/docs/source/getting_started/installation/cpu/index.md
@@ -5,7 +5,8 @@ vLLM is a Python library that supports the following CPU variants. Select your C
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} x86
+::::{tab-item} Intel/AMD x86
+:selected:
 :sync: x86
 
 :::{include} x86.inc.md
@@ -15,7 +16,7 @@ vLLM is a Python library that supports the following CPU variants. Select your C
 
 ::::
 
-::::{tab-item} ARM
+::::{tab-item} ARM AArch64
 :sync: arm
 
 :::{include} arm.inc.md
@@ -44,7 +45,7 @@ vLLM is a Python library that supports the following CPU variants. Select your C
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} x86
+::::{tab-item} Intel/AMD x86
 :sync: x86
 
 :::{include} x86.inc.md
@@ -54,7 +55,7 @@ vLLM is a Python library that supports the following CPU variants. Select your C
 
 ::::
 
-::::{tab-item} ARM
+::::{tab-item} ARM AArch64
 :sync: arm
 
 :::{include} arm.inc.md
@@ -92,7 +93,7 @@ Currently, there are no pre-built CPU wheels.
 :::::{tab-set}
 :sync-group: device
 
-::::{tab-item} x86
+::::{tab-item} Intel/AMD x86
 :sync: x86
 
 :::{include} x86.inc.md
@@ -102,7 +103,7 @@ Currently, there are no pre-built CPU wheels.
 
 ::::
 
-::::{tab-item} ARM
+::::{tab-item} ARM AArch64
 :sync: arm
 
 :::{include} arm.inc.md

diff --git a/docs/source/getting_started/installation/cpu/x86.inc.md b/docs/source/getting_started/installation/cpu/x86.inc.md
@@ -2,12 +2,20 @@
 
 vLLM initially supports basic model inferencing and serving on x86 CPU platform, with data types FP32, FP16 and BF16.
 
+:::{attention}
+There are no pre-built wheels or images for this device, so you must build vLLM from source.
+:::
+
 ## Requirements
 
 - OS: Linux
 - Compiler: `gcc/g++ >= 12.3.0` (optional, recommended)
 - Instruction Set Architecture (ISA): AVX512 (optional, recommended)
 
+:::{tip}
+[Intel Extension for PyTorch (IPEX)](https://github.com/intel/intel-extension-for-pytorch) extends PyTorch with up-to-date features optimizations for an extra performance boost on Intel hardware.
+:::
+
 ## Set up using Python
 
 ### Pre-built wheels
@@ -29,7 +37,3 @@ vLLM initially supports basic model inferencing and serving on x86 CPU platform,
 ### Build image from source
 
 ## Extra information
-
-## Intel Extension for PyTorch
-
-- [Intel Extension for PyTorch (IPEX)](https://github.com/intel/intel-extension-for-pytorch) extends PyTorch with up-to-date features optimizations for an extra performance boost on Intel hardware.