Skip to content

Commit a5ba0fc

Browse files
authored
doc: faq gpu compatibility (ollama#3142)
1 parent 3a30bf5 commit a5ba0fc

File tree

4 files changed

+79
-34
lines changed

4 files changed

+79
-34
lines changed

Dockerfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
ARG GOLANG_VERSION=1.22.1
22
ARG CMAKE_VERSION=3.22.1
3+
# this CUDA_VERSION corresponds with the one specified in docs/gpu.md
34
ARG CUDA_VERSION=11.3.1
45
ARG ROCM_VERSION=6.0
56

docs/faq.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,10 @@ curl -fsSL https://ollama.com/install.sh | sh
1414

1515
Review the [Troubleshooting](./troubleshooting.md) docs for more about using logs.
1616

17+
## Is my GPU compatible with Ollama?
18+
19+
Please refer to the [GPU docs](./gpu.md).
20+
1721
## How can I specify the context window size?
1822

1923
By default, Ollama uses a context window size of 2048 tokens.

docs/gpu.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
# GPU
2+
## Nvidia
3+
Ollama supports Nvidia GPUs with compute capability 5.0+.
4+
5+
Check your compute compatibility to see if your card is supported:
6+
[https://developer.nvidia.com/cuda-gpus](https://developer.nvidia.com/cuda-gpus)
7+
8+
| Compute Capability | Family | Cards |
9+
| ------------------ | ------------------- | ----------------------------------------------------------------------------------------------------------- |
10+
| 9.0 | NVIDIA | `H100` |
11+
| 8.9 | GeForce RTX 40xx | `RTX 4090` `RTX 4080` `RTX 4070 Ti` `RTX 4060 Ti` |
12+
| | NVIDIA Professional | `L4` `L40` `RTX 6000` |
13+
| 8.6 | GeForce RTX 30xx | `RTX 3090 Ti` `RTX 3090` `RTX 3080 Ti` `RTX 3080` `RTX 3070 Ti` `RTX 3070` `RTX 3060 Ti` `RTX 3060` |
14+
| | NVIDIA Professional | `A40` `RTX A6000` `RTX A5000` `RTX A4000` `RTX A3000` `RTX A2000` `A10` `A16` `A2` |
15+
| 8.0 | NVIDIA | `A100` `A30` |
16+
| 7.5 | GeForce GTX/RTX | `GTX 1650 Ti` `TITAN RTX` `RTX 2080 Ti` `RTX 2080` `RTX 2070` `RTX 2060` |
17+
| | NVIDIA Professional | `T4` `RTX 5000` `RTX 4000` `RTX 3000` `T2000` `T1200` `T1000` `T600` `T500` |
18+
| | Quadro | `RTX 8000` `RTX 6000` `RTX 5000` `RTX 4000` |
19+
| 7.0 | NVIDIA | `TITAN V` `V100` `Quadro GV100` |
20+
| 6.1 | NVIDIA TITAN | `TITAN Xp` `TITAN X` |
21+
| | GeForce GTX | `GTX 1080 Ti` `GTX 1080` `GTX 1070 Ti` `GTX 1070` `GTX 1060` `GTX 1050` |
22+
| | Quadro | `P6000` `P5200` `P4200` `P3200` `P5000` `P4000` `P3000` `P2200` `P2000` `P1000` `P620` `P600` `P500` `P520` |
23+
| | Tesla | `P40` `P4` |
24+
| 6.0 | NVIDIA | `Tesla P100` `Quadro GP100` |
25+
| 5.2 | GeForce GTX | `GTX TITAN X` `GTX 980 Ti` `GTX 980` `GTX 970` `GTX 960` `GTX 950` |
26+
| | Quadro | `M6000 24GB` `M6000` `M5000` `M5500M` `M4000` `M2200` `M2000` `M620` |
27+
| | Tesla | `M60` `M40` |
28+
| 5.0 | GeForce GTX | `GTX 750 Ti` `GTX 750` `NVS 810` |
29+
| | Quadro | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M` |
30+
31+
32+
## AMD Radeon
33+
Ollama supports the following AMD GPUs:
34+
| Family | Cards and accelerators |
35+
| -------------- | ---------------------------------------------------------------------------------------------------------------------------------------------- |
36+
| AMD Radeon RX | `7900 XTX` `7900 XT` `7900 GRE` `7800 XT` `7700 XT` `7600 XT` `7600` `6950 XT` `6900 XTX` `6900XT` `6800 XT` `6800` `Vega 64` `Vega 56` |
37+
| AMD Radeon PRO | `W7900` `W7800` `W7700` `W7600` `W7500` `W6900X` `W6800X Duo` `W6800X` `W6800` `V620` `V420` `V340` `V320` `Vega II Duo` `Vega II` `VII` `SSG` |
38+
| AMD Instinct | `MI300X` `MI300A` `MI300` `MI250X` `MI250` `MI210` `MI200` `MI100` `MI60` `MI50` |
39+
40+
### Overrides
41+
Ollama leverages the AMD ROCm library, which does not support all AMD GPUs. In
42+
some cases you can force the system to try to use a similar LLVM target that is
43+
close. For example The Radeon RX 5400 is `gfx1034` (also known as 10.3.4)
44+
however, ROCm does not currently support this target. The closest support is
45+
`gfx1030`. You can use the environment variable `HSA_OVERRIDE_GFX_VERSION` with
46+
`x.y.z` syntax. So for example, to force the system to run on the RX 5400, you
47+
would set `HSA_OVERRIDE_GFX_VERSION="10.3.0"` as an environment variable for the
48+
server. If you have an unsupported AMD GPU you can experiment using the list of
49+
supported types below.
50+
51+
At this time, the known supported GPU types are the following LLVM Targets.
52+
This table shows some example GPUs that map to these LLVM targets:
53+
| **LLVM Target** | **An Example GPU** |
54+
|-----------------|---------------------|
55+
| gfx900 | Radeon RX Vega 56 |
56+
| gfx906 | Radeon Instinct MI50 |
57+
| gfx908 | Radeon Instinct MI100 |
58+
| gfx90a | Radeon Instinct MI210 |
59+
| gfx940 | Radeon Instinct MI300 |
60+
| gfx941 | |
61+
| gfx942 | |
62+
| gfx1030 | Radeon PRO V620 |
63+
| gfx1100 | Radeon PRO W7900 |
64+
| gfx1101 | Radeon PRO W7700 |
65+
| gfx1102 | Radeon RX 7600 |
66+
67+
AMD is working on enhancing ROCm v6 to broaden support for families of GPUs in a
68+
future release which should increase support for more GPUs.
69+
70+
Reach out on [Discord](https://discord.gg/ollama) or file an
71+
[issue](https://github.com/ollama/ollama/issues) for additional help.
72+
73+
### Metal (Apple GPUs)
74+
Ollama supports GPU acceleration on Apple devices via the Metal API.

docs/troubleshooting.md

Lines changed: 0 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -67,40 +67,6 @@ You can see what features your CPU has with the following.
6767
cat /proc/cpuinfo| grep flags | head -1
6868
```
6969

70-
## AMD Radeon GPU Support
71-
72-
Ollama leverages the AMD ROCm library, which does not support all AMD GPUs. In
73-
some cases you can force the system to try to use a similar LLVM target that is
74-
close. For example The Radeon RX 5400 is `gfx1034` (also known as 10.3.4)
75-
however, ROCm does not currently support this target. The closest support is
76-
`gfx1030`. You can use the environment variable `HSA_OVERRIDE_GFX_VERSION` with
77-
`x.y.z` syntax. So for example, to force the system to run on the RX 5400, you
78-
would set `HSA_OVERRIDE_GFX_VERSION="10.3.0"` as an environment variable for the
79-
server. If you have an unsupported AMD GPU you can experiment using the list of
80-
supported types below.
81-
82-
At this time, the known supported GPU types are the following LLVM Targets.
83-
This table shows some example GPUs that map to these LLVM targets:
84-
| **LLVM Target** | **An Example GPU** |
85-
|-----------------|---------------------|
86-
| gfx900 | Radeon RX Vega 56 |
87-
| gfx906 | Radeon Instinct MI50 |
88-
| gfx908 | Radeon Instinct MI100 |
89-
| gfx90a | Radeon Instinct MI210 |
90-
| gfx940 | Radeon Instinct MI300 |
91-
| gfx941 | |
92-
| gfx942 | |
93-
| gfx1030 | Radeon PRO V620 |
94-
| gfx1100 | Radeon PRO W7900 |
95-
| gfx1101 | Radeon PRO W7700 |
96-
| gfx1102 | Radeon RX 7600 |
97-
98-
AMD is working on enhancing ROCm v6 to broaden support for families of GPUs in a
99-
future release which should increase support for more GPUs.
100-
101-
Reach out on [Discord](https://discord.gg/ollama) or file an
102-
[issue](https://github.com/ollama/ollama/issues) for additional help.
103-
10470
## Installing older or pre-release versions on Linux
10571

10672
If you run into problems on Linux and want to install an older version, or you'd

0 commit comments

Comments
 (0)