-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: create blog post for vulkan support (#2021)
* docs: create blog post for vulkan support * Update website/blog/2024-05-01-vulkan-support/index.md --------- Co-authored-by: Meng Zhang <[email protected]>
- Loading branch information
Showing
4 changed files
with
70 additions
and
0 deletions.
There are no files selected for viewing
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
--- | ||
title: 'Vulkan Support: LLMs for Everyone' | ||
authors: [boxbeam] | ||
tags: [deployment] | ||
--- | ||
|
||
It has long been the case that machine learning models are run on the GPU to improve their performance. | ||
The GPU is far more effective at the kinds of computations needed for AI than the CPU, and so GPU compute libraries | ||
such as Cuda and ROCm are typically used. | ||
|
||
However, requiring the support of these libraries can restrict which graphics cards are compatible, leaving many | ||
with older or less popular cards unable to run LLMs efficiently. | ||
|
||
Tabby is happy to announce that we now support Vulkan, a graphics library created primarily for games. Its original purpose | ||
means that it is designed to work on a very broad range of cards, and leveraging it to host LLMs means that we can now | ||
offer GPU acceleration to people whose cards are not supported by Cuda and ROCm. | ||
|
||
Vulkan works on basically any GPU, so if you have previously been forced to host local models on your CPU, now is the time | ||
to see what Tabby with Vulkan can do for you! | ||
|
||
## Vulkan Installation | ||
|
||
To begin, first make sure that you have Vulkan installed. | ||
|
||
For Windows users, Vulkan may be natively supported. Otherwise, the Vulkan SDK can be downloaded at https://vulkan.lunarg.com/sdk/home#windows. | ||
|
||
For Linux users, Vulkan can be installed through your package manager: | ||
- Arch Linux: vulkan-icd-loader (universal), and also install vulkan-radeon (for AMD) or vulkan-nouveau (for Nvidia) | ||
- Debian Linux: libvulkan1 | ||
|
||
![Vulkan installed on Arch Linux](./vulkan-installed-on-arch.png) | ||
|
||
## Tabby Installation | ||
|
||
To start using Tabby with Vulkan, first download one of the pre-built Vulkan binaries for your platform: | ||
- Linux: https://github.com/TabbyML/tabby/releases/download/v0.10.0/tabby_x86_64-manylinux2014-vulkan | ||
- Windows: https://github.com/TabbyML/tabby/releases/download/v0.10.0/tabby_x86_64-windows-msvc-vulkan.exe | ||
|
||
## Running | ||
|
||
Once you've installed the appropriate binary, you can simply run it from the command line: | ||
|
||
For Windows, open a command prompt and navigate to the download folder, then run: | ||
|
||
``` | ||
tabby_x84_64-windows-msvc-vulkan serve --model StarCoder-1B --device vulkan | ||
``` | ||
|
||
For Linux: | ||
|
||
``` | ||
./tabby_x64_64-manylinux2014-vulkan serve --model StarCoder-1B --device vulkan | ||
``` | ||
|
||
When it starts, you should see a printout indicating that Vulkan has found your card and is working properly: | ||
|
||
![Vulkan running on Linux](./vulkan-running.png) | ||
|
||
Now enjoy your speedy completions! | ||
|
||
![Completion example](./completion.png) |
3 changes: 3 additions & 0 deletions
3
website/blog/2024-05-01-vulkan-support/vulkan-installed-on-arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.