Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows 11 install procedure and missing dependancies (transformers) #31

Open
Anon426 opened this issue Jan 19, 2024 · 2 comments
Open

Comments

@Anon426
Copy link

Anon426 commented Jan 19, 2024

Hi all,

Ok i had multiple issues getting this working with the default requirements.txt install and although the additional install info is useful, theres no full guide, so here are the steps that will hopefully get it running for you - im using windows 11 (NOT WSL) and a 4090

1)install python 3.10.x + add to paths option

  1. install git

  2. install cuda 12.1 + make sure paths are correct

  3. install vscode for c++ (unknown if necessary but I already have this for many other ai gens)

  4. git clone repo and prepare:

git clone https://github.com/turboderp/exui.git

cd exui

python -m venv venv

call venv/scripts/activate

  1. open requirements.txt and delete the torch & exllama2 entries, save

  2. install torch for cuda 12.1:

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

7a) then install the remaing requirement deps:

pip install - r requirements.txt

  1. Download the correct pre-compiled wheels for xxlama2 & flash attention 2

https://github.com/turboderp/exllamav2/releases/download/v0.0.11/exllamav2-0.0.11+cu121-cp310-cp310-win_amd64.whl

https://github.com/jllllll/flash-attention/releases/download/v2.4.2/flash_attn-2.4.2+cu121torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl

  1. move these wheels to the exui directory then install them

pip install "exllamav2-0.0.11+cu121-cp310-cp310-win_amd64.whl

pip install "flash_attn-2.4.2+cu121torch2.1cxx11abiFALSE-cp310-cp310-win_amd64.whl"

  1. install transformers + sentencepiece (sentence maybe not needed but got it anyway) - this is missing from any documentation here at all and youll need these to actually load a model

pip install --no-cache-dir transformers sentencepiece

  1. run exui

python server.py

  1. goto models and load you model

enjoy

@Odin7094
Copy link

I'm on W10 but still thanks, it finally worked once I went by your instructions.

@tamanor
Copy link

tamanor commented May 10, 2024

I've tried getting Exui to work for the past few days and had nothing but issues, your guide has got me the most into it. but when I run the python server.py command i get the following

(venv) J:\exui>python server.py Traceback (most recent call last): File "J:\exui\server.py", line 11, in <module> from backend.models import update_model, load_models, get_model_info, list_models, remove_model, load_model, unload_model, get_loaded_model File "J:\exui\backend\models.py", line 5, in <module> from exllamav2 import( File "J:\exui\venv\Lib\site-packages\exllamav2\__init__.py", line 3, in <module> from exllamav2.model import ExLlamaV2 File "J:\exui\venv\Lib\site-packages\exllamav2\model.py", line 29, in <module> from exllamav2.attn import ExLlamaV2Attention, has_flash_attn File "J:\exui\venv\Lib\site-packages\exllamav2\attn.py", line 26, in <module> import flash_attn File "J:\exui\venv\Lib\site-packages\flash_attn\__init__.py", line 3, in <module> from flash_attn.flash_attn_interface import ( File "J:\exui\venv\Lib\site-packages\flash_attn\flash_attn_interface.py", line 10, in <module> import flash_attn_2_cuda as flash_attn_cuda ImportError: DLL load failed while importing flash_attn_2_cuda: The specified procedure could not be found.

From what I can see I have everything installed, any idea what im doing wrong?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants