Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Installation fails in Windows 11 #5

Open
cleverestx opened this issue Nov 8, 2023 · 9 comments
Open

Installation fails in Windows 11 #5

cleverestx opened this issue Nov 8, 2023 · 9 comments

Comments

@cleverestx
Copy link

image

I was able to clone, but installing from requirements fails. How do I get around this?

For reference I'm using a RTX 4090 system with i9-13900k CPU, 96GB of RAM, so any optimizations suggestions also appreciated.

@turboderp
Copy link
Member

You can install PyTorch separately, from here. CUDA toolkit can be installed from here.

To run the JIT version of ExLlamaV2 you'll also need the Visual Studio (or VS build tools) installed, but alternatively you can get a prebuilt wheel from here.

As for optimization, you'll probably want flash-attn-2 installed, though it can be a bit tricky on Windows. There are people who've got it working. Then, if you're on the latest NVIDIA driver keep an eye on your VRAM usage and see if it seems suspiciously high (as in, higher than 24 GB) since then the driver may have started swapping VRAM to system RAM which slows everything to a crawl. There should be an option for disabling that behavior in the latest driver, or you could downgrade to 531.x or lower.

@cleverestx
Copy link
Author

"You can install PyTorch separately, from here. CUDA toolkit can be installed from here."

Thanks, but I already have both of these installed, because I use Automatic1111 (and SD.NEXT) and OOBE for AI text generation (all of which require these)...at least I know OOBE is using 12.1 CUDA and my other image generation software is using Pytorch...so I'm not sure why it's acting like I don't have these installed.

I guess I'll wait until flash-attn-2 is more friendly to install in Windows...not good enough at that...

@turboderp
Copy link
Member

Are you sure you have torch installed? According to the screenshot you don't, or at least it's in an isolated venv somewhere. It's also possible you have an older version (probably 2.0.1) and simply need to upgrade.

As for Flash Attention, it's entirely optional. It helps on long contexts, but it's not a massive difference most of the time.

@Zueuk
Copy link

Zueuk commented Nov 18, 2023

so... it needs to install another copy of pytorch and everything else, in addition to Auto1111, Oobabooga & etc?

@turboderp
Copy link
Member

It needs a copy of PyTorch. Oobabooga etc. install themselves into virtual environments to keep everything because they have a mountain of dependencies that all need to be very specific versions. Which also means that all those dependencies aren't available to other applications.

This pretty much just needs the latest PyTorch and ExLlamaV2.

@Zueuk
Copy link

Zueuk commented Nov 18, 2023

Hmm ok, I installed the requirements, now it fails to start with

No CUDA runtime is found, using CUDA_HOME='C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA\v11.3'
Traceback (most recent call last):
...
import exllamav2_ext
ImportError: DLL load failed while importing exllamav2_ext: The specified module could not be found.

Does it really need the CUDA toolkit installed? Which version?
Where do I get exllamav2_ext?

@turboderp
Copy link
Member

exllamav2_ext is a component of the exllamav2 package. It's a PyTorch extension that gets built and loaded when you import exllamav2, which requires CUDA to be installed. You can also install the extension as its own package with python setup.py install --user`.

Alternatively, there are prebuilt wheels here that contain both exllamav2 and exllamav2_ext pre-compiled for various CUDA and Python versions. These should work without the CUDA toolkit installed. It does look like there were some changes made to PyTorch between versions 2.0.1 and 2.1.0, so you'll probably need PyTorch>=2.1.0 since they were pre-built on that version.

@Zueuk
Copy link

Zueuk commented Nov 19, 2023

This is where I installed exllamav2 from, and it gave me all the above error messages.

It seems to me that pip install -r requirements.txt installed a CPU-only version of PyTorch. I've reinstalled it using command line from https://pytorch.org/get-started/locally/, and now it actually works.

@turboderp
Copy link
Member

Oh. Well, sadly there's no good way to deal with PyTorch as a requirement. I guess I should make a note in the readme at least, cause I think the default version if you don't have it installed already is always going to be what you got. Which won't work, of course.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants