-
-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Installation fails in Windows 11 #5
Comments
You can install PyTorch separately, from here. CUDA toolkit can be installed from here. To run the JIT version of ExLlamaV2 you'll also need the Visual Studio (or VS build tools) installed, but alternatively you can get a prebuilt wheel from here. As for optimization, you'll probably want flash-attn-2 installed, though it can be a bit tricky on Windows. There are people who've got it working. Then, if you're on the latest NVIDIA driver keep an eye on your VRAM usage and see if it seems suspiciously high (as in, higher than 24 GB) since then the driver may have started swapping VRAM to system RAM which slows everything to a crawl. There should be an option for disabling that behavior in the latest driver, or you could downgrade to 531.x or lower. |
"You can install PyTorch separately, from here. CUDA toolkit can be installed from here." Thanks, but I already have both of these installed, because I use Automatic1111 (and SD.NEXT) and OOBE for AI text generation (all of which require these)...at least I know OOBE is using 12.1 CUDA and my other image generation software is using Pytorch...so I'm not sure why it's acting like I don't have these installed. I guess I'll wait until flash-attn-2 is more friendly to install in Windows...not good enough at that... |
Are you sure you have torch installed? According to the screenshot you don't, or at least it's in an isolated venv somewhere. It's also possible you have an older version (probably 2.0.1) and simply need to upgrade. As for Flash Attention, it's entirely optional. It helps on long contexts, but it's not a massive difference most of the time. |
so... it needs to install another copy of pytorch and everything else, in addition to Auto1111, Oobabooga & etc? |
It needs a copy of PyTorch. Oobabooga etc. install themselves into virtual environments to keep everything because they have a mountain of dependencies that all need to be very specific versions. Which also means that all those dependencies aren't available to other applications. This pretty much just needs the latest PyTorch and ExLlamaV2. |
Hmm ok, I installed the requirements, now it fails to start with
Does it really need the CUDA toolkit installed? Which version? |
Alternatively, there are prebuilt wheels here that contain both |
This is where I installed exllamav2 from, and it gave me all the above error messages. It seems to me that |
Oh. Well, sadly there's no good way to deal with PyTorch as a requirement. I guess I should make a note in the readme at least, cause I think the default version if you don't have it installed already is always going to be what you got. Which won't work, of course. |
I was able to clone, but installing from requirements fails. How do I get around this?
For reference I'm using a RTX 4090 system with i9-13900k CPU, 96GB of RAM, so any optimizations suggestions also appreciated.
The text was updated successfully, but these errors were encountered: