-
Notifications
You must be signed in to change notification settings - Fork 362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can please somebody give a clear explanation of how to install torch-tensorrt on Windows? #2577
Comments
Hello - we are working on a pip-installable Windows distribution for which a sample can be found on the In order to use it, the Torch distribution requires this patch: pytorch/pytorch#111313, to enable Dynamo experimentally on Windows, since Torch does not yet fully support Dynamo on Windows. Then, once cloned, running |
Hello again, thank you for your reply, @gs-olive, and I apologize, for replying so late. Thank you for pointing me to the windows_CI branch, I haven't noticed it then. Now if I use git clone https://github.com/pytorch/TensorRT.git (even though this is the same link to non windows_CI, it's on windows_CI branch) and run Now I downloaded the zip file format, I get the TensorRT windows_CI, when i run setup.py, i get an error, that this is not a git repository. I ran the command
specifically the line Thank you for your help |
Hello - thanks for the follow-up. To clone a specific branch from the command line, I use: git clone --single-branch --branch windows_CI https://github.com/pytorch/TensorRT.git This should resolve the git issues related to that repository. Then, |
Hi again @gs-olive, I've successfully installed it like you said, but now when i try to import torch_tensorrt, I get this error: C:\Users\Tomas\AppData\Local\Programs\Python\Python310\lib\site-packages\torch_tensorrt_Device.py:19: UserWarning: Unable to import torchscript frontend core and torch-tensorrt runtime. Some dependent features may be unavailable. |
Thanks for testing that out - I have seen this error before; it relates to the torch version being used. The branch is based on the latest Torch nightly, which has the |
Thank you @gs-olive for your fast reply. Now I'm really confused. I installed torch_tensorrt, now when I try to import torch_tensorrt I get this error and the program stops: for some reason at first when I installed it I was having some problems, so I did pip uninstall torch_tensorrt, and then ran again python setup.py install, and I get the error Or for it to work ok, I need to download the pytorch 2.2.0 nightly? |
@gs-olive Hello, I cloned it again as well and it has the same error with "returned non-zero exit status 128" in the end. Can we have some kind of a step by step guide, what exactly we have to do? 1.) Download this that and that. Something like this I mean. Thanks a lot. |
I still get the following error using the latest commit (3c099ef at the time of writing) from the
|
@HolyWu I don't know exactly why the problem of you is happening, because I'm not the dev of this, but perhaps you could try Python 3.10 because the dev said it. |
By the way I have solved it myself. Here is how: 1.) Before all of this, please install Python 3.11. Setup should install the library. However I have not tried more than this for now. Some kind of example would be helpful? |
Hello - thanks for the comments on this thread. To address them: @ninono12345 - the @HolyWu - thanks for this catch/report - @jensdraht1999 - the steps provided are very close to what I am using on this experimental branch to get a working build. For a detailed set of steps I am currently using with this experimental branch, here is what I have: Steps for Dynamo-Only, Python-Runtime-Only Windows build using
|
@gs-olive I have done all of this. But Step 8 is a problem. Step 9 doesn't really work I think. Please look at the pictures. Would you recommend me to use Python 3.10? By the way I have totally reinstalled Python 3.11 deleted all the sitepackage folder and more to have it all clean and also cuda 12.1. Step 8: Step 9: If I run the example file without changing anything: What could be the problem? |
Hi - thanks for the follow-up. I do not think Python 3.10 vs 3.11 should make a difference here, though I am using Python 3.10. Regarding the attached image, in the IDE on the left, it seems that there is an error finding a Regarding the error logs, that is due to the version of the example which I had linked in the original message. I have fixed the link now, and it uses the correct configuration parameters for Windows. It is linked here as well for convenience. |
@gs-olive When I "import torch; import torch_tensorrt" in IDLE, then it will throw multiple .dll "procedure entry point not found" (german "prozedureinsprungpunkt nicht gefunden") errors. After I click them away, then it will run the right one. The files are actually there, where they are suppossed to be. Please look: Also the eval file is there where it belongs: This is the log for example file after that you have linked right now: |
@gs-olive I tried Python 3.10 as well it did not work. Could you upload your site-packages folder to the internet so I could try it out? |
Got it - thanks for sharing this. It seems that the issue there is in Dynamo and not Torch-TensorRT. If the below snippet also fails on your machine, that would indicate a Torch issue. import torch
import torchvision.models as models
model = models.resnet18(pretrained=True).half().eval().to("cuda")
inputs = [torch.randn((1, 3, 224, 224)).to("cuda").half()]
optimized_model = torch.compile(model, backend="eager")
optimized_model (*inputs) Additionally, I am not using IDLE as my Python interpreter - I have installed Python for Windows from the Python site directly and I am invoking it in Powershell - I think this could also potentially affect Torch. The versions of the relevant packages on my machine are: $ python -m pip list
Package Version
------------------------ ------------------------
tensorrt 9.2.0.post12.dev5
tensorrt-bindings 9.2.0.post12.dev5
tensorrt-libs 9.2.0.post12.dev5
torch 2.3.0.dev20240116+cu121 |
The example you have provided did give me the same error, so my Pytorch Nightly does somehow not work. So I have to work it out. My depedencies: C:\Users\King>python -m pip list black 23.12.1 |
@gs-olive Just one question. I think I have ruled out the problem. I think Torchvision is definitly needed and also installed the nightly version, but this is making problem. Can i just know from you, which torchvision and torchaudio you have installed. I think this would make this problem go away. |
Okay fixed the problem with Torchvision like this. Please try to download them like this: pip install --pre torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121 While downloading it, you will the see the download link for .whl files. Put the link in a browser, download them. pip install "C:\Users{UserName}\Desktop\whl files\torchaudio-2.2.0.dev20240118+cu121-cp310-cp310-win_amd64.whl" --force-reinstall --no-deps pip install "C:\Users{UserName}\Desktop\whl files\torchvision-0.18.0.dev20240118+cu121-cp310-cp310-win_amd64.whl" --force-reinstall --no-deps This will install those both packages regardless of their dependencies. However the problem on the original example exists, but I will look at it now. |
Thanks for the update. I do not have
I installed these using something very similar to what you have written: pip uninstall -y torch torchvision
pip install --pre torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu121 |
@gs-olive Okay now I have fixed it, the rest of the errors are the expected ones. The problem was this: You have to delete following folder: "C:\Users{UserName}\AppData\Local\Programs\Python\Python310\Lib\site-packages\nvidia\cudnn" Because it seems that torch already has those files somehow and now the dll errors disappeard. |
Tips: Do NOT USE IDLE. There is really something broke with it. Use Cmd or Powershell or something else. It really worked with the example, if run through cmd. There's still an error, but it is definitly loading everything into VRAM. This error here is excepted, I mean in the end of the script ? @gs-olive Thanks again for your help and patience. @gs-olive |
Updated with 12.3 cuda and made all the steps again. Now it seems to work better, but I'm not sure, if the example provided is suppossed to have exactly this error here below. And also just deleted the .dll files in the folder "C:\Users{UserName}\AppData\Local\Programs\Python\Python310\Lib\site-packages\nvidia\cudnn", everything else stayed in there. The error seems to be less problematic. And also run in CMD, NOT in IDLE. |
Thanks for the details - the error at the end is not expected and I want to determine whether the issue is in |
@gs-olive I do it like this: The code I have saved in a file called 1.py. The content:
I run via CMD. This happens:
But if I run this code here via CMD:
This will happen:
If I run this code via CMD:
This happens:
I'm not sure, if something I have forgotten somehow. I have followed all your steps, step by step. Here are the pacakages:
If I clone to and then run: python setup.py install via CMD, this is what I get:
Also if delete the folder in the site-packages folder torch_tensorrt and torch_tensorrt-2.3.0.dev0+e7e92911-py3.10.egg-info and run the code again. This happens and the folder are created there again:
So what do we now? |
This will happen in Powershell if import is done: PS C:\Users{UserName}> python
|
WAIT A MINUTE: Could it be, that regular GPUs like RTX 3060 LAPTOP are not supported by Tensorrt at all? |
I thought gs-olive had a typo in #2577 (comment). It should be import torch
import torchvision.models as models
model = models.resnet18(pretrained=True).half().eval().to("cuda")
inputs = [torch.randn((1, 3, 224, 224)).to("cuda").half()]
optimized_model = torch.compile(model, backend="eager")
print(optimized_model(*inputs)) |
It has worked with the code you have provided. Thank you. @HolyWu Here is the what is the output:
|
@HolyWu Just wanted to ask you, did it work for you, without deleting "C:\Users{UserName}\AppData\Local\Programs\Python\Python310\Lib\site-packages\nvidia\cudnn" folder? |
Yes. Although it throws multiple "entry point not found" errors about cuDNN DLLs, torch_compile_resnet_example still runs fine after clicking them away. You can of course The reason for the error is that, the version of cuDNN DLLs bundled in torch windows wheel is still 8.8.0.1. When you install tensorrt==9.2.0.post12.dev5, it installs the latest nvidia-cudnn-cu12 for you, which currently is 8.9.7.29. When you |
Thank you @HolyWu for the catch - that was a bug in my initial script which has now been addressed; apologies for that @jensdraht1999. Regarding the cuDNN issue, I'm looking for a workaround to make the imports smoother - potentially by having one of @jensdraht1999 - it is good to see the |
@gs-olive Reinstalling everything again, because it seems, I had the TensorRT Folder in the path, which might have caused errors. Will tell you everything when done. |
Run as non-admin and also as Admin per CMD. GPU Hardware sceduling on/off tested. Currently off. Energy Profile Performance and Powersave both tested. Windows restarted also after installation. The file:"eval_frame.py" from the link posted by you, has been put in the folder:"C:\Users{UserName}\AppData\Local\Programs\Python\Python310\Lib\site-packages\torch_dynamo" I can confirm this will exists and gets downloaded again, if non-eexistend: Nvidia Cuda System Fallback Policy: Default, but also tested ON/OFF, the same. Everything from the path is in the jpg here below: This is the system enviroment: This happens, if I run "import torch; import torch_tensorrt":
And then your example: torch_compile_resnet_example.py Please look this file: torch_compile_resnet_example.txt WITH THE FOLDER "C:\Users{UserName}\AppData\Local\Programs\Python\Python310\Lib\site-packages\nvidia\cudnn" following happens: WITHOUT THE FOLDER: C:\Users{UserName}\AppData\Local\Programs\Python\Python310\Lib\site-packages\nvidia\cudnn Also this here happens with this file with following content (Holywu corrected example): import torch model = models.resnet18(pretrained=True).half().eval().to("cuda") log3.txt please look at it. @gs-olive I wonder where the compiled model is stored or the workspace is? |
Hi @jensdraht1999 - thanks for testing this out - these logs are very helpful. Normally, the compiled model is stored in the My guess is that the issue with the |
I need to install this here, which you had installed.: This should be the exact version you already have: My current installation is like this: I also applied the patch you have provided, it was unforunately worse, because it made a error. The log here: @gs-olive I will use the exact version you have also used and try again. I will inform you about it. And if it's working for you and Holywu and the other user in this thread, than this problem must be somehow my setup. |
I have installed an old version torch and torchvision. It still does not work. My installed packagages: So it seems it must be a error on my side somehow. I'll have to look. |
Thanks for your input, I have tested your steps and they do not work on my Windows machine. When I try to run setup.py it gives me an error, saying I need to specify the commands. Any ideas? |
@enislalmi What exactly it does say? |
Hi, I got it downloading with your way, but I get this error |
|
Before everything install Cuda 12 version. Do you have git installed, because after installing it, you need to do this: And then the rest of this: Run python setup.py install to install torch_tensorrt Open a Python interpreter and run There may be warnings about TorchScript and Torchvision, but the import should succeed. |
@jensdraht1999 yes everything is done as you said, but I still receive the same error. Thanks! |
Hi @gs-olive. I modified
Full log: ir_dynamo.log By the way, I couldn't find a page on https://pytorch.org/TensorRT explaining the key difference between |
@enislalmi If import did work as it should and the example provided by Gsolive did work, then it did work. You have to be aware, that currently this is in development and not everything is guaranteed to work. |
Hi @HolyWu - thanks for testing that out! This is actually expected in this case, since Below is a brief summary of the different
|
@gs-olive I tried again, somehow there could be any problem. What exactly do you get for the output for the example you had provided. Here is my output: [03/03/2024-23:16:52] [TRT] [V] CUDA lazy loading is enabled. The graph consists of 89 Total Operators, of which 89 operators are supported, 100.0% coverage Compiled with: CompilationSettings(precision=torch.float16, debug=True, workspace_size=21474836480, min_block_size=7, torch_executed_ops={}, pass_through_build_failures=False, max_aux_streams=None, version_compatible=False, optimization_level=None, use_python_runtime=True, truncate_long_and_double=False, use_fast_partitioner=True, enable_experimental_decompositions=False, device=Device(type=DeviceType.GPU, gpu_id=0), require_full_compilation=False, disable_tf32=False, sparse_weights=False, refit=False, engine_capability=<EngineCapability.DEFAULT: 0>, num_avg_timing_iters=1, dla_sram_size=1048576, dla_local_dram_size=1073741824, dla_global_dram_size=536870912, dryrun=False, hardware_compatible=False) Graph Structure: Inputs: List[Tensor: (1, 3, 224, 224)@float16] ------------------------- Aggregate Stats ------------------------- Average Number of Operators per TRT Engine: 89.0 ********** Recommendations **********
|
Thanks for the follow-up. For the example I had referenced, it runs to completion on my machine with no errors. Not sure if this will help, but there was a recent Windows-support PR I added to Torch recently which was merged pytorch/pytorch#115969. Upgrading to the latest nightly version of Torch could be helpful with this, since the
|
Ok I will do it, but i wonder if this can work, because the PR was closed, not merged. Are you sure if I run this, it will work? pip3 install --pre -U torch torchvision --index-url https://download.pytorch.org/whl/nightly/cu121 I will have to wait to go home to install this. But thank you very much for the help. |
I think the logs from this run will still be helpful for debugging if a different error appears. Even though it shows as "Closed", the changes are reflected on |
Okay thanks, I will try as fast as possible. I will report anything to you. |
@gs-olive I wanted to ask, if you intent to implement that we can cache tensorrt to a file, so it has not to calculate everything? |
If you are referring to model export, #2806 should enable this workflow via the C++ runtime. |
I mean the model export on Pytorch. You know, that there are 3 kinds of pytorch workflows: Normal, compiled, tensorrt. Compiled and tensorrt are not available in pytorch ON WINDOWS since forever. Then a few days before this question was asked, there was pytorch compiled in a PR, however it performed very bad, so as far as I know, it will not be added. TensorRT from Nvidia has been supported with you help, but the model, which will be optimized and then used for inferencing has not been generated like it does on Linux. Let's say, I want to use "RESNET50" for inferencing. Let's say my performance is 50 per second. The I use TensorRT of NVIDIA, this will optimize the model and save the optimized model to a directory I like and the name of the file will be for example "resnet-rtx3060.onnx". After that it will use this model and this model will only be compatible with my gpu. And because it's compiled/optimized for my NVIDIA GPU it's 75 per second and probably even more. The range of how it good it performs is aroud 30%-200%. This is the example code I mean (https://pytorch.org/TensorRT/tutorials/getting_started_with_fx_path.html): def compile( I'm 100% honest with you, I do not know a lot, perhaps this PR is something else. I apologize if this something totally unrealeaded. |
❓ Question
Hello,
I've encountered problems installing torch-tensorrt on Windows 10
No matter how I try, how many sources I look up to, there is no clear explanation on how to do everything. The documentation is vague, and because I am used to working with python code, which does everything for you, that is pip install... python code.py, and nothing more is required, I do not have as much experience with cmake, building libraries, files, and c++, which makes it very difficult to follow along the installation process.
Now I've tried to follow along instructions from the main page
pip install torch-tensorrt doesn't work
downloaded zip file of this repository; python setup.py install also doesn't work
installed bazel
modified the workspace, still nothing
tried to directly import into code py/torch-tensorrt - nothing
then inside the py folder opened command prompt ant typed in:
bazel build //:libtorchtrt --compilation_mode=dbg
and received this error:
`Starting local Bazel server and connecting to it...
INFO: Repository libtorch instantiated at:
D:/pyth/tensorrt-main/WORKSPACE:53:13: in
Repository rule http_archive defined at:
C:/users/tomas/_bazel_tomas/r4zfvyvs/external/bazel_tools/tools/build_defs/repo/http.bzl:372:31: in
WARNING: Download from https://download.pytorch.org/libtorch/nightly/cu121/libtorch-cxx11-abi-shared-with-deps-latest.zip failed: class com.google.devtools.build.lib.bazel.repository.downloader.ContentLengthMismatchException Bytes read 2210658461 but wanted 2501377827
ERROR: An error occurred during the fetch of repository 'libtorch':
Traceback (most recent call last):
File "C:/users/tomas/_bazel_tomas/r4zfvyvs/external/bazel_tools/tools/build_defs/repo/http.bzl", line 132, column 45, in _http_archive_impl
download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error downloading [https://download.pytorch.org/libtorch/nightly/cu121/libtorch-cxx11-abi-shared-with-deps-latest.zip] to C:/users/tomas/_bazel_tomas/r4zfvyvs/external/libtorch/temp7217651597570855917/libtorch-cxx11-abi-shared-with-deps-latest.zip: Bytes read 2210658461 but wanted 2501377827
ERROR: D:/pyth/tensorrt-main/WORKSPACE:53:13: fetching http_archive rule //external:libtorch: Traceback (most recent call last):
File "C:/users/tomas/_bazel_tomas/r4zfvyvs/external/bazel_tools/tools/build_defs/repo/http.bzl", line 132, column 45, in _http_archive_impl
download_info = ctx.download_and_extract(
Error in download_and_extract: java.io.IOException: Error downloading [https://download.pytorch.org/libtorch/nightly/cu121/libtorch-cxx11-abi-shared-with-deps-latest.zip] to C:/users/tomas/_bazel_tomas/r4zfvyvs/external/libtorch/temp7217651597570855917/libtorch-cxx11-abi-shared-with-deps-latest.zip: Bytes read 2210658461 but wanted 2501377827
ERROR: D:/pyth/tensorrt-main/core/util/logging/BUILD:13:11: //core/util/logging:logging depends on @libtorch//:libtorch in repository @libtorch which failed to fetch. no such package '@libtorch//': java.io.IOException: Error downloading [https://download.pytorch.org/libtorch/nightly/cu121/libtorch-cxx11-abi-shared-with-deps-latest.zip] to C:/users/tomas/_bazel_tomas/r4zfvyvs/external/libtorch/temp7217651597570855917/libtorch-cxx11-abi-shared-with-deps-latest.zip: Bytes read 2210658461 but wanted 2501377827
ERROR: Analysis of target '//:libtorchtrt' failed; build aborted:
INFO: Elapsed time: 458.697s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (64 packages loaded, 413 targets configured)
Fetching https://download.pytorch.org/...orch-cxx11-abi-shared-with-deps-latest.zip; 2.1 GiB (2,210,121,825B) 446s
And also tried some other things, I cannot remember, but unsuccessfully.
THANK YOU FOR YOUR HELP IN ADVANCE
Environment
The text was updated successfully, but these errors were encountered: