-
Notifications
You must be signed in to change notification settings - Fork 360
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can not compile on cuda:1 on multi-gpus #2668
Comments
Hi - thanks for the report - could you share the error logs when compiling with |
As an update on this, I am seeing the following output when running your script:
It seems to appear because the active CUDA device in the context is being overwritten or incorrectly set. I am looking into the issue. For a temporary workaround, consider using the prefix argument |
I have some additional details to share on this, as well as the proposed fix. In the initial code, the If the intended usecase was to compile/run on multiple GPUs, I would suggest using separate Python threads, each with a We recently added a feature, Multi-Device Runtime Safety, which gives more verbose messages regarding device contexts and will automatically switch devices if a mismatch is detected, at the cost of runtime device-checking. This can be enabled via This code works on my end, please let me know if it also resolves your case: import torch
from torchvision import models
import torch_tensorrt
dtype = torch.float32
device = torch.device("cuda:1")
torch.cuda.set_device(device)
model = models.resnet50().to(dtype).to(device)
model.eval()
inputs_1 = torch.rand(12, 3, 224, 224).to(device).to(dtype)
optimized_model = torch.compile(model, backend="torch_tensorrt", dynamic=False, fullgraph=True,
options={
"precision": dtype,
"device": device
})
with torch.no_grad():
useless = optimized_model(inputs_1)
print(useless) |
Thanks
Thanks, when will be this nice feature released and should I close this issue? |
Hi - this feature should already be in the version you are using. Once you have verified the resolution works for you, feel free to close this bug and open a new one if there are other issues. |
The following code works well with 'cuda:0', but will raise error when use 'cuda:1'
Two gpus are both A100, pytorch 2.2.1 cu118 torch-tensorrt 2.2.0+cu118 tensorrt 8.6.0
The text was updated successfully, but these errors were encountered: