Can not compile on cuda:1 on multi-gpus 

The following code works well with 'cuda:0', but will raise error when use 'cuda:1'
```
import torch
from torchvision import models
import torch_tensorrt


dtype = torch.float32
# works fine when cuda:0 but error on cuda:1
device = torch.device("cuda:0")

model = models.resnet50().to(dtype).to(device)
model.eval()
inputs_1 = torch.rand(12, 3, 224, 224).to(device).to(dtype)

optimized_model = torch.compile(model, backend="torch_tensorrt", dynamic=False, fullgraph=True,
                                options={
                                    "precision": dtype,
                                    "device": device
                                })

with torch.no_grad():
    useless = optimized_model(inputs_1)
print(useless)
```
Two gpus are both A100, pytorch 2.2.1 cu118 torch-tensorrt 2.2.0+cu118 tensorrt 8.6.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can not compile on cuda:1 on multi-gpus #2668

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can not compile on cuda:1 on multi-gpus #2668

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions