Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Bug] Accuracy issue when using Torch-TensorRT #3119

Open
cehongwang opened this issue Aug 23, 2024 · 1 comment
Open

🐛 [Bug] Accuracy issue when using Torch-TensorRT #3119

cehongwang opened this issue Aug 23, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@cehongwang
Copy link
Collaborator

cehongwang commented Aug 23, 2024

Bug Description

To Reproduce

Steps to reproduce the behavior:

  1. Run the code
model = models.resnet101(pretrained=False).eval().to("cuda")
exp_program = torch.export.export(model, tuple(inputs))
enabled_precisions = {torch.float}
debug = False
workspace_size = 20 << 30
min_block_size = 0
use_python_runtime = False
torch_executed_ops = {}
trt_gm = torch_trt.dynamo.compile(
    exp_program,
    tuple(inputs),
    use_python_runtime=use_python_runtime,
    enabled_precisions=enabled_precisions,
    debug=debug,
    min_block_size=min_block_size,
    torch_executed_ops=torch_executed_ops,
    make_refitable=True,
)  # Output is a torch.fx.GraphModule

expected_outputs, compiled_outputs = model(*inputs), trt_gm(*inputs)
for expected_output, compiled_output in zip(expected_outputs, compiled_outputs):
    assert torch.allclose(
        expected_output, compiled_output, 1e-2, 1e-2
    ), "Compilation Result is not correct. Compilation failed"

print("Compilation successfully!")

Expected behavior

The error should be smaller

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • Torch-TensorRT Version (e.g. 1.0.0): 2.5.0
  • PyTorch Version (e.g. 1.0): 2.5.0
  • CPU Architecture: x86
  • OS (e.g., Linux): Linux
  • How you installed PyTorch (conda, pip, libtorch, source):source
  • Build command you used (if compiling from source):develop/edible
  • Are you using local sources or building from archives:
  • Python version:3.10.14
  • CUDA version:12.1
  • GPU models and configuration: A40
  • Any other relevant information:

Additional context

@cehongwang cehongwang added the bug Something isn't working label Aug 23, 2024
@HolyWu
Copy link
Contributor

HolyWu commented Aug 25, 2024

Not sure about the actual culprit, but using resnet101(pretrained=True) instead of resnet101(pretrained=False) doesn't incur accuracy issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants