Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

❓ [Question] dynamo conversion failing w/ TRTInterpreter #3124

Open
patrick-botco opened this issue Aug 28, 2024 · 8 comments
Open

❓ [Question] dynamo conversion failing w/ TRTInterpreter #3124

patrick-botco opened this issue Aug 28, 2024 · 8 comments
Assignees
Labels
question Further information is requested

Comments

@patrick-botco
Copy link

patrick-botco commented Aug 28, 2024

❓ Question

im able to torch.export and generate an ExportedProgram with no issues for my model. upon compiling with torch_tensorrt...

ep = torch.export.load("...")
example_inputs = ep.example_inputs[0]
model = ep.module().to("cuda")

compile_spec = {
    "ir": "torch_compile",
    "inputs": example_inputs,
    "enabled_precisions": enabled_precisions,
    "workspace_size": workspace_size,
    "min_block_size": min_block_size,
    "torch_executed_ops": {},
    "sparse_weights": True,
}

optimized_model = torch_tensorrt.compile(model, **compile_spec)

... i run into this error:

ERROR:torch_tensorrt [TensorRT Conversion Context]:INetworkDefinition::addConstant: Error Code 3: API Usage Error (Parameter check failed, condition: !weights.values == !weights.count. )
Traceback (most recent call last):
...
  File ".../lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 479, in run
    self._construct_trt_network_def()
  File ".../lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 325, in _construct_trt_network_def
    super().run()
  File ".../lib/python3.10/site-packages/torch/fx/interpreter.py", line 145, in run
    self.env[node] = self.run_node(node)
  File ".../lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 529, in run_node
    trt_node: torch.fx.Node = super().run_node(n)
  File ".../lib/python3.10/site-packages/torch/fx/interpreter.py", line 202, in run_node
    return getattr(self, n.op)(n.target, args, kwargs)
  File ".../lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/_TRTInterpreter.py", line 638, in call_function
    return converter(self.ctx, target, args, kwargs, self._cur_node_name)
  File ".../lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/aten_ops_converters.py", line 242, in aten_ops_cat
    return impl.cat.cat(
  File ".../lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/impl/cat.py", line 31, in cat
    each_input = get_trt_tensor(ctx, each_input, f"{name}_tensor_{i}")
  File ".../lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/converter_utils.py", line 384, in get_trt_tensor
    return create_constant(ctx, input_val, name, dtype, min_rank)
  File ".../lib/python3.10/site-packages/torch_tensorrt/dynamo/conversion/converter_utils.py", line 349, in create_constant
    constant.name = name
torch._dynamo.exc.BackendCompilerFailed: backend='torch_tensorrt_backend' raised:
AttributeError: 'NoneType' object has no attribute 'name'

im currently able to cleanly generate an ExportedProgram via torch.export, and outputs from the trace match the original PyTorch model. in particular, its unclear to me why !weights.values == !weights.count would be an API Usage Error, and the discrepancy between torch.compile and how torch_tensorrt interprets / performs the op conversion (torch.compile on the ExportedProgram module works fine)

What you have already tried

i've narrowed the issue down to a single module that does positional encoding. the output of this module is then concat'd with another tensor, which is the error above. without this module, everything works as expected, and i'm able to see about a 5x speedup.

the only unique thing about this module is that it has a buffer and some in-place operations; however, i've dumped and manually inspected the fx Graph and the trace looks correct (buffer lifted as a constant input). other things ive done are: re-writing the forward so that they are no in-place operations to make graph capture easier.

Environment

Build information about Torch-TensorRT can be found by turning on debug messages

  • PyTorch Version (e.g., 1.0): 2.4
  • CPU Architecture: aarch64
  • OS (e.g., Linux): Ubuntu
  • How you installed PyTorch (conda, pip, libtorch, source): pip
  • Build command you used (if compiling from source): modified bazel build rules + install
  • Are you using local sources or building from archives: local build from source
  • Python version: 3.10
  • CUDA version: 12.4
  • GPU models and configuration: Ampere (Jetson Nano, JetPack 6.0)
  • Any other relevant information: i compiled torch_tensorrt on HEAD of main as of last Friday (8/23)

Additional context

cc @narendasan not sure if you have any insight here. thanks!

@patrick-botco patrick-botco added the question Further information is requested label Aug 28, 2024
@narendasan
Copy link
Collaborator

narendasan commented Aug 28, 2024

@patrick-botco Are you able to share a repro of this issue?
Seems like the problem is in the cat converter (cc: @apbose)

@patrick-botco
Copy link
Author

yea let me get one

@patrick-botco
Copy link
Author

patrick-botco commented Aug 28, 2024

@narendasan @apbose this is a stripped down portion of Meta's SAM2 (original at https://github.com/facebookresearch/segment-anything-2/blob/main/sam2/modeling/sam/prompt_encoder.py), with minor modifications of _embed_points

fill in CHECKPOINT_PATH and run the below python ...

# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.

# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

from typing import Optional, Tuple, Any

import torch
from torch import nn

import numpy as np


class PositionEmbeddingRandom(nn.Module):
    """
    Positional encoding using random spatial frequencies.
    """

    def __init__(self, num_pos_feats: int = 64, scale: Optional[float] = None) -> None:
        super().__init__()
        if scale is None or scale <= 0.0:
            scale = 1.0
        self.register_buffer(
            "positional_encoding_gaussian_matrix",
            scale * torch.randn((2, num_pos_feats)),
        )

    def _pe_encoding(self, coords: torch.Tensor) -> torch.Tensor:
        """Positionally encode points that are normalized to [0,1]."""
        # assuming coords are in [0, 1]^2 square and have d_1 x ... x d_n x 2 shape
        coords = 2 * coords - 1
        coords = coords @ self.positional_encoding_gaussian_matrix
        coords = 2 * np.pi * coords
        # outputs d_1 x ... x d_n x C shape
        return torch.cat([torch.sin(coords), torch.cos(coords)], dim=-1)

    def forward(self, size: Tuple[int, int]) -> torch.Tensor:
        """Generate positional encoding for a grid of the specified size."""
        h, w = size
        device: Any = self.positional_encoding_gaussian_matrix.device
        grid = torch.ones((h, w), device=device, dtype=torch.float32)
        y_embed = grid.cumsum(dim=0) - 0.5
        x_embed = grid.cumsum(dim=1) - 0.5
        y_embed = y_embed / h
        x_embed = x_embed / w

        pe = self._pe_encoding(torch.stack([x_embed, y_embed], dim=-1))
        return pe.permute(2, 0, 1)  # C x H x W

    def forward_with_coords(
        self, coords_input: torch.Tensor, image_size: Tuple[int, int]
    ) -> torch.Tensor:
        """Positionally encode points that are not normalized to [0,1]."""
        coords = coords_input.clone()
        coords[:, :, 0] = coords[:, :, 0] / image_size[1]
        coords[:, :, 1] = coords[:, :, 1] / image_size[0]
        return self._pe_encoding(coords.to(torch.float))  # B x N x C


class PromptEncoder(nn.Module):
    def __init__(
        self,
        embed_dim: int,
        image_embedding_size: Tuple[int, int],
        input_image_size: Tuple[int, int],
    ) -> None:
        """
        Encodes prompts for input to SAM's mask decoder.

        Arguments:
          embed_dim (int): The prompts' embedding dimension
          image_embedding_size (tuple(int, int)): The spatial size of the
            image embedding, as (H, W).
          input_image_size (int): The padded size of the image as input
            to the image encoder, as (H, W).
        """
        super().__init__()
        self.embed_dim = embed_dim
        self.input_image_size = input_image_size
        self.image_embedding_size = image_embedding_size
        self.pe_layer = PositionEmbeddingRandom(embed_dim // 2)

        self.num_point_embeddings: int = 4  # pos/neg point + 2 box corners
        point_embeddings = [
            nn.Embedding(1, embed_dim) for i in range(self.num_point_embeddings)
        ]
        self.point_embeddings = nn.ModuleList(point_embeddings)
        self.not_a_point_embed = nn.Embedding(1, embed_dim)

    def _embed_points(
        self,
        points: torch.Tensor,
        labels: torch.Tensor,
        pad: bool,
    ) -> torch.Tensor:
        """Embeds point prompts."""
        points = points + 0.5  # Shift to center of pixel
        if pad:
            padding_point = torch.zeros((points.shape[0], 1, 2), device=points.device)
            padding_label = -torch.ones((labels.shape[0], 1), device=labels.device)
            points = torch.cat([points, padding_point], dim=1)
            labels = torch.cat([labels, padding_label], dim=1)
        point_embedding = self.pe_layer.forward_with_coords(
            points, self.input_image_size
        )
        point_embedding = torch.where(
            labels[:, :, None] == -1,
            self.not_a_point_embed.weight,
            point_embedding + torch.where(
                labels[:, :, None] == 0,
                self.point_embeddings[0].weight,
                torch.where(
                    labels[:, :, None] == 1,
                    self.point_embeddings[1].weight,
                    torch.where(
                        labels[:, :, None] == 2,
                        self.point_embeddings[2].weight,
                        self.point_embeddings[3].weight,
                    ),
                ),
            ),
        )
        return point_embedding

    def _get_device(self) -> torch.device:
        return self.point_embeddings[0].weight.device

    def forward(
        self,
        points: Optional[Tuple[torch.Tensor, torch.Tensor]],
        boxes: Optional[torch.Tensor],
        masks: Optional[torch.Tensor],
    ) -> Tuple[torch.Tensor, torch.Tensor]:
        """
        Embeds different types of prompts, returning both sparse and dense
        embeddings.

        Arguments:
          points (tuple(torch.Tensor, torch.Tensor) or none): point coordinates
            and labels to embed.

        Returns:
          torch.Tensor: sparse embeddings for the points and boxes, with shape
            BxNx(embed_dim), where N is determined by the number of input points
            and boxes.
        """
        sparse_embeddings = torch.empty(
            (1, 0, self.embed_dim), device=self._get_device()
        )
        if points is not None:
            coords, labels = points
            point_embeddings = self._embed_points(coords, labels, pad=True)
            sparse_embeddings = torch.cat([sparse_embeddings, point_embeddings], dim=1)

        return sparse_embeddings


class PositionalEncoder(nn.Module):
    def __init__(self) -> None:
        super().__init__()
        self.sam_prompt_encoder = PromptEncoder(
            embed_dim=256,
            image_embedding_size=(16, 16),
            input_image_size=(256, 256),
        )

    def forward(self, sam_point_coords: torch.Tensor, sam_point_labels: torch.Tensor) -> torch.Tensor:
        sparse_embeddings = self.sam_prompt_encoder(
            points=(sam_point_coords, sam_point_labels),
            boxes=None,
            masks=None,
        )
        return sparse_embeddings

CHECKPOINT_PATH = "/path/to/checkpoint.pt"

if __name__ == "__main__":
    model = PositionalEncoder().to("cuda")
    model.eval()

    sam_point_coords = torch.randn([1, 1, 2], dtype=torch.float32, device="cuda") 
    sam_point_labels = torch.randn([1, 1], dtype=torch.float32, device="cuda")

    inputs = (sam_point_coords, sam_point_labels)

    # reference output from original model
    ref_out = model(*inputs)

    # export, serialize, deserialize
    ep = torch.export.export(model, inputs)
    torch.export.save(ep, CHECKPOINT_PATH)
    reloaded_model = torch.export.load(CHECKPOINT_PATH)

    # compare outputs
    trace_out = reloaded_model.module()(*inputs)

    assert torch.allclose(ref_out, trace_out)

load the ckpt on a Jetson (this may be unnecessary to repro), and attempt to build the engine

import torch
import torch_tensorrt

ep = torch.export.load("/path/to/checkpoint.pt")
example_inputs = ep.example_inputs[0]
model = ep.module().to("cuda")

# reference output from traced model
ref_out = model(*example_inputs)

optimized_model = torch_tensorrt.compile(
    model,
    ir="torch_compile",
    inputs=example_inputs,
    enabled_precisions={torch.float, torch.half},
    workspace_size=4 << 30,
    min_block_size=7,
    torch_executed_ops={},
)

opt_out = optimized_model(*example_inputs)

assert torch.allclose(ref_out, opt_out)

@patrick-botco
Copy link
Author

you should see this fx Graph dump as part of the stack trace

While executing %cat_5 : [num_users=1] = call_function[target=torch.ops.aten.cat.default](args = ([%_frozen_param0, %where_3], 1), kwargs = {_itensor_to_tensor_meta: {<tensorrt.tensorrt.ITensor object at 0xfffea332b3f0>: ((1, 1, 2), torch.float32, False, (2, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea331dd70>: ((1, 1, 2), torch.float32, False, (2, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33ba730>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33b8230>: ((1, 1), torch.float32, False, (1, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33bbaf0>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33b83f0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33d8cf0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33d9830>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33da530>: ((1, 2), torch.float32, False, (4, 2), None, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea347f9f0>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33cc3b0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33cd570>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33ce870>: ((1, 2), torch.float32, False, (4, 2), None, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33cff30>: ((1, 2), torch.float32, False, (4, 2), None, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33c5470>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea33c6bb0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29d0570>: ((1, 2, 1), torch.float32, False, (4, 2, 1), None, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29d3bf0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29e15b0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29e3370>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29e52b0>: ((1, 2), torch.float32, False, (4, 2), None, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29e7330>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29ed6b0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29ef9f0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29f5f30>: ((1, 2), torch.float32, False, (4, 2), None, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29fc830>: ((1, 2), torch.float32, False, (4, 2), None, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea29fef30>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a018b0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a08470>: ((1, 2, 1), torch.float32, False, (4, 2, 1), None, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a11370>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a11af0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a11cb0>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a11d30>: ((1, 2, 2), torch.float32, False, (4, 2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a12270>: ((2, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a12430>: ((2, 128), torch.float32, False, (128, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a12530>: ((1, 2, 128), torch.float32, False, (256, 128, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a126b0>: ((1, 2, 128), torch.float32, False, (256, 128, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a129b0>: ((1, 2, 128), torch.float32, False, (256, 128, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a12ab0>: ((1, 2, 128), torch.float32, False, (256, 128, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a12c70>: ((1, 2, 256), torch.float32, False, (512, 256, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a12db0>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a12eb0>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a13170>: ((1, 2, 1), torch.float32, False, (2, 1, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a132b0>: ((1, 2, 1), torch.bool, False, (2, 1, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a13230>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a13670>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a138b0>: ((1, 2, 1), torch.float32, False, (2, 1, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a139f0>: ((1, 2, 1), torch.bool, False, (2, 1, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a13a70>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a13db0>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a280b0>: ((1, 2, 1), torch.float32, False, (2, 1, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffeae5c48f0>: ((1, 2, 1), torch.bool, False, (2, 1, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a282f0>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a284f0>: ((1, 2), torch.float32, False, (2, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a28730>: ((1, 2, 1), torch.float32, False, (2, 1, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea34bc2f0>: ((1, 2, 1), torch.bool, False, (2, 1, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea3456970>: ((1, 2, 256), torch.float32, False, (512, 256, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a28d70>: ((1, 2, 256), torch.float32, False, (512, 256, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a28ff0>: ((1, 2, 256), torch.float32, False, (512, 256, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a291b0>: ((1, 2, 256), torch.float32, False, (512, 256, 1), torch.contiguous_format, False, {}), <tensorrt.tensorrt.ITensor object at 0xfffea2a29430>: ((1, 2, 256), torch.float32, False, (512, 256, 1), torch.contiguous_format, False, {})}})

@patrick-botco
Copy link
Author

@narendasan @apbose does the above repro for you guys?

@apbose
Copy link
Collaborator

apbose commented Sep 4, 2024

Reproed. Looking into this.

@patrick-botco
Copy link
Author

thanks @apbose , lmk if I can help in any way

@apbose
Copy link
Collaborator

apbose commented Sep 6, 2024

Seems like the cat converter is receiving an empty tensor. In the above code the sparse_embedding is

sparse_embeddings = torch.empty(
            (1, 0, self.embed_dim), device=self._get_device()
        )

which results in an empty tensor when you are giving dim1= 0. I do not see the above cat failing when I give

sparse_embeddings = torch.empty(
            (1, self.embed_dim), device=self._get_device()
)
sparse_embeddings = sparse_embeddings.unsqueeze(1)

instead it fails in torch ref, traced output and torchTRT output not matching
I see that the output shape is [1x3x256] where first and second slice [1,256] output matches for all, but the zeroth slice [1, 256] does not. Point to note is that it does not match for traced and torch output too, so it seems to be something model specific. As mentioned above, passing the input in the above way should not result in the torchTRT error.

@patrick-botco would you know why the traced and torch output won't match?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants