Open
Description
🐛 Describe the bug
When running add.Tensor on ET (optimized kernels) with input shapes (8) and (1,8), the operator call fails at runtime due to the rank of the output tensor (full error below). It runs successfully in eager mode.
When swapping the inputs such that (1,8) comes first, it runs successfully. It would appear there is a bug in the output size calculation when broadcasting. I haven't tested on portable specifically.
import torch
from executorch.exir import to_edge_transform_and_lower
from executorch.runtime import Runtime
class Model(torch.nn.Module):
def forward(self, x, y):
return x + y
inputs = (
torch.randn(8),
torch.randn(1, 8),
)
et_program = to_edge_transform_and_lower(
torch.export.export(Model(), inputs),
).to_executorch()
print(et_program.exported_program())
runtime = Runtime.get()
program = runtime.load_program(et_program.buffer)
method = program.load_method("forward")
method.execute(inputs)
Output:
[tensor_impl.cpp:78] Attempted to change the tensor rank which is immutable: old=2, new=1
[op_add_sub_impl.h:91] Check failed (error == Error::Ok): Failed to resize output tensor.
[method.cpp:1303] KernelCall failed at instruction 0:0 in operator aten::add.out: 0x12
Versions
N/A