Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reshape error after onnx conversion #123

Open
charlieWyatt opened this issue Jan 22, 2024 · 13 comments
Open

Reshape error after onnx conversion #123

charlieWyatt opened this issue Jan 22, 2024 · 13 comments

Comments

@charlieWyatt
Copy link

charlieWyatt commented Jan 22, 2024

I have successfully trained a fasterrcnn_mobilenetv3_large_fpn and can make inferences on it using python.

I get no errors when converting the model to onnx using export.py -
python export.py --weights outputs/training/fasterrcnn_mobilenetv3_large_fpn_iNaturalist/best_model.pth --data data_configs/iNaturalist.yaml --out model.onnx

However, when I try to make inference on the onnx model I get reshape errors -

python onnx_inference_image.py --input ../input/iNaturalist/inference_images/ --weights weights/model.onnx --data data_configs/iNaturalist.yaml --show --imgsz 640

onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'/roi_heads/Reshape_2' Status Message: C:\a_work\1\s\onnxruntime\core\providers\cpu\tensor\reshape_helper.h:40 onnxruntime::ReshapeHelper::ReshapeHelper size != 0 && (input_shape_size % size) == 0 was false. The input tensor cannot be reshaped to the requested shape. Input shape:{115,2479}, requested shape:{-1,4}

My data has 620 classes.
When I search the node which is causing the error on Netron, the node with the reshape error is occurring in the middle of the network but closer to the end.

@sovit-123
Copy link
Owner

Hello @charlieWyatt
There are a few things to consider here.
When you export a model to ONNX using the script it is built by default with a static shape of 640x640. That means you can run inference only on images of 640x640. You can adjust the resolution using the appropriate arguments of the export.py script.

I see that during inference, you are already providing --imgsz 640. So, this error should not arise.

Can you please provide me with your training command and if possible a few images for me to reproduce the issue on my side?

@charlieWyatt
Copy link
Author

Thanks @sovit-123 I appreciate the response.

I am printing out the image.shape here in onnx_inference_image.py-
image
And am getting a value of - torch.Size([1, 3, 640, 640])

Here is an example image I am training on -
australian_magpie_22133012
australian_ibis_144215602

laughing_kookaburra_153012569

And my inference image -
magpie

The training command I am using is -
python train.py --model fasterrcnn_mobilenetv3_large_fpn.py --epochs 75 --data data_configs/iNaturalist.yaml --batch 4 --mosaic 0

@sovit-123
Copy link
Owner

sovit-123 commented Jan 23, 2024

May I know you ONNX and ONNX Runtime versions? Can you please try with ONNX Runtime 1.14.0 and the corresponding version of ONNX if they are different?

Also, please try to run inference with the same image folder path using the inference.py script by using the PyTorch trained model. Please check that the PyTorch inference is running as expected.

@charlieWyatt
Copy link
Author

charlieWyatt commented Jan 23, 2024

inference on same image folder path using inference.py -
python inference.py --data data_configs/iNaturalist.yaml --weights outputs/training/fasterrcnn_mobilenetv3_large_fpn_iNaturalist/best_model.pth --input ../input/iNaturalist/inference_images/
magpie

Old versions -
onnxruntime version - 1.16.3
onnx version - 1.15.0

Tried running it with -

  • onnx==1.14.0 and onnxruntime==1.14.0
  • onnx==1.14.0 and onnxruntime==1.16.0

and I am getting the same error.

Interesting, if I try the resnet18 model with all other parameters the same, I am able to export to onnx and make inferences, but would prefer to use the mobilenet model if possible.

@sovit-123
Copy link
Owner

That's interesting. In that case, I will check the source code script if it is only running for ResNet18 model. There may be some error.

@charlieWyatt
Copy link
Author

charlieWyatt commented Jan 23, 2024

I have a suspicion it is because of the redefinition of the roi heads in the mobilenet model on line 16 -
image

Also the inference error seems to indicate it is an issue in the roi heads -
"
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'/roi_heads/Reshape_2' Status Message: C:\a_work\1\s\onnxruntime\core\providers\cpu\tensor\reshape_helper.h:40 onnxruntime::ReshapeHelper::ReshapeHelper size != 0 && (input_shape_size % size) == 0 was false. The input tensor cannot be reshaped to the requested shape. Input shape:{115,2479}, requested shape:{-1,4}
"

@sovit-123
Copy link
Owner

Thanks for the observation. I will take a look. Although it may take some time.

@sovit-123
Copy link
Owner

Hi. I tested the ResNet backbone models and a few other models. It seems that they are working fine while the the MobineNet one mentioned by you is not. It may be because of some internal resizing. I will need some more time to test it out.

@charlieWyatt
Copy link
Author

No worries, thanks @sovit-123!

@soobin508
Copy link

hi @sovit-123 I also encountered the same issue with the mobilenet reshape. Is there any update regarding this matter?

@sovit-123
Copy link
Owner

From a first look, it seemed like an internal transform issue of the model. However, the transforms are the same as the ResNet backbone ones. So, I still need to figure this out.

@sovit-123
Copy link
Owner

Hi. Can you train again? I have pushed an update where you can export at 640x640 and run inference at any resolution. Basically dynamic export. While exporting do not give any resolution. Only provide the desired resolution while running the inference.

@juliomilani
Copy link

juliomilani commented Jul 17, 2024

It seems that the problem is in the torchvision implementation. Here is a minimal reproductible example:

obs: The error shows up only in the cropped image. If I comment out img = img[50:200, 150:250] - it works fine.

import torch
import torchvision
import cv2
import requests
import numpy as np
import onnxruntime

print("Exporting model")
model = torchvision.models.detection.fasterrcnn_mobilenet_v3_large_fpn(weights='DEFAULT')
torch.onnx.export(model, torch.rand(1, 3, 640, 640), '/tmp/model.onnx', 
                    input_names=['input'], output_names=['boxes', 'scores', 'labels'])


print("Downloading image")
r = requests.get('https://docs.opencv.org/4.x/roi.jpg', allow_redirects=True)
open('/tmp/roi.jpg', 'wb').write(r.content)
img = cv2.imread('/tmp/roi.jpg')

img = img[50:200, 150:250]
cv2.imwrite('/tmp/roi2.jpg', img)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
image_dim_dim = cv2.resize(img, (640, 640))
image_dim_dim = np.array(image_dim_dim, dtype=np.float32) / 255.0
image_bchw = np.transpose(np.expand_dims(image_dim_dim, 0), (0, 3, 1, 2))

print("Running inference")
session = onnxruntime.InferenceSession('/tmp/model.onnx', providers=["CPUExecutionProvider"])
outputs = [o.name for o in session.get_outputs()]
inputs = [o.name for o in session.get_inputs()]
prediction = session.run(outputs, {inputs[0]: image_bchw})
print(prediction)

Error:
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running Reshape node. Name:'/roi_heads/Reshape_2' Status Message: /onnxruntime_src/onnxruntime/core/providers/cpu/tensor/reshape_helper.h:39 onnxruntime::ReshapeHelper::ReshapeHelper(const onnxruntime::TensorShape&, onnxruntime::TensorShapeVector&, bool) size != 0 && (input_shape_size % size) == 0 was false. The input tensor cannot be reshaped to the requested shape. Input shape:{490,363}, requested shape:{-1,4}

Packages:
onnxruntime 1.18.1
torch 2.2.2
torchvision 0.17.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants