Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error "tried creating tensor with negative value in shape" #71

Open
fullymiddleaged opened this issue Dec 18, 2024 · 7 comments
Open

Error "tried creating tensor with negative value in shape" #71

fullymiddleaged opened this issue Dec 18, 2024 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@fullymiddleaged
Copy link

Hi,

I have an onnx model that works correctly via the python api for onnxruntime, but if I send identical input array values (as sent via python version) the api here will error with "tried creating tensor with negative value in shape". My arrays are int64 type, with input_ids and attention_mask array values.

I have read posts suggesting that because the input dimension is listed as -1 there is some manual way of specifying this in C, that doesn't seem to be an issue with the python api for onnxruntime.

I am happy to attach my model if it helps. I have a feeling this could be an easy fix for a C guru. At least I hope so!

Thanks!

@kibae
Copy link
Owner

kibae commented Dec 19, 2024

Hi, @fullymiddleaged

Could you please attach the model if it's not too big?
Also, could you give me some sample parameters for inference, I'll give it a try.

@fullymiddleaged
Copy link
Author

fullymiddleaged commented Dec 19, 2024

Hi @kibae !

Sure, the model and example inputs can be downloaded via OneDrive here (don't worry it's only 40mb) - these use the same test data/expected result I have been using. To get these arrays I ran a working inference in python, and dumped the inputs and prediction values to text, then manipulated the inputs to be in the format required for posting json to your api server.

For reference, a snippet of my python code below shows the method of creating the inputs and obtaining the prediction. This may be helpful in case I've translated the input json for your api incorrectly. You also get to see how my arrays were created.

tokenizer = AutoTokenizer.from_pretrained(config.MODEL_FOLDER)
inputs = tokenizer(text, return_tensors="np", truncation=True, padding="max_length", max_length=512)
inputs = {k: v.astype(np.int64) for k, v in inputs.items()}

output = onnxsession.run(output_names=[label_name], input_feed=dict(inputs))[0]
prediction = np.squeeze(tf.keras.activations.softmax(tf.convert_to_tensor(output)).numpy())

Perhaps I'm being silly and not posting the inputs correctly, as Netron shows me the inputs as both needing to be int64[batch_size,sequence_length] - perhaps I've missed something here? I look forward to your reply, many thanks!

@fullymiddleaged
Copy link
Author

fullymiddleaged commented Dec 20, 2024

Please also see additional useful info below: I am running the 1.20.1-linux-cpu docker image with the following options:

onnxruntime_server:
    # After the docker container is up, you can use the REST API (http://localhost:8080).
    # API documentation will be available at http://localhost:8080/api-docs.
    image: kibaes/onnxruntime-server:1.20.1-linux-cpu
    ports:
      - "8080:80" # for http backend
    volumes:
      # for model files
      # https://github.com/kibae/onnxruntime-server#run-the-server
      - ./models:/app/models

      # for log files
      - ./logs:/app/logs
    environment:
      # for swagger(optional)
      - ONNX_SERVER_SWAGGER_URL_PATH=/api-docs
      - ONNX_SERVER_MODEL_DIR=/app/models
      - ONNX_SERVER_PREPARE_MODEL=quantized:v1(cuda=false)

@kibae
Copy link
Owner

kibae commented Dec 21, 2024

Hello, @fullymiddleaged :)

I’ve written test code based on the ONNX file and the request you kindly provided. The shape of input_ids defined in the ONNX file was (-1, -1). The first -1 likely represents the batch size, and the second -1 seems to correspond to the list of 512(max_length) token IDs for input_ids.

When using the model in Python, it works without confusion because the tokenizer returns input_ids of length 512. However, if the shape of input_ids is not explicitly defined when exporting the model to ONNX, the ONNX runtime cannot distinguish whether the input represents batch_size: 1 x input_ids: 512 or batch_size: 2 x input_ids: 256. This ambiguity causes an error within the ONNX runtime.

torch.onnx.export(
model_to_save,
(torch.tensor(x[:1], dtype=torch.float32), torch.tensor(y[:1], dtype=torch.float32),
torch.tensor(z[:1], dtype=torch.float32)),
"../fixture/sample/1/model.onnx",
export_params=True,
input_names=['x', 'y', 'z'],
output_names=['output'],
dynamic_axes={'x': {0: 'batch_size'}, 'y': {0: 'batch_size'}, 'z': {0: 'batch_size'},
'output': {0: 'batch_size'}},
verbose=True,
)

This link includes an example of how to specify the shape for each variable during ONNX export. Could you try re-exporting the ONNX model following this approach?

In the meantime, I will consider ways to forcibly define the shape during the session creation process. If you have any good ideas, I’d appreciate it if you could share them!

@fullymiddleaged
Copy link
Author

Hi @kibae - thanks for the reply and for investigating. Sadly, I've made a few attempts at changing my model, but I'm not having much luck loading my model.safetensors file into Torch to then re-export (I'm using colab and I think the versions of a dependency aren't matching, eek). So, I may end up having to re-train first. 😓

But it would indeed be amazing to define the shape during session creation, I'm happy to test this hard-coded for now if you like? As for possible interactive methods, maybe any of the below could work and are easy to document?

Option 1 - Via session options environment variable.

- ONNX_SERVER_PREPARE_MODEL=quantized:v2(cuda=false,batchsize=1,length=512)

Option 2: via POST call to /api/sessions:

{
  "model": "string",
  "version": "string",
  "option": {
    "cuda": true
    "batchsize":1
    "length":512
    }
}

In both cases an assumption could be made that this would be a global option applied to the model's inputs (e.g. if it had 2 inputs, they would both need to have matching sizes) as beyond this it becomes a lot harder to clearly define maybe? Would the array type (e.g. int64/float32) also need to be specified - hopefully not.

@kibae
Copy link
Owner

kibae commented Dec 22, 2024

Hi, @fullymiddleaged

Thank you for your great suggestion. 👍 Since there could be multiple inputs, I believe option 2 might be a better approach than option 1. I’ll give it more thought after the year-end holidays.

I was also considering whether we could treat shapes like (-1, -1) or (-1, -1, -1) as (batch_size=1, input_length=length of the data). I’m still evaluating whether this would work without issues in various environments.

Wishing you a Merry Christmas and a Happy New Year! Thank you.
MerryChristmasHappyHolidaysGIF

@fullymiddleaged
Copy link
Author

Hey, thats actually a great idea around the -1 inputs. If the input length could remain dynamic like that it would make it extra useable. :)

I've managed to edit my onnx file inputs and can now carry on my tests. Have a good Christmas break! Let's carry on later, I may have some other ideas too which could be awesome to add.

🎄🎄

@kibae kibae added the enhancement New feature or request label Dec 31, 2024
@kibae kibae self-assigned this Dec 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants