Input data type inference behavior change in 1.14.1

Hello and thanks for this wonderful library! For the past 18 months, we've been using version 1.14.0 to convert an sklearn XGBoostClassifier pipeline model to ONNX. Everything has worked great. Recently we needed to upgrade to newer versions of onnx and onnxruntime. Upgrading to the latest versions of those libraries as well as the latest version of this library resulted in score mismatch issues between the sklearn pipeline version of our model and the ONNX version of our model. After much head scratching and searching, I narrowed down the issue to version 1.14.1 of skl2onnx. If I keep the old versions of `onnxruntime` and `onnx` that we've been using for the last 18 months and only switch from `skl2onnx==1.14.0` to `skl2onnx==1.14.1`, I can reproduce this score mismatch error. (I can also reproduce the issue when using any newer version of onnx, onnxruntime, or `skl2onnx>=1.14.1`

After inspecting the models in Netron, it looks like the underlying structure has changed a bit. For context, our sklearn model pipeline takes in a combination of float64 and string inputs. The string inputs are all one-hot encoded by the model pipeline. The float64 inputs are run through the venerable sklearn `passthrough` transformer. These values are then fed to the `XGBClassifier`.

In version `0.14.0`, the numeric float64 inputs feed to a `Concat` node and then into a `Cast` node which converts them all to float32. The string inputs go one hot encoding -> `Concat` -> `Reshape` and then meet up with the numeric inputs at a `Concat` node.

In version `0.14.1`, however, the numeric inputs are not being cast to float32, but the `Reshape` output of the string inputs is being cast to `float64`.

I believe this indicates that the input to my `TreeEnsembleClassifier` node in `0.14.0` was an array of float32, but in `0.14.1` (and beyond) it's an array of float64. (I tried loading the model into memory and running shape inference to generate the various data types but it wasn't working in the way I expected.)

For a sample dataset of 1k rows, this seemingly minor change results in 33% of the scores having a difference greater than 10E-5 between the sklearn and ONNX versions of the model. The greatest difference I observed with my small tests dataset was 0.04.

Workarounds:
* Change the data type on the numeric inputs from float64 to float32 upfront. This resolves the score mismatch issue, but won't be accepted by our model serving environment (it insists on passing float64 for reasons outside of our control).
* Instead of using `passthrough` in the sklearn model, I created a custom sklearn transformer that converts its input to float32. I registered a corresponding ONNX converter which translates this into a basic `Cast` node. This works.

It's been 18 months since the release of `0.14.1` and I couldn't find any similar issues. Is this a bug? Or were we simply getting lucky before that scores were matching perfectly between sklearn and ONNX on `0.14.0`? FWIW we pass float64 to the sklearn model when generating scores, so it seems wrong that we'd get one score value back from the sklearn model, but a different score back from the ONNX model when passing in the same float64 inputs.

Can you provide any insight on what caused this change? I've poured through the code in this library and the [changes](https://github.com/onnx/sklearn-onnx/releases/tag/1.14.1) in the 1.14.1 release and it's not immediately obvious to me which one caused it. My guess is that it's [this one](https://github.com/onnx/sklearn-onnx/pull/983).

Also, I think [this](https://github.com/onnx/sklearn-onnx/commit/1313eb9b6e58251331a1b7649afa35b7a26bd8f2) commit was included in that release even though it wasn't in the release notes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Input data type inference behavior change in 1.14.1 #1150

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Input data type inference behavior change in 1.14.1 #1150

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions