Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

If some layers of my TensorFlow model must be in training mode to work properly, what should I do to export the correct onnx #2379

Open
jungyin opened this issue Jan 11, 2025 · 1 comment
Labels
question An issue, pull request, or discussion needs more information

Comments

@jungyin
Copy link

jungyin commented Jan 11, 2025

Ask a Question

My model has some BatchNormalization layers, and when I try to switch to testing mode, the results I get are completely different from those in training mode. What should I do to correctly export onnx?

Further information

  • Is this issue related to a specific model?
    Model name:
    phynet

model code:

class PhysNet(keras.Model):

    def __init__(self, norm='batch'):
        self.norm = norm
        if norm == 'batch':
            norm = layers.BatchNormalization
        if norm == 'layer':
            norm = lambda :layers.LayerNormalization(axis=(1,))
        if norm == 'layer_frozen':
            norm = lambda :layers.LayerNormalization(axis=(1,), trainable=False)
        super().__init__()
        self.ConvBlock1 = keras.Sequential([
            layers.Conv3D(16, kernel_size=(1, 5, 5), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock2 = keras.Sequential([
            layers.Conv3D(32, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock3 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock4 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock5 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock6 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock7 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock8 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.ConvBlock9 = keras.Sequential([
            layers.Conv3D(64, kernel_size=(3, 3, 3), strides=1, padding='same'),
            norm(),
            layers.Activation('relu')
        ])
        self.upsample = keras.Sequential([
            layers.Conv3DTranspose(64, kernel_size=(4, 1, 1), strides=(2, 1, 1), padding='same'),
            norm(),
            layers.Activation('elu')
        ])
        self.upsample2 = keras.Sequential([
            layers.Conv3DTranspose(64, kernel_size=(4, 1, 1), strides=(2, 1, 1), padding='same'),
            norm(),
            layers.Activation('elu')
        ])
        self.convBlock10 = layers.Conv3D(1, kernel_size=(1, 1, 1), strides=1)
        self.MaxpoolSpa = layers.MaxPool3D((1, 2, 2), strides=(1, 2, 2))
        self.MaxpoolSpaTem = layers.MaxPool3D((2, 2, 2), strides=2)
        self.poolspa = layers.AvgPool3D((1, 2, 2))
        self.flatten = layers.Reshape((-1,))

    def call(self, x):
        if self.norm == 'batch':
            training=True
        else:
            training=False
        x = self.ConvBlock1(x, training=training)
        x = self.MaxpoolSpa(x)
        x = self.ConvBlock2(x, training=training)
        x = self.ConvBlock3(x, training=training)
        x = self.MaxpoolSpaTem(x)
        x = self.ConvBlock4(x, training=training)
        x = self.ConvBlock5(x, training=training)
        x = self.MaxpoolSpaTem(x)
        x = self.ConvBlock6(x, training=training)
        x = self.ConvBlock7(x, training=training)
        x = self.MaxpoolSpa(x)
        x = self.ConvBlock8(x, training=training)
        x = self.ConvBlock9(x, training=training)
        x = self.upsample(x, training=training)
        x = self.upsample2(x, training=training)
        x = self.poolspa(x)
        x = self.convBlock10(x, training=training)
        x = self.flatten(x)
        x = x-tf.expand_dims(tf.reduce_mean(x, axis=-1), -1)
        return x

output ,eval was onnx/tf ,train was tf source
image

Model opset:
18

Notes

@jungyin jungyin added the question An issue, pull request, or discussion needs more information label Jan 11, 2025
@jungyin
Copy link
Author

jungyin commented Jan 11, 2025

When I tried to lock my model in validation mode, the outputs of onnx and tf were exactly the same. However, I now need to ensure that my model is exported in training mode to pinpoint the issue caused by BatchNormalization being in non training mode

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question An issue, pull request, or discussion needs more information
Projects
None yet
Development

No branches or pull requests

1 participant