Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decouple padding from keras frontend #143

Open
TerrestrialEclipse opened this issue Sep 24, 2021 · 4 comments
Open

Decouple padding from keras frontend #143

TerrestrialEclipse opened this issue Sep 24, 2021 · 4 comments

Comments

@TerrestrialEclipse
Copy link

In the intermediate representation padding (when used without explicit padding layer, e.g., as a parameter in pooling layers) isn't decoupled from the keras frontend.
Keras allows this padding to be asymmetric, while, e.g., pytorch only supports symmetric padding.

Hence, it would be good to have an explicit numerical representation of the padding sizes in the intermediate format. This would allow better decoupling of the intermediate format from the frontends.

@pg020196
Copy link
Owner

pg020196 commented Oct 4, 2021

Hi @TerrestrialEclipse,
if I understand your request correctly, you would like to adjust the padding property in the intermediate format to not only specify the kind of padding but rather the padding dimensions, e. g.

padding=[<top>, <bottom>, <left>, <right>] //padding size at specified position

// Example for padding of size 2
padding = [2,2,2,2]

I agree that this could be a great idea. Until now, I don't see any problems why this couldn't be implemented.

@TerrestrialEclipse
Copy link
Author

TerrestrialEclipse commented Oct 5, 2021

Hi @pg020196,

yes, that is what I would suggest. Up until now, padding has to be recomputed from the parameters in the intermediate representation. This implementation, however, only allows for symmetric padding. Still, asymmetric padding is possible, e.g. when pooling with a window size of 4x4 and padding "same" (in Keras), where padding width / height results in 3 in total, i.e. adding 1 at the beginning and 2 at the and of a column / row.

As far as I understand, Pytorch on the other hand, explicitly requires specifying the number of padded elements as
padding=[<top>, <bottom>, <left>, <right>] //padding size at specified position

@pg020196
Copy link
Owner

pg020196 commented Oct 6, 2021

Hi @TerrestrialEclipse,
I did a quick first research on padding behavior in Keras. Our current padding implementation is based on the padding types used in the AvgPool and MaxPool layers (e. g. https://www.tensorflow.org/api_docs/python/tf/keras/layers/MaxPool2D?hl=de). It turns out that Keras also supports asymmetric padding when specifically using Padding layers (e. g. https://www.tensorflow.org/api_docs/python/tf/keras/layers/ZeroPadding2D?hl=de).

Therefore, I would suggest to change the padding representation in the intermediate format in the before mentioned way. Additionally, instead of simply identifying the padding type the keras frontend should already translate the padding type (valid, same) into numeric values (top, bottom, left, right). Finally, the implementation in the .c-template file (or backend) must be adjusted to perform padding according to the given values.

If I didn't forget anything, this would also allow the implementation of padding layers with asymmetric properties in future releases.

I didn't have a look at the pytorch implementation but if you're right, the proposed solution would also work for a future pytorch frontend.

@TerrestrialEclipse
Copy link
Author

Hi @pg020196 ,

as far as I understand, your current implementation tries to replicate the Keras behavior, but inherently, it only allows for symmetric padding: it computes a padding width and adds this exact amount both at the beginning and the end of a row / column. Keras, on the other hand, does allow for asymmetric padding implicitly, i.e., also without using an explicit padding layer.

Maybe we should merge this issue with the second issue regarding padding (issue #144): That keras cuts off the filter kernel at the edges.

Here is an example that showcases both issues of asymmetric padding width and cutting off kernels at the edges. I slightly adapted the model in the notebook Neural-Network-Translator/samples/keras/keras example networks.ipynb to

model = keras.Sequential()
model.add(keras.layers.AveragePooling1D(pool_size=4, strides=1, padding='same', input_shape = (6,1)))

I use keras 2.5.0 and tensorflow 2.5.0-rc1. The total padding is pool_size - 1 = 3, i.e. asymmetric. This could be realized by adding 1 element before and 2 after the data, followed by regular pooling / filtering. However, Keras does not exactly do that, but rather cuts off the first element of the filter for the first position of the sliding window, and 1, respectively 2, elements of the filter for the second to last, and last position, respectively. This can be seen with

data_x = np.array([[1,2,3,4,5,6]]).reshape([1,6,1])
model.predict(data_x)

which yields

array([[[2. ],
        [2.5],
        [3.5],
        [4.5],
        [5. ],
        [5.5]]], dtype=float32)

Here is a graphic representation of the filtering behavior in keras, where all X in one row together represent one position of the sliding window for AveragePooling. In the first row, the AveragePooling is thus only computed over 3 elements, and in the two last positions, pooling is carried out for 3, resp. 2 elements.

1 2 3 4 5 6
X X X          --> 2.0
X X X X        --> 2.5
  X X X X      --> 3.5
    X X X X    --> 4.5
      X X X    --> 5.0
        X X    --> 5.5

I think in the current implementation this is not supported. So far, this is also not supported in the C# backend, but I would add it once the intermediate format is adapted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants