Decouple padding from keras frontend #143

TerrestrialEclipse · 2021-09-24T15:10:11Z

In the intermediate representation padding (when used without explicit padding layer, e.g., as a parameter in pooling layers) isn't decoupled from the keras frontend.
Keras allows this padding to be asymmetric, while, e.g., pytorch only supports symmetric padding.

Hence, it would be good to have an explicit numerical representation of the padding sizes in the intermediate format. This would allow better decoupling of the intermediate format from the frontends.

pg020196 · 2021-10-04T19:25:19Z

Hi @TerrestrialEclipse,
if I understand your request correctly, you would like to adjust the padding property in the intermediate format to not only specify the kind of padding but rather the padding dimensions, e. g.

padding=[<top>, <bottom>, <left>, <right>] //padding size at specified position

// Example for padding of size 2
padding = [2,2,2,2]

I agree that this could be a great idea. Until now, I don't see any problems why this couldn't be implemented.

TerrestrialEclipse · 2021-10-05T06:58:03Z

Hi @pg020196,

yes, that is what I would suggest. Up until now, padding has to be recomputed from the parameters in the intermediate representation. This implementation, however, only allows for symmetric padding. Still, asymmetric padding is possible, e.g. when pooling with a window size of 4x4 and padding "same" (in Keras), where padding width / height results in 3 in total, i.e. adding 1 at the beginning and 2 at the and of a column / row.

As far as I understand, Pytorch on the other hand, explicitly requires specifying the number of padded elements as
padding=[<top>, <bottom>, <left>, <right>] //padding size at specified position

pg020196 · 2021-10-06T18:06:28Z

Hi @TerrestrialEclipse,
I did a quick first research on padding behavior in Keras. Our current padding implementation is based on the padding types used in the AvgPool and MaxPool layers (e. g. https://www.tensorflow.org/api_docs/python/tf/keras/layers/MaxPool2D?hl=de). It turns out that Keras also supports asymmetric padding when specifically using Padding layers (e. g. https://www.tensorflow.org/api_docs/python/tf/keras/layers/ZeroPadding2D?hl=de).

Therefore, I would suggest to change the padding representation in the intermediate format in the before mentioned way. Additionally, instead of simply identifying the padding type the keras frontend should already translate the padding type (valid, same) into numeric values (top, bottom, left, right). Finally, the implementation in the .c-template file (or backend) must be adjusted to perform padding according to the given values.

If I didn't forget anything, this would also allow the implementation of padding layers with asymmetric properties in future releases.

I didn't have a look at the pytorch implementation but if you're right, the proposed solution would also work for a future pytorch frontend.

TerrestrialEclipse · 2021-10-07T07:26:29Z

Hi @pg020196 ,

as far as I understand, your current implementation tries to replicate the Keras behavior, but inherently, it only allows for symmetric padding: it computes a padding width and adds this exact amount both at the beginning and the end of a row / column. Keras, on the other hand, does allow for asymmetric padding implicitly, i.e., also without using an explicit padding layer.

Maybe we should merge this issue with the second issue regarding padding (issue #144): That keras cuts off the filter kernel at the edges.

Here is an example that showcases both issues of asymmetric padding width and cutting off kernels at the edges. I slightly adapted the model in the notebook Neural-Network-Translator/samples/keras/keras example networks.ipynb to

model = keras.Sequential()
model.add(keras.layers.AveragePooling1D(pool_size=4, strides=1, padding='same', input_shape = (6,1)))

I use keras 2.5.0 and tensorflow 2.5.0-rc1. The total padding is pool_size - 1 = 3, i.e. asymmetric. This could be realized by adding 1 element before and 2 after the data, followed by regular pooling / filtering. However, Keras does not exactly do that, but rather cuts off the first element of the filter for the first position of the sliding window, and 1, respectively 2, elements of the filter for the second to last, and last position, respectively. This can be seen with

data_x = np.array([[1,2,3,4,5,6]]).reshape([1,6,1])
model.predict(data_x)

which yields

array([[[2. ],
        [2.5],
        [3.5],
        [4.5],
        [5. ],
        [5.5]]], dtype=float32)

Here is a graphic representation of the filtering behavior in keras, where all X in one row together represent one position of the sliding window for AveragePooling. In the first row, the AveragePooling is thus only computed over 3 elements, and in the two last positions, pooling is carried out for 3, resp. 2 elements.

1 2 3 4 5 6
X X X          --> 2.0
X X X X        --> 2.5
  X X X X      --> 3.5
    X X X X    --> 4.5
      X X X    --> 5.0
        X X    --> 5.5

I think in the current implementation this is not supported. So far, this is also not supported in the C# backend, but I would add it once the intermediate format is adapted.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decouple padding from keras frontend #143

Decouple padding from keras frontend #143

TerrestrialEclipse commented Sep 24, 2021

pg020196 commented Oct 4, 2021

TerrestrialEclipse commented Oct 5, 2021 •

edited

Loading

pg020196 commented Oct 6, 2021 •

edited

Loading

TerrestrialEclipse commented Oct 7, 2021

Decouple padding from keras frontend #143

Decouple padding from keras frontend #143

Comments

TerrestrialEclipse commented Sep 24, 2021

pg020196 commented Oct 4, 2021

TerrestrialEclipse commented Oct 5, 2021 • edited Loading

pg020196 commented Oct 6, 2021 • edited Loading

TerrestrialEclipse commented Oct 7, 2021

TerrestrialEclipse commented Oct 5, 2021 •

edited

Loading

pg020196 commented Oct 6, 2021 •

edited

Loading