-
Notifications
You must be signed in to change notification settings - Fork 287
fix left-padding #2278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix left-padding #2278
Changes from all commits
299102c
59627a4
5d1b2c0
97cada7
85bb256
6ab5ea9
f1c55ac
8c40279
7ceef48
edc01a2
11f77c5
6fa4685
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,7 +17,7 @@ def test_dense_input(self): | |
sequence_length=5, padding_side="left" | ||
) | ||
output = start_end_packer(input_data) | ||
expected_output = [0, 0, 5, 6, 7] | ||
expected_output = [5, 6, 7, 0, 0] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This doesn't make sense. Why is left padding the same as right padding in this case? The test case before looks correct, this looks wrong. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, it was obviously wrong before. |
||
self.assertAllEqual(output, expected_output) | ||
|
||
def test_bfloat16_dtype(self): | ||
|
@@ -40,7 +40,7 @@ def test_dense_2D_input(self): | |
sequence_length=5, padding_side="left" | ||
) | ||
output = start_end_packer(input_data) | ||
expected_output = [[0, 0, 5, 6, 7]] | ||
expected_output = [[5, 6, 7, 0, 0]] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ditto, we are showing a lot of right hand side padding for the left padding option. I think we have introduced a bug. |
||
self.assertAllEqual(output, expected_output) | ||
|
||
def test_ragged_input(self): | ||
|
@@ -55,7 +55,7 @@ def test_ragged_input(self): | |
sequence_length=5, padding_side="left" | ||
) | ||
output = start_end_packer(input_data) | ||
expected_output = [[0, 0, 5, 6, 7], [0, 8, 9, 10, 11]] | ||
expected_output = [[0, 5, 6, 7, 0], [8, 9, 10, 11, 0]] | ||
self.assertAllEqual(output, expected_output) | ||
|
||
def test_start_end_token(self): | ||
|
@@ -119,7 +119,7 @@ def test_start_end_padding_value(self): | |
padding_side="left", | ||
) | ||
output = start_end_packer(input_data) | ||
expected_output = [[3, 3, 1, 5, 6, 7, 2], [3, 1, 8, 9, 10, 11, 2]] | ||
expected_output = [[3, 1, 5, 6, 7, 2, 3], [1, 8, 9, 10, 11, 2, 3]] | ||
self.assertAllEqual(output, expected_output) | ||
|
||
def test_truncation(self): | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -21,17 +21,24 @@ | |
NO_CONVERT_COUNTER = threading.local() | ||
|
||
|
||
def pad(x, shape, padding_side, pad_value): | ||
if padding_side == "left": | ||
def pad(x, shape, padding_side, pad_value, axis=-1): | ||
if padding_side == "left" and pad_value is not None: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't fully understand what we are trying to do here, but I think it is buggy, we should go back to the reverse and |
||
x = x[..., ::-1] | ||
|
||
outputs = x.to_tensor( | ||
default_value=pad_value, | ||
shape=shape, | ||
) | ||
|
||
if padding_side == "left": | ||
outputs = x.to_tensor( | ||
default_value=pad_value, | ||
) | ||
outputs = outputs[..., ::-1] | ||
padding_shape = [tf.shape(outputs)[0]] + [1] * (len(outputs.shape) - 1) | ||
padding_shape[axis] = shape[axis] - tf.shape(outputs)[axis] | ||
padding_shape = tf.cast(padding_shape, "int64") | ||
padding = tf.fill(padding_shape, pad_value) | ||
padding = tf.cast(padding, outputs.dtype) | ||
outputs = tf.concat([outputs, padding], axis=axis) | ||
else: | ||
outputs = x.to_tensor( | ||
default_value=pad_value, | ||
shape=tf.cast(shape, "int64"), | ||
) | ||
return outputs | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we have this line for start end packer but not multi-segment packer? what's the difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you delete the test, an error will be reported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! But that's not the question. We want the implementation of start end packer and multi segment packer to look similar where we can. Having "none" pad value be different between these layers could lead to subtle bugs for end users. Is there a technical reason why we need this line for StartEndPacker and not MultiSegmentPacker? If this is just for tests, let's rework the tests. Let's try to keep the layers working roughly the same.