Skip to content

Add tests for invalid utf8 when formatting #473

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

smaye81
Copy link

@smaye81 smaye81 commented May 28, 2025

According to docs:

The value is formatted as if string(value)was performed and any invalid UTF-8 sequences are replaced with \ufffd. Multiple adjacent invalid UTF-8 sequences must be replaced with a single \ufffd.

This adds two additional tests to verify:

  • invalid UTF-8 sequences are each replaced with \ufffd.
  • multiple adjacent invalid UTF-8 sequences are replaced with a single \ufffd.

Copy link

google-cla bot commented May 28, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

smaye81 added a commit to bufbuild/protovalidate-java that referenced this pull request May 29, 2025
The current conformance tests for `string.format` are not exhaustive and
do not account for all scenarios in the
[docs](https://github.com/google/cel-spec/blob/master/doc/extensions/strings.md).
One such example is a test for invalid UTF-8.

This adds the ability to specify supplemental conformance tests in the
form of another textproto file. The content is merged with the actual
cel conformance tests and then run against our implementation. This
allows us to specify our own tests not yet covered in the official
conformance tests. As a result, this includes two tests for invalid
UTF-8, which incidentally turned up a bug involving collapsing
placeholders for contiguous invalid UTF-8 bytes.

Note that a PR has been created
[here](google/cel-spec#473) to add these tests
to the spec. Once added and released, they can be removed from our
supplemental tests.
smaye81 added a commit to bufbuild/protovalidate-python that referenced this pull request May 30, 2025
The current conformance tests for `string.format` are not exhaustive and
do not account for all scenarios in the
[docs](https://github.com/google/cel-spec/blob/master/doc/extensions/strings.md).
One such example is a test for invalid UTF-8.

This adds the ability to specify supplemental conformance tests in the
form of another textproto file. The content is merged with the actual
cel conformance tests and then run against our implementation. This
allows us to specify our own tests not yet covered in the official
conformance tests. As a result, this includes two tests for invalid
UTF-8, which incidentally turned up a bug involving collapsing
placeholders for contiguous invalid UTF-8 bytes.

Note that a PR has been created
[here](google/cel-spec#473) to add these tests
to the spec. Once added and released, they can be removed from our
supplemental tests.

See See bufbuild/protovalidate-java#294 for a
similar PR in protovalidate-java.

This also renames some functions to make the test implementation more
consistent across protovalidate implementations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant