Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explain_output_differences escape unicode is needlessly restrictive #29

Open
insou22 opened this issue Apr 22, 2022 · 1 comment
Open

Comments

@insou22
Copy link
Member

insou22 commented Apr 22, 2022

https://github.com/COMP1511UNSW/autotest/blob/main/explain_output_differences.py#L379

# ...
        if len(line) > max_line_length_shown:
            line = line[0:max_line_length_shown] + " ..."
        line = line.encode("unicode_escape").decode("ascii")
        if leave_colorization:
            line = line.replace(r"\x1b", "\x1b")
        if leave_tabs:
            line = line.replace(r"\t", "\t")
            line = line.replace(r"\\", "\\")
# ...

I would like to see the line line = line.encode("unicode_escape").decode("ascii") removed ideally,
but @hexDoor has pointed out that this may be necessary to escape things away such as ANSI colour codes.

Unfortunately it escapes normal characters that are of multi-byte width (eg. other language codepoints, emojis, etc.)

Can we make it less restrictive, i.e. normally render displayable characters, and just escape dangerous/non-displayable bytes away?

@insou22
Copy link
Member Author

insou22 commented Apr 22, 2022

Current behaviour:

Test 3 (./mini_grep '😂') - failed (Incorrect output)
Your program produced this line of output:
oh and; emojis should work too \U0001f602\U0001f602\U0001f602

The correct 1 lines of output for this test were:
oh and; emojis should work too \U0001f602\U0001f602\U0001f603

The difference between your output(-) and the correct output(+) is:
- oh and; emojis should work too 😂😂😂
?                                  ^

+ oh and; emojis should work too 😂😂😃
?                                  ^


The input for this test was:
oh and; emojis should work too 😂😂😂
and they follow the same rules: 😃 🌍 🍞 🚗

You can reproduce this test by executing these commands:
  rustc mini_grep.rs
  echo -e 'oh and; emojis should work too 😂😂😂\nand they follow the same rules: 😃 🌍 🍞 🚗' | ./mini_grep 😂

With that line of code removed:

Test 3 (./mini_grep '😂') - failed (Incorrect output)
Your program produced this line of output:
oh and; emojis should work too 😂😂😂

The correct 1 lines of output for this test were:
oh and; emojis should work too 😂😂😃

The difference between your output(-) and the correct output(+) is:
- oh and; emojis should work too 😂😂😂
?                                  ^

+ oh and; emojis should work too 😂😂😃
?                                  ^


The input for this test was:
oh and; emojis should work too 😂😂😂
and they follow the same rules: 😃 🌍 🍞 🚗

You can reproduce this test by executing these commands:
  rustc mini_grep.rs
  echo -e 'oh and; emojis should work too 😂😂😂\nand they follow the same rules: 😃 🌍 🍞 🚗' | ./mini_grep 😂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant