-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Send non-printable codepoints as U+FFFD #76
Conversation
termbox2.h
Outdated
@@ -1535,6 +1535,7 @@ static int send_sgr(uint32_t fg, uint32_t bg, int fg_is_default, | |||
static int send_cursor_if(int x, int y); | |||
static int send_char(int x, int y, uint32_t ch); | |||
static int send_cluster(int x, int y, uint32_t *ch, size_t nch); | |||
static int is_valid_codepoint(uint32_t ch); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just yak shaving, but is_invalid_codepoint
with inverted logic will be more suitable here. No ternary operator and not
ing its return :-).
I reached this conclusion when I read the function definition and encountered the ternary in the return
. And it is entirely possible you did it intentionally, in which case, ignore this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The word invalid
is even in the title of the PR :-D.
Thank you for the review @pepe. I altered the PR to filter based on
Of course there are codepoints above the 1-byte range, but the same categories apply. I think it comes down to control codes and whitespaces.
One more note about control codes, specifically NULL: I'm not sure why termbox v1 went out of its way to replace NULL with a space. I think it may be from when clearing the buffer was a simple zero-fill, but now we set everything to a space. Let me know if you can think of any counter-examples. I will leave these PRs open for a while for visibility. |
Thank you. That is a much better choice, anyway. I am planning to rewrite a TUI of mine in the coming weeks, so I will definitely use this version to test it. |
Presently if someone sets a cell to an invalid codepoint, termbox will write it out verbatim, but it's guaranteed to be garbage. Instead send U+FFFD. This effectively limits cell contents to valid Unicode.
Any reasons to not do this? Could we be even more strict and useiswprint
?EDIT: Updated PR to use
iswprint
Note users can still send arbitrary data via
tb_send
.