-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Konsole: overly wide Unicode characters mess up intended layouts #51
Comments
This is a bug in Konsole's bi-directional text rendering support and there's nothing Turbo Vision can do about it. The problem goes away if you disable this feature from Change that setting and share the results, because there may still be something wrong with these apparently double-width characters. |
Disabling bi-directional text rendering at least improved the cursor movement issue for me. But in your case it looks that Konsole is additionally rendering certain characters as double-width when they are not, I don't know why. Can you please run the following program and share the output? #include <locale.h>
#include <wchar.h>
#include <stdio.h>
struct UChar { const char *mbc; wchar_t wc; };
const UChar chars[] =
{
{"a", L'a'},
{"❶", L'❶'},
{"⑪", L'⑪'},
{"🤡", L'🤡'},
};
int main()
{
setlocale(LC_ALL, "");
for (const auto &ch : chars)
printf("wcwidth(%s) = %d\n", ch.mbc, wcwidth(ch.wc));
} I get the following:
|
In my earlier screenshot of the character picker, I notice the scrollbar is in the correct spot and it cuts off the characters inside the list, rather than itself being shifted to the right. That makes me think that if we had a list of the affected characters, this could be worked around by always explicitly moving to the expected screen position after writing one of those characters. That would hopefully cut off the right side of the over-wide character and allow the rest of the line to be in the right spot. |
Well, it's clear that there's something wrong with how Konsole chooses to render these characters. I suggest you try another terminal emulator (alacritty, gnome-terminal, kitty, or even xterm). If the problem persists in any of these, then the problem may lie in the font rendering libraries. Otherwise, it may be an issue unique to Konsole (which version are you using, BTW?).
This suggests to me that Konsole is aware these characters are actually just one cell wide, but for some reason they are rendered wider than they should.
I have never experienced this issue before, so it can be assumed that not all Konsole users suffer from it. Then, how would Turbo Vision detect whether this issue is happening or not? I don't think enabling this workaround unconditionally would be very comfortable. Cheers. |
Also, please try using different fonts (mine is Hack). |
This is Konsole version 17.12.3. Changing the font does fix it. The fonts preinstalled on Ubuntu, "DejaVu Sans Mono", "Courier 10 Pitch", "Nimbus Mono L", and "Noto Mono", all produce the over-wide characters. My favorite third party font (Iosevka) looks correct. I wonder if this is some kind of font fallback issue, maybe ❶ doesn't exist in any of the built-in mono fonts. A workable solution for me is to simply omit these characters from the symbol picker, but they are handy-looking glyphs that I'd like to salvage if I can. Another workable solution for me is to ignore the problem and just let it be broken on Konsole. These characters look good in every other system and terminal combination I've tried. |
That's what I would do. At the most, this issue can be documented somewhere so that in the unlikely case a user runs across it, they can fix it themselves. |
Works for me. Thanks! |
wcwidth() lies in a huge number of cases. The only reliable way to determine the actual character width is as it is done for Windows, by outputting the character and measuring the cursor offset. By the way, dividing a line into grapheme clusters is possible using the same method. An example in Python is here: |
Hi @unxed! I'm afraid that solution is only feasible on Windows. In order to do that in a Unix terminal:
So, in my opinion, you would end up with a poor experience for both the user and the programmer. |
Why not output chars outside the visible area, above it? |
I haven't actually tried that. But I suspect that drawing outside the visible area only makes sense if there is a scrollback area. Turbo Vision uses the alternate screen buffer, which results in scrollback being disabled in most terminal emulators. At this point, I expect that the terminal won't allow the cursor to be moved out of bounds. So, it looks to me that such a strategy is likely not to work in many terminal emulators, and therefore it would not be portable. Besides the fact that it would tackle just the first of the problems I mentioned. |
Could whose problems be solved using atomic updates proposal as described here: ? |
Synchronized updates allow you to avoid having the characters you are trying to measure shown on screen. But that just solves the first issue I mentioned. In addition, it introduces more steps to the process of measuring a character's width, so the performance may be even worse depending on how you use it. As absurd as it may sound, I think the only way around the issues I mentioned is to tacke this issue the other way around, and have the client application tell the terminal whether the characters it is printing should be displayed as single- or double-width. This way the application's expectations would match the actual display, so there could be no screen garbling because of width mismatches. And since this would just require a one-way communication from the application to the terminal, it would avoid the performance penalty of having to wait for replies from the terminal, and it would also avoid any conflicts with user input. |
That sounds pretty reasonable! Could you offer a draft standard? It could be implemented, for example, in far2l built-in terminal or maybe in kitty if the author is willing to do so. |
By the way, since we are talking about Unicode support. Could you please tell me if grapheme clusters can have varying displayed widths depending on neighboring grapheme clusters? Or is the width of a grapheme cluster a fixed value? I haven't been able to figure this out yet, maybe you know? Thank you! |
Hi @unxed. Regarding your question on grapheme clusters, I don't know much about these details of the Unicode specification, so I can't help you. All I know is that I cannot expect the average terminal emulator to be fully compliant with Unicode. Not just because some terminals never intended to be compliant in the first place, but also because the specification evolves over time and implementations may become outdated (e.g. the internet is full of different implementations of A clear example of this was the behaviour of Kate's embedded terminal widget by the time I wrote the following comment: #26 (comment)
That's why I think that attempting to solve the issue of character widths by focusing on Unicode standard compliancy is not the best idea. For this to work, both the application and the terminal emulator should either implement these complex Unicode logics, or rely on third-party dependencies that implement such logics. Even if they did so, the Unicode support in them would inevitably be in risk of becoming outdated, unless both these programs and/or the systems where they would be running kept receiving updates. Considering that one of the main points of text-based applications is portability (e.g. being able to run in a remote host), it seems to me that tackling this issue in this way would be senseless. Having the client application ask the terminal the width of text is a possible solution, but it will only work performantly in specific scenarios with very low latency. It takes at least 1 A serious proposal for the solution I mentioned in my previous comment would require considering a lot of things into account, since this is not just about single characters. For example, a client application may want to ensure that displaying "👨👩👧👦" (consisting of 5 Unicode codepoints) will occupy just two screen columns. The terminal may not know how to render this grapheme cluster properly (as in the previous example of Kate's embedded terminal), so inevitably these characters won't be displayed the way the application expected, but they should still occupy exactly two screen columns, since messing up the application's layout can be avoided. For a standard proposal to be effective in solving this issue, it should provide clear hints for terminal emulator developers on how to handle this situation and many other ones. But I have never developed a terminal emulator and I am not familiar with font rendering, so I have no idea what it makes sense to ask the terminal emulator to do and what it doesn't. |
Taking into account everything we've discussed, the only solution that comes to mind is to pass a set of rules (describing how to split a string into grapheme clusters and determine the width of these clusters) from the terminal to the application (or vice versa) at the app start. Because it doesn’t seem like the Unicode standard logic is now actively changing between versions, but just new characters are being added. Currently, such rules are usually statically compiled into the application. If they are made dynamically loadable, this could solve the issue, although it would result in a slight delay when launching the application. As for terminal support, we could experiment with this in the built-in far2l terminal, and if we find a sustainable solution, we could propose it to other developers. What do you think of this approach? |
Here's another idea. Perhaps we could develop a protocol that allows the terminal and the application to agree on the highest Unicode standard version they both support and then operate using that version. If this protocol isn't supported, we could fall back to the current approach. |
The point of my suggestion was that things should be made as simple as possible for both the client application and the terminal emulator. Turbo Vision currently uses the system-provided Similarly, I think that having the client application and the terminal emulator talk to each other about rules describing how to split a string into grapheme clusters and determine the width of these clusters does not sound much simpler. What would those rules be like? How much code would it take in the client application to support that? When writing about my suggestion, I was thinking of something like this:
This wouldn't ensure that all grapheme clusters are rendered properly, but it would prevent the client application's layout from messing up. In addition, the terminal may have to reply to some of the escape sequences so that the client knows this feature is supported. But, as I said, maybe implementing this in a terminal emulator is very complex and unconvenient. I don't know. |
@elfmz can you please look into this? Can we support this experimental approach in far2l's VT? |
I tried to implement an approach in which the application informs the terminal about the size of the grapheme cluster on a per-cluster basis. And the terminal simply has to fit the grapheme cluster into the required matrix of cells. You can play with it using vtm built-in terminal (
The explicitly specified codepoint (joining modifier) is taken from the Unicode codepoint range 0xD0000-0xD02A2 (not allocated yet range), the value of which is encoded by the "wh_xy" literal value enumeration:
If you dive deeper, you can get the following things with rotation, mirroring and halves: |
I've updated the draft: Unicode Character Geometry Modifiers |
Btw, iTerm2 has ESC sequence to specify Unicode version for characters with detection:
|
I'm trying to find a solution to this rendering issue in Konsole.
This is a
TEditor
example showing how the line formatting gets shifted around due to those ❺ symbols. Also, the title of this window is clipped (there's no close parenthesis). These symbols are rendered slightly wider than a single cell, leading to the rest of the printed line being shifted out of alignment. Other terminals show this symbol in a single terminal cell, but Konsole is painting the line with variable character widths with chaotic effects.Here's my evolution of the ASCII table from tvdemo. On the right you can see some extra-wide characters that mess up the whole line. ⑪-⑯ and ❽-❿ are clipped. When you click on one, it selects the "wrong" character due to the rendering discrepancy.
Open to suggestions on this. Konsole is the only terminal I've tested so far with this issue.
The text was updated successfully, but these errors were encountered: