You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I copied a paragraph from a PDF and it contained hardcoded unicode italic characters which take 4 bytes in UTF-8 or 2 bytes in UTF-16. After pasting that to a markdown file and saving it in a file in UTF-8 encoding I started receiving Line length [Expected: 80, Actual: 85] warning, even though there are only 74 unicode characters displayed on the line (stored as 107 bytes).
(I assume the intention of the rule is to consider the "visual count of characters" as rendered in the editor - 74 in this case)
I may be missing some context or detail of the implementation but I think the issue is a combination of JS handling everything as UTF-16 rather than UTF-8 (that is the seemingly incorrect .length of the line reported) and the usage of regular "unicode-unaware" regular expressions, where . again matches on UTF-16 character.
So I think the correct way to handle these would be [...line].length to get the total length of the line and the inclusion of the u flag for the regular expressions to switch them to unicode mode.
The text was updated successfully, but these errors were encountered:
Hi, I copied a paragraph from a PDF and it contained hardcoded unicode italic characters which take 4 bytes in UTF-8 or 2 bytes in UTF-16. After pasting that to a markdown file and saving it in a file in UTF-8 encoding I started receiving
Line length [Expected: 80, Actual: 85]
warning, even though there are only 74 unicode characters displayed on the line (stored as 107 bytes).- $\forall 𝑣_1, 𝑣_2, \ldots, 𝑣_𝑛 \in 𝑇_𝑛: 𝑣_1, 𝑣_2, \ldots, 𝑣_𝑛 \in 𝐾 \iff
(I assume the intention of the rule is to consider the "visual count of characters" as rendered in the editor - 74 in this case)
I may be missing some context or detail of the implementation but I think the issue is a combination of JS handling everything as UTF-16 rather than UTF-8 (that is the seemingly incorrect
.length
of the line reported) and the usage of regular "unicode-unaware" regular expressions, where.
again matches on UTF-16 character.So I think the correct way to handle these would be
[...line].length
to get the total length of the line and the inclusion of theu
flag for the regular expressions to switch them to unicode mode.The text was updated successfully, but these errors were encountered: