-
-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Execution character set on Windows - add execution_windows_acp
and execution_windows_ocp
?
#45
Comments
Seems like something where we make the encoding classes ( I didn't think Right proper kind of messed up the Windows situation is, though. |
For "can we have encodings for |
Mm, I was more imagining the possibility of a backend based on Windows' EDIT: Also, to go back to an earlier statement:
It's bloody well not supposed to, that's for sure. In the C++17 standard, 24.5.5 paragraph 1 delegates to the C standard, and notes "ISO C 7.28". If we pull up the C11 spec, 7.28.1.3 specifically is
(emphasis mine) This, then, clearly constrains the three functions to process their input alike, and Microsoft is at least documenting their implementation of |
I should come back to this to state: I can't very well just use You can read about why it's unsuitable here: https://thephd.dev/the-c-c++-rust-string-text-encoding-api-landscape#windows-api |
On Windows, the question of "execution character set" (at least for narrow characters) is complicated by some additional factors:
OEM Code Page
(CP_OCP
), as is used for the consoleANSI Code Page
(CP_ACP
), as is used for the GUImbrto*
and*tombr
functions is, at least according to the documentation, inconsistent:mbrtowc
is documented as treating its input as the "current locale" (and you can play games with the.ACP
and.OCP
locales, accordingly)mbrtoc32
, on the other hand, is documented as treating its input as UTF-8 unconditionally.As a result, I'm not sure that there is currently any way that
ztd.text
currently handles the "execution character set" on Windows that provides the expected result under all circumstances:<cuchar>
/<uchar.h>
is affected by (2), as it usesmbrtoc32
iconv
is not availablecuneicode
only seems to have three approaches:ztdc_is_execution_encoding_utf8()
, which is false (or at least ought to be) unless the system code page has been set toCP_UTF8
/65001
mbrtoc32
, which falls to (2) abovereinterpret_cast
to treat the input as UTF-8, which is certainly not correct.I haven't verified at runtime that (2) actually presents itself, partly because while this documentation is for Visual Studio I'm using Embarcadero C++ Builder (and their standard library is sorely underdocumented and variable by version), and partly because issue (1) is the more pressing (the application I'm working with needs to interact with both, as we're currently in the midst of making it UTF-8 native, but need to retain the ability to interact with legacy files that were due to oversights written according to
CP_ACP
, and also need to emit data to the console in certain circumstances).As a result, is there any chance that flavors of the execution character set could be added for
CP_ACP
andCP_OCP
- or possibly for WindowsCP_*
values in general?The text was updated successfully, but these errors were encountered: