You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is a running list of all the (mildly to extremely) cursed encodings, and whether or not we should implement them. More can be suggested on Twitter here Here goes:
UTF-EBCDIC This may be patent-encumbered or license-checked, and therefore cannot be implemented.
UTF-7 This may be patent-encumbered or license-prohibited, and therefore cannot be implemented.
UTF-7-IMAP This may be patent-encumbered or license-prohibited, and therefore cannot be implemented.
UTF-1 Not a good encoding.
Some that might not be possible within the framework of this library:
Early Canjie input method translation: this is moreso a system of input that is then converted to characters, rather than a character set itself. It also seems to have a (potentially?) unbounded set of inputs that can produce an equally wild amount of outputs, making the encode_one/decode_one limitations potentially useless? Needs more research
The text was updated successfully, but these errors were encountered:
For what is worth, the Unicode Consortium published conversion tables for many of those encodings; conversion to unicode from these encodings end up being going through lookup tables; conversion back is likely the same for "properly normalized" unicode.
For what it's worth, I've already started working on lookup tables for most of the single and double-byte encodings. Albeit, they're not derived from the icu data, but from other sources.
This is a running list of all the (mildly to extremely) cursed encodings, and whether or not we should implement them. More can be suggested on Twitter here Here goes:
MULE_INTERNAL (Multilanguage Emacs internal encoding)Garbage encoding for an even more garbage text editor.UTF-EBCDICThis may be patent-encumbered or license-checked, and therefore cannot be implemented.UTF-7This may be patent-encumbered or license-prohibited, and therefore cannot be implemented.UTF-7-IMAPThis may be patent-encumbered or license-prohibited, and therefore cannot be implemented.UTF-1Not a good encoding.Some that might not be possible within the framework of this library:
encode_one
/decode_one
limitations potentially useless? Needs more researchThe text was updated successfully, but these errors were encountered: