Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't Cursed Open Inside #21

Open
11 of 15 tasks
ThePhD opened this issue Aug 13, 2021 · 3 comments
Open
11 of 15 tasks

Don't Cursed Open Inside #21

ThePhD opened this issue Aug 13, 2021 · 3 comments
Assignees
Labels
enhancement New feature or request good first issue Good for newcomers

Comments

@ThePhD
Copy link
Contributor

ThePhD commented Aug 13, 2021

This is a running list of all the (mildly to extremely) cursed encodings, and whether or not we should implement them. More can be suggested on Twitter here Here goes:

Some that might not be possible within the framework of this library:

  • Early Canjie input method translation: this is moreso a system of input that is then converted to characters, rather than a character set itself. It also seems to have a (potentially?) unbounded set of inputs that can produce an equally wild amount of outputs, making the encode_one/decode_one limitations potentially useless? Needs more research
@marzojr
Copy link

marzojr commented Apr 1, 2023

For what is worth, the Unicode Consortium published conversion tables for many of those encodings; conversion to unicode from these encodings end up being going through lookup tables; conversion back is likely the same for "properly normalized" unicode.

The data can be found here: https://github.com/unicode-org/icu-data.

@ThePhD
Copy link
Contributor Author

ThePhD commented Apr 1, 2023

Yeah, I've seen that!

For what it's worth, I've already started working on lookup tables for most of the single and double-byte encodings. Albeit, they're not derived from the icu data, but from other sources.

See here: https://github.com/soasis/encoding_tables

@marzojr
Copy link

marzojr commented Apr 2, 2023

Oh, nice! I was going by the your encoding docs, which, I guess, are out of date then.

@ThePhD ThePhD self-assigned this May 16, 2023
@ThePhD ThePhD added enhancement New feature or request good first issue Good for newcomers labels May 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants