Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should add flags for filename encodings #43

Open
fragglet opened this issue Mar 29, 2023 · 2 comments
Open

Should add flags for filename encodings #43

fragglet opened this issue Mar 29, 2023 · 2 comments

Comments

@fragglet
Copy link
Owner

fragglet commented Mar 29, 2023

The code currently does no translation for filename encodings and there are a variety of different ways that filenames can be encoded. In particular Shift-JIS and EUC support are important since lha format is/was very popular in in Japan. These unfortunately will need to be manually specified since as far as I know there is no way to detect the encodings. We should internally translate everything to UTF-8.

There are some extended ASCII formats that can be reasonably autodetected based on the OS field: for example CP437 is probably a sensible default for DOS archives (or the system codepage when running on Windows) , and Mac Extended ASCII for macOS archives. If the encoding cannot be determined then non-ASCII characters should become the Unicode replacement character.

With this in place we can relax the "safe print" code currently in place, although it's still important to never print a terminal escape character or anything in the C0/C1 control character ranges (and probably the specials range too)

@gryf
Copy link

gryf commented Oct 22, 2023

Also, lha has been popular on Amiga OS. Default encoding seems to be Latin1, although there are different mappings for countries, which doesn't easily fall into Latin1. I guess, auto detection for corner cases could be difficult if not possible. Perhaps an external mapfile as an command line option could help in such situation, so that lhasa doesn't need to do make assumptions.

@polluks
Copy link
Contributor

polluks commented Dec 18, 2023

Indeed, Latin1 looks strange

...
[generic]                  909    2192  41.5% -lh5- 651e Nov 24  2018 AmiArcadia/Source/generic/espa�Ðl.ct
[generic]                  935    2225  42.0% -lh5- 3231 Nov 24  2018 AmiArcadia/Source/generic/franíÂis.ct
...

http://aminet.net/package/misc/emu/AmiArcadiaMOS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants