Should add flags for filename encodings #43

fragglet · 2023-03-29T15:42:31Z

The code currently does no translation for filename encodings and there are a variety of different ways that filenames can be encoded. In particular Shift-JIS and EUC support are important since lha format is/was very popular in in Japan. These unfortunately will need to be manually specified since as far as I know there is no way to detect the encodings. We should internally translate everything to UTF-8.

There are some extended ASCII formats that can be reasonably autodetected based on the OS field: for example CP437 is probably a sensible default for DOS archives (or the system codepage when running on Windows) , and Mac Extended ASCII for macOS archives. If the encoding cannot be determined then non-ASCII characters should become the Unicode replacement character.

With this in place we can relax the "safe print" code currently in place, although it's still important to never print a terminal escape character or anything in the C0/C1 control character ranges (and probably the specials range too)

gryf · 2023-10-22T16:38:03Z

Also, lha has been popular on Amiga OS. Default encoding seems to be Latin1, although there are different mappings for countries, which doesn't easily fall into Latin1. I guess, auto detection for corner cases could be difficult if not possible. Perhaps an external mapfile as an command line option could help in such situation, so that lhasa doesn't need to do make assumptions.

polluks · 2023-12-18T10:25:12Z

Indeed, Latin1 looks strange

...
[generic]                  909    2192  41.5% -lh5- 651e Nov 24  2018 AmiArcadia/Source/generic/espa�Ðl.ct
[generic]                  935    2225  42.0% -lh5- 3231 Nov 24  2018 AmiArcadia/Source/generic/franíÂis.ct
...

http://aminet.net/package/misc/emu/AmiArcadiaMOS

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should add flags for filename encodings #43

Should add flags for filename encodings #43

fragglet commented Mar 29, 2023 •

edited

Loading

gryf commented Oct 22, 2023

polluks commented Dec 18, 2023

Should add flags for filename encodings #43

Should add flags for filename encodings #43

Comments

fragglet commented Mar 29, 2023 • edited Loading

gryf commented Oct 22, 2023

polluks commented Dec 18, 2023

fragglet commented Mar 29, 2023 •

edited

Loading