Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change from English names of languages to ISO 639 for internal recognition of languages #807

Open
PonPonTheBonBon opened this issue Jan 20, 2024 · 1 comment

Comments

@PonPonTheBonBon
Copy link
Contributor

As of now, all translation-files of UltraStar use the English names of the languages (except Basque written as "Euskara"), and for the language-covers to work in UltraStar, the language of the txt-files must be written in English (except Colognian written as "Kölsch"). So I suggest that these two, and any other reference to languages, are written in the 2-letter ISO 639-code if available, otherwise the 3-letter ISO 639-code or other appropriate code.

Currently, when changing language of UltraStar, the languages are ordered by the names of the files, which causes "Deutsch" to be sorted as G, and "Español" to be sorted as S. If these are instead changed to their codes in ISO 639 alpha-2, then "Deutsch" sorts as D, and "Español" sorts as E. While an improvement, it isn't perfect since "Suomi" will still short as F and "日本語" (nihongo) sorts as J, but I would argue it's better than how it is now.

Currently, when grouping songs by language, the languages are listed in English regardless of language chosen for UltraStar, and when editing a song in UltraStar, the language is also listed in English. For the txt-files, instead of writing #LANGUAGE:English, you could write #LANGUAGE:en or perhaps better as #LANG:en to save 4 more bytes, and treat "LANG" as the argument that takes ISO 639 to avoid confusion with older verisons. The language-files for UltraStar would then include a list of names of languages, for example LANG_EL=Greek LANG_EN=English LANG_ES=Spanish, UltraStar can then properly show the correct cover while still displaying the languages in the user's chosen language. The language of the song could also be written when selecting a song to sing, since the language of the song is a major factor in karaoke. If a song lacks the ISO 639-code and instead has it written as #LANGUAGE:English, UltraStar can treat this as an alias for en as compatiblity with older files.

Since songs can include more than one language, the languages can be separated by comma, such as #LANG:en,es. It would display in UltraStar as "English, Spanish", showing that English is the primary language, and that there are some amount of Spanish in the song as well. When grouping songs by language, the user could be given an option how to treat multilingual songs, for example: 1. duplicate the song in each language it features, 2. only group it in its primary language, 3. treat the combined languages as a new language, as examples.

I do understand that UltraStar has existed for a long time and this would be a big change. But websites like UltraStar DataBase which already modifies uploaded lyrics-files, could update the field for language to the correct ISO 639-code, and other databases could do similarly. Plus that UltraStar can still support the old method of using English names of languages for backwards compatibility, but that the ISO 639-codes would be the preferred attribute moving forward.

Summary

  • Rename the translation-files to the 2-letter-codes.
  • Rename language-based covers to the 2-letter-codes.
  • Allow and prefer using ISO 639-codes in the lyrics-files.
  • Allow more than one language in the file for songs with multiple languages.
  • Display all references to languages in UltraStar in the user's language (except for the option of switching language).
  • Display the language of the song when picking a song in the user's language.

Table of languages in UltraStar

Language ISO 639 Transl. Cover Note
Austrian de-AT X Isn't this an odd inclusion?
Bavarian bar X
Catalan ca X
Chinese zh X X
Croatian hr X
Czech cs X
Danish da X X
Dutch nl X X
English en X X
Euskara eu X "Basque" is used by CLDR
Finnish fi X X
French fr X X
Gaelic gd X "Scottish Gaelic" is used by CLDR
Galician gl X
German de X X
Greek el X X
Hungarian hu X
Icelandic is X
Italian it X X
Japanese ja X X
Kölsch ksh X "Colognian" is used by CLDR
Luxembourgish lb X
Norwegian no X X Perhaps 'nb' and 'nn' should be used instead?
Peruvian es-PE X Isn't this an odd inclusion?
Polish pl X X
Portuguese pt X
Romanian ro X X
Russian ru X X
Serbian sr X
Slovak sk X
Slovenian sl X / The cover is using the flag of Russia
Spanish es X X
Swedish sv X X
Turkish tr X
@PonPonTheBonBon
Copy link
Contributor Author

I was bored and wrote the translations and syntax for all languages in UltraStar, and the translations can be found here: https://pastebin.com/x0Pxr5gW

If any additional languages are requested to be added, I am willing to help. These translations are taken from CLDR, which has free data to be used, otherwise from Wikipedia, Wiktionary or Wikidata, or given a very close guess by me (mostly related to Bavarian, Austrian and Peruvian).

@PonPonTheBonBon PonPonTheBonBon changed the title Move from using English names of languages to ISO 639 Change from English names of languages to ISO 639 for internal recognition of languages Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant