-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why is there no Intl.Locale.prototype.variants
?
#900
Comments
@zbraniecki Thoughts on this? |
TG2 discussion: https://github.com/tc39/ecma402/blob/main/meetings/notes-2024-11-25.md#why-is-there-no-intllocaleprototypevariants-900 There were questions about motivation (most use cases for variants are better served by a corresponding Unicode extension keyword), as well as the shape of this getter (does it return a list? is the list sorted? or does it return a string with multiple subtags?) |
I see, thanks for having the discussion! Since when is the variants part in CLDR "deprecated" though? I sadly can't remember my exact use-case and should have included it in my original post, but I think it was about two things:
The latter got added to the IANA language subtag registry in March this year. I know that isn't CLDR, but I was under the impression that this file is the "source of truth" for registered language subtags used in CLDR and everything else. I also don't see any kind of "deprecation" of variants here: https://www.unicode.org/reports/tr35 Regarding the type of an eventual I don't know what
is referencing and I don't see the difficulty of parsing variants. They can only ever be 5-8 long alphanumeric strings and they can only be followed by extensions and private use tags, so what's wrong with While we're at it, I don't see a reason why |
A little more context: I think people on the call were referring to variants as "legacy" or "deprecated" because of the following issues:
In other words, the comments from the discussion were based on the point of view that variants are basically a grab bag of things that would be better expressed as more specific locale extensions. Personally, I still think variants are motivated because they remain the standard way of expressing orthographies. Something like
Returning an ECMAScript |
I don't think it would, because ECMAScript Set instances are deterministically ordered. |
Yeah, I mainly suggested the |
Regarding the ordering of variants: I don't really think the array or set needs to be ordered in any specific way other than "the same as supplied". Both CLDR and IANA, if I understand correctly, just define a recommended or canonical way to order them in the context of language subtags, not in the context of JavaScript arrays. And AFAIK implementers need to be able to understand any ordering. One could even argue that it's more expected if the ordering is the same as the user specified it, even if it's "wrong". So in general, the ordering, of all things, shouldn't be the blocker here. |
BCP47 suggests that the order in the original language tag carries meaning, with earlier subtags subordinating later ones. Specifically, item 6 under section 4.1 (Choice of Language Tag) says:
This means that the order should be preserved when there are two or more (and presuming, for the moment, that the tag's author has paid attention to the details in the registry as well as the text just above). Unicode/CLDR says some different things about the ordering. In practice, the variants are only useful in very specific applications, most of which have nothing to do with locales. In either case, the original order affects tag matching using one of the fallback schemes, and so should probably be preserved by |
Unfortunately I believe the ordering is one of the main issues that needs to be resolved. We have two specs, IETF and UTS 35, which disagree on the ordering (preserved or alphabetical). ECMA-402 mostly follows Unicode's reckoning of locale identifiers, so it would follow that variants should be alphabetical. However, variants are most useful in IETF's reckoning, where order is preserved. What currently happens with variant ordering in Intl.Locale.prototype.toString? Can we follow that? |
The current reality in Chrome at least is this: (new Intl.Locale("de-bcdefg-abcdefg-12345-1000000-1996")).toString() === "de-1000000-12345-1996-abcdefg-bcdefg"
(new Intl.Locale("sl-IT-rozaj-biske-1994")).toString() === "sl-IT-1994-biske-rozaj" So it's basically just alphabetic with no special numeric handling ("1000000" ≺ "12345" ∧ "12345" ≺ "abcdefg" ∧ "abcdefg" ≺ "bcdefg"). I couldn't find anything about ordering here or anywhere else in ECMA402, so I guess Another resource that says basically the same as that BCP47 section and what @sffc has said: https://www.w3.org/International/questions/qa-choosing-language-tags#variants Both, that BCP47 section and that W3 link, claim that the ordering of variants helps with interoperability but don't get more specific, so I'm really unsure if this is actually the case. Like, is there any application out there that would completely break down if I give it a Either way, I get that the ordering is important, within language tag strings. But this would be addressed by fixing const describeBookLanguage = (bookName, locale) => {
const prefix = `${bookName} is written in`;
let languageLabel = "Sanskrit";
if (locale.language === "sa" && locale.variants.length !== 0) {
// don't care about the order here
if (locale.variants.includes("itihasa") {
languageLabel = `Epic ${languageLabel}`;
}
if (locale.variants.includes("bauddha") {
languageLabel = `Bhuddist Hybrid ${languageLabel}`;
// "Bhuddist Hybrid Epic Sanskrit" is technically possible here but probably not real
}
}
else if (locale.language === "cls") {
languageLabel = `Classical ${languageLabel}`;
}
else if (locale.language === "vsn") {
languageLabel = `Vedic ${languageLabel}`;
}
return `${prefix} ${languageLabel}.`;
} Or is it expected that something like the below should also work? const firstPart = "sl-IT";
const secondPart = "rozaj-biske-1994";
const localeString1 = `${firstPart}-${secondPart}`; // "sl-IT-rozaj-biske-1994";
const locale = new Intl.Locale(localeString1);
const localeString2 = locale.toString(); // "sl-IT-1994-biske-rozaj";
const variants = locale.variants; // ["1994", "biske", "rozaj"]
const localeString3 = `${firstPart}-${variants.join("-")}`; // "sl-IT-1994-biske-rozaj"
const allSame = localeString1 === localeString2 && localeString2 === localeString3; // false, but should this be true? The below would also be an issue if one expects a specific ordering of variants, but again, I don't think that expectation exists. const slovenianVariantDescriptionParts = new Map([
["rozaj", "Resian"],
["biske", ", San Giorgio dialect"],
["lipaw", ", Lipovaz dialect"],
["njiva", ", Gniva dialect"],
["osojs", ", Oseacco dialect"],
["solba", ", Stolvizza dialect"],
["1994", ", in standardized 1994 orthography"]
]);
const describeSlovenianLanguageUsed = (locale) => {
if (locale.language !== "sl") {
throw new Error("Not Slovenian");
}
if (locale.variants.length === 0 || !locale.variants.includes("rozaj")) {
return "Slovenian";
}
return locale.variants
.map(variant => slovenianVariantDescriptionParts.get(variant))
.join("");
// depending on the order of variants, this could result in:
// - "Resian, San Giorgio dialect, in standardized 1994 orthography" ✅
// - ", Gniva dialect, in standardized 1994 orthographyResian" ❌
// - ", in standardized 1994 orthographyResian" ❌
// - ", Stolvizza dialectResian" ❌
}; I also still don't agree with this sentiment:
Again, maybe I'm misunderstanding something, but I'm not saying this is the most important thing ever, but I also wouldn't disregard variants as something "deprecated" or "only useful in very specific applications, most of which have nothing to do with locales". So all in all, I think the ordering of |
For reference, how the ordering issue has been "solved" in a past issue: #330 (comment) |
Sorry for the triple comment but here a quote from UTS 35 which ECMA402 follows, as far as I understand now:
|
For the record, it is defined:
|
Ok, so |
@sffc That seems logical to me with the only addition that it might needs to be |
Why is there no
Intl.Locale.prototype.variants
? There are getters forlanguage
,region
andscript
but I saw no information about the reasonvariants
is missing or shouldn't be there as well.The text was updated successfully, but these errors were encountered: