-
-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add PLURAL support to TranslateWiki integration #77
Comments
@verdy-p if you have time we would love to see if you can assist the project by seeing if you can refactor the internationalisation including the PLURAL possibilities of TranslateWiki. |
@mzeinstra are there any specific messages for which PLURAL is needed? One example:
|
That is a good question. I presume that those strings that handle days have the priority. As they are presented as numbers (first, second of month). I'll start a conversation with our French and German speaking project members to see if they think that some of them need to have pluralisation enabled. |
There's no case in French where you need a plural for ordinal numbers of
days of months (and the only ordinal needing it in abbreviated form with
digits is "1<sup>er</sup>" or "premier" if spelled completely, for the
first day of the month, all other days using cardinals numbers).
Le mar. 13 déc. 2022 à 09:26, Maarten Zeinstra ***@***.***> a
écrit :
… That is a good question.
I presume that those strings that handle days have the priority. As they
are presented as numbers (first, second of month).
I'll start a conversation with our French and German speaking project
members to see if they think that some of them need to have pluralisation
enabled.
—
Reply to this email directly, view it on GitHub
<#77 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKSUG3AC3MFUNHHPYCHBMDWNAXMZANCNFSM5VHJZ4MA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@verdy-p It would be helpful to see which of these needs to be plural: Or are you only talking about the numbering of the months that need to differentiate between 1st, 2nd, etc.?
|
That's exactly that. And in French only "1er" is distinguished from other
day numbers which use the cardinal form:
"Le 1er, le 2 et le 31" (Le premier, le deux et le trente-et-un)
"Du 1er au 31 décembre" (Du premier au trente-et-un décembre)
Note that "Depuis un début inconnu jusqu’à $1," causes a problem when we
don't know the value of the date in $1:
- if there's a day number it becomes: "Depuis un début inconnu jusqu’au
$1," (e.g. "Depuis un début inconnu jusqu’au 1er décembre 2022," or "Depuis
un début inconnu jusqu’au 31 décembre 2022,")
- if there's no day number, it remains: "Depuis un début inconnu jusqu’au
$1," (e.g. "Depuis un début inconnu jusqu’à décembre 2022," or "Depuis un
début inconnu jusqu’à décembre 2022,")
So the precision of the date matters on the form of the preposition to be
used ("à" vs. "au", the later being a mandatory combination of the
preposition "à" with the following article "le"; the same occurs with "à"
vs. "au", the later being the mandatory combination of the preposition "de"
with the following article "le"), because there's an article "le" only with
known day numbers (or known weekday names and the day number is implied)
This problem also affects all your listed translations containing $1,
except those implying only a month or year precision. This does not affect
the translation "$3 $2 $1" for the date format ("daynumber
monthname yearnumber"), because an article is not needed where it occurs in
isolation (but it becomes needed inside sentences or after any preposition).
Unfortunately, there's still no parameter to specify which precision is
given with a date given in $1.
Le mar. 13 déc. 2022 à 15:49, Maarten Zeinstra ***@***.***> a
écrit :
… @verdy-p <https://github.com/verdy-p> It would be helpful to see which of
these needs to be plural:
Or are you only talking about the numbering of the months that need to
differentiate between 1st, 2nd, etc.?
Autour du $1 (incertain),
Autour du $1,
$1 (incertain),
Printemps,
Été,
Automne,
Hiver,
Printemps (hémisphère nord),
Été (hémisphère nord),
Automne (hémisphère nord),
Hiver (hémisphère nord),
Printemps (hémisphère sud),
Été (hémisphère sud),
Automne (hémisphère sud),
Hiver (hémisphère sud),
Premier trimestre,
Deuxième trimestre,
Troisième trimestre,
Quatrième trimestre,
Premier quadrimestre,
Deuxième quatrimestre,
Troisième quadrimestre,
Premier semestre,
Second semestre,
$3 $2 $1,
$1 $2,
jour $1 d’un mois inconnu de $2,
$1 avant J.-C.,
Année $1 avant J.-C.,
Année $1,
janvier,
février,
mars,
avril,
mai,
juin,
juillet,
août,
septembre,
octobre,
novembre,
décembre,
De $1 à $2,
Depuis $1 (fin indéterminée),
Jusqu’à $1,
Depuis $1 jusqu’à une fin inconnue,
Depuis un début inconnu jusqu’à $1,
heure locale,
Ensemble vide,
Tous celles-ci :,
Une de celles-ci :,
L’année $1 et toutes les précédentes,
L’année $1 ou la précédente,
L’année $1 et toutes les suivantes,
L’année $1 ou la suivante,
$1 et tous les mois précédents,
$1 et tous les mois suivants,
$1 ou le mois précédent,
$1 ou le mois suivant,
$1 et toutes les dates précédentes,
$1 et toutes les dates suivantes,
$1 ou une date précédente,
$1 ou une date suivante,
$1 ou une saison précédente,
$1 ou une saison suivante,
$1 et toutes les saisons précédentes,
$1 et toutes les saisons suivantes,
$1 et $2,
$1 ou $2,
Tous celles-ci : $1,
Une de celles-ci : $1,
Toutes les années de $1 à $2,
Tous les mois de $1 à $2,
Tous les jours du $1 au $2,
$1, $2 ou une année entre les deux,
$1, $2 ou un mois entre les deux,
$1, $2 ou un jour entre les deux
—
Reply to this email directly, view it on GitHub
<#77 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKSUGY7AMCPLTL7W774DRDWNCEGFANCNFSM5VHJZ4MA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Also, none of these examples need any distinguished plural form, they are
all using singular, independantly of day numbers, months, or year number.
Le mar. 13 déc. 2022 à 15:49, Maarten Zeinstra ***@***.***> a
écrit :
… @verdy-p <https://github.com/verdy-p> It would be helpful to see which of
these needs to be plural:
Or are you only talking about the numbering of the months that need to
differentiate between 1st, 2nd, etc.?
Autour du $1 (incertain),
Autour du $1,
$1 (incertain),
Printemps,
Été,
Automne,
Hiver,
Printemps (hémisphère nord),
Été (hémisphère nord),
Automne (hémisphère nord),
Hiver (hémisphère nord),
Printemps (hémisphère sud),
Été (hémisphère sud),
Automne (hémisphère sud),
Hiver (hémisphère sud),
Premier trimestre,
Deuxième trimestre,
Troisième trimestre,
Quatrième trimestre,
Premier quadrimestre,
Deuxième quatrimestre,
Troisième quadrimestre,
Premier semestre,
Second semestre,
$3 $2 $1,
$1 $2,
jour $1 d’un mois inconnu de $2,
$1 avant J.-C.,
Année $1 avant J.-C.,
Année $1,
janvier,
février,
mars,
avril,
mai,
juin,
juillet,
août,
septembre,
octobre,
novembre,
décembre,
De $1 à $2,
Depuis $1 (fin indéterminée),
Jusqu’à $1,
Depuis $1 jusqu’à une fin inconnue,
Depuis un début inconnu jusqu’à $1,
heure locale,
Ensemble vide,
Tous celles-ci :,
Une de celles-ci :,
L’année $1 et toutes les précédentes,
L’année $1 ou la précédente,
L’année $1 et toutes les suivantes,
L’année $1 ou la suivante,
$1 et tous les mois précédents,
$1 et tous les mois suivants,
$1 ou le mois précédent,
$1 ou le mois suivant,
$1 et toutes les dates précédentes,
$1 et toutes les dates suivantes,
$1 ou une date précédente,
$1 ou une date suivante,
$1 ou une saison précédente,
$1 ou une saison suivante,
$1 et toutes les saisons précédentes,
$1 et toutes les saisons suivantes,
$1 et $2,
$1 ou $2,
Tous celles-ci : $1,
Une de celles-ci : $1,
Toutes les années de $1 à $2,
Tous les mois de $1 à $2,
Tous les jours du $1 au $2,
$1, $2 ou une année entre les deux,
$1, $2 ou un mois entre les deux,
$1, $2 ou un jour entre les deux
—
Reply to this email directly, view it on GitHub
<#77 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKSUGY7AMCPLTL7W774DRDWNCEGFANCNFSM5VHJZ4MA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Note that we also use an article « le » for season names (they are all
singular in these formats) but possibly changed into « l’ » before a vowel
or mute h:
"le printemps, l’été, l’automne, l’hiver" (so this depends on the season,
they all have a masculine grammatical gender, so there's no "la" article)
After a preposition "de" or "à" (including "jusqu’à" which mandatorily
contracts "jusque" with a following "à"), this article "le" contracts to
"du" or "au", but not when it is muted with an apostrophe:
- from the season: "du printemps, de l’été, de l’automne, de l’hiver"
- to the season: "au printemps, à l’été, à l’automne, à l’hiver"
(alternative: "jusqu’au printemps, jusqu’à l’été, jusqu’à l’automne,
jusqu’à l’hiver")
- in/during a season: "au printemps, en été, en automne, en hiver"
A plural would occur only when using weekday names ("lundi(s), mardi(s),
..., dimanche(s)"), season names ("printemps, été(s), automne(s),
hiver(s)"; note that printemps is invariable), quarter names
("trimestre(s)"), half-years names ("semestre(s)"); but not with month
names ("janvier, ... décembre" are invariably singular, for the plural
meaning we'd expand them to "les mois de janvier, ..., les mois de
décembre"), but only over a larger period (e.g. a given month name over a
range of years, a given week day names over a full month):
- all Mondays in March: "tous les lundis en mars"
- all winters in years 2000 to 2022: "tous les hivers de 2000 à 2022"
But this case does not occur in the existing set of translations.
So in summary, in French, there cannot exist plurals for month names, but
there can exist distinguished plurals for weekday names, season names,
quarter names ("trimestre(s)"), third-year names ("quadrimestre(s)"),
half-year names ("semestre(s)"), but for specific cases still not covered
in EDTF.
Le mar. 13 déc. 2022 à 15:49, Maarten Zeinstra ***@***.***> a
écrit :
… @verdy-p <https://github.com/verdy-p> It would be helpful to see which of
these needs to be plural:
Or are you only talking about the numbering of the months that need to
differentiate between 1st, 2nd, etc.?
Autour du $1 (incertain),
Autour du $1,
$1 (incertain),
Printemps,
Été,
Automne,
Hiver,
Printemps (hémisphère nord),
Été (hémisphère nord),
Automne (hémisphère nord),
Hiver (hémisphère nord),
Printemps (hémisphère sud),
Été (hémisphère sud),
Automne (hémisphère sud),
Hiver (hémisphère sud),
Premier trimestre,
Deuxième trimestre,
Troisième trimestre,
Quatrième trimestre,
Premier quadrimestre,
Deuxième quatrimestre,
Troisième quadrimestre,
Premier semestre,
Second semestre,
$3 $2 $1,
$1 $2,
jour $1 d’un mois inconnu de $2,
$1 avant J.-C.,
Année $1 avant J.-C.,
Année $1,
janvier,
février,
mars,
avril,
mai,
juin,
juillet,
août,
septembre,
octobre,
novembre,
décembre,
De $1 à $2,
Depuis $1 (fin indéterminée),
Jusqu’à $1,
Depuis $1 jusqu’à une fin inconnue,
Depuis un début inconnu jusqu’à $1,
heure locale,
Ensemble vide,
Tous celles-ci :,
Une de celles-ci :,
L’année $1 et toutes les précédentes,
L’année $1 ou la précédente,
L’année $1 et toutes les suivantes,
L’année $1 ou la suivante,
$1 et tous les mois précédents,
$1 et tous les mois suivants,
$1 ou le mois précédent,
$1 ou le mois suivant,
$1 et toutes les dates précédentes,
$1 et toutes les dates suivantes,
$1 ou une date précédente,
$1 ou une date suivante,
$1 ou une saison précédente,
$1 ou une saison suivante,
$1 et toutes les saisons précédentes,
$1 et toutes les saisons suivantes,
$1 et $2,
$1 ou $2,
Tous celles-ci : $1,
Une de celles-ci : $1,
Toutes les années de $1 à $2,
Tous les mois de $1 à $2,
Tous les jours du $1 au $2,
$1, $2 ou une année entre les deux,
$1, $2 ou un mois entre les deux,
$1, $2 ou un jour entre les deux
—
Reply to this email directly, view it on GitHub
<#77 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKSUGY7AMCPLTL7W774DRDWNCEGFANCNFSM5VHJZ4MA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thank you @verdy-p for providing additional context. To get this as concrete as possible for the developers, would you mind going to https://edtf.wikibase.wiki/wiki/Property:P1 (switch language to French in upper rights corner) to see which of the humanisations need pluralisation for french? Really concrete examples help the developers to focus on those parts of the tool to increase the humanisation of the strings. For example, is this a good humanisation with pluralisation?
|
@mzeinstra translators are now able to use PLURAL in their translations (code change). Is there anything else that needs to happen before we can close this task? |
@JeroenDeDauw is this also available on Translatewiki? If so we'll ask our team-members to have a look at it for French. Is it also possible to document the new feature in the main and on the relevant page on translatewiki. |
Yes, translators on TW can now use PLURAL in their translations. Though note that for various reasons, the original ordinal suffixing (ie 1st, 2nd, 3rd, etc) is still used. So translators should not use PLURAL for these values. What is the relevant page on TW? |
I assumed there was a project page on TW, like https://translatewiki.net/wiki/Translating:OpenStreetMap, but there is none. Would that be https://translatewiki.net/wiki/Translating:EDTF then? |
We are already getting translations for this lib via twn. All messages available here are also available on twn: https://translatewiki.net/w/i.php?title=Special:Translate&language=de&group=mwgithub-edtf&filter=&action=translate Not sure / cannot tell, if twn likes to create an individual project for this lib. |
@kghbln Yes we have a lot of translations already, however if I understand the change correctly we now allow all parameters to have pluralisation, right? That means that we have to communicate that to that community that this is possible. |
That is correct @mzeinstra |
No, it is not correct because of the required contraction of "de autour" into "d’autour". This means that "de $1 à $2" mutates into "d’$1 à $2" But note that "autour du" (meaning "circa") has also changed the leading preposition. When we have a simple date in $1 (without "circa"), the date takes a article "le" (that MUST be contracted with the previous "de" into "du"). This means also that there's a mutation from "de $1 à $2" into "du $1 au $2" or "d’$1 à $2", depending on the value of $1. But there's also a variation of the second proposition "à" (or "jusqu’à", preferred when $1 contains both a time, because "à" translates both "to" for the end of the interval and "at" for setting a time), into "au" (or "jusqu’au", when $2 starts by a date and not a time). For all these reasons the English format "from $1 to $2" does not have a a single translation: they depend on the precision of values in $1 and $2. In CLDR data, there are separate items for translating date/time intervals depending on precision of values. Look at Unicode CLDR data: this is much more accurate than what EDTF provides and still cannot translate correctly. In fact I really think that EDTF is absolutely not needed at all: everything it does (and partially documents) is FULLY covered and documented in CLDR, with MANY examples already translated in many more languages. So I strongly suggest deprecating EDTF, or reimplementing based on CLDR (eg. with ICU4C, ICU4Java or the newest ICU4x, all documented and supported by the Unicode Consortium and open-sourced; Note that ICU4X is now fully supported and offers many more binfings than ICU4C and ICU4J, to support more languages, it is easier to integrate into Mediawiki, and offers significant security advantes and its code coverage is much more tested, even if some earlier bugs in ICU4C and ICU4J were fixed by retroporting coverage tests and code fixes discovered in in ICU4X). But if you cannot reimplement EDTF based on CLDR data (or do not want to integrate ICU4X), make sure you look at the data alerady covered: what I described above about French also applies to other languages using common contractions for prepositions and articles, including for example Italian or Spanish, and add the necessary translatable items to fix the initial bad assumptions made on intervals. You may still maintain "compatibility items" in EDTF, but mark them as deprecated in favor of more precise items (where the precision of datetime variables is explicity specified). Also I request again that you avoid forcing a leading capital in sources (e.g. "from $1 to $2" and not "From $1 to $2", "yesterday" and not "Yesterday"...) and that all translations should use uncapitalized terms (unless these terms are always capitalized like "Monday" or "March" in English), i.e. like entries in dictionaries. The capittlization at start of a sentence or title can be infered. CLDR does not force the capitalisation in any one of these translatable terms. |
CLDR supports EDTF? Don't think so. This is what EDTF stands for https://www.loc.gov/standards/datetime/ Looks like CLDR is a MediaWiki extension that thus cannot be used standalone. |
@verdy-p Thanks for the feedback. I see your argument on capitalisation, I will spin that off into its own issue. We also know that generic humanisation of any dates/times into different natural languages has its limitations. We are probably not able to use generic humanisation when different parts of the date has impact on the other parts of the date. We should accept the limitation of the current iteration and improvements of this tool. Please remember that as @JeroenDeDauw already said, this is an open source repository for the standalone library for EDTF, which is subsequently used in the extension Wikibase-EDTF. It can also be used in other systems that want to adopt or humanise EDTF strings. This is not the place for discussions on whether EDTF is fit for purpose at Wikidata, please address those comments to the appropriate platform. |
My comment was not about if it is fine or not in Wikidata or Wikibase (In fact EDTF is also questionable for its indirect use in Wikibase-EDTF as well). But whever we want to maintain translation of EDTF formats as an integrant part of EDTF, developed separately, ot if we should think about refactor EDTF itself based on CLDR, which already performs (with ICU) and translates (with CLDR data) absolutely EVERYTHING that EDTF wants to support. What I mean is that it is independant of the choice of EDTF as an interface used in Wikidata or Wikibase: Wikidata/Wikibase are themselves based on MediaWiki, which is ALREADY integrating CDLR data and the ICU library for many things, and I just don't understand the need to deviate from CLDR data with is already vetted and already has a much broader coverage, where all existing issues above have already been highly discussed and are ALREADY solved. I just then view EDTF as a "poorman" implementation that is far below and wants to reinvent things that have already been solved in CLDR, and is already widely used. The only interest of EDTF is then just to allow integrating overrides as a workround for some translations that CLDR still has not been able to vet and release (because CLDR vetting is extremely slow and for many minority languages, it takes considerable timeto have them supported, whereas Wikimedia can support them faster in a community effort; but as soon as CLDR data is available, it should become the standard and EDTF data would progressively be deprecated, allowing wikis to make a transition if they need some temporary stability, e.g. if EDTF-formatted dates are used in pagenames). We had the same issue with translation of language names: CDLR data is impressive, but MediaWiki includes its own limited set of overrides (to avoid using fallbacks to other supported languages), and wikis themselves have their own local overrides to what Mediawiki features. In conclusion EDTF can remain as a useful transition library but in the long term, it should allow wikis to converge to the international CLDR standard (which is also used in many other non-Wikimedia projects, including various other i18n libraries like the standard libraries in C/C++ or PHP, that all wikis are also using, and many system libraries, components and protocols). A transition scheme is useful, but already you should ink about rebasing EDTF to solve many existing issues. CLDR is the way to go (and ICU4x allows EDTF to do that quite simply): translating date and time values is a very common task that all development frameworks need to integrate for their i18n support. And here we need convergence (EDTF just fills a small niche but is far bhind what modern apps need and already use today; and this is not jsut about translation, because date and time values have legal concerns and are focused for security, we cannot translate them as we want and must avoid all ambiguities, so we cannot do that alone in a very tiny developers team and a few translators that for now can't properly do their work as expected) |
I am still not sure what exactly you mean. Do with "EDTF" you mean "the i18n code part of the EDTF library"? |
As far as I know, the answer to both of these questions is NO:
|
|
So with "CLDR" are you not talking about https://www.mediawiki.org/wiki/Extension:CLDR? From the links you provided, it is not at all clear to me there is support for the Extended Date Time Format that we can simply use instead of this library. |
Please note that this thread was started about translatability (we don't care here about the custom syntax used to represent dates with EDTF in a locale-neutral way, however that EDTF form may be considered as being a specific locale, just like the "root" locale in CLDR, or the "POSIX" locale: EDTF just defines its own "language" just like POSIX does in legacy C libraries) But beside that, it should parse formatted dates (including those in EDTF form) into some Datetime object or DatetimeRange object, or DatetimeSet object, and then to format these objects into human-readable texts (or back to EDTF form), it can perfectly use CLDR data which contains almost all the needed formats and translations (except may be a couple additional qualifiers for incertainty or approximation, but these should rapidly be supported in CLDR data as well; existing translations made in TWN should then only be needed to increase the coverage for more languages or locales that CLDR still does not provide, or if there's a need for overrides). What will remain to translate for EDTF will be dramatically smaller (and most existing translations may be deprecated as no longer needed). |
A pull request demonstrating translation via CLDR is definitely welcome |
This idea for further developing this EDTF library is based on the question of #76
Currently, the library does not support PLURAL messages from translatewiki, whereas this would enhance the readability of humanised EDTF strings.
Translatewiki does support PLURAL messages see: https://translatewiki.net/wiki/Plural
internationalisation was build from scratch, see: https://github.com/ProfessionalWiki/EDTF/tree/master/src/PackagePrivate/Humanizer/Internationalization
The text was updated successfully, but these errors were encountered: