-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spanish Implementation #29
base: master
Are you sure you want to change the base?
Conversation
The Lang enum should be really small since it doesn't contain any inherent data. So it should be dirt cheap to copy
Alright, this should be about it. (except for mod.rs because I was too lazy to undo rustfmt on that. Shame on me) Rustfmt should be easily be doable in another pull request by just re-adding the the undonde derives could be additionally done in another PR. and optionally fix clippy warnings in another PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misread the commit content. This commit should've been add missing docs
Sorry for making you go through all of this trouble! and thank you very much for taking the time to work on it, it's very kind :) I'll see the PR more in depth this weekend! If you're not planning on redoing the derives or rustfmt, I'll take on that job once this is merged. |
I was thinking about figuring what rustfmt config was the one that onelined the chained methods and see if it was alright to disable it. The reason is because if the method chaining gets one-lined, then the type-hint for each method return on the chain will not be hinted by rust-analyser. but anyways, the chainings tends to be short more often than not, so the loss of the type-hint which can only be seen in an IDE isn't that relevant |
Hi, I was thinking about making a PR Benching with Criterion with another branch for benchmarks. Mainly I want to see what issues can arises when I expect merge conflicts( expecting so because I had to add some derives to do the benchmark) The reason was to see how the spanish implementation fared because I did a fix to the unneeded 0 spam that the split_thousands() method created. (Ref: link) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you again for taking the time to participate! I'm sorry I am relatively slow, I've got a lot going on lately.
Feel free to take a look at the comments I've put :)
@@ -25,6 +25,7 @@ AVAILABLE LANGUAGES: | |||
fr: French (France and Canada) | |||
fr_BE: French (Belgium and the Democratic Republic of the Congo) | |||
fr_CH: French (Swiss Confederation and Aosta Valley) | |||
es: Spanish |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto the pokemon?
|
||
let word = to_words(BigFloat::from(21.01), Outputs::Currency, &[]); | ||
assert_eq!(word.unwrap(), "veintiún US dollars con un centavo"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw, compared the results from es.rs to python num2words and see differences
I don't mind much the differences in megas as it depends a lot on the source. however mil trillones
makes me wonder, is this what should actually be the number?
10000000000000
diez billones
diez trillones
100000000000000
cien billones
cien trillones
1000000000000000000
un trillón
un quintillón
1000000000000000000000 <=== here
mil trillones
un sextillón
10000000000000000000000000
diez cuatrillones
diez septillones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's different?. Is Mil Trillones from Python?
I followed the standard of Each thousands changes the Millard because It was a bit confusing that a Billion*1000 is called the equivalent in spanish of "thousand billions" instead of "trillion"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://es.wikipedia.org/wiki/Anexo:Nombres_de_los_n%C3%BAmeros_en_espa%C3%B1ol
So according to this source, it starts to jump by 1_000_000 instead of 1_000 after the trillion
As summary, English's Sextillion is 10^21
Spanish's Sextillion (according to this particular page in wiki) is 10^36
I guess it would still best to discuss with someone more familiarized with spanish, because if this really is the spanish standard, I'm pretty sure almost none will get spanish's quadrillon right (Because it actually is "A Million Trillion")
based on [Real Academia Española](https://www.rae.es/dpd/ordinales) standards, the TENS of 2 have no space between itself and its units. `"la primera y a la segunda decena se pueden escribir en una o en dos palabras, siendo hoy mayoritaria y siempre preferible la grafía univerbal"`
From Ballasi#29 (comment), fix ordinal representation based on the rules defined in 2.b && 2.d @ https://www.rae.es/dpd/ordinales
Alright. That should be it |
Oh, btw. I'm working on translating the currencies. Might submit a commit by the end of the day |
Fix the wrong semantics that was used. Spanish semantics follows this pattern of for each big milliard (Million, Billion, Trillion) grows at a pace of `1 000 000`^n This means that 1 million is `1 000 000`^1 1 billion => `1 000 000`^2 [1 000_000 000_000] 1 trillion => `1 000 000`^3 [1 000_000 000_000 000_000] 1 quadrillion => `1 000 000`^4 [......] ...... etc This semantic differs from english's 1 billion => `1 000`^3 [1 000 000_000] 1 trillion => `1 000 000`^4 [1 000_000 000_000] 1 quadrillion => `1 000 000`^4 [......]
* Fix Tens edge case for Ordinals based on [Real Academia Española](https://www.rae.es/dpd/ordinales) standards, the TENS of 2 have no space between itself and its units. `"la primera y a la segunda decena se pueden escribir en una o en dos palabras, siendo hoy mayoritaria y siempre preferible la grafía univerbal"` * Fix Ordinals Representation From Ballasi#29 (comment), fix ordinal representation based on the rules defined in 2.b && 2.d @ https://www.rae.es/dpd/ordinales * Fix Tests & remove comments * Currency Translation for spanish * Cardinal Fix - Mistakenly used English semantics (#1) Fix the wrong semantics that was used. Spanish semantics follows this pattern of for each big milliard (Million, Billion, Trillion) grows at a pace of `1 000 000`^n This means that 1 million is `1 000 000`^1 1 billion => `1 000 000`^2 [1 000_000 000_000] 1 trillion => `1 000 000`^3 [1 000_000 000_000 000_000] 1 quadrillion => `1 000 000`^4 [......] ...... etc This semantic differs from english's 1 billion => `1 000`^3 [1 000 000_000] 1 trillion => `1 000 000`^4 [1 000_000 000_000] 1 quadrillion => `1 000 000`^4 [......]
I have a question if you can get the answer I'll be leaving it as the first meanwhile, but should it be like the second case? assert_eq!(
ordinal(124_001_091_000_000_000_001),
"ciento veinticuatrotrillonésimos milnoventa y unobillonésimos primeros"
// "ciento veinticuatrotrillonésimos mil noventa y unobillonésimos primeros"
); |
* Staging changes for ordinal Fix Some data to take into consideration Todo: Fix accent on monomorphized words i.e. VigesimoSegundo Si el ordinal se escribe en dos palabras, el primer elemento mantiene la tilde que le corresponde como palabra independiente: vigésimo segundo, vigésima cuarta, trigésimo octavo, cuadragésima quinta; pero, si se escribe en una sola palabra, el compuesto resultante, al ser ser una voz llana terminada en vocal, debe escribirse sin tilde, pues no le corresponde llevarla según las reglas de acentuación (v. cap. II, § 3.4.5.1.1): vigesimosegundo (no ⊗vigésimosegundo). Los ordinales complejos escritos en una sola palabra solo presentan variación de género y número en el segundo componente: vigesimoprimero, vigesimoprimera, vigesimoprimeros, vigesimoprimeras; pero, si se escriben en dos palabras, ambos componentes son variables: vigésimo primero, vigésima primera, vigésimos primeros, vigésimas primeras. No se consideran correctas las grafías en dos palabras si se mantiene invariable el primer componente: ⊗vigésimo segundos, ⊗vigésimo cuarta, ⊗vigésimo octavas. * temporarily add rustfmt back * Finish Ordinal Fix. Reference: https://www.rae.es/dpd/ordinales * remove accent on composites of `Vigesimo` * remove rustfmt for merge
I couldn't redo the Cardinal implementation. |
Hey guys! Native Spanish speaker here. Is there anything that you need help with? I'd love to use this crate in a project that I'm currently developing |
Priority close to 0:
Otherwise, everything should be fine. Other than big numbers which can lose precision |
Hey @Laifsyn! I just review the "Semantically Incorrect" numbers that you mention and as far as I know almost the complete Latin-america region use those big numbers exactly in the same way as in English, so a direct translation of them should be precise enough. Later versions of this crate could include some other forms of Spanish cardinals numbers but knowing that they are rarely used this should be not a priority. But let me ask some friends in Colombia and Argentina (Countries that use particularly high number prices in the every day life) to confirm that. In the other hand, let me test the USD and PESOS translation, those are the currencies that I'm familiar with. And last but not least, I just want to say a big thank you to make the Spanish implementation possible! |
Submitting as draft because
1-) I'm unclear how pull requests actually works
2-) I haven't properly made an Integration Test because I found it extremely un-ergonomic to constantly build the Num2Words settings after every .to_word() call
I also submitted a .devcontainer configuration because I found it extremely helpful to be able to test stuffs in a virtual environment specially so when I didn't have available a beefy-enough computer (I might've forgotten to add Rust Docs Viewer in the extensions though)