Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Italian translation #257

Open
Tnonis90 opened this issue Mar 19, 2024 · 23 comments
Open

Add Italian translation #257

Tnonis90 opened this issue Mar 19, 2024 · 23 comments

Comments

@Tnonis90
Copy link
Contributor

Hello @NSoiffer ,
this is Tommaso from VisionDept SRL., the italian distributor for Vispero / JAWS. We are interested in tackling the Italian for MathCAT, and are ready to start translating.
Regarding speech, I kindly ask you to set up the environment with the automatic translations, so we can start out.
As for Braille, we'd need a discussion on what code to choose: in italy, almost everyone nowadays uses the LAMBDA Math Code (Italian version). Do you happen to have any familiarity with that?

Thanks a lot, and ook forward to getting started with this.

Tommaso

@NSoiffer
Copy link
Owner

I'm very glad to help out with the Italian translations.

I'll build an initial translation in the next day or two and let you know the details.

I know a little about the LAMBDA math code, but I need a specification as to how the MathML maps to its linear format (I didn't see anything at https://www.lambdaproject.org/). I just finished implementing the German LaTeX braille code. For that, I didn't translate directly to the braille dots, but instead to the ASCII chars for LaTeX and then let the current braille mapping table do the translation. I was told that was preferable because each country that might the LaTeX use this would have different 8-dot mappings, or might use a 6-dot mapping. I suspect something similar is desirable for the LAMBDA code. Is that true?

@Tnonis90
Copy link
Contributor Author

Thanks for letting me know. At
www.veia.it
you can download Lambda 1.44, which does contain an XML specification with all Braille markup code. Do not use Lambda 2, as everything's embedded in the executable file in that specific version.

Thanks!

@NSoiffer
Copy link
Owner

NSoiffer commented Mar 23, 2024

I've created an "it" branch. Clone MathCAT and checkout that branch. There are instructions for translators here. Here's a short list:

  1. Go to Rules/Languages/it
  2. Open unicode.yaml and look through the translations. They are likely mostly good. If a translation is good, change the "t: ..." to "T: ...". This marks the translation as having been verified that it is good. There are some if tests for some things and those translations are more likely to be not as good. Hopefully the syntax is understandable.
  3. Open SimpleSpeak_Rules.yaml and again look through the translations (search for "t: "). Again, there are tests here such as for Verbosity and for Blindness. English often uses "the square root of ..." in a verbose setting, but in a terse one, it might be shortened to "square root x" (dropping "the" and "of"). If Italian never uses those extra words, just use an empty string. In a second pass, we can talk about making the rules more natural for Italian.
  4. Open all the files in SharedRules and do the same thing as for SimpleSpeak_Rules.yaml
  5. Open definitions.yaml. This has words for cardinal numbers (one, two, three...) and ordinal numbers (first, second, third...). Also for words used in fractions ("half", ...). These translations are likely correct, but there might be some bad ones.

At any point, you can test these out in NVDA if you have the MathCAT addon. After installing the addon, to test, copy the 'it' directory to %AppData%\nvda\addons\MathCAT\globalPlugins\MathCAT\Rules\Languages. Start NVDA or if it is running, restart NVDA (or you can go to/click on NVDA:Tools:Reload Plugins). If you have an Italian voice, it should use the Italian speech rules. Go to any page with MathML (e.g, https://it.wikipedia.org/wiki/Equazione_di_secondo_grado) and the math should be spoken in Italian. If NVDA+MathPlayer works with LAMBDA, NVDA+MathCAT should also, so that would be another source of math. If NVDA says there is an error in speaking the math, open the NVDA log (NVDA+F1). The message is a little bit hard to understand, but it will hopefully guide you to a place where you have a typo (e.g, accidentally deleted a quote mark).

A similar process applies to MathCAT in JAWS. However, I haven't used JAWS much and not with MathCAT at all. I'm not sure where the MathCAT files are stored, but wherever that is, a similar process (copy the files to Rules/Languages/it) as with NVDA should be followed. I'm not sure if it picks up the Italian voice automatically. In NVDA, that's code that I wrote.

Good luck. If you want to do a teleconference call some time, I can walk you through the process and that might clear up some questions. In a few hours, you might have something that speaks ok for some expressions and in a few days, does ok for many common expressions. To get a really natural translation, you may want to add or delete some rules and I can talk you through those or write them for you with your input.

At some point, you should do some work on unicode-full.yaml. This is where less commonly used characters can be found. It is a huge file (~3,600 lines), so you may want scan through every now and then when you are feeling a little bored and translate characters you think are really poorly translated (anything marked with 'google translation' is more likely poorly translated).

And also we can talk about whether it makes sense to implement ClearSpeak or some other speech style.

@NSoiffer
Copy link
Owner

@Tnonis90

Thanks for letting me know. At www.veia.it you can download Lambda 1.44, which does contain an XML specification with all Braille markup code.

I only see options for LAMBDA 2, BM2021, and EBKey. I tried BM2021 in the hopes that was old version, but after downloading, I see the "BM" stands for "Braille Music", so that's not the right thing. The EBKey description also indicates that's right either. I didn't see anything else that I could download. Can you clarify what I should get?

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Mar 25, 2024 via email

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Mar 26, 2024 via email

@NSoiffer
Copy link
Owner

@Tnonis90 : I don't see your commit. Did you forget to do a "git push"?

Also, I think I answered this:

So far, I have found a problem with an untranslatable string, “out of” This string speaks when you up arrow out of an inner element (e.g. denominator). Could you please tell me where to fix this string so it speaks in correct Italian?

In case I didn't, you need to translate navigate.yaml -- I left that out of my shortened instructions by accident.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 15, 2024 via email

@NSoiffer
Copy link
Owner

When I click on your SHA, the top of the page says "This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository."

I need to get some sleep now. If you don't beat me to it, I'll look into what's going on when I get up and see if I can correct it/bring it into the repo.

@NSoiffer
Copy link
Owner

I asked my git-savy son for help and the only thing we could come up with is to essentially clone things and copy files over. That's error-prone.

If you created a fork, and committed your changes there, I could do something, but I'd need to know what your fork is.

Maybe you can try and do a pull request from your repo or branch into the MathCAT repo. That would likely be the shortest path to getting this right. One place I saw says that the error sometimes comes from pushing a tag, not a branch.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 23, 2024 via email

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 23, 2024 via email

@NSoiffer
Copy link
Owner

Probably the best path is to do a "Pull Request" (on top, third item after "code" and "issues") on your repo's page. It will probably suggest what to do. If not click "New pull request" (on top right) and then choose the "...compare across forks" link.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 29, 2024 via email

@NSoiffer
Copy link
Owner

As you probably saw, I merged your code into the 'it' branch. I know you want to use the JAWS character translations for the characters they have. However, if you think the current files are good enough to use until you do more work, let me know and I'll merge the 'it' branch into main.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Apr 30, 2024 via email

@NSoiffer
Copy link
Owner

At the end of March, I downloaded Lambda from your link and tried it out and ran into several issues. Between having to write/finish a paper for ICCHP and immersing myself in the update to Nemeth for chemistry, my memory is hazy on the problems I found :-{

I do remember that Lambda didn't work on many of the MathML examples I tried to import. I remember decompiling mathml2lambda.pyc to get a better sense of what Lambda supports, but I don't remember what I concluded (if anything). I don't think I found any documentation on the lambda code itself. Do you know where there is documentation on it?

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented May 9, 2024 via email

@NSoiffer
Copy link
Owner

I could take the list of more commonly used math symbols in MathCAT (in unicode.yaml) and pull out the symbols to a file that I could then copy and paste and see what lambda generates. Then copy then back and with the aid of a program, stick that result into proper spot in the MathCAT table. I'm travelling right now and don't have lambda on my laptop, but if you paste in something like ÷, λ, ←, does lambda show useful dots? If so, that would not be too much work.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented May 14, 2024 via email

@NSoiffer
Copy link
Owner

I've attached a list of 360 characters, one character per line. If this isn't a good format, let me know what format you would like (e.g, all chars on a single line or 40 chars per line or ...).

I tried Lambda 2 myself, but I don't know settings I should use to get the proper braille chars. In JAWS, if I set the output table to Italian, I only see 6 dot and computer braille options. I think you need to do the conversion.

Note: there are four invisible chars in the list (U+2061 - U+2064). I don't know if lambda supports these. I suspect there might be some others that aren't supported. The list begins with a blank char (which probably translates to an empty braille cell).

List of characters: chars.txt

In order to know what braille char corresponds to what Unicode char (and hence create the list in MathCAT), don't delete any chars even if they don't translate. That way I can know that what is on line 137 corresponds to α. The alternative is for you send back something like
Λ = dots 1238
for each char.
What you send back can be the actual braille char (you need to let me know what the mapping is or use the Unicode braille chars) or something like 1238 and I can covert that to dots.

Hopefully this approach works.

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Jul 3, 2024 via email

@Tnonis90
Copy link
Contributor Author

Tnonis90 commented Jul 3, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants