Liblouis table for Dutch #6

bertfrees · 2015-03-16T15:19:59Z

Include some official braille code specification document
Write tests based on the specification
Validate the existing braille table for Dutch. There is currently a table for the Netherlands and a table for Belgium which are slightly different (and should ideally be merged, see Common table for Dutch liblouis/liblouis#34)
Improve the existing table or write a new one from scratch

Related issues:

bertfrees · 2015-05-18T18:21:18Z

@dkager See branch "dkager_dutch"

dkager · 2015-05-19T07:13:29Z

Thanks. Should we update nl-BE or do a new set of tests for nl-NL? Would be nice to keep them in sync, or to have just one set of tests because these tables are almost identical.
Also, I think task 1 can be marked as completed: we have the 2005 specification of the Dutch braille code (is it online somewhere?).

bertfrees · 2015-05-19T07:27:09Z

Try extending nl_BE. If all the tests in there are valid for the Netherlands (Dedicon) rename it to nl. Otherwise make a new test file for nl_NL and we can get them in sync again later.

dkager · 2015-05-19T12:19:57Z

I've reorganized the tests somewhat and added examples from the official documentation. There still isn't a huge lot of data. I'm not sure what else can be added, these tests cover most of the corner cases.

I still have to actually run these, but one omission I can spot right away is the special braille notation for minutes and seconds, i.e.:
The world record time is 3' 5''.
Here you don't use an apostrophe. Instead you write:
⠼⠉⠈⠔ ⠼⠑⠈⠔⠔
I don't think the nl-BE (or nl-NL) table defines this either. It may not be possible to figure this out from the context, because normal apostrophes are used.

bertfrees · 2015-05-19T12:27:19Z

These are things that liblouis has difficulty with, because it knows little about the context, except for the actual text you give it (and typeform info). You could make a rule that finds occurrences of the pattern "digit-digit-apos-digit-digit-apos-apos" and translate it in a special way. For such simple heuristics this approach works, but if you need to do advanced text analysis, or if you want to mark up minutes and seconds in XML, you need something besides liblouis.

dkager · 2015-05-19T12:41:11Z

I don't think we get a lot of books where this shows up, so not very high priority just yet.
One more thing that is missing is tests for emphasized text, etc. Will add that and test the whole thing on Thursday.

bertfrees · 2015-05-19T12:42:22Z

The new data looks very good. It's not a problem if it's not a lot, if it covers everything it covers everything.

I noticed you deleted some tests. In almost all cases you replaced them with an equivalent test but in one or two cases you didn't (e.g. "SLAAT DE VLAM IN DE PAN..."; I know it's kind of the same as "MEER DAN DRIE WOORDEN..." but not completely because the last uppercase word doesn't end with a period). I want to avoid touching other people's tests as much as possible so if there's no good reason to delete a test I think we should leave it.

I'm gonna add back the ascii-output because it's useful for readability. I can do that with a script.

dkager · 2015-05-19T12:52:59Z

Can you also re-add the all-capitals sentence?

I'm not a big fan of the ascii-input because it's redundant and to me (as a braille reader) the symbols don't make much sense. Could you wait with the ascii-input until I've finished the test data? It makes editing somewhat easier for me as I don't know the ASCII code liblouis uses. Also, I noticed that other harness tests (fi_harness.txt) don't have ASCII?
The Unicode output specification is proving to be very nice to work with, because I can verify it straight away on my braille display! :)

bertfrees · 2015-05-19T13:01:13Z

I know, that's why I added the Unicode alongside with the ASCII. The Unicode is used by the test harness, the ASCII is for making it readable for sighted people (documentation purpose). I'll wait until you've finished though, then I only have to run the conversion script once.

bertfrees · 2015-05-21T13:56:08Z

@dkager In commit 1a92520 I've added xfail flags to all failing tests.

dkager · 2015-05-21T14:22:06Z

"xfail": "Fails because special handling of more than three capitalized words is not supported by liblouis",

So we need support for this that is similar to how emphasized text is handled?

I'll look at the Greek capital letters. They are used in maths so it is important that they work.

bertfrees · 2015-05-21T14:57:16Z

So we need support for this that is similar to how emphasized text is handled?

Yes, that support is already implemented by Michael Gray, but his patch isn't merged into master yet. See liblouis#50.

I'll look at the Greek capital letters. They are used in maths so it is important that they work.

I think the nl-NL table does Greek capitals different then the nl-BE table.

dkager · 2015-05-21T16:39:26Z

Forgot to bring this up re: commit 1a92520. Ignoring unknown keys in the JSON file sounds like a good idea, but I think it has to be an option. Right now if you want to "test the tests" for syntax errors you can do so. If you ignore unknown keys, you can't. So maybe add an option to the harness runner so that we can have both output and ascii-output keys.

bertfrees · 2015-05-21T16:43:22Z

Okay sounds reasonable. Maybe call it --strict (i.e. make "ignore" the default, because we want to be able to automatically run all the tests as they are).

dkager · 2015-05-21T17:41:18Z

Not entirely sure why it's failing in the first place since no explicit format validation is performed except for checking the input is valid JSON. Something to check next week. :)

bertfrees · 2015-05-21T17:45:14Z

The validation happens implicitly when constructing a BrailleTest instance. The solution was to add **kwargs to the constructor (see 0ea22d6#diff-f1bb1dc1db8387a1553e444ef5b0da82R123)

dkager · 2015-05-26T10:19:03Z

The nl-NL table has a few issues that should be easy to fix:

Greek letters (can probably take those from nl-BE).
For the percent sign some bogus whitespace is added.
The ² and ³ symbols need a / added to them.

Furthermore, the emphasis tests are failing. For one of them it looks like expected == received, yet the test fails with a braille difference error. I hope I'm not overlooking something!
Exp: ⠨⠓⠑⠃ ⠚⠑ ⠸⠸⠨⠧⠇⠥⠉⠓⠞ ⠇⠁⠝⠛⠎ ⠙⠑⠀⠸⠨⠁⠝⠁⠏⠕⠑⠗⠀⠁⠇ ⠛⠑⠇⠑⠵⠑⠝⠢
Rec: ⠨⠓⠑⠃ ⠚⠑ ⠸⠸⠨⠧⠇⠥⠉⠓⠞ ⠇⠁⠝⠛⠎ ⠙⠑ ⠸⠨⠁⠝⠁⠏⠕⠑⠗ ⠁⠇ ⠛⠑⠇⠑⠵⠑⠝⠢

There are also some accented letters that aren't in the standard. Need to figure out what to do with those. See eb4ef4d.

What's the best way to proceed after I fixed nl-NL? Rename the test harness to nl-NL-g1_harness.txt and associate it with the nl-NL table?

bertfrees · 2015-05-26T11:16:36Z

Don't attach too much importance to the emphasis tests. Emphasis in liblouis is currently broken. For that one test that seems to have the correct output, the expected output had some blank braille patterns instead of regular spaces. I've corrected it.

For things that are not in the standard such as those accented letters maybe you could make a separate table? Are Greek letters in the standard by the way?

Copy the table you want to edit to nl and rename the test to nl. We should probably name the table g0 instead of g1 (see liblouis#16 (comment)).

dkager · 2015-05-26T12:04:01Z

Don't attach too much importance to the emphasis tests. Emphasis in liblouis is currently broken.

Should we do xfails for those?

Are Greek letters in the standard by the way?

The standard describes the first three letters in an example. This matches what nl-NL does, except for the letter beta which was missing dot 1 (will push a fix in a moment). The definitions in nl-NL are the same as in the example, but I'll check if Dedicon is maybe using something else.

We should probably name the table g0 instead of g1

It's also not really contracted (.ctb versus .utb).

Also, is the patch for the MORE THAN THREE WORDS in capitals going to be included in the next release? We now have an xfail for this, but it's important to resolve the problem to get proper Dutch braille output.

bertfrees · 2015-05-26T12:18:35Z

Should we do xfails for emphasis tests: yes.

The extension should probably be changed to .utb, yes.

The patch which fixes emphasis and adds support for indicating phrases of capitalized words will not be in the June release. We've targeted September.

dkager · 2015-05-26T12:23:01Z

September is okay, as long as it’s in there before delivering the system. :)

bertfrees · 2015-05-26T12:27:13Z

It's in the branch https://github.com/MikeGray-APH/liblouis/commits/mrg_ueb_update, in case you already want to try it out.

dkager · 2015-05-28T07:18:03Z

This is pretty much done now. What remains:

Look at the remaining FIXMEs in nl-NL-chardefs.cti.
Merge the two tables as Common table for Dutch liblouis/liblouis#34 describes.

bertfrees · 2015-05-28T09:54:43Z

About the FIXMEs in nl-NL-chardefs.cti: they refer to the standard "World Braille Usage (3rd edition)": https://cdn.rawgit.com/liblouis/braille-specs/master/world-braille-usage-third-edition.pdf

bertfrees · 2015-05-28T09:59:01Z

By the way I put those FIXMEs in there. They're mostly about differences between NL and BE that need to be sorted out.

dkager · 2015-06-04T14:50:24Z

There are now quite a few tests that are failing (see dkager_dutch_ueb). For next week: find out why, and fix it while not breaking the nocaps table.

dkager · 2015-07-07T08:28:04Z

The Dutch braille standard 2005 does not specify that the plus sign doesn't cancel a capitalized word. But the Dutch liblouis table does treat it that way. There is a test: P+R park and ride

Is there any additional documentation that specifies this or is it an organization-specific choice? Maybe we should discuss this in the process of unifying the Belgium and Netherlands tables.

bertfrees · 2015-07-07T09:50:37Z

Yes I probably based my implementation on the test that I got, and didn't check the documentation.

dkager · 2015-07-30T08:06:04Z

Add emphmodechars opcode liblouis/liblouis#116

bertfrees added this to the dutch (1) milestone Mar 16, 2015

bertfrees mentioned this issue Apr 8, 2015

Support Dutch braille code snaekobbi/pipeline-mod-braille#39

Open

4 tasks

bertfrees added the 1 - Next label Apr 8, 2015

bertfrees assigned dkager Apr 8, 2015

bertfrees added 2 - In progress and removed 1 - Next labels May 21, 2015

dkager mentioned this issue Jul 6, 2015

Liblouis table for mathematical expressions in Dutch #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Liblouis table for Dutch #6

Liblouis table for Dutch #6

bertfrees commented Mar 16, 2015

bertfrees commented May 18, 2015

dkager commented May 19, 2015

bertfrees commented May 19, 2015

dkager commented May 19, 2015

bertfrees commented May 19, 2015

dkager commented May 19, 2015

bertfrees commented May 19, 2015

dkager commented May 19, 2015

bertfrees commented May 19, 2015

bertfrees commented May 21, 2015

dkager commented May 21, 2015

bertfrees commented May 21, 2015

dkager commented May 21, 2015

bertfrees commented May 21, 2015

dkager commented May 21, 2015

bertfrees commented May 21, 2015

dkager commented May 26, 2015

bertfrees commented May 26, 2015

dkager commented May 26, 2015

bertfrees commented May 26, 2015

dkager commented May 26, 2015

bertfrees commented May 26, 2015

dkager commented May 28, 2015

bertfrees commented May 28, 2015

bertfrees commented May 28, 2015

dkager commented Jun 4, 2015

dkager commented Jul 7, 2015

bertfrees commented Jul 7, 2015

dkager commented Jul 30, 2015

Liblouis table for Dutch #6

Liblouis table for Dutch #6

Comments

bertfrees commented Mar 16, 2015

bertfrees commented May 18, 2015

dkager commented May 19, 2015

bertfrees commented May 19, 2015

dkager commented May 19, 2015

bertfrees commented May 19, 2015

dkager commented May 19, 2015

bertfrees commented May 19, 2015

dkager commented May 19, 2015

bertfrees commented May 19, 2015

bertfrees commented May 21, 2015

dkager commented May 21, 2015

bertfrees commented May 21, 2015

dkager commented May 21, 2015

bertfrees commented May 21, 2015

dkager commented May 21, 2015

bertfrees commented May 21, 2015

dkager commented May 26, 2015

bertfrees commented May 26, 2015

dkager commented May 26, 2015

bertfrees commented May 26, 2015

dkager commented May 26, 2015

bertfrees commented May 26, 2015

dkager commented May 28, 2015

bertfrees commented May 28, 2015

bertfrees commented May 28, 2015

dkager commented Jun 4, 2015

dkager commented Jul 7, 2015

bertfrees commented Jul 7, 2015

dkager commented Jul 30, 2015