Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ordering of the open and closing "tags" in braille #10

Open
josteinaj opened this issue Aug 17, 2015 · 9 comments
Open

Ordering of the open and closing "tags" in braille #10

josteinaj opened this issue Aug 17, 2015 · 9 comments
Assignees
Labels

Comments

@josteinaj
Copy link

@usama49 and I discovered a quirk in liblouis the other day in the ordering of open and closing "tags" (begin-/end-letters).

Italic/bold/underline in liblouis table no-no-g0.utb:

firstwordital 23
firstletterital 23
firstwordbold 6-3
firstletterbold 6-3
firstwordunder 456
firstletterunder 456

lastworditalafter 56
lastletterital 56
lastwordboldafter 6-3
lastletterbold 6-3
lastwordunderafter 456
lastletterunder 456

Test harness:

{
  "tables": [
    "unicode.dis",
    "no-no-g0.utb"
  ],
  "tests": [
    {
      "data": [
        {
          "comment": "Test",
          "input":    "a b c",
          "typeform": "00700",
          "output": ""
        }
      ]
    }
  ]
}

Running input "a b c" with different typeforms for the "b":

typeform description output braille output dots output HTML-ified
00000 ⠁ ⠃ ⠉ 1 12 14 ⠁ ⠃ ⠉
00100 italic ⠁ ⠆⠃⠰ ⠉⠰ 1 23-12-56 14-56 ⠁ <i>⠃</i> ⠉</i>
00200 underline ⠁ ⠸⠃⠸ ⠉⠸ 1 456-12-456 14-456 ⠁ <u>⠃</u> ⠉</u>
00300 italic + underline ⠁ ⠸⠆⠃⠸⠰ ⠉⠸⠰ 1 456-23-12-456-56 14-456-56 ⠁ <u><i>⠃</u></i> ⠉</u></i>
00400 bold ⠁ ⠠⠄⠃⠠⠄ ⠉⠠⠄ 1 6-3-12-6-3 14-6-3 ⠁ <b>⠃</b> ⠉</b>
00500 bold + italic ⠁ ⠠⠄⠆⠃⠠⠄⠰ ⠉⠠⠄⠰ 1 6-3-23-12-6-3-56 14-6-3-56 ⠁ <b><i>⠃</b></i> ⠉</b></i>
00600 bold + underline ⠁ ⠸⠃⠠⠄⠸ ⠉⠠⠄⠸ 1 456-12-6-3-456 14-6-3-456 ⠁ <u>⠃<b></u> ⠉</b></u>
00700 bold + underline + italic ⠁ ⠃ ⠉ 1 12 14 ⠁ ⠃ ⠉
  • When there's two typeforms, they're not opened and closed like in XML. For instance,
    typeform 3 closes its underline "tag" before its italic "tag". I'm not sure if this is a
    problem for braille readers, or if any braille specification says anything about it.
    Maybe there could be a opcode for defining the default nesting order of bold/italic/underline?
    Will it be possible to transform <em><strong>TEXT</strong></em> differently
    than <strong><em>TEXT</em></strong>?
  • Each line seems to end with closing tags, I don't know why it happens in this example.
  • Typeform 7 doesn't seem to work.
@bertfrees
Copy link
Member

Thanks guys for the thorough testing. Typeform is currently broken in liblouis so it doesn't surprise me that some things are not going as expected. It's the reason we're working on getting the changes from APH into the next release (liblouis#50). Could somebody please try and rebase the Norwegian table changes onto branch liblouis/ueb_update_code and check if that makes things better or worse? We might have to modify the table a bit because the behavior of some opcodes have changed. Documentation at https://github.com/liblouis/liblouis/wiki/New-opcodes-for-UEB (work in progress).

@bertfrees
Copy link
Member

There seems to be an additional problem with nested typeform where the inner typeform starts at a different letter, such as The <em>quick <strong>brown</strong> fox</em> jumps.... The resulting braille closes and opens the elements at the start of the inner typeform like this: The <em>quick </em><em><strong>brown</em></strong><em> fox</em> jumps....

@josteinaj
Copy link
Author

The test results are the same as before after rebasing to ueb_update_code.

Here's a test where bold is nested inside italics, but does not start at the same place:

    {
      "data": [
        {
          "comment": "Test",
          "input":    "The quick brown fox jumps...",
          "typeform": "0000111111555551111000000000",
          "output": ""
        }
      ]
    }

The results here are also the same after rebasing:

output braille: ⠠⠞⠓⠑ ⠆⠟⠥⠊⠉⠅ ⠰⠠⠄⠆⠃⠗⠕⠺⠝⠠⠄⠰⠆ ⠋⠕⠭⠰ ⠚⠥⠍⠏⠎⠄⠄⠄
output dots: 6-2345-125-15 23-12345-136-24-14-13 56-6-3-23-12-1235-135-2456-1345-6-3-56-23 124-135-1346-56 245-136-134-1234-234-3-3-3
output HTML-ified: ⠠⠞⠓⠑ <i>⠟⠥⠊⠉⠅ </i><b><i>⠃⠗⠕⠺⠝</b></i><i> ⠋⠕⠭</i> ⠚⠥⠍⠏⠎⠄⠄⠄

Branch rebased to ueb_update_code: issue-10-norwegian-typeform

bertfrees added a commit that referenced this issue Aug 17, 2015
@bertfrees
Copy link
Member

I'm getting different results and they look better at first sight. @josteinaj @usama49 Could you please add the expected output to tests/harness/no_typeform_harness.txt and then run

./autogen.sh && ./configure --enable-ucs4 && make
cd tests/harness
make no_typeform_harness.txt

bertfrees added a commit that referenced this issue Aug 17, 2015
@dkager
Copy link

dkager commented Aug 31, 2015

Nesting is supposed to work in the ueb_update_code branch. That is, closing tags should be the reversed order of their opening tags.
Starting and/or ending different types of emphasis in the same phrase is a notorious problem. It is not possible to implement generic behavior in liblouis that works for all braille standards. We should try to clearly define the expected behavior and look at adding opcodes to reliably produce these results.

@bertfrees
Copy link
Member

@usama49 @josteinaj Is this fixed in the latest version of pipeline-mod-nlb? (which does not use liblouis anymore to handle emphasis)

@josteinaj
Copy link
Author

I don't quite remember. We haven't worked on this lately I think.

@usama49 @KariRudjord do you have time to test this?

bertfrees pushed a commit that referenced this issue Jul 1, 2016
bertfrees added a commit that referenced this issue Jul 1, 2016
@bertfrees
Copy link
Member

bertfrees commented Jul 1, 2016

I've rebased the branch issue-10-norwegian-typeform: liblouis/liblouis@master...snaekobbi:issue-10-norwegian-typeform. Uses the new test format, and hopefully with the latest additions to liblouis you can now make this work.

bertfrees pushed a commit that referenced this issue Apr 14, 2019
bertfrees added a commit that referenced this issue Apr 14, 2019
bertfrees pushed a commit that referenced this issue Apr 14, 2019
bertfrees pushed a commit that referenced this issue May 27, 2019
@KariRudjord
Copy link

I have checked “The quick brown fox jumps” tests that Jostein gave me. They say the devil is in the details, but I found no devils. Everything correct. God work, Bert!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants