Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbols list #28

Open
dkager opened this issue Mar 24, 2016 · 8 comments
Open

Symbols list #28

dkager opened this issue Mar 24, 2016 · 8 comments

Comments

@dkager
Copy link
Contributor

dkager commented Mar 24, 2016

How does Dotify determine which content belongs to a volume? That is, would a volume-level symbols list be possible if it was generated as boilerplate BEFORE being fed into dtbook-to-pef?

@bertfrees
Copy link
Member

Dotify splits the document body (in CSS terms: everything in the normal flow, i.e. not flowed into @begin or @end areas; in OBFL terms: everything not inluded in pre-content or post-content) into equal parts. Table of contents and volume endnotes sections work as follows: they must be included in pre-content or post-content, and they are basically lists of items that reference elements in the document body. This way Dotify can determine which items need to be included in which volumes, and in case of a "document range" table of contents, how to group the items according to volume.

Volume-level symbols list could be done similar to how endnotes are done, that is: before formatting you generate a list of "symbol items" that reference the positions in the body where they are used.

However one issue is that I'm not sure whether OBFL allows the same item to reference multiple positions in the body, and what Dotify does in that case. What would be most logical is to include the item in as many volumes as needed, and that is also exactly what we need. (As a matter of fact, this may also be an issue with endnotes. Can several noterefs in DTBook reference the same note? I'm not sure.)

Whether the order of items in the result is determined by the order of items in the collection, or by the order of the referenced elements in the body, I don't know, because these orders have always been identical until now. This could be another possible issue if we want to use this approach for symbols lists.

@dkager
Copy link
Contributor Author

dkager commented Apr 6, 2016

Yesterday I learned that the symbols list is book-level. Who knew?! So let's start with that.

My initial thoughts:

  • Add an option symbols-list-file to the script. This can be used to reference the list of symbols definitions. Maybe the absence of this can signal that no symbols list is to be included, so we can drop the boolean option.
  • Traverse the input file character-by-character and look them up in the symbols file. Probably this means another XSLT pre-processing step.
  • As a result of the previous step, add and populate a symbols list section in the input file.

The problem I see with this approach is that the symbols file uses ASCII for the replacements, which should be inserted as-is. I need to look into that.

My initial idea was to add a liblouis table to translate all the symbols, but this makes maitenance more difficult because the symbols file is also used in other systems.

@bertfrees
Copy link
Member

Regarding the as-is insertion of ASCII braille, that should be possible but it will probably involve a translation to Unicode braille because the result needs to be PEF. Related issue: snaekobbi/issues#9

@dkager
Copy link
Contributor Author

dkager commented Apr 7, 2016

To clarify, the ASCII is the BRF output that the Braillo needs. So either we'd have to back-translate it to Unicode in the pre-processing step, or this can be done based on the ascii-table option. In the last case the Braillo table needs more supplements to guarantee every replacement can be back-translated.
Basically the end result needs to be that these are passed on from input to BRF without change. Since DP2 works with PEF that is probably too optimistic. I can provide a symbols file with Unicode braille, but that breaks compatibility with our legacy system.

@dkager
Copy link
Contributor Author

dkager commented May 3, 2016

I discussed this with the product manager. Our decision is not to implement the symbols list in phase 3, making it out of scope for the project.

@dkager dkager closed this as completed May 3, 2016
@bertfrees
Copy link
Member

I'm reopening this issue because we're using the tracker now for the backlog of issues that need to be fixed in the project follow-up.

@bertfrees bertfrees reopened this Nov 28, 2016
@bertfrees
Copy link
Member

According to Arjan the symbols list is volume-level, not book-level like Davy said above.

In our current conversion we use a symbols list to convert certain characters. We also put this character in each volume in a paragraph "Symbols list" as an additional declaration. Take as an example the ampersand (&) sign.

@dkager
Copy link
Contributor Author

dkager commented Nov 29, 2016

According to Arjan the symbols list is volume-level, not book-level like Davy said above.

Tests with the current conversion software show that this depends on the book type (RO or SV).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants