TTS lexicon #753
Unanswered
torchtrust
asked this question in
Q&A
TTS lexicon
#753
Replies: 2 comments
-
I think what's going on is that the word detection does not detect the "Ps." as an abbreviation. The "." is seen as punctuation and falls outside of the word token. This is why "Ps." is not matched by the lexicon if you include the ".": <lexeme>
<grapheme regex="true" positive-lookahead="[ ]+[0-9]">Ps.</grapheme>
<alias>Psalm</alias>
</lexeme> I don't think this can be solved on the lexicon level if the word detection is wrong. So we either
|
Beta Was this translation helpful? Give feedback.
0 replies
-
Thanks Bert,
There are also plenty of other abbreviations such as:
Vol.
p.
pp.
c.
d.
No.
So for now I can only think the solution is to pre-process the dtbook,
which we could do in my system. If you want a solution within the DAISY
pipeline, which would be ideal, then there does need to be another way
around this issue. Not being a linguist I don't know if this is limited to
english.
Thanks
Paul
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am progressing to the lexicon now we can produce a whole DTBook with css added. In a lot of our books are Abbreviations for books of the Bible and I have worked out the look ahead to include the full stop(period) e.g. Ps. for Psalm
<lexeme> <grapheme positive-lookahead="(\.)?[ ]+[0-9]">Ps</grapheme> <alias>Psalm</alias> </lexeme>
But I want it to get rid of the full stop (period) as the TTS pauses too long as it thinks it is at the end of a sentence!
Putting a full stop in the grapheme doesn't work even with a backslash.
Any ideas?
thanks
Paul
Beta Was this translation helpful? Give feedback.
All reactions