-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Trascriber notes containing brackets #195
Comments
At the time of creating CZ data, I have been thinking about it. I did not care much about Parla-CLARIN recommendation in this issue. I decided to kept brackets for this reason:
so I am not sure if brackets should've removed (or added when they are missing?) |
OK, so we have to conflicting requirements:
I had a look at the similar example of the q element, and TEI is (of course!) agnostic on whether to keep the quotation marks, although it does advise if the quotation marks are not kept, they original form be kept in the Note also that if 2. is taken, we have, as you note, further choices:
So, I would either delete them or delete them but reintroduce some common brackets. The second might indeed be preferable, but it does mean that maybe Parla-CLARIN and definitelly ParlaMint guidelines would need to be changed, and this implemented in v2tov3 and probably validation script. |
@matyaskopp, as we are closing issues, maybe we should now also decide how to treat these brackets. My suggestions (delete + re-introduce) is above. What do you think? |
@TomazErjavec , I agree with delete+reintroduce but I am not sure where it should be implemented:
|
I would vote for finalization, as we can there also catch & correct errors of the new partners. |
Ok, agree |
Transcriber notes are now - after finalization - bracketless. so, closing. |
Transcriber notes are in the source documents often indicated by being enclosed in brackets or similar, and these marks then also serve to identify the notes. The (admittedly implicit) assumption in Parla-CLARIN as well as ParlaMint recommendations is that these marks are not retained in the marked-up TEI document. This seem to make sense, as they are mark-up baggage from the source document, and only make the actualy content of the notes more opague.
However, many corpora, e.g. CZ have retained these markers:
<note type="comment">(otevřením?)</note>
.I propose that they are deleted. Rather than making an issue for every corpus that has them, this could be one of the v2tov3 script functions and recoded in #183.
@matyaskopp, would you agree?
The text was updated successfully, but these errors were encountered: