-
Notifications
You must be signed in to change notification settings - Fork 1
Transcriber
Maarten Janssen edited this page Aug 23, 2020
·
3 revisions
Transcriber (.trs) is a file-format used by Transcriber. It is an XML-based format for transcribing spoken data, which encodes some metadata, speakers turns and their aligment to an audio/video file.
trs2teitok.pl
Command line options of the tool:
- debug: debugging mode
- output: name of the output file - if empty STDOUT
- morerev: More revision statements
- file: filename of the input
The script converts Episode, Sections, and Turn to ab, ug, and u (where ug is an utterance group, which is modeled after lg and others, but does not really exist in TEI). And within the turn, it converts strings between Sync elements to tokens.
The format does not really specify what the synchronisation elements are synching; the script currently assumes they are around words, but that will not always be the correct assumption.
No export has been provided as of yet.