Replies: 1 comment
-
GPT is very prone to merging lines, at least in 3.5 - it took quite a few iterations to arrive at the current prompt, which more or less eliminates desyncs. Feeding it lines with a clear indication where it should fill in the translation helps to keep it on track (it just has to fill in the blanks). Validation/retry could probably fix desyncs even with a looser format... but it more than doubles the token count for the batch, since it has to resend the whole message chain, so it is unlikely to be a net win on that front! :-) My long term goal is to allow GPT to merge lines when it helps it produce a more fluent translation, then fix up the timings... it doesn't seem able to do that itself, unfortunately - GPT4 might be able to, but it's so much more expensive that I haven't experimented with it much. |
Beta Was this translation helpful? Give feedback.
-
I got this idea from Subtitle Edit's "Auto-translate via copy-paste" function where they process the .SRT file such that it ends up like this
Then you can just toss it into a translator like DeepL and get:
and the software maps the translation back to the original timestamps since it's a 1-1 mapping separated by each asterisk.
I've been experimenting with this approach with ChatGPT, the translation is often flawless, but the problem is it often ends up combining lines across the asterisks and makes the mapping back desync. But with the functionality of subtrans like batching, validating, re-translating will help enough with this that it becomes a non-problem?
If this works then the consumption of tokens will be largely reduced
Beta Was this translation helpful? Give feedback.
All reactions