Skip to content
machinewrapped edited this page Dec 10, 2023 · 19 revisions

Welcome to the gpt-subtrans wiki!

How to use gpt-subtrans

The easiest and most flexible way to use gpt-subtrans is with the GUI, gui-subtrans. See the readme for installation and setup instructions.

How does it work?

gpt-subtrans takes subtitles in a source language, divides them up into batches (grouped into scenes) and politely asks ChatGPT to translate them to the target language.

It then attempts to extract the translated lines from ChatGPT's response and map them to the source file. Basic validation is applied to the translation to check for some of the errors ChatGPT tends to introduce, and if any are found a reply is sent noting the issues and asking ChatGPT to try again. Nine times out of ten it is able to correct its mistakes when they are pointed out.

Each batch is treated as a new conversation, with some details condensed from preceding batches to try to help ChatGPT understand the context of the lines it is translating. Additional context for the translation can be provided via project options, e.g. a short synopsis of the film and names of major characters.

The translation requests are constructed from the file instructions.txt (unless overridden), they have been tweaked to try to minimise the chances of ChatGPT desyncing or going off the rails and inventing its own plot. The prompts used for the translation are quite rigid, using an xml-like syntax that helps ChatGPT understand the request and provide a structured response that can be parsed for the translations. There is sure to be further room for improvement.

How good is it?

Whilst ChatGPT's ability to understand context marks a significant advance over other machine-translation tools, it is still far short of what a professional translator would produce. The intention is not to replace translators but to open up accessibility to films and shows that have been denied a professional translation. The goal is that the result should be good enough to allow you to follow the film, but it is likely many manual corrections will be needed if you want to share the translation more widely.

Apart from anything else, Chat GPT does not actually watch the video so it is wholly reliant on the quality of the source subtitles, and will still make mistakes about who is speaking or misinterpret a remark that needs visual context.

I recommend passing the output through Subtitle Edit's "Fix Common Errors" function, which can automatically fix things like line breaks and short durations to make the subtitles more readable.

How long does it take?

The answer is a very heavy "it depends". Firstly, if you are using a free/trial OpenAI account requests are severely restricted (three per minute at the time of writing, it seems to keep going down). Each batch of subtitles is one request, as are any retranslations or retries, so it's probably going to take hours to translate an average movie.

Paid accounts are much less restricted, though for the first 48 hours after signing up the rate limit is still somewhat restricted (after that there is a limit, but you are very unlikely to hit it with gpt-subtrans).

For users on a paid account the limiting factor will be how long ChatGPT takes to respond, which varies considerably depending on load and the size of a request. It might still only manage a few batches a minute, but whilst batches in a scene are always processed in sequence, multiple scenes can be processed in parallel (this can currently only be done in the GUI).

How much does it cost?

Cost depends on:

  • How many lines of subtitles?
  • How many batch requests?
  • Which model is used as the translator?

For an average movie of up to 1,000 lines of dialogue, translation with gpt-3.5-turbo should cost between $0.20 and $0.50. With GPT4 the cost is approximately ten times higher.

Is it safe?

All subtitles need to be sent to OpenAI's servers for processing, so don't use the app to translate anything you wouldn't want to be used in future GPT training data.

I make absolutely no guarantees about the accuracy or suitability of the translations it produces, and I can't guarantee that the app won't do something catastrophically stupid with your data so please make sure you have backups or work on a copy you can afford to lose. This is not a commercial product and does not come with any sort of warranty or guarantees.

Which model should I use?

For most users the gpt-3.5-turbo-16k model is probably the best choice, as it is fast and cheap and supports a larger batch size (around 100 lines for average movie dialogue).

The cost per token is somewhat higher than the smaller models, but since it can handle much larger batches it can generally translate the whole file with fewer requests, reducing the overhead of the instruction prompt sent for each request. If the source material is naturally composed of small batches it will end up costing more, and the lower per-token cost of the original gpt-3.5-turbo model may be more economical.

The application also supports the gpt-instruct model, which is fine-tuned for following instructions rather than conversation. In theory this could produce better translations, in practise the difference is hard to qualify - some lines come out better, some come out worse. Since it only supports the smaller 4k token window and has a higher per-token cost, I don't particularly recommend using it.

gpt-4 is better at understanding and summarizing content and has a better grasp of nuance and tone, so it may produce better translations of some material. In most cases the difference is likely to be minimal, so at ten times the cost it is probably better to save this model for selective retranslations when gpt-3.5-turbo is confused.

Instructions

The instructions tell GPT what its task is and how it should approach the task. The instructions are sent as a system message... you've probably heard of prompt engineering - this is it.

Several sets of instructions are provided, but the default instructions.txt should work for most use cases.

If you have specific needs or the source material has special characteristics, consider editing the instructions to guide GPT. The instructions can be edited on a per-project basis, but if you have multiple similar sets of subtitles to translate consider saving them as instructions (something).txt in the project root (the same location as instructions.txt) and they will be available to select in the new project options.

Keep in mind that LLMs work best when they have an example of how they should respond.

A couple of examples of custom instructions are included, e.g. "improve quality" that can be used to fix spelling and grammatical errors without changing the language (it's important to modify the prompt as well in this case).

Prompt

This is the instruction given to GPT at the start of each batch. The default prompt should work for most cases:

  • Please translate these subtitles[ for movie][ to language].

The for movie and to language tags are automatically replaced with the name and language specified for the project.

What do the other options do?

Scene threshold

Subtitles will be divided into scenes whenever a gap longer than this threshold is encountered, regardless of the number of lines. An appropriate threshold depends on the source material, as a rule of thumb aim for a threshold that produces around 10 scenes for a feature film as this will maximize the context available to later batches.

Maximum batch size

It is important to set a maximum batch size that is suitable for the model. There isn't a hard answer for the maximum size as it depends on the content of the subtitles, but empirically speaking these batch sizes work well:

gpt-3.5-turbo : 30 lines gpt-3.5-turbo-16k : 100 lines gpt-4 : 60 lines

Minimum batch size

This is less important as batches over the maximum size are divided at the longest gap to encourage coherence, but specifying a minimum size may prevent an excessive number of small batches if the source has sections of sparse dialogue. Around 8-10 lines is probably a good balance.

Allow retranslations

GPT can be unpredictable, and sometimes returns a response that can't be processed or that fails to complete the task. Some simple validations are performed on the response, and if any of them fail the batch will be retried if this setting is enabled. It is strongly advised.

Enforce line parity

This validates that the number of subtitles in the translation matches the number of lines in the source text. GPT has a tendency to merge lines and desync, so this check will catch that and trigger a retranslation request. When the problem is pointed out GPT can usually fix it, so it is strongly advised to enable this check.

Max characters

Any translated line that exceeds this number of characters will cause the batch to be sent back to GPT for retranslation. There is no guarantee that the result will respect the limit though, and if the source material has a lot of long lines it could just result in a lot of pointless retries, so consider raising the value to something that suggests a total failure (e.g. 250 characters) if you see this happening in the logs a lot.

Max newlines

More than 1 or 2 newlines generally indicates that the response was not in the correct format and multiple lines have been grouped together, so the batch will be retried. If the source material has more newlines it might be fine for the translation to follow suit though, so raise if necessary.

Max threads

It is usually best to process a translation sequentially (the "Play" button on the toolbar) since this means a summary of the previous scene will be available to pass as context when translating the next scene. If you're in a hurry and are not too worried about translation accuracy you can process multiple scenes in parallel (the "Fast Forward" button). 3 or 4 parallel translations is usually enough, there will probably be one or two long scenes that are the bottleneck and one or two extra threads will be able to finish the rest before they are completed.

How can I help?

Report any issues here on github and I'll do my best to investigate them. Even better, fix them yourself and create a pull request!

If there are any features or changes that would make the app easier to use or more effective the same logic applies.

Spread the word if you find gpt-subtrans helpful, or at least unhelpful in interesting ways.

Clone this wiki locally