Skip to content

Shift postprocessing

Victor Efimov edited this page Oct 3, 2015 · 8 revisions

Postprocessing in Sushi works on chapter groups. These groups are either formed from the actual chapters or automatically created by grouping events with similar shifts. Grouping can be disabled using the --no-grouping switch but it will also disable all postprocessing except keyframes.

Border fixing

It is quite common to get incorrect results for events located very close to the audio start/end or near the chapter borders. Sushi tries to handle this case by detecting the lines with unusually high difference and linking them to the first valid line. There's no way to control this behavior.

Smoothing

Consider the following shifts:

1.0  1.0  1.0  1.0  5.0  1.0  1.0  1.0  1.0  1.0

It is very likely that the included 5.0 value is wrong and should actually be 1.0. Sushi tries to remove outliers like this by applying a running median to the group shifts. You can control the radius of this median using the --smooth-radius parameter. Do not that this parameter sets the radius of the median and not its diameter. This means that the default value of 3 will result in every shift being replaced by a median of 7 nearest values (including the current one).

Group averaging

The whole grouping idea comes from a simple fact that there are often distinct groups in the video. Common examples are "Part A", "Opening", "Ending" and so on. It is very likely that all events inside each group will share a single common shift. Indeed, it's hard to imagine an opening with half of its lines getting shifted by a second forward and another half by two seconds backwards.

Sushi uses this fact and averages the shift across all events in every group. The average is weighted on the similarity of the corresponding audio segments between the source and the destination files so that segments with lower difference have stronger impact on the result group shift.

Applying a single shift to every event in the group ensures that you won't get any overlaps or blinking lines in the output stream (only at this stage and only if groups don't overlap with each other after shifting) even though Sushi does not perform any special checking for these cases.

Keyframes

Keyframe processing is the only step in Sushi that can change the duration of a line and hence produce overlaps. It works when both --src-keyframes and --dst-keyframes are provided, the script can get the FPS of both files (either extracting the timecodes automatically or using the --dst-fps, --src-fps, --src-timecodes and --dst-timecodes arguments) and --max-kf-distance is positive.

Sushi can make these keyframes automatically using the SCXvid-standalone app. There are two special values for keyframes arguments: --dst-keyframes auto will first try to find keyframes already made by sushi and if found – use them (even if they don't match the video!), otherwise make keyframes as usual. --dst-keyframes make will ignore the search step and will always re-create the keyframes. Default value is not provided, you probably want to set it to auto yourself.

There are two algorithms: keyframe correction and keyframe snapping. You can control which of them are performed using the --kf-mode argument. Default value of all means that both are used.

Do note that the goal of keyframe processing is to make the script look like the source and not to improve keyframe snapping. This is done by taking into account the distance between source events and corresponding keyframes. For example, if the event was 2 frames away from the closest keyframe in the source, it will also be 2 frames away in the output (if everything goes well).

Keyframe correction

The first pass is keyframe correction, or shifting. During this step sushi calculates the distance of every search group to the nearest keyframes which are no farther than --max-kf-distance from start/end time. These distances are then (linearly) interpolated across all search groups.

After the interpolation each group has two correction shifts – one at the start time and one at the end. If the group contains a single event, both shifts are applied as is, possibly changing the event duration. If the group contains multiple events, the closest-to-mean-shift-across-all-events shift is applied. Sushi will warn you if any typesetting groups have different shifts at start/end points because it usually means they're broken.

This step might introduce unwanted effects like disappearing lead-ins or lead to worse timing in general, but it's vital when there's some audio synchronization problems, different framerate issues etc. This step is enabled using the --kf-mode shift argument.

Keyframe snapping

Works exactly like you'd expect it to. If start/end time is closer than --max-kf-distance to the nearest keyframe, it gets snapped to that keyframe. Search groups with more than one event are not snapped.

This step is enabled using the --kf-mode snap argument.

Clone this wiki locally