Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][RFC] Ordered-chapters walkthrough/guide #54

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

OrangeChannel
Copy link
Collaborator

@OrangeChannel OrangeChannel commented Feb 15, 2020

This is still unfinished, but figured might as well post this here for more visibility for people not in the Discord.

TODO:

  • completely re-write the whole thing with the libbluray advice from thebombzen 😞

  • fix up the TODOs in the md

  • explain editions

  • use editions to swap between OP/ED and NCOP/NCED

  • explain how to create a playall file

TODO in actual script:

  • Allow handling of both audio and video while trimming/splicing
  • Write chapter timestamps to valid XML
  • Find a better way to support VFR input clips (probably should require timecodes file with fallback on acsuite's implementation)
  • Re-write tests to cover all new functionality
  • Check that multiple OP/ED work as expected (should but just incase because this isn't in the current tests)

Blockers:

  • mpv # 7963 about playall files not loading attachments properly
  • vapoursynth audio support branch for new frames method for VFR clips and possibly some audio stuff

Copy link
Member

@FichteFoll FichteFoll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did an initial review that took me much more time than I hoped to spend on it.

I generally miss a short paragraph or at least a TL;DR on why one would even want to use Ordered Chapters. It is very important to provide context to decisions and not just explain how to apply them.

as they allow for multiple subtitles tracks and chapters
which are a somewhat unique feature to the container.
The MPEG-4 Part 14 or MP4 is another container format
but it is limited in its usefulness for fansubbing.
Copy link
Member

@FichteFoll FichteFoll Feb 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Imo we have enough time to explain why that is the case, unless this is already mentioned somewhere else in the guide and we can just refer to it.

encoding/mkv.md Outdated Show resolved Hide resolved
encoding/mkv.md Outdated Show resolved Hide resolved
encoding/mkv.md Outdated Show resolved Hide resolved
encoding/mkv.md Outdated

## Extracting the audio

If the source files you are working with are not .m2ts,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
If the source files you are working with are not .m2ts,
If the source files you are working with are not `.m2ts`,


This can be done with the MKVToolNix GUI,
in which case the order of the tracks in the bottom list is important.
Although mpv will only warn that tracks are mismatched,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Although mpv will only warn that tracks are mismatched,
Although mpv only emits a warning when tracks are mismatched,

This can be done with the MKVToolNix GUI,
in which case the order of the tracks in the bottom list is important.
Although mpv will only warn that tracks are mismatched,
it is a good idea to keep the same track order in all of the mkv's.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty sure this is not how you pluralize abbreviations, but someone more knowledgeable should verify this.


for n, a in zip(videos, audios):
args = shlex.split('mkvmerge -o {0}.mkv {0}.264 {1} {0}.ass'.format(n, a))
subprocess.run(args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as discussed in discord. 😉 Build a list and run that. Also, the internal variable name for the first parameter is cmd and that better fits what we're building here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to be

def run(args: Union[bytes, str, Sequence[Union[bytes, str, PathLike]]], ...

so maybe what you're thinking of is some other library or an older version of subprocess?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I was confusing things, then. I still find "args" to be a weird choice because it obviously contains the executable as the first element, but my argument regarding the internal name seems to be debunked.


Using a small [script to change frame numbers into timestamps][f2ts.py],
we find that 2159 frames translates to `00:01:30.048291667`.
This can also be accomplished with this following line in a Python instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "Python instance" refer to exaclty?

This can also be accomplished with this following line in a Python instance.

```py
for _ in input().split():print('%02d:%02d:%09f'%((m:=(s:=int(_)/23.976)//60)//60,m%60,s%60))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should never use _ as a variable name because that has special meaning in interactive sessions and is used as a "drop this" convention in scripts.

@OrangeChannel
Copy link
Collaborator Author

Did an initial review that took me much more time than I hoped to spend on it.

I generally miss a short paragraph or at least a TL;DR on why one would even want to use Ordered Chapters. It is very important to provide context to decisions and not just explain how to apply them.

@FichteFoll


Ordered-chapters are a special feature of mkv files that allows
the player to reference other mkv files in the same folder
to create a virtual combined timeline.

The example use cases we will be looking at starts with the ability
to cut out the OP/ED from each episode in a series
(assuming the OP/ED stays consistent throughout the show).
This comes with at least four benefits that are unachievable
using normal chapters:

  1. We avoid having to encode the same OP/ED sequence for each episode
    saving us encoding time.
  2. We can use different audio/video settings for the OP/EDs
    than for the main episodes.
    • Using FLAC for the music and a more compressible lossy codec
      for the main episode audio tracks will save space and satisfy audiophiles
      that want lossless music.
    • Using a much lower CRF value for the OP/ED will still likely save space
      as we are only encoding each once.
  3. We can create a 'playall' file that creates a virtual timeline of
    all of the main episodes, skipping previews and
    only playing the OP/ED once over the entire timeline.
  4. Using editions, we can allow the viewer to switch between the no-credit (NC)
    versions of the OP/ED and the normal ones.

You think this is too long?

@OrangeChannel
Copy link
Collaborator Author

OrangeChannel commented Feb 17, 2020

@FichteFoll Regarding noteboxes, I can't find a plugin so I made a new class with some basic CSS (the dashed border is because I can't decide on colors for each theme right now).
image
image
image
The last (white theme) pic shows 100% width forced. In theory, it would be slightly different color than the background and there'd only be the border on the left. It uses the font-awesome font that we seem to already be loading on the website for other things.

EDIT: here's a rough mobile design
image

@OrangeChannel
Copy link
Collaborator Author

OrangeChannel commented Feb 17, 2020

Just for reference:

.infobox {
    display: inline-flex;
    border-left: .4em solid;
    border-top: .1em dashed;
    border-bottom: .1em dashed;
    border-bottom-left-radius: .2em;
    border-top-left-radius: .2em;
    border-bottom-right-radius: .2em;
    border-top-right-radius: .2em;
    border-color: inherit;
    opacity: 80%;
    border-right: .1em dashed;
    padding-left: 0;
    padding-top: .5em;
    padding-bottom: .5em;
    padding-right: .5em;
    margin-top: -.25em;
    margin-bottom: -.25em;
    background: 0 0;
    width: 100%
}

.infobox p {
    padding-top: .5em;
    font-size: .85em;
    font-style: italic;
    margin-bottom: 0;
    padding-bottom: .5em
}

.infobox::before {
    content: "";
    display: inline-flex;
    font: 14px/1 FontAwesome;
    font-size: 1.75em;
    margin: 0 .5em 0 .8em;
    align-self: center
}
<div class="infobox"><p>
If you are following along with the next few steps,
you don't need to extract the audio just yet.
The audio-cutting script will apply this FFmpeg command to the files
automatically.
</p></div>

@FichteFoll
Copy link
Member

You think this is too long?

No, I think that's great.

Info box also looks good, except for entire text being in italics and noticeably smaller font size. I would definitely prefer using the normal font size or at least larger than that and I'm not a fan of using italics like this because it makes the strokes thin and harder to read with no added benefit. We already visually mark the entire box, so we wouldn't need to do it for the text as well.


For the sake of reviweing the PR and general workflow, please don't force-push already reviewed commits, so I can focus on the changes after than and don't have to rely on Github providing direct comparisons with the lost head.

I'll review the changes later.

@OrangeChannel
Copy link
Collaborator Author

OrangeChannel commented Feb 17, 2020

You might've missed the conversation I had in Discord with thebombzen but pretty much this is getting re-re-written from scratch (it was in the checklist at the top of the PR before you reviewed) so a lot of the stuff you reviewed is probably going to become irrelevant as I continue working on this. But if it makes it cleaer I'll just commit on top of the previous for now on.

Edit: I'm going to send the notebox changes in another CSS-focused PR once I fix it up. Will mention its formatting in the CONTRIBUTING doc.

@FichteFoll
Copy link
Member

All right, I'll hold off until you give me a clear sign it's ready for re-review.

@FichteFoll FichteFoll added the content We can do better! label Mar 2, 2020
@averms
Copy link
Contributor

averms commented Jun 4, 2020

Please don't tell people how to use ordered chapters :(

@stuxcrystal
Copy link
Contributor

What's the status here?

@OrangeChannel
Copy link
Collaborator Author

Have to clean up the code of the ordered-chapters automation script a bit before I get to this, but I should probably be able to get around to this next week. Kind of forgot about this after I became too lazy to re-write with the BDMV info, and then got distracted by a million other things 😆 .

@OrangeChannel
Copy link
Collaborator Author

OrangeChannel commented Aug 2, 2020

So, the script (https://github.com/OrangeChannel/ocsuite) can now handle

  1. Trimming + splicing the audio for all episodes and repeated chapters
  2. Trimming + splicing the source clip for all episodes and repeated chapters
  3. Generating SUIDs for the repeated chapters
  4. Writing the chapters to valid Mastroka XMLs including adjusted timestamps and references to external segments using the generated SUIDs

so I can at least start on the re-write of Fluff's blog post now.

However, mpv-player/mpv#7963 is a thing that petzku brought up yesterday and I'll be waiting to see if this is a bug or not or if something should be done about it before updating the Caveats, Assorted Traps and Various Interesting Things to Note section since some of this stuff is outdated or doesn't apply to mpv.

EDIT: All of this stuff is limited to constant-frame-rate sources though sadly, the logic behind converting frame numbers to timestamps isn't difficult for VFR, but doing so while cutting out sections (repeated/ordered chapters) is not feasible really.

@Lypheo
Copy link
Contributor

Lypheo commented Aug 2, 2020

EDIT: All of this stuff is limited to constant-frame-rate sources though sadly, the logic behind converting frame numbers to timestamps isn't difficult for VFR, but doing so while cutting out sections (repeated/ordered chapters) is not feasible really.

Why not, exactly? Couldn’t you just extract the timestamps and look them up by frame number instead of calculating them?

@OrangeChannel
Copy link
Collaborator Author

Actually, since the script now returns the trimmed/spliced clips, vfr chapter timestamps are possible with vspipe's timecodes files, but is there a way to do this without creating a temporary script for each clip and then reading from a timecode file? I can technically do this in a really messy way with .get_frame(n)['_DurationNum'] without breaking out of the script but this would probably be pretty complicated/ugly. @Lypheo is there a better way to get timestamps from a vfr vs clip?

@Lypheo
Copy link
Contributor

Lypheo commented Aug 3, 2020

Not sure exactly how ocsuite operates but maybe sth like this?

src = core.ffms2.Source(r"anime_tiddies.mkv", timecodes="tc.txt")
tc = [int(float(x)) for x in open("tc.txt", "r").read().splitlines()[1:]]

def fu(n,f):
	fout = f.copy()
	fout.props["TimeStamp"] = tc[n]
	return fout
	
o = core.std.ModifyFrame(src, src, fu)

@OrangeChannel
Copy link
Collaborator Author

OrangeChannel commented Aug 3, 2020

I've implemented a pure Python solution in acsuite for now

def f2ts(f: int, /, *, precision: int = 3, src_clip: vs.VideoNode) -> str:
    """Converts frame number to a timestamp based on framerate.
    Can handle variable-frame-rate clips as well, using similar methods to that of vspipe --timecodes.
    Meant to be called as a functools.partial with 'src_clip' specified before-hand.
    """
    if precision not in [0, 3, 6, 9]:
        raise ValueError(f"f2ts: the precision {precision} must be a multiple of 3 (including 0)")
    if src_clip.fps != fractions.Fraction(0, 1):
        t = round(10 ** 9 * f * src_clip.fps ** -1)
        s = t / 10 ** 9
    else:
        s = clip_to_timecodes(src_clip)[f]

    m = s // 60
    s %= 60
    h = m // 60
    m %= 60

    if precision == 0:
        return f'{h:02.0f}:{m:02.0f}:{round(s):02}'
    elif precision == 3:
        return f'{h:02.0f}:{m:02.0f}:{s:06.3f}'
    elif precision == 6:
        return f'{h:02.0f}:{m:02.0f}:{s:09.6f}'
    elif precision == 9:
        return f'{h:02.0f}:{m:02.0f}:{s:012.9f}'


@functools.lru_cache
def clip_to_timecodes(src_clip: vs.VideoNode) -> collections.deque:
    """Cached function to return a list of timecodes for vfr clips."""
    timecodes = collections.deque([0.0], maxlen=src_clip.num_frames + 1)
    curr_time = fractions.Fraction()
    for frame in src_clip.frames():
        curr_time += fractions.Fraction(frame.props['_DurationNum'], frame.props['_DurationDen'])
        timecodes.append(float(curr_time))
    return timecodes

as this should be used only on source filter inputs, the speed is really limited by the current implementation of VideoNode.frames() which according to stux, will be much faster once the pending frames PR gets merged.

The speed is not really great (~50 seconds to get 100 timestamps from a 1 million long frame vfr clip created by core.std.BlankClip) and deque only seems to improve performance by around 2% at most, but this will hopefully be a lot faster when the new frames generator is implemented.

@Lypheo
Copy link
Contributor

Lypheo commented Aug 3, 2020

But why though 🤔 . With ffms2's timecodes, you incur no speed penalty at all, and afaict it should be identical in funtionality.

@OrangeChannel
Copy link
Collaborator Author

But why though 🤔 . With ffms2's timecodes, you incur no speed penalty at all, and afaict it should be identical in funtionality.

For the audio cutting script this is actually part of:
Hmmm, this does require the vfr clip to be directly read from a file though, so wouldn't work for clips that somehow become vfr via vapoursynth filtering, but thinking about it... this shouldn't really ever be a case since this is for cutting audio which will likely be synced with the source anyways.

Also, the way to implement this would require the user to supply a timecodes file that they would need to generate (assuming they have ffms2 installed and loaded), as the current functionality takes in a VideoNode as input, not a filename I could open with ffms2.Source(). I could make it so a provided timecodes file will take precedence over the clip_to_timecodes function, only using it as a fallback if only a clip is given. For the ordered chapters script, I could also just make the audio and write_to_xml methods take a timecodes filename as a argument, but I don't know if I want to force the user to re-generate an index + a timecodes file if they've used a different source filter to get the vfr clip in the first place.

For the chapter timestamp part of the ordered chapters creation script:
Now, on to the real reason using existing frame props here in acsuite is actually a decent solution for the ordered chapters script:
Once I have the vfr clips split along chapter boundaries, I stitch them back together using std.Splice, allowing me to just grab frame numbers and timestamps from the already adjusted clips, without having to do the math to adjust frame timestamps to begin at zero and be continuous across cut-out repeated chapters (although technically this wouldn't be that difficult, I think).

@OrangeChannel
Copy link
Collaborator Author

Just did some testing with an actual mkv clip (34k frames loaded via ffms2), and it is unusably slow 😞 , also noticed that ffms2 timecodes round to nearest ms, which is not ideal for chapter timings but I'm not sure the precision really matters at all at that level. Assuming the frames PR even gives this a massive speed-up, it would have to be like 2 orders of magnitude faster than it is now for this to be usable on very long clips (i.e. merged clips from a BDMV). Also found out Splice complains when trying to add a BlankClip with a loaded clip with different frame rates (using the + operator) but not for two clips with differing framerates created by std.BlankClip 🤔 . Looks like using a timecodes file is going to be necessary for the ordered chapters script, so I'm going to start refactoring stuff now so the timestamp calculation doesn't just use clip.fps and the functools.reduce operation won't error out for differing framerate segments.

@Lypheo
Copy link
Contributor

Lypheo commented Aug 3, 2020

Also found out Splice complains when trying to add a BlankClip with a loaded clip with different frame rates (using the + operator) but not for two clips with differing framerates created by std.BlankClip 🤔

The loaded clip probably differed in format or dimensions. There's a bug in Splice that I never bothered to report which causes it to state fps mismatch as the error cause even though the format/res was the problem.

@OrangeChannel
Copy link
Collaborator Author

The loaded clip probably differed in format or dimensions. There's a bug in Splice that I never bothered to report which causes it to state fps mismatch as the error cause even though the format/res was the problem.

Ahh, that makes a lot more sense actually, still giving Splice a list of segments is actually cleaner looking than a functools.reduce(operator.add, segments) anyways so I went ahead and changed it.

I still can't think of a good way to use timecodes files to calculate the new frame times once the repeated chapters are cut out besides the method I'm using now, which turns out to be extremely slow. VFR clips are now technically supported in both the audio and ordered chapters scripts, though it will be very slow to write the chapters to xml if the clip is vfr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
content We can do better!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants