You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Starting clips on keyframes can potentially be a fast and useful sampler because it can sometimes be faster to seek and decode to a keyframe instead of decoding multiple P frames to get to the sample point.
Ideas:
There can be an API option to to return clips whose first frame is always a keyframe
There can be an API option to have clips start on keyframes if they are "close enough" (in pts) to the sample point that the sampler wanted to choose. This can be used to balance performance and scene diversity
This should work in approximate mode (i.e. it shouldn't require a full scan of the file to do this -- we could read the list of keyframes from the header)
There could potentially be an API call that returns the list of pts to the user, if they want to do something with it (they could look at it and decide whether it's worth doing keyframe only sampling or uniform sampling)
The text was updated successfully, but these errors were encountered:
@NicolasHug and I have also talked about a similar thing. We'll want to do some research to better understand the use-cases people want to cover. The biggest question to me is, should "key_frame_only" be a kind of seek mode, and then we use the samplers as-is? Or should the sampler API itself be aware of the concept of key frames? If the latter, what do we need to expose in the decoder to enable it?
@scotts - I'm interested in something like this as well.
Currently, I'm processing long videos using IterableDataset in Pytorch, where each worker handles a chunk of the video, and those chunks always start at an I-frame. To achieve this, I first identify the I-frames and use them to seek each worker to its assigned chunk. I use these keyframes for fast seeking, and then each worker samples frames in sequential order. Right now, I'm using PyAv for that, but I would like to see how I can achieve the same with torchcodec.
I'm sharing this in case it's useful, but I'm also open to hearing if there are alternative ways to parallelize the processing of long videos that I might not be considering.
Starting clips on keyframes can potentially be a fast and useful sampler because it can sometimes be faster to seek and decode to a keyframe instead of decoding multiple P frames to get to the sample point.
Ideas:
The text was updated successfully, but these errors were encountered: