-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamic slicing and batch size #5671
Comments
Hello @rems75 |
Thanks for the answer @mzient . I'll be training things on H100s, which have 7 nvdecs, will the ordinary external source in batch mode be able to leverage all of them? (In my case the batch will contain 20-30 1s videos) Regarding the second question, any thoughts on extracting frames in the pipeline itself? Maybe a custom operator? |
Hi @rems75, Thank you for reaching out. Now, only |
Hi @JanuszL |
Hi @rems75, You can try using NSight System and explore its video profiling capabilities. |
Thanks for the pointer @JanuszL, still looking into it. |
Let me add this to our ToDo list. |
Describe the question.
Hello everyone,
I'm trying to optimise a torch data loading pipeline that involves video decoding and thought I'd give DALI a try (already tried things like
pynvvideocodec
but that ended up quite slow). I have something more or less working but at the cost of some suboptimal decisions so I'm wondering whether I missed relevant options or whether DALI is not perfectly suited for my use case.I have a set of N 1s videos, where N changes from batch to batch, and I want to extract a certain number of frames from those videos, where the indices of the frames differ from video to video. From reading other posts, it does seem at the frontier of what DALI was designed for.
I have set up an
ExternalInputCallable
class withbatch=False
(in order to leverage parallelism) where__call__
returns a video and list of indices, and a pipeline based onfn.experimental.decoders.video
.The questions I have are the following:
fn.element_extract
and decoding at the sample level in the pipeline? Right now I'm doing the slicing per sample on a torch tensor built from each tensorGPU returned by pipeline.run(), which feels very inefficient.Check for duplicates
The text was updated successfully, but these errors were encountered: