You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It takes a long time to do inference for the first few images. Progressively the inference time reduces. Does engine generation optimize according to the device? Any cache/changes needed to make the inference time constant for any number of iterations?
I believe this is just an unfortunate limitation of GPU kernel initialization. There is not much you can do other than maybe running dummy inputs for however many slow inferences there are before going into your real data.
It takes a long time to do inference for the first few images. Progressively the inference time reduces. Does engine generation optimize according to the device? Any cache/changes needed to make the inference time constant for any number of iterations?
The text was updated successfully, but these errors were encountered: