You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's common practice to extract multiple frames from a video to create a multi-image input. While Ovis1.6 is primarily trained on single-image samples, it also supports multi-image inputs. An example is available at: #25
On the other hand, we are currently working on incorporating video data into our training process and plan to enhance video processing capabilities in future versions.
Ovis is really good. Could you please support video and audio?
The text was updated successfully, but these errors were encountered: