You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm interested in the Fast Captioning pipeline and I found there if a difference between the frame concatenation methods used in captioner/fast_captioner_lmdeploy.py and captioner/app.py.
The text was updated successfully, but these errors were encountered:
Liam-lyr
changed the title
Frame concatenation methods are different between radio demo and fast_captioner_lmdeploy.py?
Frame concatenation methods are different between gradio demo and fast_captioner_lmdeploy.py?
Aug 19, 2024
Hi, thanks for the great work!
I'm interested in the Fast Captioning pipeline and I found there if a difference between the frame concatenation methods used in
captioner/fast_captioner_lmdeploy.py
andcaptioner/app.py
.In
captioner/fast_captioner_lmdeploy.py
, you use 30 frames per image to form a 5x6 grid and feed it to the pipeline (https://github.com/ShareGPT4Omni/ShareGPT4Video/blob/88426fd4a8386f3009368d424d5972881cdde311/captioner/fast_captioner_lmdeploy.py#L93C1-L97C84):However in
captioner/app.py
, you concat the frames in a list (as described in the original paper) (the following snippet and https://github.com/ShareGPT4Omni/ShareGPT4Video/blob/88426fd4a8386f3009368d424d5972881cdde311/captioner/app.py#L155C1-L201C21):Can you please explain why? Thanks a lot.
The text was updated successfully, but these errors were encountered: