You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question about the position format of the caption in the input data in the command data. For example, the following sentence in the paper, A video of a Super-hero Movie. Is this sentence part of the text prompt, or does it need to be embedded through the imagebind model and then input into LLM?
The text was updated successfully, but these errors were encountered:
I have a question about the position format of the caption in the input data in the command data. For example, the following sentence in the paper, A video of a Super-hero Movie. Is this sentence part of the text prompt, or does it need to be embedded through the imagebind model and then input into LLM?
The text was updated successfully, but these errors were encountered: