You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
videoChatWithLLM
视频对话与llm
关于项目
代码:https://github.com/otoTree/videoChatWithLLM/tree/main
模型:internlm xcomposer2-vl-7b
工作流程
摄像头采集实时画面 ---->发送按钮点击,截取最后一帧发送,与文本合成prompt --->生成内容,返回文本
思考
通过一帧图像的图文理解实现了伪视频对话
未来如果能够实现
流输入
同样也能实现视频对话Todo
· 实现asr和tts功能
· 实现表情功能,具象化LLM
· 模型微调,降低模型功耗,加速模型生成速度
· 实在不行就炼一个新的视频生成模型(看好sora)
成员
· 小黄(otoTree)
Beta Was this translation helpful? Give feedback.
All reactions