Stars
A simple screen parsing tool towards pure vision based GUI agent
Bring portraits to life in Real Time!onnx/tensorrt support!实时肖像驱动!
Memory-Guided Diffusion for Expressive Talking Video Generation
Official implementation of "Sonic: Shifting Focus to Global Audio Perception in Portrait Animation"
A modern GUI client based on Tauri, designed to run in Windows, macOS and Linux for tailored proxy experience
一个用于将 m3u8 流媒体文件转换为 mp4 视频文件的 Python 库。该库利用多线程下载技术,显著提升了下载速度和转换效率。
R0NAM1 / aiortc
Forked from aiortc/aiortcWebRTC and ORTC implementation for Python using asyncio
WebRTC and ORTC implementation for Python using asyncio
MDN samples server; used for samples that can't be hosted in-place on MDN, plus back-end server-side code for samples that need it.
A ffmpeg-based tool that can push video and audio to the streaming server in real time, which can be easily integrated into your python DIP workflow to create live streaming applications.
Offical implement of Dynamic Frame Avatar with Non-autoregressive Diffusion Framework for talking head Video Generation
# Edge-TTS Web 一个基于 Microsoft Edge 浏览器 TTS 引擎的在线语音合成系统,提供简单易用的 Web 界面。 特性 🌍 支持多语言:中文(简体、繁体、粤语)、英语、日语等 74 种语言 - 🎭 丰富音色:提供 318 种不同的声音选项 - 🎛️ 灵活调节:支持语速调整(0.25x-4x) - 📝 字幕支持:自动生成 SRT 格式字幕 - 🎯 精准同步:音频与字…
一个超轻量级、可以在移动端实时运行的数字人模型
Audio2Blendshape Model using the BEAT database
[ECCV'24] TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
洛曦 数字人视频播放器,带HTTP API,使用gradio api对接Easy-Wav2Lip、Sadtalker、GeneFacePlusPlus、MuseTalk,也可以用于播放本地视频
An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…
Speech-to-text, text-to-speech, speaker diarization, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, HarmonyOS, Raspberry Pi, RISC…
A gather of tools or funcs frequently using in my work.
🤢 LipSick: Fast, High Quality, Low Resource Lipsync Tool 🤮