Skip to content

A deep learning model to lip-sync a given video with any given audio. It uses GAN architecture to orchestrate loss reconstruction or training.

License

Notifications You must be signed in to change notification settings

snehitvaddi/Deepfake-using-Wave2Lip

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Wav2Lip: Accurately Lip-sync Videos In Any Language.

Wav2Lip Repository is part of the paper: A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild published at ACM Multimedia 2020.

🧾 Official Paper 📑 Project Page 🔑 Original Repo 💡 Colab link
Official Paper Page Repo Updated Code/Notebook

Import the pretrained weights from here. Make sure you added this folder to your drive before executing code.

Note: This project/paper is whole and sole referenced from Rudrabha.

💡 I have made training notebook DeepFake_Wav2Lip.ipynb private to avoid misuse, contact me @[email protected] for complete directory ✌

🔑 If you're looking any Btech/Mtech/Academic projects? Ping me, I have a bunch

🧠 Video Output:

👉 Trump Speaking in Telugu (An Indian language):

🗺 Architecture:

This approach generates accurate lip-sync by learning from an already well-trained lip-sync expert. Unlike previous works that employ only a reconstruction loss or train a discriminator in a GAN setup, we use a pre-trained discriminator that is already quite accurate at detecting lip-sync errors. We show that fine-tuning it on the noisy generated faces reduces the discriminator's capacity to measure lip-sync, impacting the generated lip forms as well.

🔧 Try it yourself:

  • We need a base video which needs to be lip synched.
  • An audio file of any language to mimic.
  • That's all you need to lip sync.

⚡ Highlights:

  • Lip-sync videos to any target speech with high accuracy 💯
  • The audio source can be any file supported by FFMPEG containing audio data: *.wav, *.mp3 or even a video file, from which the code will automatically extract the audio.

☢ Ethics and Code of conduct

  • Deepfake is not for creating inappropriate content.
  • Deepfake is not for changing faces without consent or with the intent of hiding its use.
  • Deepfake is not for any illicit, unethical, or questionable purposes.
  • Deepfake exists to experiment and discover AI techniques, for social or political commentary, for movies, and for any number of ethical and reasonable uses.

⚠ Creator Disclaimer

All results from this open-source code or our demo website should only be used for research/academic/personal purposes only. As the models are trained on the LRS2 dataset, any form of commercial use is strictly prohibited. Please contact us for all further queries.

About

A deep learning model to lip-sync a given video with any given audio. It uses GAN architecture to orchestrate loss reconstruction or training.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published