Visual Speech Enhancement

Implementation of the method described in the paper: Visual Speech Enhancement by Aviv Gabbay, Asaph Shamir and Shmuel Peleg.

Speech Enhancement Demo

Usage

Dependencies

python >= 2.7
mediaio
face-detection
keras >= 2.0.4
numpy >= 1.12.1
dlib >= 19.4.0
opencv >= 3.2.0
librosa >= 0.5.1

Getting started

Given an audio-visual dataset of the directory structure:

├── speaker-1
|   ├── audio
|   |   ├── f1.wav
|   |   └── f2.wav
|   └── video
|	├── f1.mp4
|	└── f2.mp4
├── speaker-2
|   ├── audio
|   |   ├── f1.wav
|   |   └── f2.wav
|   └── video
|	├── f1.mp4
|	└── f2.mp4
...

and noise directory contains audio files (*.wav) of noise samples, do the following steps.

Preprocess train, validation and test datasets separately by:

speech_enhancer.py --base_dir <output-dir-path> preprocess
    --data_name <preprocessed-data-name>
    --dataset_dir <dataset-dir-path>
    --noise_dirs <noise-dir-path> ...
    [--speakers <speaker-id> ...]
    [--ignored_speakers <speaker-id> ...]

Then, train the model by:

speech_enhancer.py --base_dir <output-dir-path> train
    --model <model-name>
    --train_data_names <preprocessed-training-data-name> ...
    --validation_data_names <preprocessed-validation-data-name> ...
    [--gpus <num-of-gpus>]

Finally, enhance the test noisy speech samples by:

speech_enhancer.py --base_dir <output-dir-path> predict
    --model <model-name>
    --data_name <preprocessed-test-data-name>
    [--gpus <num-of-gpus>]

Citing

If you find this project useful for your research, please cite

@inproceedings{gabbay2018visual,
  author    = {Aviv Gabbay and
               Asaph Shamir and
               Shmuel Peleg},
  title     = {Visual Speech Enhancement},
  booktitle = {Interspeech},
  pages     = {1170--1174},
  publisher = {{ISCA}},
  year      = {2018}
}

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
.gitignore		.gitignore
README.md		README.md
data_processor.py		data_processor.py
dataset.py		dataset.py
network.py		network.py
speech_enhancement_evaluator.py		speech_enhancement_evaluator.py
speech_enhancer.py		speech_enhancer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Visual Speech Enhancement

Speech Enhancement Demo

Usage

Dependencies

Getting started

Citing

About

Releases

Packages

Languages

LindaCY/audio-visual-speech-enhancement

Folders and files

Latest commit

History

Repository files navigation

Visual Speech Enhancement

Speech Enhancement Demo

Usage

Dependencies

Getting started

Citing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages