Skip to content

Implementation of deep recurrent nonnegative matrix factorization (DR-NMF) for speech separation

License

Notifications You must be signed in to change notification settings

stwisdom/dr-nmf

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Implementation of deep recurrent nonnegative matrix factorization (DR-NMF) for speech separation

DR-NMF is a recurrent neural network constructed from the unfolded iterations of the iterative soft-thresholding algorithm (ISTA) applied to sparse NMF inference. Sparse NMF inference is the task of inferring the nonnegative sparse coefficients H given a nonnegative dictionary W such that WH approximates a nonnegative observation matrix X. For speech separation, the observation matrix X is the raw spectrogram of noisy audio, and the dictionary W is partitioned into speech and noise components. This partitioning of the dictionary W allows computation of an enhancement mask in the STFT domain.

Read the paper here: https://arxiv.org/abs/1709.07124

Presentation slides from WASPAA 2017 (received best student paper award)

Download pretrained sparse NMF dictionaries and weights for the trained networks here

Instructions:

Uses the task 2 data from the 2nd CHiME Challenge, which is available from the LDC (free for 2017 members, $50 for non-members).

  1. Set up environment (updated 05-03-19). This code depends on some older versions of packages (see requirements.txt. To set up a conda environment, run these commands:
conda create --name drnmf_orig3 cudnn=5.1 gxx_linux-64=5.4.0 python=2.7 theano=0.9.0 numpy=1.11 pygpu=0.6.9
pip install keras==2.0.4 librosa==0.5.1 joblib==0.11.0 hickle jupyter
  1. Download required toolboxes by running download_toolboxes.sh.
  2. Generate taskfiles by replacing the variable chime2_path in create_taskfiles.sh by your local CHiME2 path and running create_taskfiles.sh.
  3. Use enhance.py to train, reconstruct, and score audio. Use the run_waspaa2017.sh script to replicate results from the WASPAA 2017 paper.

Uses code from the following sources, which are automatically downloaded and unzipped by download_toolboxes.sh:

About

Implementation of deep recurrent nonnegative matrix factorization (DR-NMF) for speech separation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published