Skip to content


Browse files Browse the repository at this point in the history
  • Loading branch information
Snowdar committed May 30, 2020
1 parent b5dbb7f commit f3dc35d
Show file tree
Hide file tree
Showing 8 changed files with 180 additions and 116 deletions.
265 changes: 178 additions & 87 deletions
Original file line number Diff line number Diff line change
@@ -1,108 +1,199 @@
# ASV-Subtools
Copyright xmuspeech (Author: Snowdar 2020-02-27)
# ASV-Subtools: An Open Source Tools for Speaker Recognition
> Copyright: [XMU Speech Lab]( (Xiamen University, China)
> Apache 2.0
> Author : Miao Zhao (**Snowdar**), Jianfeng Zhou, Zheng Li, Hao Lu
> Co-author: Lin Li, Qingyang Hong

## Introduction
ASV-Subtools is developed based on Pytorch and Kaldi for speaker recognition and language identification etc..
The basic training framework is provided here and the relation between every part is very clear. So you could change anything you want to obtain a custom ASV-Subtools.

### Support List
- Multi-GPU Training Solution
+ [x] [DistributedDataParallel (DDP)]( [Built-in function of Pytorch]
+ [x] [Horovod](

- Front-end
+ [x] [Convenient Augmentation of Reverb, Noise, Music and Babble](
+ [x] Inverted [Specaugment](

- Model
+ [x] [Standard X-vector](
+ [x] [Extended X-vector](
+ [x] Resnet1d
+ [x] [Resnet2d](
+ [ ] [F-TDNN X-vector](

- Components
+ [x] [Attentive Statistics Pooling](
+ [x] [ Learnable Dictionary Encoding (LDE) Pooling](
+ [x] [Sequeze and Excitation (SE)]( [An [example]( of speaker recognition based on Resnet1d by Jianfeng Zhou]
+ [ ] Multi-head Attention Pooling

- Loss Functions
+ [x] Softmax Loss (Affine + Softmax + Cross-Entropy)
+ [x] AM-Softmax Loss
+ [x] AAM-Softmax Loss
+ [x] Double AM-Softmax Loss
+ [x] Ring Loss

- Optimizer [Out of Pytorch built-in functions]
+ [x] Lookahead [A wrapper optimizer]
+ [x] RAdam
+ [x] Ralamb
+ [x] Novograd
+ [x] Gradient Centralization [Extra bound to optimizer]

- Training Stratagies
+ [x] [AdamW]( + [WarmRestarts](
+ [ ] SGD + [ReduceLROnPlateau](
+ [x] Training with Magin Decay Stratagy
+ [x] Heated Up Stratagy
+ [x] [Multi-task Training with Phonetic Information]( (Kaldi) [[Source codes]( was provided by [Yi Liu]( Thanks.]
+ [ ] Multi-task Training with Phonetic Information (Pytorch)
+ [ ] GAN

- Back-End
+ [x] LDA, Submean, Whiten (ZCA), Vector Length Normalization
+ [x] Cosine Similarity
+ [x] Classifiers: SVM, GMM, Logistic Regression (LR), PLDA, APLDA, CORAL, CORAL+, LIP, CIP
+ [x] Score Normalization: S-Norm, AS-Norm
+ [ ] Calibration
+ [x] Metric: EER, Cavg, minDCF

- Others
+ [x] [Learning Rate Finder](
+ [ ] Use **matplotlib** to Plot DET Curve a.w.t the Format of DETware (Matlab Version) of [NIST's Tools](

### Project Structure
### Training Framework

### Data Pipeline

## Ready to Start
### 1\. Install Kaldi
The Pytorch-training has less relation to Kaldi, but we have not provided other interfaces to concatenate acoustic features and training now. So if you don't want to use Kaldi, it is easy to change the **libs.egs.egs.ChunkEgs** class for the features are given to Pytorch only by []( Of course, you should also change the interface of extracting x-vector after training done. And most of scripts which requires Kaldi could be not available, such as subtools/ and subtools/ etc..

**If you prefer to use Kaldi, then install Kaldi firstly w.r.t**

# Download Kaldi
git clone kaldi --origin upstream
cd kaldi
# You could check the INSTALL file of current directory to install for more details
# Compile tools firstly
cd tools
sh extras/
make -j 4
# Config src before compiling
cd ../src
./configure --shared
# Check depend and compile
make depend -j 4
make -j 4
cd ..

### 2\. Create Project
Create your project with **4-level name** relative to Kaldi root directory (1-level), such as **kaldi/egs/xmuspeech/sre**. It is important to environment. For more details, see [subtools/](

# Suppose current directory is kaldi root directory
mkdir -p kaldi/egs/xmuspeech/sre

### 3\. Clone ASV-Subtools
ASV-Subtools could be saw as a set of tools like utils/steps of Kaldi, so there are only two extra stages to complete the installation:
+ Clone ASV-Subtools to your project.
+ Install the requirements of python (**Python3 is recommended**).

# Clone asv-subtools from github
cd kaldi/egs/xmuspeech/sre
git clone

### 4\. Install Python Requirements
+ Pytorch>=1.2: ```pip3 install torch```
+ Other requirements: numpy, thop, pandas, progressbar2, matplotlib, scipy (option), sklearn (option)
```pip3 install -r subtools/requirements.txt```

### 5\. Support Multi-GPU Training
ASV-Subtools provides both **DDP (recommended)** and Horovod solutions to support multi-GPU training.

Subtools is a set of tools which is based on Pytorch + Kaldi for speaker recognition etc..
**Some answers about how to use multi-GPU taining, see [subtools/pytorch/launcher/]( It is very convenient and easy now.**

## 声明
Requirements List:
+ DDP: Pytorch, NCCL
+ Horovod: Pytorch, NCCL, Openmpi, Horovod

## 用途
#### An Example of Install NCCL Based on Linux-Centos-7 and CUDA-10.2
# For a simple way, there are only three stages.
# [1] Download rpm package of nvidia
## 声纹识别 Recipe
# [2] Add nvidia repo to yum (NOKEY could be ignored)
sudo rpm -i nvidia-machine-learning-repo-rhel7-1.0.0-1.x86_64.rpm
# [3] Install NCCL by yum
sudo yum install libnccl-2.6.4-1+cuda10.2 libnccl-devel-2.6.4-1+cuda10.2 libnccl-static-2.6.4-1+cuda10.2

## 克隆
在工程目录下,如kaldi/egs/xmuspeech/sre, 克隆subtools:
These yum-clean commands could be very useful when you get some troubles when using yum.

git clone
# Install yum-utils firstly
yum -y install yum-utils
## 更新
yum clean all
cd subtools
git pull
yum-complete-transaction --cleanup-only
## 依赖库安装
package-cleanup --cleandupes

If you want to install Openmpi and Horovod, see for more details.

yum -y install yum-utils
yum clean all
yum-complete-transaction --cleanup-only
package-cleanup --cleandupes
### 6\. Extra Installation (Option)

[1] 基本依赖包
## Recipe
### Voxceleb Recipe [Speaker Recognition]
There are two recipes of Voxceleb:

pip3 install torch numpy pandas progressbar2
(1) see subtools/recipe/voxceleb/

[2] 多GPU训练依赖包 <方案 = Horovod:>

+ NCCL安装 <方案 = 从yum网络库安装:>
(2) see subtools/recipe/voxcelebSRC/

# Navidia yum库下载(Centos7,cuda10.2:

# Navidia yum库安装(NOKEY警告可忽略)
sudo rpm -i nvidia-machine-learning-repo-rhel7-1.0.0-1.x86_64.rpm
### AP-OLR 2020 Baseline Recipe [Language Identification]

# 安装NCCL(nccl2.6.4+cuda10.2)
sudo yum install libnccl-2.6.4-1+cuda10.2 libnccl-devel-2.6.4-1+cuda10.2 libnccl-static-2.6.4-1+cuda10.2

+ Openmpi安装(高性能通信包) <方案 = 下载编译安装:>
Kaldi baseline:

# 源代码下载(3.1.2版本正常,高版本可能异常)
Pytorch baseline:

# 解压
tar zxf openmpi-3.1.2.tar.gz
## Feedback
+ If you find bugs or have some questions, please create an issue in issues of github to let everyone know it so that a good solution could be provided.
+ If you have any questions to ask me, you could also send e-mail to [email protected] and I will reply this in my free time.

# 配置检查与编译安装
cd openmpi-3.1.2

./configure --prefix=/usr/local

make -j 4

make install

+ Horovod安装

# GCC版本问题 < 方案 = 临时使用高版本GCC-6.3:>
# 更新yum源并安装GCC-6.3
yum -y install centos-release-scl
yum -y install devtoolset-6-gcc devtoolset-6-gcc-c++ devtoolset-6-binutils
# 临时启用GCC-6.3(仅当前终端生效)
scl enable devtoolset-6 bash 或 source /opt/rh/devtoolset-6/enable

# 若上述方法安装后 import horovod.torch时,出现 "/lib64/ version `GLIBCXX_3.4.20' not found" 问题 < 方案 = 编译安装:>
# 下载源码包
tar xzf gcc-6.3.0.tar.gz
cd gcc-6.3.0
mkdir gcc-build-6.3.0
cd gcc-build-6.3.0
../configure --enable-checking=release --enable-languages=c,c++ --disable-multilib
make -j 4
make install

在 /root/.bashrc 中添加环境变量 export LD_LIBRARY_PATH=/usr/local/lib64:$LD_LIBRARY_PATH

# 安装GPU支持版本(基于NCCL依赖)

# 环境变量
在 /etc/profile 或 /root/.bashrc 中添加

export PATH=$PATH:/usr/local/python3/bin/

## 问题反馈
本项目定位为开源项目,若有相关问题请联系作者Snowdar [[email protected]]
## Acknowledgement
+ Thanks to Kaldi, Pytorch, kaldi_io
+ Thanks to everyone that contribute their time and ideas to ASV-Subtools.
+ Thanks to myself also (\^_^).
2 changes: 1 addition & 1 deletion
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@

# Copyright xmuspeech (Author:2018-09-14)
# Copyright xmuspeech (Author:Snowdar 2018-09-14)

# To avoid adding some spkear-id with sp prefix in lre or sre task,you can use this script to correct your datadir
extra_files= # you can add some you want to fix but needs utt-id in 1th field
Expand Down
Binary file added doc/ASV-Subtools-project-structure.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/pytorch-data-pipeline.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/pytorch-training-framework.png
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
26 changes: 0 additions & 26 deletions

This file was deleted.

2 changes: 1 addition & 1 deletion pytorch/libs/support/
Original file line number Diff line number Diff line change
Expand Up @@ -368,7 +368,7 @@ def read_log_csv(csv_path:str):
def init_multi_gpu_training(gpu_id="", solution="ddp", port=29500):
num_gpu = len(parse_gpu_id_option(gpu_id))
if num_gpu > 1:
# The ddp solution is suggested.
# The DistributedDataParallel (DDP) solution is suggested.
if solution == "ddp":
if is_main_training():"DDP has been initialized.")
Expand Down
1 change: 0 additions & 1 deletion requirements.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
Expand Down

0 comments on commit f3dc35d

Please sign in to comment.