v1.0.0 beta: Complete refactor, Reduced Error Rates, Support for Advanced Training Features, Matching Mode
New Features
- Complete Refactor: Refactored nearly all the code to improve maintainability.
- Reduced Error Rate: Optimizations were made to the model structure, loss function, and decoding methods, reducing the error rate of phoneme alignment.
- Advanced Training Features:
-
Automatic Mixed Precision Training: Utilized PyTorch Lightning's built-in mixed precision training.
Simply specify the
accelerator
hyperparameter in thetrain_config.yaml
(default isbf16-mixed
). -
Pretrained Models: Allows fine-tuning with a pretrained model.
After downloading a pretrained model compatible with the current version from the release, ensure that the
model
hyperparameter intrain_config.yaml
matches the pretrained model you wish to use (this is generally provided on the release page). During training, usepython train.py -p path_to_your_pretrained_ckpt
to specify the pretrained model.For more advanced features, refer to the contents of
train_config.yaml
. -
Matching Mode: Allows for the identification of a continuous sequence segment that maximizes probability within a given sequence of phonemes during inference, without the necessity to use all phonemes, similar to LyricFA. To enable during inference, specify
-m
.
-
Removed Features
- Aspiration Detection: Due to the complexity of implementation, breath sound detection was not realized. This feature may be added in the future.
v1.0.0 beta:代码重构、降低错误率、支持高级训练特性、Matching模式
新特性
- 代码重构: 重构了几乎所有代码,以提高可维护性。
- 降低错误率: 对模型结构、loss函数、解码方式等进行了优化,降低了音素对齐的错误率。
- 高级训练特性:
-
自动混合精度训练: 使用了pytorch lightning自带的混合精度训练。
只需在
train_config.yaml
中指定accelerator
超参数(默认为bf16-mixed
)。 -
预训练模型: 允许使用预训练模型进行微调。
在release下载符合当前版本的预训练模型后,确保
train_config.yaml
中的model
超参数与需要使用的预训练模型一致(一般会在release页面给出)。在训练时,使用python train.py -p path_to_your_pretrained_ckpt
指定预训练模型。有关更多高级功能,请参阅
train_config.yaml
中的内容。 -
Matching模式: 允许推理时在给定的音素序列中找到一个使得概率最大的连续序列片段,而非必须用上所有音素,类似于LyricFA。推理时指定
-m
即可开启,
-
移除的特性
- 吸气音检测: 由于实现方法较为复杂,并未实现吸气音检测。这项功能可能会在将来添加。