Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add emotional embedding and finetune Japanese,maybe add English. #50

Closed
wants to merge 123 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
123 commits
Select commit Hold shift + click to select a range
6e24916
Create emo_gen.py
Stardust-minus Oct 7, 2023
138c633
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 7, 2023
8b6f2dc
update server.py, fix bugs in func get_text() and infer(). (#52)
jiangyuxiaoxiao Oct 8, 2023
002ae5a
Extract get_text() and infer() from webui.py. (#53)
jiangyuxiaoxiao Oct 8, 2023
c4f0eff
add emo emb
Stardust-minus Oct 8, 2023
9c59816
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 8, 2023
62c38ce
init emo gen
Stardust-minus Oct 8, 2023
9a68343
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 8, 2023
abf97b5
init emo
Stardust-minus Oct 8, 2023
a669050
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 8, 2023
0acf3c8
init emo
Stardust-minus Oct 8, 2023
8716993
Delete bert/bert-base-japanese-v3 directory
Stardust-minus Oct 8, 2023
e8a158c
Create .gitkeep
Stardust-minus Oct 8, 2023
b98cbb9
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 8, 2023
db74663
Create add_punc.py
Stardust-minus Oct 8, 2023
b57dfd7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 8, 2023
dc48bb4
fix bug in bert_gen.py (#54)
jiangyuxiaoxiao Oct 9, 2023
61e95d2
Update README.md
Stardust-minus Oct 9, 2023
e4250e2
fix bug in models.py (#56)
jiangyuxiaoxiao Oct 10, 2023
d2210b2
更新 models.py
Stardust-minus Oct 10, 2023
09e8146
Fix japanese cleaner (#61)
Akito-UzukiP Oct 12, 2023
c087f82
Merge branch 'master' into dev
Stardust-minus Oct 12, 2023
d3d0e78
Apply Code Formatter Change
Stardust-minus Oct 12, 2023
ec5ec86
Add config.yml for global configuration. (#62)
jiangyuxiaoxiao Oct 13, 2023
20ce2fc
Update webui.py (#65)
jiangyuxiaoxiao Oct 15, 2023
13bb441
Fix (#68)
Akito-UzukiP Oct 16, 2023
9e8c4a1
Update infer.py and webui.py. Supports loading and inference models …
jiangyuxiaoxiao Oct 16, 2023
5ffa506
Fix bug in translate.py (#69)
jiangyuxiaoxiao Oct 16, 2023
070f71c
Supports loading and inference models of 1.1、1.0.1、1.0 version. (#70)
jiangyuxiaoxiao Oct 16, 2023
6440411
Update japanese.py (#71)
OedoSoldier Oct 16, 2023
989950b
使用配置文件配置bert_gen.py, preprocess_text.py, resample.py (#72)
jiangyuxiaoxiao Oct 17, 2023
513c4e1
Delete bert/bert-base-japanese-v3 directory
Stardust-minus Oct 17, 2023
aa5fbc8
Create config.json
Stardust-minus Oct 17, 2023
ecf81bb
Create tokenizer_config.json
Stardust-minus Oct 17, 2023
6098266
Create vocab.txt
Stardust-minus Oct 17, 2023
bdc6afe
Update server.py. 支持多版本多模型 (#76)
jiangyuxiaoxiao Oct 18, 2023
70392e7
Dev webui (#77)
Stardust-minus Oct 18, 2023
9582243
Create config.json
Stardust-minus Oct 18, 2023
55c23d1
Create preprocessor_config.json
Stardust-minus Oct 18, 2023
ad9cb2d
Create vocab.json
Stardust-minus Oct 18, 2023
3793a2d
Delete emotional/wav2vec2-large-robust-12-ft-emotion-msp-dim/.gitkeep
Stardust-minus Oct 18, 2023
3059d57
Update emo_gen.py
Stardust-minus Oct 18, 2023
88b8221
Delete add_punc.py
Stardust-minus Oct 18, 2023
3b76fbd
add emotion_clustering.i
Stardust-minus Oct 18, 2023
0ca0160
Merge branch 'master' into dev
Stardust-minus Oct 18, 2023
969692a
Apply Code Formatter Change
Stardust-minus Oct 18, 2023
d4637b0
Update models.py
Stardust-minus Oct 18, 2023
d265262
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 18, 2023
02e4ed4
Update preprocess_text.py (#78)
AnyaCoder Oct 18, 2023
93f5774
Update preprocess_text.py. 检测重复以及不存在的音频 (#79)
jiangyuxiaoxiao Oct 19, 2023
d83f70d
Handle Janpanese long pronunciations (#80)
OedoSoldier Oct 19, 2023
69a7a6c
Use unified phonemes for Japanese long vowel (#82)
OedoSoldier Oct 19, 2023
a794154
增加一个按钮,点击后可以按句子切分,添加“|” (#81)
YYuX-1145 Oct 19, 2023
05ad3ec
Fix phonemer bug (#83)
OedoSoldier Oct 19, 2023
f7f4bc8
Fix long vowel handler bug (#84)
OedoSoldier Oct 19, 2023
a87e77d
加入整合包管理器的特性:长文本合成可以自定义句间段间停顿 (#85)
YYuX-1145 Oct 19, 2023
d25c503
Update train_ms.py
Stardust-minus Oct 19, 2023
b917d54
fix'
Stardust-minus Oct 19, 2023
676dec6
Update cleaner.py
Stardust-minus Oct 19, 2023
25a1823
add en
Stardust-minus Oct 19, 2023
b3275d5
add en
Stardust-minus Oct 19, 2023
0fcfab9
Update english.py
Stardust-minus Oct 19, 2023
351180c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2023
0a652b9
add en
Stardust-minus Oct 19, 2023
f049f0e
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2023
b1181b8
add en
Stardust-minus Oct 19, 2023
42c9a2d
add en
Stardust-minus Oct 19, 2023
f78794f
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2023
366cb5d
add en
Stardust-minus Oct 19, 2023
0439c40
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 19, 2023
b93eae8
更新 README.md
Stardust-minus Oct 20, 2023
adfb708
更新 README.md
Stardust-minus Oct 20, 2023
65f11b7
更新 README.md
Stardust-minus Oct 20, 2023
ae8c7f1
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 20, 2023
77f51a2
Change phonemer to pyopenjtalk (#86)
OedoSoldier Oct 20, 2023
ee4a5de
更新 english.py
Stardust-minus Oct 20, 2023
c42ce35
Fix english_bert_mock.py. (#87)
jiangyuxiaoxiao Oct 20, 2023
b4f96f7
Add punctuation execptions (#88)
OedoSoldier Oct 20, 2023
db559aa
remove get bert
Stardust-minus Oct 21, 2023
0375448
Merge branch 'master' into dev
Stardust-minus Oct 21, 2023
6c681b2
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Oct 21, 2023
726b4d0
Fix bug in oldVersion. (#89)
jiangyuxiaoxiao Oct 22, 2023
97a933b
Update requirements.txt
Stardust-minus Oct 22, 2023
7ddd8be
change to large
Stardust-minus Oct 22, 2023
9e8109f
rollback requirements.txt
Stardust-minus Oct 22, 2023
2a1c183
Feat: Enable 1.1.1 models using fix-ver infer. (#91)
jiangyuxiaoxiao Oct 22, 2023
681569f
Add Japanese accent (high-low) (#90)
OedoSoldier Oct 22, 2023
59a032d
Do not replace iteration mark (#92)
OedoSoldier Oct 22, 2023
ac922dd
Fix: fix import error in oldVersion (#93)
jiangyuxiaoxiao Oct 23, 2023
59e676f
Refactor: reusing model loading in webui.py and server.py. (#94)
jiangyuxiaoxiao Oct 23, 2023
ea8764e
Feat: Enable using config.yml in train_ms.py (#96)
jiangyuxiaoxiao Oct 24, 2023
c3e9bc0
更新 emo_gen.py
Stardust-minus Oct 24, 2023
89a585b
Change emo_gen.py (#97)
OedoSoldier Oct 24, 2023
327e5e3
Fix queue (#98)
OedoSoldier Oct 24, 2023
034efd6
Fix training bugs (#99)
OedoSoldier Oct 24, 2023
36c5ffc
Update infer.py (#100)
AnyaCoder Oct 25, 2023
1ce3bc4
Add reference audio (#101)
OedoSoldier Oct 25, 2023
51b8a9a
Fix: fix 1.1.1-fix (#102)
jiangyuxiaoxiao Oct 25, 2023
b408cfe
Fix infer bug (#103)
OedoSoldier Oct 25, 2023
45a12ce
Feat: Add server_fastapi.py. (#104)
jiangyuxiaoxiao Oct 25, 2023
d6f1aee
Fix: requirements.txt. (#105)
jiangyuxiaoxiao Oct 25, 2023
ee900d8
Swith to deberta-v3-large (#106)
OedoSoldier Oct 26, 2023
6e02141
Feat: Update config.py. (#107)
jiangyuxiaoxiao Oct 26, 2023
685e18a
Dev fix (#108)
AnyaCoder Oct 26, 2023
1000e9f
Revert "Dev fix (#108)" (#109)
Stardust-minus Oct 26, 2023
456302d
Dev fix (#110)
AnyaCoder Oct 26, 2023
d753d2b
Add emo vec quantizer (#111)
OedoSoldier Oct 26, 2023
baa273c
Clean req and gitignore (#112)
OedoSoldier Oct 26, 2023
60073b2
Switch to deberta-v2-large-japanese (#113)
OedoSoldier Oct 26, 2023
36bc21a
Fix emo bugs (#114)
OedoSoldier Oct 26, 2023
90948db
Fix english (#115)
OedoSoldier Oct 26, 2023
38a8415
Don't train codebook (#116)
OedoSoldier Oct 26, 2023
a1354bc
Update requirements.txt
Stardust-minus Oct 26, 2023
ff3d33e
Update english_bert_mock.py
Stardust-minus Oct 26, 2023
f7b6d85
Fix: server_fastapi.py (#118)
jiangyuxiaoxiao Oct 27, 2023
a8a717d
Fix: don't print debug logging. (#119)
jiangyuxiaoxiao Oct 27, 2023
d24b837
Merge branch 'master' into dev
Stardust-minus Oct 27, 2023
52211b9
Apply Code Formatter Change
Stardust-minus Oct 27, 2023
34a59be
更新,修正bug (#121)
jiangyuxiaoxiao Oct 27, 2023
8ac9f28
Update train_ms.py
Stardust-minus Oct 28, 2023
b06cc0c
Fix emo_gen (#127)
OedoSoldier Oct 28, 2023
9ae0015
Update emo_gen.py (#129)
OedoSoldier Oct 28, 2023
3bc84e6
Update vq (#130)
OedoSoldier Oct 28, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -166,3 +166,14 @@ cython_debug/
filelists/*
!/filelists/esd.list
data/*
/config.yml
/Web/
/emotional/*/*.bin
/bert/*/*.bin
/bert/*/*.h5
/bert/*/*.model
/bert/*/*.safetensors
asr_transcript.py
extract_list.py
/Data
Data/*
Empty file added .gitmodules
Empty file.
12 changes: 3 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,16 +10,8 @@ VITS2 Backbone with bert
[//]: # ()
[//]: # (本仓库来源于之前朋友分享了ai峰哥的视频,本人被其中的效果惊艳,在自己尝试MassTTS以后发现fs在音质方面与vits有一定差距,并且training的pipeline比vits更复杂,因此按照其思路将bert)

[//]: # (与vits结合起来以获得更好的韵律。本身我们是出于兴趣玩开源项目,用爱发电,我们本无意与任何人起冲突,然而[MaxMax2016](https://github.com/MaxMax2016))

[//]: # (以及其organization[PlayVoice](https://github.com/PlayVoice)几次三番前来碰瓷,说本项目抄袭了他们的代码,甚至上法院云云,因此在Readme中特别声明,本项目与)

[//]: # ([PlayVoice/vits_chinese](https://github.com/PlayVoice/vits_chinese)没有任何关系,结合bert的思路方面也是完全来源于MassTTS)


[//]: # (附:对面认为本项目抄袭了他代码的证据,诸位可以自行查看并做出判断,[bert_vits2引用的MassTTS的实际代码](https://github.com/PlayVoice/vits_chinese/tree/4781241520c6b9fdcf090fca289148719272e89f#bert_vits2%E5%BC%95%E7%94%A8%E7%9A%84masstts%E7%9A%84%E5%AE%9E%E9%99%85%E4%BB%A3%E7%A0%81) )

## 成熟的旅行者/开拓者/舰长/博士/sensei/猎魔人/喵喵露/V应当参阅代码自己学习如何训练。

### 严禁将此项目用于一切违反《中华人民共和国宪法》,《中华人民共和国刑法》,《中华人民共和国治安管理处罚法》和《中华人民共和国民法典》之用途。
### 严禁用于任何政治相关用途。
#### Video:https://www.bilibili.com/video/BV1hp4y1K78E
Expand All @@ -30,6 +22,8 @@ VITS2 Backbone with bert
+ [p0p4k/vits2_pytorch](https://github.com/p0p4k/vits2_pytorch)
+ [svc-develop-team/so-vits-svc](https://github.com/svc-develop-team/so-vits-svc)
+ [PaddlePaddle/PaddleSpeech](https://github.com/PaddlePaddle/PaddleSpeech)
+ [emotional-vits](https://github.com/innnky/emotional-vits)
+ [Bert-VITS2-en](https://github.com/xwan07017/Bert-VITS2-en)
## 感谢所有贡献者作出的努力
<a href="https://github.com/fishaudio/Bert-VITS2/graphs/contributors" target="_blank">
<img src="https://contrib.rocks/image?repo=fishaudio/Bert-VITS2"/>
Expand Down
34 changes: 34 additions & 0 deletions bert/bert-base-japanese-v3/.gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
34 changes: 34 additions & 0 deletions bert/bert-large-japanese-v2/.gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
*.7z filter=lfs diff=lfs merge=lfs -text
*.arrow filter=lfs diff=lfs merge=lfs -text
*.bin filter=lfs diff=lfs merge=lfs -text
*.bz2 filter=lfs diff=lfs merge=lfs -text
*.ckpt filter=lfs diff=lfs merge=lfs -text
*.ftz filter=lfs diff=lfs merge=lfs -text
*.gz filter=lfs diff=lfs merge=lfs -text
*.h5 filter=lfs diff=lfs merge=lfs -text
*.joblib filter=lfs diff=lfs merge=lfs -text
*.lfs.* filter=lfs diff=lfs merge=lfs -text
*.mlmodel filter=lfs diff=lfs merge=lfs -text
*.model filter=lfs diff=lfs merge=lfs -text
*.msgpack filter=lfs diff=lfs merge=lfs -text
*.npy filter=lfs diff=lfs merge=lfs -text
*.npz filter=lfs diff=lfs merge=lfs -text
*.onnx filter=lfs diff=lfs merge=lfs -text
*.ot filter=lfs diff=lfs merge=lfs -text
*.parquet filter=lfs diff=lfs merge=lfs -text
*.pb filter=lfs diff=lfs merge=lfs -text
*.pickle filter=lfs diff=lfs merge=lfs -text
*.pkl filter=lfs diff=lfs merge=lfs -text
*.pt filter=lfs diff=lfs merge=lfs -text
*.pth filter=lfs diff=lfs merge=lfs -text
*.rar filter=lfs diff=lfs merge=lfs -text
*.safetensors filter=lfs diff=lfs merge=lfs -text
saved_model/**/* filter=lfs diff=lfs merge=lfs -text
*.tar.* filter=lfs diff=lfs merge=lfs -text
*.tflite filter=lfs diff=lfs merge=lfs -text
*.tgz filter=lfs diff=lfs merge=lfs -text
*.wasm filter=lfs diff=lfs merge=lfs -text
*.xz filter=lfs diff=lfs merge=lfs -text
*.zip filter=lfs diff=lfs merge=lfs -text
*.zst filter=lfs diff=lfs merge=lfs -text
*tfevents* filter=lfs diff=lfs merge=lfs -text
53 changes: 53 additions & 0 deletions bert/bert-large-japanese-v2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
license: apache-2.0
datasets:
- cc100
- wikipedia
language:
- ja
widget:
- text: 東北大学で[MASK]の研究をしています。
---

# BERT large Japanese (unidic-lite with whole word masking, CC-100 and jawiki-20230102)

This is a [BERT](https://github.com/google-research/bert) model pretrained on texts in the Japanese language.

This version of the model processes input texts with word-level tokenization based on the Unidic 2.1.2 dictionary (available in [unidic-lite](https://pypi.org/project/unidic-lite/) package), followed by the WordPiece subword tokenization.
Additionally, the model is trained with the whole word masking enabled for the masked language modeling (MLM) objective.

The codes for the pretraining are available at [cl-tohoku/bert-japanese](https://github.com/cl-tohoku/bert-japanese/).

## Model architecture

The model architecture is the same as the original BERT large model; 24 layers, 1024 dimensions of hidden states, and 16 attention heads.

## Training Data

The model is trained on the Japanese portion of [CC-100 dataset](https://data.statmt.org/cc-100/) and the Japanese version of Wikipedia.
For Wikipedia, we generated a text corpus from the [Wikipedia Cirrussearch dump file](https://dumps.wikimedia.org/other/cirrussearch/) as of January 2, 2023.
The corpus files generated from CC-100 and Wikipedia are 74.3GB and 4.9GB in size and consist of approximately 392M and 34M sentences, respectively.

For the purpose of splitting texts into sentences, we used [fugashi](https://github.com/polm/fugashi) with [mecab-ipadic-NEologd](https://github.com/neologd/mecab-ipadic-neologd) dictionary (v0.0.7).

## Tokenization

The texts are first tokenized by MeCab with the Unidic 2.1.2 dictionary and then split into subwords by the WordPiece algorithm.
The vocabulary size is 32768.

We used [fugashi](https://github.com/polm/fugashi) and [unidic-lite](https://github.com/polm/unidic-lite) packages for the tokenization.

## Training

We trained the model first on the CC-100 corpus for 1M steps and then on the Wikipedia corpus for another 1M steps.
For training of the MLM (masked language modeling) objective, we introduced whole word masking in which all of the subword tokens corresponding to a single word (tokenized by MeCab) are masked at once.

For training of each model, we used a v3-8 instance of Cloud TPUs provided by [TPU Research Cloud](https://sites.research.google/trc/about/).

## Licenses

The pretrained models are distributed under the Apache License 2.0.

## Acknowledgments

This model is trained with Cloud TPUs provided by [TPU Research Cloud](https://sites.research.google/trc/about/) program.
19 changes: 19 additions & 0 deletions bert/bert-large-japanese-v2/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{
"architectures": [
"BertForPreTraining"
],
"attention_probs_dropout_prob": 0.1,
"hidden_act": "gelu",
"hidden_dropout_prob": 0.1,
"hidden_size": 1024,
"initializer_range": 0.02,
"intermediate_size": 4096,
"layer_norm_eps": 1e-12,
"max_position_embeddings": 512,
"model_type": "bert",
"num_attention_heads": 16,
"num_hidden_layers": 24,
"pad_token_id": 0,
"type_vocab_size": 2,
"vocab_size": 32768
}
10 changes: 10 additions & 0 deletions bert/bert-large-japanese-v2/tokenizer_config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
{
"tokenizer_class": "BertJapaneseTokenizer",
"model_max_length": 512,
"do_lower_case": false,
"word_tokenizer_type": "mecab",
"subword_tokenizer_type": "wordpiece",
"mecab_kwargs": {
"mecab_dic": "unidic_lite"
}
}
Loading
Loading