Skip to content
This repository has been archived by the owner on Jan 18, 2024. It is now read-only.

Commit

Permalink
Tiny word changes.
Browse files Browse the repository at this point in the history
  • Loading branch information
Tomiinek committed May 8, 2020
1 parent d35d430 commit 861c6e9
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

_______

This repository contains an implementation of **Tacotron 2** that supports **multilingual experiments** and that implements different approaches to **encoder parameter sharing**. Our model combines ideas from [Learning to speak fluently in a foreign language: Multilingual speech synthesis and cross-language voice cloning](https://google.github.io/tacotron/publications/multilingual/index.html), [End-to-End Code-Switched TTS with Mix of Monolingual Recordings](https://csttsdemo.github.io/), and [Contextual Parameter Generation for Universal Neural Machine Translation](https://arxiv.org/abs/1808.08493).
This repository contains an implementation of **Tacotron 2** that supports **multilingual experiments** and that implements different approaches to **encoder parameter sharing**. It also presents a model combining ideas from [Learning to speak fluently in a foreign language: Multilingual speech synthesis and cross-language voice cloning](https://google.github.io/tacotron/publications/multilingual/index.html), [End-to-End Code-Switched TTS with Mix of Monolingual Recordings](https://csttsdemo.github.io/), and [Contextual Parameter Generation for Universal Neural Machine Translation](https://arxiv.org/abs/1808.08493).

<p>&nbsp;</p>

Expand All @@ -23,7 +23,7 @@ This repository contains an implementation of **Tacotron 2** that supports **mul

_______

We compared **three multilingual text-to-speech models**. The first **shares the whole encoder** and uses an adversarial classifier to remove speaker-dependent information from the encoder. The second has **separate encoders** for each language. Finally, the third is our attempt to combine the best of both previous approaches, i.e., effective parameter sharing of the first method and flexibility of the second. It has a fully convolutional encoder with language-specific parameters generated by a **parameter generator**. It also makes use of an adversarial speaker classifier which follows principles of domain adversarial training. See the illustration above.
We provide synthesized samples, training and evaluation data, source code, and parameters for comparison of **three multilingual text-to-speech models**. The first **shares the whole encoder** and uses an adversarial classifier to remove speaker-dependent information from the encoder. The second has **separate encoders** for each language. Finally, the third is our attempt to combine the best of both previous approaches, i.e., effective parameter sharing of the first method and flexibility of the second. It has a fully convolutional encoder with language-specific parameters generated by a **parameter generator**. It also makes use of an adversarial speaker classifier which follows principles of domain adversarial training. See the illustration above.

**Interactive demos** introducing code-switching abilities and joint multilingual training of the generated model (trained on an enhanced CSS10 dataset) are available [here](https://colab.research.google.com/github/Tomiinek/Multilingual_Text_to_Speech/blob/master/notebooks/code_switching_demo.ipynb) and [here](https://github.com/Tomiinek/Multilingual_Text_to_Speech/blob/master/notebooks/multi_training_demo.ipynb), respectively.

Expand Down

0 comments on commit 861c6e9

Please sign in to comment.