From 861c6e92cb8dfd396873af114ace001e66eb1119 Mon Sep 17 00:00:00 2001 From: Tomas Nekvinda <tom@neqindi.cz> Date: Fri, 8 May 2020 19:32:53 +0200 Subject: [PATCH] Tiny word changes. --- README.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 2e9894c..4ccf801 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,7 @@ _______ -This repository contains an implementation of **Tacotron 2** that supports **multilingual experiments** and that implements different approaches to **encoder parameter sharing**. Our model combines ideas from [Learning to speak fluently in a foreign language: Multilingual speech synthesis and cross-language voice cloning](https://google.github.io/tacotron/publications/multilingual/index.html), [End-to-End Code-Switched TTS with Mix of Monolingual Recordings](https://csttsdemo.github.io/), and [Contextual Parameter Generation for Universal Neural Machine Translation](https://arxiv.org/abs/1808.08493). +This repository contains an implementation of **Tacotron 2** that supports **multilingual experiments** and that implements different approaches to **encoder parameter sharing**. It also presents a model combining ideas from [Learning to speak fluently in a foreign language: Multilingual speech synthesis and cross-language voice cloning](https://google.github.io/tacotron/publications/multilingual/index.html), [End-to-End Code-Switched TTS with Mix of Monolingual Recordings](https://csttsdemo.github.io/), and [Contextual Parameter Generation for Universal Neural Machine Translation](https://arxiv.org/abs/1808.08493). <p> </p> @@ -23,7 +23,7 @@ This repository contains an implementation of **Tacotron 2** that supports **mul _______ -We compared **three multilingual text-to-speech models**. The first **shares the whole encoder** and uses an adversarial classifier to remove speaker-dependent information from the encoder. The second has **separate encoders** for each language. Finally, the third is our attempt to combine the best of both previous approaches, i.e., effective parameter sharing of the first method and flexibility of the second. It has a fully convolutional encoder with language-specific parameters generated by a **parameter generator**. It also makes use of an adversarial speaker classifier which follows principles of domain adversarial training. See the illustration above. +We provide synthesized samples, training and evaluation data, source code, and parameters for comparison of **three multilingual text-to-speech models**. The first **shares the whole encoder** and uses an adversarial classifier to remove speaker-dependent information from the encoder. The second has **separate encoders** for each language. Finally, the third is our attempt to combine the best of both previous approaches, i.e., effective parameter sharing of the first method and flexibility of the second. It has a fully convolutional encoder with language-specific parameters generated by a **parameter generator**. It also makes use of an adversarial speaker classifier which follows principles of domain adversarial training. See the illustration above. **Interactive demos** introducing code-switching abilities and joint multilingual training of the generated model (trained on an enhanced CSS10 dataset) are available [here](https://colab.research.google.com/github/Tomiinek/Multilingual_Text_to_Speech/blob/master/notebooks/code_switching_demo.ipynb) and [here](https://github.com/Tomiinek/Multilingual_Text_to_Speech/blob/master/notebooks/multi_training_demo.ipynb), respectively.