espnet-tts-serving

Serves ESPnet (version 2) TTS model file. It packs the python code into a docker container for running pytorch on CPU/GPU. It is just a pytorch model inference. No special frontend is defined here. Input is a list of phonemes: {"text":"a <space> b", "voice": "sample.v"}, output is a based64 encoded spectrogram prediction: {"data":"T5CE ...<truncated>... AAA=="}.

Configuration

The service can load several models. It takes a configuration file as an input. See deploy/cpu/voices.yaml as a sample. Service will load a model for a configured voice name, and it will keep it until a request with another voice name will arrive. There is a possibility to load several models into a memory using environment WORKERS parameter.

There is also some load balancer implemented. It tries to keep a model in memory if there are many requests waiting for the same voice.

Sample usage

See deploy/cpu or deploy/gpu for deployment and testing samples.

ESPnet version 1

For ESPnet (version 1) look at ESPnet1 branch

License

Released under the The 3-Clause BSD License.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

espnet-tts-serving

Configuration

Sample usage

ESPnet version 1

License

Files

README.md

Latest commit

History

README.md

File metadata and controls

espnet-tts-serving

Configuration

Sample usage

ESPnet version 1

License