Skip to content

A comparison of TTS APIs and Open Source projects on price, speed and quality

Notifications You must be signed in to change notification settings

yagudaev/tts-apis-comparison

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 

Repository files navigation

Text to Speech APIs Comparision

A comparison of TTS APIs and Open Source projects on price, speed and quality.

Last Updated: June 11, 2024

This repo contains audio files and data about a comparison of all major Text-to-speech APIs and (soon) Open Source Projects.

We use Paul Grarham's essay text as a text to convert and compare.

Default Alive or Default Dead?

October 2015

When I talk to a startup that's been operating for more than 8 or 9 months, the first thing I want to know is almost always the same. Assuming their expenses remain constant and their revenue growth is what it has been over the last several months, do they make it to profitability on the money they have left? Or to put it more dramatically, by default do they live or die?

The startling thing is how often the founders themselves don't know. Half the founders I talk to don't know whether they're default alive or default dead.

If you're among that number, Trevor Blackwell has made a handy calculator you can use to find out.

The reason I want to know first whether a startup is default alive or default dead is that the rest of the conversation depends on the answer. If the company is default alive, we can talk about ambitious new things they could do. If it's default dead, we probably need to talk about how to save it. We know the current trajectory ends badly. How can they get off that trajectory?

Why do so few founders know whether they're default alive or default dead? Mainly, I think, because they're not used to asking that. It's not a question that makes sense to ask early on, any more than it makes sense to ask a 3 year old how he plans to support himself. But as the company grows older, the question switches from meaningless to critical. That kind of switch often takes people by surprise.

I propose the following solution: instead of starting to ask too late whether you're default alive or default dead, start asking too early. It's hard to say precisely when the question switches polarity. But it's probably not that dangerous to start worrying too early that you're default dead, whereas it's very dangerous to start worrying too late.

It is about 1.5-2mins of audio depending on the model pauses used.

Resulting Audio files

Spreadsheet Comparison

Benchmarks

  1. https://huggingface.co/blog/arena-tts - let users rank two models based on what sounds more natural, then calculate the score.

Other Interesting Links

https://artificialanalysis.ai/text-to-speech - compares quality and performance of all tts models

About

A comparison of TTS APIs and Open Source projects on price, speed and quality

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published