A comparison of TTS APIs and Open Source projects on price, speed and quality.
Last Updated: June 11, 2024
This repo contains audio files and data about a comparison of all major Text-to-speech APIs and (soon) Open Source Projects.
We use Paul Grarham's essay text as a text to convert and compare.
Default Alive or Default Dead?
October 2015
When I talk to a startup that's been operating for more than 8 or 9 months, the first thing I want to know is almost always the same. Assuming their expenses remain constant and their revenue growth is what it has been over the last several months, do they make it to profitability on the money they have left? Or to put it more dramatically, by default do they live or die?
The startling thing is how often the founders themselves don't know. Half the founders I talk to don't know whether they're default alive or default dead.
If you're among that number, Trevor Blackwell has made a handy calculator you can use to find out.
The reason I want to know first whether a startup is default alive or default dead is that the rest of the conversation depends on the answer. If the company is default alive, we can talk about ambitious new things they could do. If it's default dead, we probably need to talk about how to save it. We know the current trajectory ends badly. How can they get off that trajectory?
Why do so few founders know whether they're default alive or default dead? Mainly, I think, because they're not used to asking that. It's not a question that makes sense to ask early on, any more than it makes sense to ask a 3 year old how he plans to support himself. But as the company grows older, the question switches from meaningless to critical. That kind of switch often takes people by surprise.
I propose the following solution: instead of starting to ask too late whether you're default alive or default dead, start asking too early. It's hard to say precisely when the question switches polarity. But it's probably not that dangerous to start worrying too early that you're default dead, whereas it's very dangerous to start worrying too late.
It is about 1.5-2mins of audio depending on the model pauses used.
- https://huggingface.co/blog/arena-tts - let users rank two models based on what sounds more natural, then calculate the score.
https://artificialanalysis.ai/text-to-speech - compares quality and performance of all tts models