v1.0.0
v1.0.0 (06-24-2024)
A new major version release of the Cartesia Python client that overhauls the library structure.
Refer to the migration guide for more thorough details on changes.
Features
- Adds support for
model_id=sonic-multilingual
to generate multilingual audio - Endpoint-specific methods for generating and streaming audio
Breaking changes
- Renames
CartesiaTTS
andAsyncCartesiaTTS
->Cartesia
andAsyncCartesia
- Replaces
client.generate
with endpoint-specific methods for Text-to-Speech output_format
must be specified as anOutputFormat
object, which is a dict specifying the keys:container
,encoding
andsample_rate
- Both SSE and WebSocket requests no longer return
sampling_rate
in their output. They will respect thesample_rate
corresponding to theOutputFormat
passed in.
Adds
client.tts.sse
methods for generating audio using Server-Sent Eventsclient.tts.websocket
methods for managing a WebSocket connection and generating audioclient.tts.get_output_format()
to obtainOutputFormat
object from output format nameclient.tts.get_sample_rate()
to obtainsample_rate
from output format nameclient.voices.list()
to fetch a list of all available voicesclient.voices.get()
to fetch aVoiceMetadata
object fromvoice_id
client.voices.clone()
to clone a voice by specifying a filepathclient.voices.create()
to create a new voice- Specifies
cartesia_version=2024-06-10
as default header for HTTP and WS requests
Removes
client.get_voices()
client.get_voice_embedding()
client.generate()
- Ability to specify Numpy Array as a return type. We recommend using
np.frombuffer
with the appropriatedtype
. - Ability to clone voices using a link