v1.0.0 (06-24-2024)

A new major version release of the Cartesia Python client that overhauls the library structure.

Refer to the migration guide for more thorough details on changes.

Renames CartesiaTTS and AsyncCartesiaTTS -> Cartesia and AsyncCartesia
Replaces client.generate with endpoint-specific methods for Text-to-Speech
output_format must be specified as an OutputFormat object, which is a dict specifying the keys: container, encoding and sample_rate
Both SSE and WebSocket requests no longer return sampling_rate in their output. They will respect the sample_rate corresponding to the OutputFormat passed in.

client.tts.sse methods for generating audio using Server-Sent Events
client.tts.websocket methods for managing a WebSocket connection and generating audio
client.tts.get_output_format() to obtain OutputFormat object from output format name
client.tts.get_sample_rate() to obtain sample_rate from output format name
client.voices.list() to fetch a list of all available voices
client.voices.get() to fetch a VoiceMetadata object from voice_id
client.voices.clone() to clone a voice by specifying a filepath
client.voices.create() to create a new voice
Specifies cartesia_version=2024-06-10 as default header for HTTP and WS requests

client.get_voices()
client.get_voice_embedding()
client.generate()
Ability to specify Numpy Array as a return type. We recommend using np.frombuffer with the appropriate dtype.
Ability to clone voices using a link

Provide feedback