I put together this guide because I couldn't find an adequate explanation of how to do this anywhere else. Please read carefully as there are currently some issues arising from the way Piper is implemented in Home Assistant that make this process more confusing than it should be.
Note: this guide assumes you will run your voices on the same physical device (eg. Raspberry pi) as Home Assistant.
- This option is best for people who plan to render short phrases or phrases that will be used repeatedly.
- To learn how to run a (GPU accelerated) piper docker container on another machine on your network, use this guide instead.
- Install the Piper Add-on (Settings > Addons > click ADD-ON STORE button > search for Piper.)
- Install the Wyoming protocol integration (Settings > Devices & Services > Integrations > Click ADD INTEGRATION BUTTON > search for Wyoming Protocol).
- When prompted by Wyoming Protocol for host and port, you can use
core-piper
forhost
and10200
forport
core-piper
is the hostname provided by the piper add-on's docker container. You can also use the ip address of the machine running the docker container.10200
is the default port for Piper.
- If your voice files are named something like
en_US-bob_1234-medium.onnx
, you can use them as-is. - Voices created with earlier versions of TextyMcSpeechy did not comply with Piper's expected naming convention. In order for your voices to be detected and appear in menus properly, you will need to rename them using the system prescribed by Piper. You may also need to edit the
.onnx.json
file. - Instructions for doing this can be found here.
- This is a bit challenging on Home Assistant OS since you don't have permissions to upload to this folder via the webui.
- There are a variety of ways of accomplishing this which are beyond the scope of this guide.
- If you don't find a
piper
directory inside/share
, create it yourself. - I did this by installing the FTP Add-on (Settings > Add-ons > click ADD-ON STORE button > search for FTP)
- After setting up credentials in the FTP add-on and ensuring it was up and running, I used Filezilla (FTP client software) on my PC to connect to the server and upload my model files to
/share/piper
.
Once your .onnx
and .onnx.json
files are in the /share/piper
folder, restart the Piper add-on and reload the Wyoming Protocol integration, otherwise they won't know about your models.
- If you have done everything correctly so far, your voice will technically be ready to use, but there are some implementation issues you need to know about.
- Important: Due to the way the Piper add-on currently gets its voice lists, if you go to
Settings > Add-ons > Piper > Configuration
and look for your custom voice in the dropdown menu, you won't find your custom voices there. I have opened an issue for this on github. - The only place you are currently able to see your custom voice in the webui is in Settings > Voice Assistants, after creating or modifying a voice assistant entity using Piper as the text to speech engine.
- There is a "Try Voice" button you can use to verify that your custom voice is working properly. I recommend doing this before proceeding further.
- If your custom model shows up with the wrong name, you probably need to change some fields in your
.onnx.json
file and restart Piper/Wyoming protocol.
This guide explains how to create and call a service script that will render text using your custom voices. You should create one for each of the voices you intend to use.
-
Create a dropdown list to contain the names of all your custom voice services
- In Settings > Devices & Services > Helpers tab, click CREATE HELPER button and choose
Dropdown
- In the
name
field, entertts_voices
(this tutorial will assume that the dropdown's entity name will beinput_select.tts_voices
) - in the
options
field, add the name of the first voice you want to be able to choose from, prefaced byscript.
, eg.script.say_as_bob
- add the rest of the voices and save the dropdown list. You can add more voices to this list later.
- important: immediately after your dropdown is created, there might not be any voice selected, and this would cause the script to fail. Click the
tts_voices
entity you just created and in the popup window, choose one of your voices from the dropdown list.
- In Settings > Devices & Services > Helpers tab, click CREATE HELPER button and choose
-
Create a text input box to hold the demo text you want the TTS engine to say.
- In Settings > Devices & Services > Helpers tab, click CREATE HELPER button and choose
Text
- In the
name
field, entertext_to_say
(this tutorial will assume that the text entity's name will beinput_text.text_to_say
) - click on the newly created entity and in the popup window, enter a test phrase in the text input box and save it.
- In Settings > Devices & Services > Helpers tab, click CREATE HELPER button and choose
-
Create a button to say your test phrase in the selected voice. This will be configured later.
- In Settings > Devices & Services > Helpers tab, click CREATE HELPER button and choose
Button
- In the
name
field, entersay_it
(this tutorial will assume that the button entity's name will beinput_button.say_it
) - Click
Create
- In Settings > Devices & Services > Helpers tab, click CREATE HELPER button and choose
-
Create a script that will say the text in your text box using the voice service selected in the dropdown list.
- In Settings > Automations and Scenes > Scripts tab, click the CREATE SCRIPT button, choose Create new script.
- From the kebab menu (three vertical dots in the top right corner), choose
Edit in YAML
- Delete the line that says
sequence: []
- Paste in the following script:
sequence:
- data_template:
message: "{{ states('input_text.text_to_say') }}"
action: "{{ states('input_select.tts_voices') }}"
alias: Test TTS
description: Say text in the input box
- Click the SAVE SCRIPT button. Name this script
test_tts
(this tutorial will assume that this script entity's name will bescripts.test_tts
) - Once your
test_tts
script has been created, you should be able to test it. Click the kebab menu (three vertical dots) on the line associated with thetest_tts
script, and chooseRun
. If you have done everything right you should hear your test phrase spoken in the voice you have chosen.
- In the example above, an
Entities
card was created by putting the dashboard into edit mode, clicking the+
button, and choosing theEntities
card from theBY CARD
tab. - in the popup window, under
Entities (required)
, delete the example entities, then:- Set the first entity to
input_select.tts_voices
- Set the second entity to
input_text.text_to_say
- Click
Save
- Set the first entity to
- To add the button, click the
+
button below the dropdown and text box you just added and in theBY CARD
tab, search for theButton
card and click it. - To configure the button, first replace anything in the
Entity
field with the button entity you created earlier (input_button.say_it
) - Optionally give your button a name and/or icon.
- In the
Tap behavior
field, choosePerform action
- A new
Action
field will appear. Enterscript.test_tts
here to cause this button to trigger the script you created earlier. - Change any additional layout options as you wish. The example above has
Full width card
turned on in theLayout
tab - click
SAVE
to save your button. - click the
DONE
button to stop editing your dashboard. - You now can say anything in any voice via your home assistant dashboard. Enjoy!
- Sometimes the
Say it
button will play the previous message that had been stored in the text box rather than the one you have just typed. This is because the text box doesn't pass its new value to the rest of the system until it loses focus. Until I find a way to resolve this, you can click or tap anywhere outside the box after entering a new message before you click the button.