Skip to content

Commit f46c506

Browse files
authored
chore(usm-streaming): add Vapi docs content (#302)
* chore(usm-streaming): add Vapi docs content * fix image paths * chore: add Vapi to sidebar + unhide page * chore: add Zack's recommendations
1 parent 348b2d1 commit f46c506

File tree

7 files changed

+47
-47
lines changed

7 files changed

+47
-47
lines changed

fern/assets/img/vapi/Vapi-Step1.png

486 KB
Loading

fern/assets/img/vapi/Vapi-Step2.png

432 KB
Loading

fern/assets/img/vapi/Vapi-Step3.png

350 KB
Loading

fern/docs.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -68,6 +68,9 @@ navigation:
6868
- page: Pipecat
6969
path: pages/02-speech-to-text/pipecat-intro-guide.mdx
7070
slug: /voice-agents/pipecat-intro-guide
71+
- page: Vapi
72+
path: pages/02-speech-to-text/universal-streaming/voice-agents/vapi.mdx
73+
slug: /vapi
7174
- page: Introducing Slam-1
7275
path: pages/01-getting-started/slam-1.mdx
7376
slug: /getting-started/slam-1
@@ -484,6 +487,8 @@ navigation:
484487
path: pages/02-speech-to-text/universal-streaming/voice-agents/livekit.mdx
485488
- page: Pipecat
486489
path: pages/02-speech-to-text/universal-streaming/voice-agents/pipecat.mdx
490+
- page: Vapi
491+
path: pages/02-speech-to-text/universal-streaming/voice-agents/vapi.mdx
487492
- section: LangChain
488493
path: pages/06-integrations/langchain.mdx
489494
slug: /langchain
@@ -643,7 +648,6 @@ navigation:
643648
- page: Vapi
644649
path: pages/02-speech-to-text/universal-streaming/voice-agents/vapi.mdx
645650
slug: /vapi
646-
hidden: true
647651
- page: Turn detection
648652
path: pages/02-speech-to-text/universal-streaming/turn-detection.mdx
649653
slug: /turn-detection

fern/pages/02-speech-to-text/universal-streaming/universal-streaming.mdx

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1003,8 +1003,9 @@ Utilizing our ongoing transcriptions in this manner will allow you to achieve th
10031003
<Card title="Pipecat" icon={<img src="https://assemblyaiassets.com/images/Pipecat.svg" alt="Pipecat logo"/>} href="/docs/speech-to-text/universal-streaming/pipecat">
10041004
View our Pipecat integration guide.
10051005
</Card>
1006-
{/* <Card title="Vapi" icon={<img src="https://assemblyaiassets.com/images/Vapi.svg" alt="Vapi logo"/>} href="https://docs.vapi.ai/providers/transcriber/assembly-ai">
1007-
View Vapi's AssemblyAI STT plugin documentation. TO DO: ADD VAPI */}
1006+
<Card title="Vapi" icon={<img src="https://assemblyaiassets.com/images/Vapi.svg" alt="Vapi logo"/>} href="/docs/speech-to-text/universal-streaming/vapi">
1007+
View our Vapi integration guide.
1008+
</Card>
10081009
</CardGroup>
10091010

10101011

Lines changed: 30 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,58 +1,46 @@
11
---
2-
title: "Turn detection"
3-
description: "Intelligent turn detection with Streaming Speech-to-Text"
2+
title: "Vapi"
3+
description: "Vapi voice agent integration"
44
---
55

6-
### Overview
6+
## Overview
77

8-
AssemblyAI's end-of-turn detection functionality is integrated into our Streaming STT model, leveraging both acoustic and semantic features, and is coupled with a traditional silence-based heuristic approach. Both mechanisms work jointly and either can trigger end-of-turn detection throughout the audio stream. This joint approach significantly enhances the speed and accuracy of end-of-turn detection while allowing this functionality to fall back to the traditional method when the model makes a misprediction.
8+
Vapi is a developer platform for building voice AI agents, they handle the complex backend of voice agents for you so you can focus on creating great voice experiences. In this guide, we'll show you how to integrate AssemblyAI's streaming speech-to-text model into your Vapi voice agent.
99

10-
<Note>
11-
End-of-turn and end-of-utterances refer to the same thing and may be used
12-
interchangeably in these docs.
13-
</Note>
10+
<Card
11+
title="Vapi"
12+
icon={<img src="https://assemblyaiassets.com/images/Vapi.svg" alt="Vapi logo"/>}
13+
href="https://docs.vapi.ai/providers/transcriber/assembly-ai"
14+
>
15+
View Vapi's AssemblyAI STT provider documentation.
16+
</Card>
1417

15-
### Model-based detection
18+
## Quick start
1619

17-
Triggers when **all** conditions are met:
20+
<Steps>
21+
**Head to the "Assistants" tab in your Vapi dashboard.**
1822

19-
#### EOT token predicted
23+
<Frame>
24+
<img src="/assets/img/vapi/Vapi-Step1.png" />
25+
</Frame>
2026

21-
- Model predicts semantic end-of-turn with a probability greater than `end_of_turn_confidence_threshold`
22-
- Default: 0.5 (user configurable)
27+
**Click on your assistant and then the "Transcriber" tab.**
2328

24-
#### Minimum silence duration has passed
29+
<Frame>
30+
<img src="/assets/img/vapi/Vapi-Step2.png" />
31+
</Frame>
2532

26-
- After the last non-silence word token, `min_end_of_turn_silence_when_confident` milliseconds must pass
27-
- Default: 2400ms (user configurable)
33+
**Select "assembly-ai" on the Provider dropdown.**
2834

29-
#### Minimum speech duration spoken
35+
<Frame>
36+
<img src="/assets/img/vapi/Vapi-Step3.png" />
37+
</Frame>
38+
</Steps>
3039

31-
- The user must speak for at least 80ms since the last end-of-turn (ensures at least one word)
32-
- Set to 80 ms (internal)
40+
Your voice agent now uses **AssemblyAI** for speech-to-text (STT) processing.
3341

34-
#### Word finalized
42+
<Info>
43+
New to Vapi? Visit the [Quickstart Guide](https://docs.vapi.ai/quickstart/introduction) to explore various example voice agent workflows. For the easiest way to test a voice agent, follow this [simple phone-based guide](https://docs.vapi.ai/quickstart/phone). Vapi offers a wide range of example workflows to get you up and running quickly.
44+
</Info>
3545

36-
- Last word in `turn.words` has been finalized
37-
- Internal configuration
38-
39-
### Silence-based detection
40-
41-
Triggers when **all** conditions are met:
42-
43-
#### Minimum speech duration spoken
44-
45-
- The user must speak for at least 80ms since the last end-of-turn (ensures at least one word)
46-
- Set to 80 ms (internal)
47-
48-
#### Maximum silence duration has passed
49-
50-
- After the last non-silence word token, `max_turn_silence` milliseconds must pass
51-
- Default: 2400ms (user configurable)
52-
53-
### Important notes
54-
55-
- Silence-based detection can override model-based detection even with high EOT confidence thresholds
56-
- Word finalization always takes precedence — endpointing won't occur until the last word is finalized
57-
- We define end-of-turn detection as the process of detecting the end of sustained speech activity, often called end-pointing in the Voice Agents context
5846

fern/pages/06-integrations/index.mdx

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,18 +5,25 @@ AssemblyAI seamlessly integrates with a variety of tools and platforms to enhanc
55
<CardGroup>
66
<Card
77
title="Livekit"
8-
icon="forward-fast"
8+
icon={<img src="https://assemblyaiassets.com/images/Livekit.svg" alt="Livekit logo"/>}
99
href="/docs/integrations/livekit"
1010
>
1111
Use AssemblyAI with Livekit's voice agent orchestrator.
1212
</Card>
1313
<Card
1414
title="Pipecat"
15-
icon="cat"
15+
icon={<img src="https://assemblyaiassets.com/images/Pipecat.svg" alt="Pipecat logo"/>}
1616
href="/docs/integrations/pipecat"
1717
>
1818
Use AssemblyAI with Pipecat's voice agent orchestrator.
1919
</Card>
20+
<Card
21+
title="Vapi"
22+
icon={<img src="https://assemblyaiassets.com/images/Vapi.svg" alt="Vapi logo"/>}
23+
href="/docs/integrations/vapi"
24+
>
25+
Use AssemblyAI with Vapi's voice agent orchestrator.
26+
</Card>
2027
</CardGroup>
2128

2229
## No-Code Integrations

0 commit comments

Comments
 (0)