Skip to content

Commit 1b7bc87

Browse files
committed
Merge branch 'main' into adnaan/dow-100-update-langchain-agent-to-use-agent-factory
2 parents 97b1349 + 5dc841a commit 1b7bc87

13 files changed

+380
-141
lines changed

docs/mint.json

+2
Original file line numberDiff line numberDiff line change
@@ -69,6 +69,7 @@
6969
"open-source/python-quickstart",
7070
"open-source/telephony",
7171
"open-source/create-your-own-agent",
72+
"open-source/conversation-mechanics",
7273
"open-source/langchain-agent",
7374
"open-source/action-agents",
7475
"open-source/action-phrase-triggers",
@@ -80,6 +81,7 @@
8081
"open-source/playground",
8182
"open-source/turn-based-conversation",
8283
"open-source/language-support",
84+
"open-source/logging-with-loguru",
8385
"open-source/agent-factory"
8486
]
8587
},
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
title: "Conversation Mechanics"
3+
description: "How to tune the responsiveness in Vocode conversations"
4+
---
5+
6+
Building two-way conversations with an AI is a highly use-case specific task - how realistic the conversation is depends greatly on the nature of the conversation itself. In this guide, we'll cover some of the dials you can turn to configure the mechanics of a conversation in Vocode.
7+
8+
# Endpointing
9+
10+
Endpointing is the process of understanding when someone has finished speaking. The `EndpointingConfig` controls how this is done. There are a couple of different ways to configure endpointing:
11+
12+
We provide `DeepgramEndpointingConfig()` which has some reasonable defaults and knobs to suit most use-cases (but only works with the Deepgram transcriber).
13+
14+
```
15+
class DeepgramEndpointingConfig(EndpointingConfig, type="deepgram"): # type: ignore
16+
vad_threshold_ms: int = 500
17+
utterance_cutoff_ms: int = 1000
18+
time_silent_config: Optional[TimeSilentConfig] = Field(default_factory=TimeSilentConfig)
19+
use_single_utterance_endpointing_for_first_utterance: bool = False
20+
```
21+
22+
- `vad_threshold_ms`: translates to [Deepgram's `endpointing` feature](https://developers.deepgram.com/docs/endpointing#enable-feature)
23+
- `utterance_cutoff_ms`: uses [Deepgram's Utterance End features](https://developers.deepgram.com/docs/utterance-end)
24+
- `time_silent_config`: is a Vocode specific parameter that marks an utterance final if we haven't seen any new words in X seconds
25+
- `use_single_utterance_endpointing_for_first_utterance`: Uses `is_final` instead of `speech_final` for endpointing for the first utterance (works really well for outbound conversations, where the user's first utterance is something like "Hello?") - see [this doc on Deepgram](https://developers.deepgram.com/docs/understand-endpointing-interim-results) for more info.
26+
27+
Endpointing is highly use-case specific - building a realistic experience for this greatly depends on the person speaking to the AI. Here are few paradigms that we've used to help you along the way:
28+
29+
- Time-based endpointing: This method considers the speaker to be finished when there is a certain duration of silence.
30+
- Punctuation-based endpointing: This method considers the speaker to be finished when there is a certain duration of silence after a punctuation mark.
31+
32+
# Interruptions
33+
34+
When the AI speaks in a `StreamingConversation`, it can be interrupted by the user. `AgentConfig` itself provides a parameter called `interrupt_sensitivity` that can be used to control how sensitive the AI is to interruptions. Interrupt sensitivity has two options: low (default) and high. Low sensitivity makes the bot ignore backchannels (e.g. “sure”, “uh-huh”) while the bot is speaking. High sensitivity makes the agent treat any word from the human as an interruption.
35+
36+
The implementation of this configuration is in `StreamingConversation.TranscriptionsWorker` - in order to make this work well, you may need to fork Vocode and override this behavior, but it provides a good starting place for most use-cases.
37+
38+
Stay tuned, more dials to come here soon!
39+
40+
# Conversation Speed
41+
42+
`StreamingConversation` also exposes a parameter called `conversation_speed`, which controls the length of endpointing pauses, i.e. how long the bot will wait before responding to the human. This includes normal utterances from the human as well as interruptions.
43+
44+
The amount of time the bot waits inversely scales with the `conversation_speed` value. So a bot with `conversation_speed` of 2 responds in half the time compared to a `conversation_speed` of 1. Likewise a `conversation_speed` of 0.5 means the bot takes twice as long to respond.
45+
46+
```python
47+
conversation = StreamingConversation(
48+
speed_coefficient=2
49+
...
50+
)
51+
```
52+
53+
Based on the speed of the user's speech (we calculate the WPM from each final utterance that goes through the pipeline), the `speed_coefficient` updates throughout the course of the conversation - see `vocode.streaming.utils.speed_manager` to see this implementation!

docs/open-source/events-manager.mdx

+38-65
Original file line numberDiff line numberDiff line change
@@ -5,87 +5,60 @@ description: "How events are emitted and consumed."
55

66
## What is the Events Manager
77

8-
The Events Manager is a class designed to facilitate asynchronous handling of events in the application. It allows for non-blocking actions on events, such as processing transcripts, managing phone calls, and other tasks. The main components of the Events Manager are the `EventsManager` class and several `Event` subclasses representing various event types.
8+
The Events Manager consumes realtime events during conversations - it provides a framework to consume and take action on these events asychronously.
99

10-
## EventsManager Class
11-
12-
The `EventsManager` class is responsible for managing the event queue and handling events asynchronously. The class provides methods for publishing events, starting the event loop, handling events, and ending the event loop.
13-
14-
### Initialization
10+
## Current Event Types
1511

16-
```python
17-
def __init__(self, subscriptions: List[EventType] = []):
18-
self.queue = asyncio.Queue()
19-
self.subscriptions = set(subscriptions)
20-
self.active = False
21-
```
12+
The current event types include:
2213

23-
The `EventsManager` constructor accepts an optional list of `EventType` subscriptions. By default, it initializes an empty set of subscriptions, an asynchronous queue, and sets the `active` attribute to `False`.
14+
1. `TRANSCRIPT`: Indicates a partial transcript for the conversation has been received.
15+
2. `TRANSCRIPT_COMPLETE`: Indicates the transcript is complete (ie conversation has ended).
16+
3. `ACTION`: Indicates that a Vocode action has begun or completed.
17+
4. `PHONE_CALL_CONNECTED`: Indicates a phone call has been connected (only gets sent during `PhoneConversation`s)
18+
5. `PHONE_CALL_ENDED`: Indicates a phone call has ended.
2419

25-
### Publishing Events
20+
## Usage
2621

27-
```python
28-
def publish_event(self, event: Event):
29-
if event.type in self.subscriptions:
30-
self.queue.put_nowait(event)
31-
```
22+
Using the events manager to take particular action when events fire requires that you subclass `vocode.streaming.utils.EventsManager` and override the `handle_event` method.
3223

33-
The `publish_event` method takes an `Event` object as input and adds it to the queue if its type is in the set of subscribed event types.
24+
You can also configure which events your `EventsManager` is subscribed to by using the `subscriptions` property (see example).
3425

35-
### Starting the Event Loop
26+
### Example
3627

3728
```python
38-
async def start(self):
39-
self.active = True
40-
while self.active:
41-
try:
42-
event: Event = await self.queue.get()
43-
except asyncio.QueueEmpty:
44-
await asyncio.sleep(1)
45-
self.handle_event(event)
46-
```
29+
from vocode.streaming.models.events import Event, EventType
4730

48-
## Current Event Types
31+
from vocode.streaming.models.events import Event, EventType
32+
from vocode.streaming.models.transcript import TranscriptCompleteEvent
33+
from vocode.streaming.utils.events_manager import EventsManager
4934

50-
The current event types include:
5135

52-
1. `TRANSCRIPT`: Indicates a partial transcript for the conversation has been received.
53-
2. `TRANSCRIPT_COMPLETE`: Indicates the transcript is complete (ie conversation has ended).
54-
3. `PHONE_CALL_CONNECTED`: Indicates a phone call has been connected.
55-
4. `PHONE_CALL_ENDED`: Indicates a phone call has ended.
56-
5. `RECORDING`: (Vonage Only) Indicates a secure URL containing a recording of the call is available. Requires `recording=true` in `VonageConfig`.
36+
class CustomEventsManager(EventsManager):
37+
def __init__(self):
38+
super().__init__([EventType.TRANSCRIPT_COMPLETE])
5739

58-
## Example Usage
40+
async def handle_event(self, event: Event):
41+
if isinstance(event, TranscriptCompleteEvent):
42+
print("The call has finished, the transcript was", event.transcript.to_string())
43+
```
5944

60-
The following example demonstrates how the `EventsManager` class can be used to consume the `TRANSCRIPT_COMPLETE` event and save the transcript to a file using the `add_transcript` method:
45+
In this example, we create a custom `EventsManager` subclass is created with a subscription to the `TRANSCRIPT_COMPLETE` event and then print the transcript when we receive the event.
6146

62-
```python
63-
import logging
64-
from fastapi import FastAPI
65-
from vocode.streaming.models.events import Event, EventType, TranscriptCompleteEvent
66-
from vocode.streaming.utils import events_manager
67-
from call_transcript_utils import add_transcript
47+
To use `CustomEventsManager`, you can pass it into any Conversation, e.g.
6848

69-
app = FastAPI(docs_url=None)
49+
```
50+
...
51+
conversation = StreamingConversation(
52+
...,
53+
events_manager=CustomEventsManager()
54+
)
55+
```
7056

71-
logging.basicConfig()
72-
logger = logging.getLogger(__name__)
73-
logger.setLevel(logging.DEBUG)
57+
You can also pass it into a `TelephonyServer`, like:
7458

75-
class CustomEventsManager(events_manager.EventsManager):
76-
def __init__(self):
77-
super().__init__(subscriptions=[EventType.TRANSCRIPT_COMPLETE])
78-
79-
def handle_event(self, event: Event):
80-
if event.type == EventType.TRANSCRIPT_COMPLETE:
81-
transcript_complete_event = typing.cast(TranscriptCompleteEvent, event)
82-
add_transcript(
83-
transcript_complete_event.conversation_id,
84-
transcript_complete_event.transcript,
85-
)
86-
87-
events_manager_instance = CustomEventsManager()
88-
await events_manager_instance.start()
8959
```
90-
91-
In this example, a custom `EventsManager` subclass is created with a subscription to the `TRANSCRIPT_COMPLETE` event. The `handle_event` method is overridden to save the transcript to a file using the `add_transcript` method when the `TRANSCRIPT_COMPLETE` event is received.
60+
server = TelephonyServer(
61+
...,
62+
events_manager=CustomEventsManager()
63+
)
64+
```
+73
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
---
2+
title: "Logging with Loguru"
3+
description: "Make logging set up less painful for local and production usage!"
4+
---
5+
6+
Loguru is a powerful and flexible logging library for Python that simplifies logging setup and usage. It provides a more intuitive and feature-rich alternative to Python's built-in logging module.
7+
8+
## Why Use Loguru?
9+
10+
Loguru offers several advantages over the standard logging module:
11+
12+
- **Ease of Use**: Loguru simplifies the process of setting up and using loggers.
13+
- **Rich Features**: It provides advanced features like automatic exception catching, structured logging, and more.
14+
- **Flexibility**: Loguru allows for easy configuration of different logging formats and destinations.
15+
16+
## Using the Vocode Implementation
17+
18+
The Vocode implementation of Loguru provides a seamless way to integrate logging into your application. It includes custom handlers and configuration functions to streamline the setup process. When utilizing the JSON logging configuration, it'll also pull relevant environment variables such as `conversation_id` and include them in the JSON output for better production debugging!
19+
20+
### Setting Up Logging
21+
22+
To set up logging in your application, you can use the provided configuration functions. Here's how to configure pretty printing for local development and JSON logging for production:
23+
24+
#### Pretty Printing Locally
25+
26+
To enable pretty printing locally, use the `configure_pretty_logging` function. This will set up Loguru to output logs with colored formatting, making them easier to read during development.
27+
28+
```python
29+
from vocode.logging import configure_pretty_logging
30+
31+
configure_pretty_logging()
32+
```
33+
34+
#### JSON Logging in Production
35+
36+
For production environments, you may want to log in JSON format for better integration with logging systems and easier parsing. Use the `configure_json_logging` function to set this up.
37+
38+
```python
39+
from vocode.logging import configure_json_logging
40+
41+
configure_json_logging()
42+
```
43+
44+
### Why Use Different Setups?
45+
46+
Using different logging setups for local and production environments can be beneficial for several reasons:
47+
48+
- **Readability**: Pretty printing makes logs easier to read during development, helping you quickly identify issues.
49+
- **Structured Logging**: JSON logging provides structured logs that are easier to parse and analyze in production, especially when using log aggregation and monitoring tools.
50+
51+
## Example Snippet
52+
53+
Here's an example of how you can set up logging in your application:
54+
55+
```python
56+
import os
57+
from vocode.logging import configure_pretty_logging, configure_json_logging
58+
59+
DEPLOYED_ENVIRONMENTS = ["production", "staging"]
60+
ENVIRONMENT = os.environ.get("ENVIRONMENT", "development")
61+
62+
def configure_logging() -> None: # pragma: no cover
63+
"""Configures logging."""
64+
if ENVIRONMENT in DEPLOYED_ENVIRONMENTS:
65+
configure_json_logging()
66+
else:
67+
configure_pretty_logging()
68+
69+
# Configure logging based on the environment
70+
configure_logging()
71+
72+
# Your application code here
73+
```

docs/open-source/react-quickstart.mdx

+15-25
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ Or, start from our [Replit template](https://replit.com/@vocode/Simple-Conversat
1313

1414
## Setting up the conversation
1515

16-
Our self-hosted backend allows you to expose a websocket route in the same format that our hosted backend does. This allows you to deploy any agent you'd like into the conversation.
16+
Our self-hosted backend allows you to expose a websocket route that operates like `StreamingConversation`.
1717

1818
To get started, clone the Vocode repo or copy the [client backend app](https://github.com/vocodedev/vocode-python/tree/main/apps/client_backend) directory.
1919

@@ -56,35 +56,25 @@ uvicorn main:app --port 3000
5656

5757
You now have a server with a Vocode websocket route at localhost:3000! You can now use the `useConversation` hook with your self-hosted backend as follows:
5858

59-
```javascript
60-
const { status, start, stop, analyserNode } = useConversation({
59+
```typescript
60+
import { useConversation } from "vocode";
61+
62+
const { status, start, stop, error, analyserNode } = useConversation({
6163
backendUrl: "<YOUR_BACKEND_URL>", // looks like ws://localhost:3000/conversation or wss://asdf1234.ngrok.app/conversation if using ngrok
6264
audioDeviceConfig: {},
6365
});
6466
```
6567

66-
# Demo installation and setup
67-
68-
Clone the `vocode-react-demo` [repository](https://github.com/vocodedev/vocode-react-demo).
69-
70-
```
71-
$ git clone https://github.com/vocodedev/vocode-react-demo.git
72-
```
73-
74-
Run npm install inside the directory to download all of the dependencies.
75-
76-
```
77-
$ npm install
78-
```
79-
80-
Set your Client SDK key inside of your `.env`
68+
Use the `status`, `start`, and `stop` objects within your React components to control conversations with your self-hosted backend, e.g.
8169

82-
```
83-
REACT_APP_VOCODE_API_KEY=YOUR KEY HERE
84-
```
85-
86-
Start the application
70+
```jsx
71+
<>
72+
{status === "idle" && <p>Press me to talk!</p>}
73+
{status == "error" && error && <p>{error.message}</p>}
8774

88-
```
89-
$ npm start
75+
<button
76+
disabled={["connecting"].includes(status)}
77+
onClick={status === "connected" ? stop : start}
78+
></button>
79+
</>
9080
```

0 commit comments

Comments
 (0)