Skip to content

Commit 73d3a74

Browse files
committed
Add Docs on Creating and Using External Actions
1 parent dfa66d5 commit 73d3a74

File tree

2 files changed

+251
-0
lines changed

2 files changed

+251
-0
lines changed

docs/mint.json

+1
Original file line numberDiff line numberDiff line change
@@ -72,6 +72,7 @@
7272
"open-source/langchain-agent",
7373
"open-source/action-agents",
7474
"open-source/action-phrase-triggers",
75+
"open-source/external-action",
7576
"open-source/local-conversation",
7677
"open-source/events-manager",
7778
"open-source/using-synthesizers",

docs/open-source/external-action.mdx

+250
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,250 @@
1+
---
2+
title: "External Actions"
3+
description: "Have your agent communicate with an External API"
4+
---
5+
6+
External Actions allow Vocode agents to take actions outside the realm of a phone call. In particular, Vocode agents can decide to _push_ information to external systems via an API request, and _pull_ information from the API response in order to:
7+
8+
1. change the agent’s behavior based on the pulled information
9+
2. give the agent context to inform the rest of the phone call
10+
11+
## How it Works
12+
13+
### Configuring an External Action with `ExecuteExternalActionVocodeActionConfig`
14+
15+
The Vocode Agent will determine after each turn of conversation if its the ideal time to interact with the External API based primarily on the configured External Action's `description` and `input_schema`!
16+
17+
#### Overview
18+
19+
- `processing_mode`: while it only supports `muted` right now, the intent is for this field to (in the future) enable hold music while processing the request, or to enable the agent to continue speaking until the request returns.
20+
21+
- `name`: the name of the external action -- this has no impact on the functionality of action itself.
22+
23+
- `description`: Used by the Function Calling API to determine when to make the External Action call itself. See the [`description` field section below](/open-source/external-action#description-field) for more info!
24+
25+
- `url`: The API request is sent to this URL in the format
26+
defined below in [Responding to External Action API Requests](/open-source/external-action#input-schema-field)
27+
28+
- `input_schema`: A [JSON Schema](https://json-schema.org/) object that instructs how to properly form a payload to send to the External API. See the [`input_schema` field section below](/open-source/external-action#responding-to-external-action-api-requests) for more info!
29+
30+
- `speak_on_send`: if `True`, then the underlying LLM will generate a message to be spoken into the phone call as the
31+
API request is being sent.
32+
33+
- `speak_on_receive`: if `True`, then the Vocode Agent will invoke the underlying
34+
LLM to respond based on the result from the API Response or the Error encountered.
35+
36+
- `signature_secret`: a base64 encoded string to enable request validation, see the [Signature Validation section](/open-source/external-action#signature-validation) below for more info
37+
38+
#### `input_schema` Field
39+
40+
The `input_schema` field is a [JSON Schema](https://json-schema.org/) object that instructs how to properly form a payload to send to the External API.
41+
42+
For example, in the [Meeting Assistant Example](/open-source/external-action#meeting-assistant-example) below we formed the following JSON payload:
43+
44+
```json
45+
{
46+
"type": "object",
47+
"properties": {
48+
"length": {
49+
"type": "string",
50+
"enum": ["30m", "1hr"]
51+
},
52+
"time": {
53+
"type": "string",
54+
"pattern": "^d{2}:d0[ap]m$"
55+
}
56+
}
57+
}
58+
```
59+
60+
This is stating the External API is expecting:
61+
62+
- Two fields
63+
- `length` (string): either "30m" or "1hr"
64+
- `time` (string): a regex pattern defining a time ending in a zero with `am`/`pm` on the end ie: `10:30am`
65+
66+
<Card title="💡 Note" color="#ca8b04">
67+
If you’re noticing that this looks very familiar to OpenAI function calling, it is! Vocode treats OpenAI Function Calling as a first-class standard when the agent uses an OpenAI LLM.
68+
69+
The lone difference is that the top level `input_schema` JSON schema must be an `object` - this is so we can use JSON to send over parameters to the user’s API.
70+
71+
</Card>
72+
73+
#### `description` Field
74+
75+
The `description` is best used to descibe your External Action's purpose. As its passed through directly to the LLM, its the best way to convey instructions to the underlying Vocode Agent.
76+
77+
For example, in the [Meeting Assistant Example](/open-source/external-action#meeting-assistant-example) below we want to book a meeting for 30 minutes to an hour so we set the description as `Book a meeting for a 30 minute or 1 hour call.`
78+
79+
<Card title="💡 Note" color="#ca8b04">
80+
The `description` field is passed through and heavily affects how we do our
81+
function decisioning so we recommend treating it in the same way you would a
82+
prompt to an LLM!
83+
</Card>
84+
85+
### Responding to External Action API Requests
86+
87+
Once an External Action has been created, the Vocode Agent will issue API requests to the defined `url` during the course of a phone call based on the [configuration noted above](/open-source/external-action#configuring-the-external-action)
88+
The Vocode API will wait a maximum of _10 seconds_ before timing out the request.
89+
90+
In particular, Vocode will issue a POST request to `url` with a JSON payload that matches `input_schema` , specifically (using the [Meeting Assistant Example](/open-source/external-action#meeting-assistant-example) below):
91+
92+
```bash
93+
POST url HTTP/1.1
94+
Accept: application/json
95+
Content-Type: application/json
96+
x-vocode-signature: <encoded_signature>
97+
98+
{
99+
"payload": {
100+
"length": "30m",
101+
"time": "10:30am"
102+
}
103+
}
104+
```
105+
106+
#### Signature Validation
107+
108+
A cryptographically signed signature of the request body and a randomly generated byte hash is included as a header (under `x-vocode-signature`) in the outbound request so that the user’s API can validate the identity of the incoming request. The signature secret is used to sign the request and is used to ensure the validity of the `x-vocode-signature` field.
109+
110+
This should be set as a base64-encoded string and we recommend a longer length as well, using the following snippet as an example:
111+
112+
```python
113+
import os
114+
import base64
115+
116+
signature_secret = base64.b64encode(os.urandom(32)).decode()
117+
```
118+
119+
Use the following code snippet to check the signature in an inbound request:
120+
121+
```python
122+
import base64
123+
import hashlib
124+
import hmac
125+
126+
async def test_requester_encodes_signature(
127+
request_signature_value: str, signature_secret: str, payload: dict
128+
):
129+
"""
130+
Asynchronous function to check if the request signature is encoded correctly.
131+
132+
Args:
133+
request_signature_value (str): The request signature to be decoded.
134+
signature_secret (str): The signature to be decoded and used for comparison.
135+
payload (dict): The payload to be used for digest calculation.
136+
137+
Returns:
138+
None
139+
"""
140+
signature_secret_as_bytes = base64.b64decode(signature_secret)
141+
decoded_digest = base64.b64decode(request_signature_value)
142+
calculated_digest = hmac.new(signature_secret_as_bytes, payload, hashlib.sha256).digest()
143+
assert hmac.compare_digest(decoded_digest, calculated_digest) is True
144+
145+
```
146+
147+
#### Response Formatting
148+
149+
Vocode expects responses from the user’s API in JSON in the following format:
150+
151+
```python
152+
Response {
153+
result: Any
154+
agent_message: Optional[str] = None
155+
}
156+
```
157+
158+
- `result` is a payload containing the result of the action on the user’s side, and can be in any format
159+
- `agent_message` optionally contains a message that will be synthesized into audio and sent back to the phone call (see [Configuring the External Action](/open-source/external-action#configuring-the-external-action) above for more info)
160+
161+
In the [Meeting Assistant Example](/open-source/external-action#meeting-assistant-example) below, the user’s API could return back a JSON response that looks like:
162+
163+
```json
164+
{
165+
"result": {
166+
"success": true
167+
},
168+
"agent_message": "I've set up a calendar appointment at 10:30am tomorrow for 30 minutes"
169+
}
170+
```
171+
172+
## EA Local Response Server Example:
173+
174+
The following is an example of a quick start to enable testing external actions locally.
175+
176+
Running `fastapi dev app.py` will run the server @ `http://127.0.0.1:8000` and can be used for external actions locally!
177+
178+
<CodeGroup>
179+
```python app.py
180+
import time
181+
182+
from fastapi import FastAPI
183+
from pydantic import BaseModel
184+
185+
app = FastAPI()
186+
187+
class ExternalActionRequest(BaseModel):
188+
class Payload(BaseModel):
189+
length: str
190+
time: str
191+
192+
call_id: str
193+
payload: Payload
194+
195+
@app.post("/external_action")
196+
def update_item(external_action_request: ExternalActionRequest):
197+
print(f"Received request:\n{external_action_request.model_dump_json(indent=2)}")
198+
time.sleep(3)
199+
return {"result": {"success": True}}
200+
201+
````
202+
```txt requirements.txt
203+
pydantic==2.*
204+
fastapi==0.111.*
205+
````
206+
207+
</CodeGroup>
208+
209+
## Meeting Assistant Example:
210+
211+
This is an example of how to configure a Meeting Assistant Action which will attempt to book a meeting for 30 minutes or an hour at any time ending in a zero (ie 10:30am is okay but 10:35am is not)
212+
213+
```python Python
214+
import json
215+
import base64
216+
import os
217+
218+
from vocode.streaming.action.execute_external_action import (
219+
ExecuteExternalAction,
220+
ExecuteExternalActionVocodeActionConfig,
221+
)
222+
223+
ACTION_INPUT_SCHEMA = {
224+
"type": "object",
225+
"properties": {
226+
"length": {
227+
"type": "string",
228+
"enum": ["30m", "1hr"],
229+
},
230+
"time": {
231+
"type": "string",
232+
"pattern": "^\d{2}:\d0[ap]m$",
233+
},
234+
},
235+
}
236+
237+
action_config = {
238+
"name": "Meeting_Booking_Assistant",
239+
"description": "Book a meeting for a 30 minute or 1 hour call.",
240+
"url": "http://example.com/booking",
241+
"speak_on_send": True,
242+
"speak_on_receive": True,
243+
"input_schema": json.dumps(ACTION_INPUT_SCHEMA),
244+
"signature_secret": base64.b64encode(os.urandom(32)).decode(),
245+
}
246+
247+
action = ExecuteExternalAction(
248+
action_config=ExecuteExternalActionVocodeActionConfig(**action_config),
249+
)
250+
```

0 commit comments

Comments
 (0)