Skip to content

Commit a5626a2

Browse files
committed
update docs and fix integration
1 parent e34773c commit a5626a2

File tree

5 files changed

+104
-62
lines changed

5 files changed

+104
-62
lines changed

docs/images/sentry.png

416 KB
Loading

docs/open-source/sentry.mdx

+68-38
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ Integrating Sentry into your application allows you to:
1616
- Gain insights into the user experience and identify bottlenecks.
1717
- Improve the overall reliability and stability of your application.
1818

19-
## Configuring Sentry SDK
19+
## Configuring Sentry
2020

2121
To integrate Sentry into your application, you need to initialize the Sentry SDK at the earliest instantiation point in your code. This ensures that Sentry starts capturing errors and performance data as soon as possible.
2222

@@ -32,10 +32,9 @@ sentry_sdk.init(
3232
dsn=os.getenv("SENTRY_DSN"),
3333
environment=os.getenv("ENVIRONMENT"),
3434
# Sample rate for transactions (performance).
35-
traces_sample_rate=float(os.getenv("SENTRY_TRACES_SAMPLE_RATE", 1.0)),
36-
profiles_sample_rate=float(os.getenv("SENTRY_TRACES_SAMPLE_RATE", 1.0)),
35+
traces_sample_rate=1.0,
3736
# Sample rate for exceptions / crashes.
38-
sample_rate=float(os.getenv("SENTRY_SAMPLE_RATE", 1.0)),
37+
sample_rate=1.0,
3938
max_request_body_size="always",
4039
integrations=[
4140
AsyncioIntegration(),
@@ -44,62 +43,93 @@ sentry_sdk.init(
4443
)
4544
```
4645

47-
### Explanation of Configuration Settings
46+
Head to [sentry.io](sentry.io) to get your free DSN! See https://docs.sentry.io/platforms/python/configuration/options/ for more info about the above options.
4847

49-
- `dsn`: The Data Source Name (DSN) is a unique identifier for your Sentry project. It tells the SDK where to send the captured data.
50-
- `environment`: The environment in which your application is running (e.g., production, staging, development).
51-
- `traces_sample_rate`: The sample rate for capturing performance data (transactions). A value of `1.0` means 100% of transactions will be captured.
52-
- `profiles_sample_rate`: The sample rate for capturing profiling data. Similar to `traces_sample_rate`.
53-
- `sample_rate`: The sample rate for capturing errors and exceptions. A value of `1.0` means 100% of errors will be captured.
54-
- `max_request_body_size`: Configures the maximum size of request bodies to capture. Setting it to `"always"` ensures all request bodies are captured.
55-
- `integrations`: A list of integrations to use with Sentry. In this case, we're using `AsyncioIntegration` for asyncio support and `LoguruIntegration` for Loguru logging support.
48+
## Instrumenting your application
5649

57-
## Custom Sentry Spans
50+
Vocode exposes a set of custom spans that get automatically sent to Sentry during Vocode conversations. To use these spans, you'll need to manually attach a transaction to the current scope.
5851

59-
In addition to automatic error and performance monitoring, you can manually create and manage spans to gain deeper insights into specific parts of your application. The `CustomSentrySpans` class defines several custom spans that you can use to measure specific events and durations within your application. Here's a more human-readable explanation of each span and what it captures:
52+
### Example 1: Streaming Conversation
6053

61-
### Span Descriptions
54+
Update `quickstarts/streaming_conversation.py`, replace the `main` function with the following code:
6255

63-
1. **Connected to First Send** _(`CONNECTED_TO_FIRST_SEND`)_: Measures the time from when a connection is established to when the first data is sent. This can help identify delays in the initial data transmission.
56+
```python
57+
import sentry_sdk
58+
from sentry_sdk.integrations.asyncio import AsyncioIntegration
59+
from sentry_sdk.integrations.loguru import LoguruIntegration
60+
from vocode import sentry_transaction
6461

65-
2. **Endpointing Latency** _(`ENDPOINTING_LATENCY`)_: Captures the latency involved in endpointing, which is the process of determining the end of a spoken phrase or sentence. This is crucial for applications involving speech recognition.
62+
sentry_sdk.init(
63+
...,
64+
integrations=[
65+
AsyncioIntegration(),
66+
LoguruIntegration(),
67+
],
68+
)
6669

67-
3. **First Send to First Receive** _(`FIRST_SEND_TO_FIRST_RECEIVE`)_: Measures the time from when the first data is sent to when the first response is received. This span helps in understanding the round-trip time for the initial communication.
70+
async def main():
71+
...
72+
await conversation.start()
73+
...
6874

69-
4. **Language Model Time to First Token** _(`LANGUAGE_MODEL_TIME_TO_FIRST_TOKEN`)_: Tracks the time taken by the language model to generate the first token (word or character) in its response. This is useful for evaluating the performance of language models like GPT-4o.
7075

71-
5. **Latency of Conversation** _(`LATENCY_OF_CONVERSATION`)_: Measures the overall latency of a conversation, from start to finish. This span provides insights into the responsiveness of the entire conversational flow.
76+
if __name__ == "__main__":
77+
with sentry_sdk.start_transaction(
78+
op="streaming_conversation", description="streaming_conversation"
79+
) as sentry_txn:
80+
sentry_transaction.set(sentry_txn)
81+
asyncio.run(main())
82+
```
7283

73-
6. **Latency of Transcription Start** _(`LATENCY_OF_TRANSCRIPTION_START`)_: Captures the time taken to start the transcription process after receiving audio input. This is important for applications that convert speech to text.
84+
Head to the Performance pane in Sentry and click into the trace, you should see something that looks like this:
7485

75-
7. **LLM First Sentence Total** _(`LLM_FIRST_SENTENCE_TOTAL`)_: Measures the total time taken by the language model to generate the first complete sentence. This span helps in assessing the initial response time of the language model.
86+
![Sentry Transaction](/images/sentry.png)
7687

77-
8. **Start to Connection** _(`START_TO_CONNECTION`)_: Tracks the time from the start of an operation to the establishment of a connection. This can help identify delays in the connection setup phase.
88+
### Example 2: Telephony Server
7889

79-
9. **Synthesis Generate First Chunk** _(`SYNTHESIS_GENERATE_FIRST_CHUNK`)_: Measures the time taken to generate the first chunk of synthesized speech. This is crucial for applications that convert text to speech.
90+
Simply instantiate the Sentry SDK at the top of the file, e.g. in `app/telephony_app/main.py`
8091

81-
10. **Synthesis Time to First Token** _(`SYNTHESIS_TIME_TO_FIRST_TOKEN`)_: Captures the time taken to generate the first token (word or character) in the synthesized speech. This span helps in evaluating the performance of speech synthesis models.
92+
```python
93+
sentry_sdk.init(
94+
...
95+
)
8296

83-
11. **Time to First Token** _(`TIME_TO_FIRST_TOKEN`)_: Measures the overall time taken to generate the first token in any process, whether it's language modeling or speech synthesis. This span provides a general metric for initial response time.
97+
app = FastAPI(docs_url=None)
98+
```
8499

85-
12. **Synthesizer Synthesis Total** _(`SYNTHESIZER_SYNTHESIS_TOTAL`)_: Tracks the total time taken for the entire speech synthesis process. This span helps in understanding the overall performance of the synthesizer.
100+
## Custom Spans Overview
86101

87-
13. **Synthesizer Time to First Token** _(`SYNTHESIZER_TIME_TO_FIRST_TOKEN`)_: Measures the time taken by the synthesizer to generate the first token in the synthesized speech. This is useful for evaluating the initial response time of the synthesizer.
102+
### Latency of Conversation
88103

89-
14. **Synthesizer Create Speech** _(`SYNTHESIZER_CREATE_SPEECH`)_: Captures the time taken to create the entire speech output. This span provides insights into the performance of the speech creation process.
104+
**Latency of Conversation** _(`LATENCY_OF_CONVERSATION`)_ measures the overall latency of a conversation, from when the user finishes their utterance to when the agent begins its response. It is broken up into the following sub-spans:
90105

91-
#### Note on Synthesizer Spans
106+
- **[Deepgram Only] Endpointing Latency** _(`ENDPOINTING_LATENCY`)_: Captures the extra latency involved from retrieving finalized transcripts from Deepgram before deciding to invoke the agent.
107+
- **Language model Time to First Token** _(`LANGUAGE_MODEL_TIME_TO_FIRST_TOKEN`)_: Tracks the time taken by the language model to generate the first token (word or character) in its response.
108+
- **Synthesis Time to First Token** _(`SYNTHESIS_TIME_TO_FIRST_TOKEN`)_: Measures the time taken by the synthesizer to generate the first token in the synthesized speech. This is useful for evaluating the initial response time of the synthesizer.
92109

93-
The following spans will have the actual synthesizer's name prepended to them. For example, if the synthesizer is `ElevenLabsSynthesizer`, the span `SYNTHESIZER_SYNTHESIS_TOTAL` will be recorded as `ElevenLabsSynthesizer.synthesis_total`:
110+
### Deepgram
94111

95-
- **Synthesis Total** _(`SYNTHESIZER_SYNTHESIS_TOTAL`)_
96-
- **Time to First Token** _(`SYNTHESIZER_TIME_TO_FIRST_TOKEN`)_
97-
- **Create Speech** _(`SYNTHESIZER_CREATE_SPEECH`)_
112+
We capture the following spans in our Deepgram integration:
113+
114+
- **Connected to First Send** _(`CONNECTED_TO_FIRST_SEND`)_: Measures the time from when the Deepgram websocket connection is established to when the first data is sent
115+
- **[Deepgram Only] First Send to First Receive** _(`FIRST_SEND_TO_FIRST_RECEIVE`)_: Measures the time from when the first data is sent to Deepgram to when the first response is received
116+
- **[Deepgram Only] Start to Connection** _(`START_TO_CONNECTION`)_: Tracks the time it takes to establish the websocket connection with Deepgram
117+
118+
### LLM
98119

99-
This naming convention helps in identifying and differentiating spans for various synthesizers in your application.
120+
For our OpenAI and Anthropic integrations, we capture:
100121

101-
## Wrap Up
122+
- **Time to First Token** _(`TIME_TO_FIRST_TOKEN`)_: Measures the time taken by the language model to generate the first token (word or character) in its response.
123+
- **LLM First Sentence Total** _(`LLM_FIRST_SENTENCE_TOTAL`)_: Measures the total time taken by the language model to generate the first complete sentence.
102124

103-
Integrating Sentry into your application provides valuable insights into errors, exceptions, and performance issues. By configuring the Sentry SDK and using custom spans, you can monitor the health of your application and improve its reliability and stability.
125+
### Synthesizer
104126

105-
For more information on Sentry and its features, refer to the [official documentation](https://docs.sentry.io/).
127+
For most of our synthesizer integrations, we capture:
128+
129+
- **Synthesis Generate First Chunk** _(`SYNTHESIS_GENERATE_FIRST_CHUNK`)_: Measures the time taken to generate the first chunk of synthesized speech.
130+
- **Synthesizer Synthesis Total** _(`SYNTHESIZER_SYNTHESIS_TOTAL`)_: Tracks the total time taken for the entire speech synthesis process. This span helps in understanding the overall performance of the synthesizer.
131+
132+
These spans will have the actual synthesizer's name prepended to them. For example, if the synthesizer is `ElevenLabsSynthesizer`, the span `SYNTHESIZER_SYNTHESIS_TOTAL` will be recorded as `ElevenLabsSynthesizer.synthesis_total`:
133+
134+
- **Synthesis Total** _(`SYNTHESIZER_SYNTHESIS_TOTAL`)_
135+
- **Time to First Token** _(`SYNTHESIZER_TIME_TO_FIRST_TOKEN`)_

vocode/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -83,4 +83,5 @@ def getenv(key, default=None):
8383
ContextVar("conversation_id", default=None),
8484
)
8585
sentry_span_tags: ContextWrapper = ContextWrapper(ContextVar("sentry_span_tags", default=None))
86+
sentry_transaction = ContextWrapper(ContextVar("sentry_transaction", default=None))
8687
get_serialized_ctx_wrappers = ContextWrapper.serialize_instances

vocode/streaming/telephony/server/router/calls.py

+21-17
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@
22

33
from fastapi import APIRouter, HTTPException, WebSocket
44
from loguru import logger
5+
import sentry_sdk
56

67
from vocode.streaming.agent.abstract_factory import AbstractAgentFactory
78
from vocode.streaming.agent.default_factory import DefaultAgentFactory
@@ -22,6 +23,7 @@
2223
from vocode.streaming.transcriber.default_factory import DefaultTranscriberFactory
2324
from vocode.streaming.utils.base_router import BaseRouter
2425
from vocode.streaming.utils.events_manager import EventsManager
26+
from vocode import sentry_transaction
2527

2628

2729
class CallsRouter(BaseRouter):
@@ -96,25 +98,27 @@ def _from_call_config(
9698
raise ValueError(f"Unknown call config type {call_config.type}")
9799

98100
async def connect_call(self, websocket: WebSocket, id: str):
99-
await websocket.accept()
100-
logger.debug("Phone WS connection opened for chat {}".format(id))
101-
call_config = await self.config_manager.get_config(id)
102-
if not call_config:
103-
raise HTTPException(status_code=400, detail="No active phone call")
101+
with sentry_sdk.start_transaction(op="connect_call") as sentry_txn:
102+
sentry_transaction.set(sentry_txn)
103+
await websocket.accept()
104+
logger.debug("Phone WS connection opened for chat {}".format(id))
105+
call_config = await self.config_manager.get_config(id)
106+
if not call_config:
107+
raise HTTPException(status_code=400, detail="No active phone call")
104108

105-
phone_conversation = self._from_call_config(
106-
base_url=self.base_url,
107-
call_config=call_config,
108-
config_manager=self.config_manager,
109-
conversation_id=id,
110-
transcriber_factory=self.transcriber_factory,
111-
agent_factory=self.agent_factory,
112-
synthesizer_factory=self.synthesizer_factory,
113-
events_manager=self.events_manager,
114-
)
109+
phone_conversation = self._from_call_config(
110+
base_url=self.base_url,
111+
call_config=call_config,
112+
config_manager=self.config_manager,
113+
conversation_id=id,
114+
transcriber_factory=self.transcriber_factory,
115+
agent_factory=self.agent_factory,
116+
synthesizer_factory=self.synthesizer_factory,
117+
events_manager=self.events_manager,
118+
)
115119

116-
await phone_conversation.attach_ws_and_start(websocket)
117-
logger.debug("Phone WS connection closed for chat {}".format(id))
120+
await phone_conversation.attach_ws_and_start(websocket)
121+
logger.debug("Phone WS connection closed for chat {}".format(id))
118122

119123
def get_router(self) -> APIRouter:
120124
return self.router

vocode/utils/sentry_utils.py

+14-7
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
from loguru import logger
66
from sentry_sdk.tracing import Span, Transaction, _SpanRecorder
77

8-
from vocode import get_serialized_ctx_wrappers
8+
from vocode import get_serialized_ctx_wrappers, sentry_transaction
99

1010
if TYPE_CHECKING:
1111
from vocode.streaming.synthesizer.base_synthesizer import BaseSynthesizer
@@ -163,8 +163,8 @@ def set_tags(span: Span) -> Span:
163163

164164
@sentry_configured
165165
def get_span_by_op(op_value):
166-
transaction: Transaction = sentry_sdk.Hub.current.scope.transaction
167-
if transaction is not None:
166+
transaction: Transaction = sentry_sdk.Hub.current.scope.transaction or sentry_transaction.value
167+
if transaction is not None and transaction._span_recorder is not None:
168168
# Probably not great accessing an internal variable but transaction spans aren't
169169
# exposed publicly so it is what it is.
170170
span_matches = [
@@ -180,18 +180,25 @@ def get_span_by_op(op_value):
180180
return set_tags(most_recent_span)
181181
else:
182182
# If no span with the matching op was found
183-
logger.error(f"No span found with op '{op_value}'.")
183+
logger.warning(f"No span found with op '{op_value}'.")
184184
return None
185185
else:
186-
logger.debug("No active transaction found.")
186+
if transaction and transaction._span_recorder is None:
187+
logger.warning(f"Transaction Span Recorder Missing -- {transaction}")
188+
else:
189+
logger.warning("No active transaction found.")
187190
return None
188191

189192

190193
@sentry_configured
191194
def complete_span_by_op(op_value):
192-
span = get_span_by_op(op_value)
195+
try:
196+
span = get_span_by_op(op_value)
197+
except Exception as e:
198+
logger.error(f"Error getting span by op '{op_value}': {e}")
199+
return None
193200
if span is None:
194-
logger.error(f"No span found with op '{op_value}'.")
201+
logger.warning(f"No span found with op '{op_value}'.")
195202
return None
196203
span.finish()
197204

0 commit comments

Comments
 (0)