[docs sprint] Add Sentry Docs to OS (#20)

Mac Wilkinson · srhinos · ajar98 · web-flow · commit 2ea74b3b1b04 · 2024-06-13T17:42:32.000-07:00
* Add Sentry Docs to OS

* Remove Tracing

* update docs and fix integration

* remove free

---------

Co-authored-by: srhinos &lt;6531393+srhinos@users.noreply.github.com&gt;
Co-authored-by: Ajay Raj &lt;ajay.n.raj@gmail.com&gt;
diff --git a/docs/images/sentry.png b/docs/images/sentry.png
diff --git a/docs/mint.json b/docs/mint.json
@@ -77,6 +77,7 @@
         "open-source/playground",
         "open-source/turn-based-conversation",
         "open-source/language-support",
+        "open-source/sentry",
         "open-source/logging-with-loguru",
         "open-source/agent-factory"
       ]
diff --git a/docs/open-source/sentry.mdx b/docs/open-source/sentry.mdx
@@ -0,0 +1,135 @@
+---
+title: "Sentry SDK Integration"
+description: "Integrate Sentry for error tracking and performance monitoring"
+---
+
+## What is Sentry?
+
+[Sentry](https://sentry.io/) is an open-source error tracking tool that helps developers monitor and fix crashes in real-time. It provides insights into the health of your applications by capturing and reporting errors, exceptions, and performance issues.
+
+## Why integrate Sentry?
+
+Integrating Sentry into your application allows you to:
+
+- Automatically capture and report errors and exceptions.
+- Monitor the performance of your application.
+- Gain insights into the user experience and identify bottlenecks.
+- Improve the overall reliability and stability of your application.
+
+## Configuring Sentry
+
+To integrate Sentry into your application, you need to initialize the Sentry SDK at the earliest instantiation point in your code. This ensures that Sentry starts capturing errors and performance data as soon as possible.
+
+Here's how you can configure the Sentry SDK:
+
+```python
+import os
+import sentry_sdk
+from sentry_sdk.integrations.asyncio import AsyncioIntegration
+from sentry_sdk.integrations.logging import LoguruIntegration
+
+sentry_sdk.init(
+    dsn=os.getenv("SENTRY_DSN"),
+    environment=os.getenv("ENVIRONMENT"),
+    # Sample rate for transactions (performance).
+    traces_sample_rate=1.0,
+    # Sample rate for exceptions / crashes.
+    sample_rate=1.0,
+    max_request_body_size="always",
+    integrations=[
+        AsyncioIntegration(),
+        LoguruIntegration(),
+    ],
+)
+```
+
+Head to [sentry.io](sentry.io) to get your DSN! See https://docs.sentry.io/platforms/python/configuration/options/ for more info about the above options.
+
+## Instrumenting your application
+
+Vocode exposes a set of custom spans that get automatically sent to Sentry during Vocode conversations. To use these spans, you'll need to manually attach a transaction to the current scope.
+
+### Example 1: Streaming Conversation
+
+Update `quickstarts/streaming_conversation.py`, replace the `main` function with the following code:
+
+```python
+import sentry_sdk
+from sentry_sdk.integrations.asyncio import AsyncioIntegration
+from sentry_sdk.integrations.loguru import LoguruIntegration
+from vocode import sentry_transaction
+
+sentry_sdk.init(
+    ...,
+    integrations=[
+        AsyncioIntegration(),
+        LoguruIntegration(),
+    ],
+)
+
+async def main():
+    ...
+    await conversation.start()
+    ...
+
+
+if __name__ == "__main__":
+    with sentry_sdk.start_transaction(
+        op="streaming_conversation", description="streaming_conversation"
+    ) as sentry_txn:
+        sentry_transaction.set(sentry_txn)
+        asyncio.run(main())
+```
+
+Head to the Performance pane in Sentry and click into the trace, you should see something that looks like this:
+
+![Sentry Transaction](/images/sentry.png)
+
+### Example 2: Telephony Server
+
+Simply instantiate the Sentry SDK at the top of the file, e.g. in `app/telephony_app/main.py`
+
+```python
+sentry_sdk.init(
+    ...
+)
+
+app = FastAPI(docs_url=None)
+```
+
+## Custom Spans Overview
+
+### Latency of Conversation
+
+**Latency of Conversation** _(`LATENCY_OF_CONVERSATION`)_ measures the overall latency of a conversation, from when the user finishes their utterance to when the agent begins its response. It is broken up into the following sub-spans:
+
+- **[Deepgram Only] Endpointing Latency** _(`ENDPOINTING_LATENCY`)_: Captures the extra latency involved from retrieving finalized transcripts from Deepgram before deciding to invoke the agent.
+- **Language model Time to First Token** _(`LANGUAGE_MODEL_TIME_TO_FIRST_TOKEN`)_: Tracks the time taken by the language model to generate the first token (word or character) in its response.
+- **Synthesis Time to First Token** _(`SYNTHESIS_TIME_TO_FIRST_TOKEN`)_: Measures the time taken by the synthesizer to generate the first token in the synthesized speech. This is useful for evaluating the initial response time of the synthesizer.
+
+### Deepgram
+
+We capture the following spans in our Deepgram integration:
+
+- **Connected to First Send** _(`CONNECTED_TO_FIRST_SEND`)_: Measures the time from when the Deepgram websocket connection is established to when the first data is sent
+- **[Deepgram Only] First Send to First Receive** _(`FIRST_SEND_TO_FIRST_RECEIVE`)_: Measures the time from when the first data is sent to Deepgram to when the first response is received
+- **[Deepgram Only] Start to Connection** _(`START_TO_CONNECTION`)_: Tracks the time it takes to establish the websocket connection with Deepgram
+
+### LLM
+
+For our OpenAI and Anthropic integrations, we capture:
+
+- **Time to First Token** _(`TIME_TO_FIRST_TOKEN`)_: Measures the time taken by the language model to generate the first token (word or character) in its response.
+- **LLM First Sentence Total** _(`LLM_FIRST_SENTENCE_TOTAL`)_: Measures the total time taken by the language model to generate the first complete sentence.
+
+### Synthesizer
+
+For most of our synthesizer integrations, we capture:
+
+- **Synthesis Generate First Chunk** _(`SYNTHESIS_GENERATE_FIRST_CHUNK`)_: Measures the time taken to generate the first chunk of synthesized speech.
+- **Synthesizer Synthesis Total** _(`SYNTHESIZER_SYNTHESIS_TOTAL`)_: Tracks the total time taken for the entire speech synthesis process. This span helps in understanding the overall performance of the synthesizer.
+
+These spans will have the actual synthesizer's name prepended to them. For example, if the synthesizer is `ElevenLabsSynthesizer`, the span `SYNTHESIZER_SYNTHESIS_TOTAL` will be recorded as `ElevenLabsSynthesizer.synthesis_total`:
+
+- **Synthesis Total** _(`SYNTHESIZER_SYNTHESIS_TOTAL`)_
+- **Time to First Token** _(`SYNTHESIZER_TIME_TO_FIRST_TOKEN`)_
diff --git a/vocode/__init__.py b/vocode/__init__.py
@@ -83,4 +83,5 @@ def getenv(key, default=None):
     ContextVar("conversation_id", default=None),
 )
 sentry_span_tags: ContextWrapper = ContextWrapper(ContextVar("sentry_span_tags", default=None))
+sentry_transaction = ContextWrapper(ContextVar("sentry_transaction", default=None))
 get_serialized_ctx_wrappers = ContextWrapper.serialize_instances
diff --git a/vocode/streaming/telephony/server/router/calls.py b/vocode/streaming/telephony/server/router/calls.py
@@ -2,6 +2,7 @@
 
 from fastapi import APIRouter, HTTPException, WebSocket
 from loguru import logger
+import sentry_sdk
 
 from vocode.streaming.agent.abstract_factory import AbstractAgentFactory
 from vocode.streaming.agent.default_factory import DefaultAgentFactory
@@ -22,6 +23,7 @@
 from vocode.streaming.transcriber.default_factory import DefaultTranscriberFactory
 from vocode.streaming.utils.base_router import BaseRouter
 from vocode.streaming.utils.events_manager import EventsManager
+from vocode import sentry_transaction
 
 
 class CallsRouter(BaseRouter):
@@ -96,25 +98,27 @@ def _from_call_config(
             raise ValueError(f"Unknown call config type {call_config.type}")
 
     async def connect_call(self, websocket: WebSocket, id: str):
-        await websocket.accept()
-        logger.debug("Phone WS connection opened for chat {}".format(id))
-        call_config = await self.config_manager.get_config(id)
-        if not call_config:
-            raise HTTPException(status_code=400, detail="No active phone call")
+        with sentry_sdk.start_transaction(op="connect_call") as sentry_txn:
+            sentry_transaction.set(sentry_txn)
+            await websocket.accept()
+            logger.debug("Phone WS connection opened for chat {}".format(id))
+            call_config = await self.config_manager.get_config(id)
+            if not call_config:
+                raise HTTPException(status_code=400, detail="No active phone call")
 
-        phone_conversation = self._from_call_config(
-            base_url=self.base_url,
-            call_config=call_config,
-            config_manager=self.config_manager,
-            conversation_id=id,
-            transcriber_factory=self.transcriber_factory,
-            agent_factory=self.agent_factory,
-            synthesizer_factory=self.synthesizer_factory,
-            events_manager=self.events_manager,
-        )
+            phone_conversation = self._from_call_config(
+                base_url=self.base_url,
+                call_config=call_config,
+                config_manager=self.config_manager,
+                conversation_id=id,
+                transcriber_factory=self.transcriber_factory,
+                agent_factory=self.agent_factory,
+                synthesizer_factory=self.synthesizer_factory,
+                events_manager=self.events_manager,
+            )
 
-        await phone_conversation.attach_ws_and_start(websocket)
-        logger.debug("Phone WS connection closed for chat {}".format(id))
+            await phone_conversation.attach_ws_and_start(websocket)
+            logger.debug("Phone WS connection closed for chat {}".format(id))
 
     def get_router(self) -> APIRouter:
         return self.router
diff --git a/vocode/utils/sentry_utils.py b/vocode/utils/sentry_utils.py
@@ -5,7 +5,7 @@
 from loguru import logger
 from sentry_sdk.tracing import Span, Transaction, _SpanRecorder
 
-from vocode import get_serialized_ctx_wrappers
+from vocode import get_serialized_ctx_wrappers, sentry_transaction
 
 if TYPE_CHECKING:
     from vocode.streaming.synthesizer.base_synthesizer import BaseSynthesizer
@@ -163,8 +163,8 @@ def set_tags(span: Span) -> Span:
 
 @sentry_configured
 def get_span_by_op(op_value):
-    transaction: Transaction = sentry_sdk.Hub.current.scope.transaction
-    if transaction is not None:
+    transaction: Transaction = sentry_sdk.Hub.current.scope.transaction or sentry_transaction.value
+    if transaction is not None and transaction._span_recorder is not None:
         # Probably not great accessing an internal variable but transaction spans aren't
         # exposed publicly so it is what it is.
         span_matches = [
@@ -180,18 +180,25 @@ def get_span_by_op(op_value):
                 return set_tags(most_recent_span)
         else:
             # If no span with the matching op was found
-            logger.error(f"No span found with op '{op_value}'.")
+            logger.warning(f"No span found with op '{op_value}'.")
             return None
     else:
-        logger.debug("No active transaction found.")
+        if transaction and transaction._span_recorder is None:
+            logger.warning(f"Transaction Span Recorder Missing -- {transaction}")
+        else:
+            logger.warning("No active transaction found.")
         return None
 
 
 @sentry_configured
 def complete_span_by_op(op_value):
-    span = get_span_by_op(op_value)
+    try:
+        span = get_span_by_op(op_value)
+    except Exception as e:
+        logger.error(f"Error getting span by op '{op_value}': {e}")
+        return None
     if span is None:
-        logger.error(f"No span found with op '{op_value}'.")
+        logger.warning(f"No span found with op '{op_value}'.")
         return None
     span.finish()
 

Original file line number	Diff line number	Diff line change
`@@ -77,6 +77,7 @@`
`77`	`77`	`"open-source/playground",`
`78`	`78`	`"open-source/turn-based-conversation",`
`79`	`79`	`"open-source/language-support",`
	`80`	`+ "open-source/sentry",`
`80`	`81`	`"open-source/logging-with-loguru",`
`81`	`82`	`"open-source/agent-factory"`
`82`	`83`	`]`
Original file line number	Diff line number	Diff line change
`@@ -83,4 +83,5 @@ def getenv(key, default=None):`
`83`	`83`	`ContextVar("conversation_id", default=None),`
`84`	`84`	`)`
`85`	`85`	`sentry_span_tags: ContextWrapper = ContextWrapper(ContextVar("sentry_span_tags", default=None))`
	`86`	`+sentry_transaction = ContextWrapper(ContextVar("sentry_transaction", default=None))`
`86`	`87`	`get_serialized_ctx_wrappers = ContextWrapper.serialize_instances`