Support bringing existing indexes and OpenAI resources (#27)

* Merge * Support use of existing services * Dont setup index for existing * Match parameters to defaults, remove print
Azure-Samples · Oct 18, 2024 · 98a496e · 98a496e
1 parent 4bee5e7
commit 98a496e
Show file tree

Hide file tree

Showing 11 changed files with 287 additions and 59 deletions.
diff --git a/README.md b/README.md
@@ -5,6 +5,15 @@
 
 This repo contains an example of how to implement RAG support in applications that use voice as their user interface, powered by the GPT-4o realtime API for audio. We describe the pattern in more detail in [this blog post](https://aka.ms/voicerag), and you can see this sample app in action in [this short video](https://youtu.be/vXJka8xZ9Ko).
 
+* [Features](#features)
+* [Architecture Diagram](#architecture-diagram)
+* [Getting Started](#getting-started)
+  * [GitHub Codespaces](#github-codespaces)
+  * [VS Code Dev Containers](#vs-code-dev-containers)
+  * [Local environment](#local-environment)
+* [Deploying the app](#deploying-the-app)
+* [Development server](#development-server)
+
 ## Features
 
 * **Voice interface**: The app uses the browser's microphone to capture voice input, and sends it to the backend where it is processed by the Azure OpenAI GPT-4o Realtime API.
@@ -32,7 +41,7 @@ You can run this repo virtually by using GitHub Codespaces, which will open a we
 
 Once the codespace opens (this may take several minutes), open a new terminal and proceed to [deploy the app](#deploying-the-app).
 
-#### VS Code Dev Containers
+### VS Code Dev Containers
 
 You can run the project in your local VS Code Dev Container using the [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers):
 
@@ -42,7 +51,7 @@ You can run the project in your local VS Code Dev Container using the [Dev Conta
     [![Open in Dev Containers](https://img.shields.io/static/v1?style=for-the-badge&label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/azure-samples/aisearch-openai-rag-audio)
 3. In the VS Code window that opens, once the project files show up (this may take several minutes), open a new terminal, and proceed to [deploying the app](#deploying-the-app).
 
-#### Local environment
+### Local environment
 
 1. Install the required tools:
    * [Azure Developer CLI](https://aka.ms/azure-dev/install)
@@ -81,6 +90,8 @@ The steps below will provision Azure resources and deploy the application code t
     Enter a name that will be used for the resource group.
     This will create a new folder in the `.azure` folder, and set it as the active environment for any calls to `azd` going forward.
 
+1. (Optional) If you want to re-use any existing resources, follow [these instructions](docs/existing_services.md) to set the appropriate `azd` environment variables.
+
 1. Run this single command to provision the resources, deploy the code, and setup integrated vectorization for the sample data:
 
    ```shell
@@ -90,15 +101,15 @@ The steps below will provision Azure resources and deploy the application code t
    * **Important**: Beware that the resources created by this command will incur immediate costs, primarily from the AI Search resource. These resources may accrue costs even if you interrupt the command before it is fully executed. You can run `azd down` or delete the resources manually to avoid unnecessary spending.
    * You will be prompted to select two locations, one for the majority of resources and one for the OpenAI resource, which is currently a short list. That location list is based on the [OpenAI model availability table](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#global-standard-model-availability) and may become outdated as availability changes.
 
-1. After the application has been successfully deployed you will see a URL printed to the console.  Click that URL to interact with the application in your browser. You can also now run the app locally by following the instructions in the next section.
+1. After the application has been successfully deployed you will see a URL printed to the console.  Navigate to that URL to interact with the app in your browser. To try out the app, click the "Start conversation button", say "Hello", and then ask a question about your data like "What is the whistleblower policy for Contoso electronics?" You can also now run the app locally by following the instructions in [the next section](#development-server).
 
-## Running the app locally
+## Development server
 
-You can run this app locally using either the Azure services you provisioned by following the [deployment instructions](#deploying-the-app), or by pointing the local app at already existing services.
+You can run this app locally using either the Azure services you provisioned by following the [deployment instructions](#deploying-the-app), or by pointing the local app at already [existing services](docs/existing_services.md).
 
 1. If you deployed with `azd up`, you should see a `app/backend/.env` file with the necessary environment variables.
 
-2. If you are using existing services, you will need to create `app/backend/.env` file with the following environment variables:
+2. If did *not* use `azd up`, you will need to create `app/backend/.env` file with the following environment variables:
 
    ```shell
    AZURE_OPENAI_ENDPOINT=wss://<your instance name>.openai.azure.com
@@ -127,8 +138,10 @@ You can run this app locally using either the Azure services you provisioned by
 
 4. The app is available on [http://localhost:8765](http://localhost:8765).
 
-Once the app is running, when you navigate to the URL above you should see the start screen of the app:
-![app screenshot](docs/talktoyourdataapp.png)
+   Once the app is running, when you navigate to the URL above you should see the start screen of the app:
+   ![app screenshot](docs/talktoyourdataapp.png)
+
+   To try out the app, click the "Start conversation button", say "Hello", and then ask a question about your data like "What is the whistleblower policy for Contoso electronics?"
 
 ## Guidance
 

diff --git a/app/backend/app.py b/app/backend/app.py
@@ -20,8 +20,6 @@ async def create_app():
     llm_endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
     llm_deployment = os.environ.get("AZURE_OPENAI_REALTIME_DEPLOYMENT")
     llm_key = os.environ.get("AZURE_OPENAI_API_KEY")
-    search_endpoint = os.environ.get("AZURE_SEARCH_ENDPOINT")
-    search_index = os.environ.get("AZURE_SEARCH_INDEX")
     search_key = os.environ.get("AZURE_SEARCH_API_KEY")
 
     credential = None
@@ -45,7 +43,17 @@ async def create_app():
                           "1. Always use the 'search' tool to check the knowledge base before answering a question. \n" + \
                           "2. Always use the 'report_grounding' tool to report the source of information from the knowledge base. \n" + \
                           "3. Produce an answer that's as short as possible. If the answer isn't in the knowledge base, say you don't know."
-    attach_rag_tools(rtmt, search_endpoint, search_index, search_credential)
+    attach_rag_tools(rtmt,
+        credentials=search_credential,
+        search_endpoint=os.environ.get("AZURE_SEARCH_ENDPOINT"),
+        search_index=os.environ.get("AZURE_SEARCH_INDEX"),
+        semantic_configuration=os.environ.get("AZURE_SEARCH_SEMANTIC_CONFIGURATION") or "default",
+        identifier_field=os.environ.get("AZURE_SEARCH_IDENTIFIER_FIELD") or "chunk_id",
+        content_field=os.environ.get("AZURE_SEARCH_CONTENT_FIELD") or "chunk",
+        embedding_field=os.environ.get("AZURE_SEARCH_EMBEDDING_FIELD") or "text_vector",
+        title_field=os.environ.get("AZURE_SEARCH_TITLE_FIELD") or "title",
+        use_vector_query=(os.environ.get("AZURE_SEARCH_USE_VECTOR_QUERY") == "true") or True
+        )
 
     rtmt.attach_to_app(app, "/realtime")
 

diff --git a/app/backend/ragtools.py b/app/backend/ragtools.py
@@ -1,10 +1,11 @@
 import re
 from typing import Any
 
-from azure.identity import DefaultAzureCredential
 from azure.core.credentials import AzureKeyCredential
+from azure.identity import DefaultAzureCredential
 from azure.search.documents.aio import SearchClient
 from azure.search.documents.models import VectorizableTextQuery
+
 from rtmt import RTMiddleTier, Tool, ToolResult, ToolResultDirection
 
 _search_tool_schema = {
@@ -48,33 +49,45 @@
     }
 }
 
-async def _search_tool(search_client: SearchClient, args: Any) -> ToolResult:
+async def _search_tool(
+    search_client: SearchClient, 
+    semantic_configuration: str,
+    identifier_field: str,
+    content_field: str,
+    embedding_field: str,
+    use_vector_query: bool,
+    args: Any) -> ToolResult:
     print(f"Searching for '{args['query']}' in the knowledge base.")
     # Hybrid + Reranking query using Azure AI Search
+    vector_queries = []
+    if use_vector_query:
+        vector_queries.append(VectorizableTextQuery(text=args['query'], k_nearest_neighbors=50, fields=embedding_field))
     search_results = await search_client.search(
         search_text=args['query'], 
         query_type="semantic",
+        semantic_configuration_name=semantic_configuration,
         top=5,
-        vector_queries=[VectorizableTextQuery(text=args['query'], k_nearest_neighbors=50, fields="text_vector")],
-        select="chunk_id,title,chunk")
+        vector_queries=vector_queries,
+        select=", ".join([identifier_field, content_field])
+    )
     result = ""
     async for r in search_results:
-        result += f"[{r['chunk_id']}]: {r['chunk']}\n-----\n"
+        result += f"[{r[identifier_field]}]: {r[content_field]}\n-----\n"
     return ToolResult(result, ToolResultDirection.TO_SERVER)
 
 KEY_PATTERN = re.compile(r'^[a-zA-Z0-9_=\-]+$')
 
 # TODO: move from sending all chunks used for grounding eagerly to only sending links to 
 # the original content in storage, it'll be more efficient overall
-async def _report_grounding_tool(search_client: SearchClient, args: Any) -> None:
+async def _report_grounding_tool(search_client: SearchClient, identifier_field: str, title_field: str, content_field: str, args: Any) -> None:
     sources = [s for s in args["sources"] if KEY_PATTERN.match(s)]
     list = " OR ".join(sources)
     print(f"Grounding source: {list}")
     # Use search instead of filter to align with how detailt integrated vectorization indexes
     # are generated, where chunk_id is searchable with a keyword tokenizer, not filterable 
     search_results = await search_client.search(search_text=list, 
-                                                search_fields=["chunk_id"], 
-                                                select=["chunk_id", "title", "chunk"], 
+                                                search_fields=[identifier_field], 
+                                                select=[identifier_field, title_field, content_field], 
                                                 top=len(sources), 
                                                 query_type="full")
 
@@ -84,13 +97,22 @@ async def _report_grounding_tool(search_client: SearchClient, args: Any) -> None
 
     docs = []
     async for r in search_results:
-        docs.append({"chunk_id": r['chunk_id'], "title": r["title"], "chunk": r['chunk']})
+        docs.append({"chunk_id": r[identifier_field], "title": r[title_field], "chunk": r[content_field]})
     return ToolResult({"sources": docs}, ToolResultDirection.TO_CLIENT)
 
-def attach_rag_tools(rtmt: RTMiddleTier, search_endpoint: str, search_index: str, credentials: AzureKeyCredential | DefaultAzureCredential) -> None:
+def attach_rag_tools(rtmt: RTMiddleTier,
+    credentials: AzureKeyCredential | DefaultAzureCredential,
+    search_endpoint: str, search_index: str,
+    semantic_configuration: str,
+    identifier_field: str,
+    content_field: str,
+    embedding_field: str,
+    title_field: str,
+    use_vector_query: bool
+    ) -> None:
     if not isinstance(credentials, AzureKeyCredential):
         credentials.get_token("https://search.azure.com/.default") # warm this up before we start getting requests
     search_client = SearchClient(search_endpoint, search_index, credentials, user_agent="RTMiddleTier")
 
-    rtmt.tools["search"] = Tool(schema=_search_tool_schema, target=lambda args: _search_tool(search_client, args))
-    rtmt.tools["report_grounding"] = Tool(schema=_grounding_tool_schema, target=lambda args: _report_grounding_tool(search_client, args))
+    rtmt.tools["search"] = Tool(schema=_search_tool_schema, target=lambda args: _search_tool(search_client, semantic_configuration, identifier_field, content_field, embedding_field, use_vector_query, args))
+    rtmt.tools["report_grounding"] = Tool(schema=_grounding_tool_schema, target=lambda args: _report_grounding_tool(search_client, identifier_field, title_field, content_field, args))
diff --git a/app/backend/setup_intvect.py b/app/backend/setup_intvect.py
@@ -117,11 +117,11 @@ def setup_index(azure_credential, index_name, azure_search_endpoint, azure_stora
                 semantic_search=SemanticSearch(
                     configurations=[
                         SemanticConfiguration(
-                            name="semsearch",
+                            name="default",
                             prioritized_fields=SemanticPrioritizedFields(title_field=SemanticField(field_name="title"), content_fields=[SemanticField(field_name="chunk")])
                         )
                     ],
-                    default_configuration_name="semsearch"
+                    default_configuration_name="default"
                 )
             )
         )
@@ -223,6 +223,13 @@ def upload_documents(azure_credential, indexer_name, azure_search_endpoint, azur
 
     load_azd_env()
 
+    logger.info("Checking if we need to set up Azure AI Search index...")
+    if os.environ.get("AZURE_SEARCH_REUSE_EXISTING") == "true":
+        logger.info("Since an existing Azure AI Search index is being used, no changes will be made to the index.")
+        exit()
+    else:
+        logger.info("Setting up Azure AI Search index and integrated vectorization...")
+
     # Used to name index, indexer, data source and skillset
     AZURE_SEARCH_INDEX = os.environ["AZURE_SEARCH_INDEX"]
     AZURE_OPENAI_EMBEDDING_ENDPOINT = os.environ["AZURE_OPENAI_ENDPOINT"]

diff --git a/docs/existing_services.md b/docs/existing_services.md
@@ -0,0 +1,97 @@
+# Connecting VoiceRAG to existing services
+
+VoiceRAG can be connected to existing Azure services, such as Azure OpenAI and Azure Search. This guide will show you how to reuse existing services in your Azure subscription.
+
+* [Reuse existing OpenAI real-time deployment](#reuse-existing-openai-real-time-deployment)
+* [Reuse existing index from azure-search-openai-demo](#reuse-existing-index-from-azure-search-openai-demo)
+
+## Reuse existing OpenAI real-time deployment
+
+Run these commands _before_ running `azd up`:
+
+1. Run this command to ensure that the [infrastructure](../infra/main.bicep) does not make a brand new OpenAI service:
+
+    ```bash
+    azd env set AZURE_OPENAI_REUSE_EXISTING true
+    ```
+
+2. Run this command to ensure that the [infrastructure](../infra/main.bicep) assigns the proper RBAC roles for accessing the OpenAI resource:
+
+    ```bash
+    azd env set AZURE_OPENAI_RESOURCE_GROUP <YOUR_RESOURCE_GROUP>
+    ```
+
+3. Run this command to point the app code at your Azure OpenAI endpoint:
+
+    ```bash
+    azd env set AZURE_OPENAI_ENDPOINT https://<YOUR_OPENAI_SERVICE>.openai.azure.com
+    ```
+
+4. Run this command to point the app code at your Azure OpenAI real-time deployment. Note that the deployment name may be different from the model name:
+
+    ```bash
+    azd env set AZURE_OPENAI_REALTIME_DEPLOYMENT <YOUR_REALTIME_DEPLOYMENT_NAME>
+    ```
+
+## Reuse existing index from azure-search-openai-demo
+
+If you are using the popular RAG solution [azure-search-openai-demo](https://www.github.com/Azure-samples/azure-search-openai-demo), you can connect VoiceRAG to the existing index by setting the following `azd` environment variables.
+Run these commands _before_ running `azd up`.
+
+1. Run this command to ensure that the [infrastructure](../infra/main.bicep) does not make a brand new Azure Search service:
+
+    ```bash
+    azd env set AZURE_SEARCH_REUSE_EXISTING true
+    ```
+
+2. Run this command to ensure that the [infrastructure](../infra/main.bicep) assigns the proper RBAC roles for accessing the Azure Search resource:
+
+    ```bash
+    azd env set AZURE_SEARCH_SERVICE_RESOURCE_GROUP <YOUR_RESOURCE_GROUP>
+    ```
+
+3. Run this command to point the app code at your Azure Search service:
+
+    ```bash
+    azd env set AZURE_SEARCH_ENDPOINT https://<YOUR_SEARCH_SERVICE>.search.windows.net
+    ```
+
+4. Run these commands to point the app code at the existing index and fields:
+
+    ```bash
+    azd env set AZURE_SEARCH_SEMANTIC_CONFIGURATION default
+    azd env set AZURE_SEARCH_IDENTIFIER_FIELD id
+    azd env set AZURE_SEARCH_CONTENT_FIELD content
+    azd env set AZURE_SEARCH_TITLE_FIELD sourcepage
+    azd env set AZURE_SEARCH_EMBEDDING_FIELD embedding
+    azd env set AZURE_SEARCH_REUSE_EXISTING true
+    azd env set AZURE_SEARCH_INDEX gptkbindex
+    ```
+
+5. (Optional) Run this command to disable vector search:
+
+    ```bash
+    azd env set AZURE_SEARCH_USE_VECTOR_QUERY false
+    ```
+
+    This variable is not needed if your search index has a built-in vectorizer,
+    which was added to the `azure-search-openai-demo` index setup in the October 17, 2024 release.
+
+### Development server
+
+Alternatively, you can first test the solution locally with the `azure-search-openai-demo` index by creating a `.env` file in `app/backend` with contents like the following:
+
+```bash
+AZURE_TENANT_ID=<YOUR-TENANT-ID>
+AZURE_OPENAI_ENDPOINT=https://<YOUR_OPENAI_ENDPOINT>.openai.azure.com
+AZURE_OPENAI_REALTIME_DEPLOYMENT=gpt-4o-realtime-preview
+AZURE_SEARCH_ENDPOINT=https://<YOUR_SEARCH_SERVICE>.search.windows.net
+AZURE_SEARCH_INDEX=gptkbindex
+AZURE_SEARCH_SEMANTIC_CONFIGURATION=default
+AZURE_SEARCH_IDENTIFIER_FIELD=id
+AZURE_SEARCH_CONTENT_FIELD=content
+AZURE_SEARCH_TITLE_FIELD=sourcepage
+AZURE_SEARCH_EMBEDDING_FIELD=embedding
+```
+
+Then follow the steps in the project's [README](../README.md@#development-server) to run the app locally.