Skip to content

Commit

Permalink
Support bringing existing indexes and OpenAI resources (#27)
Browse files Browse the repository at this point in the history
* Merge

* Support use of existing services

* Dont setup index for existing

* Match parameters to defaults, remove print
  • Loading branch information
pamelafox authored Oct 18, 2024
1 parent 4bee5e7 commit 98a496e
Show file tree
Hide file tree
Showing 11 changed files with 287 additions and 59 deletions.
29 changes: 21 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,15 @@

This repo contains an example of how to implement RAG support in applications that use voice as their user interface, powered by the GPT-4o realtime API for audio. We describe the pattern in more detail in [this blog post](https://aka.ms/voicerag), and you can see this sample app in action in [this short video](https://youtu.be/vXJka8xZ9Ko).

* [Features](#features)
* [Architecture Diagram](#architecture-diagram)
* [Getting Started](#getting-started)
* [GitHub Codespaces](#github-codespaces)
* [VS Code Dev Containers](#vs-code-dev-containers)
* [Local environment](#local-environment)
* [Deploying the app](#deploying-the-app)
* [Development server](#development-server)

## Features

* **Voice interface**: The app uses the browser's microphone to capture voice input, and sends it to the backend where it is processed by the Azure OpenAI GPT-4o Realtime API.
Expand Down Expand Up @@ -32,7 +41,7 @@ You can run this repo virtually by using GitHub Codespaces, which will open a we

Once the codespace opens (this may take several minutes), open a new terminal and proceed to [deploy the app](#deploying-the-app).

#### VS Code Dev Containers
### VS Code Dev Containers

You can run the project in your local VS Code Dev Container using the [Dev Containers extension](https://marketplace.visualstudio.com/items?itemName=ms-vscode-remote.remote-containers):

Expand All @@ -42,7 +51,7 @@ You can run the project in your local VS Code Dev Container using the [Dev Conta
[![Open in Dev Containers](https://img.shields.io/static/v1?style=for-the-badge&label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/azure-samples/aisearch-openai-rag-audio)
3. In the VS Code window that opens, once the project files show up (this may take several minutes), open a new terminal, and proceed to [deploying the app](#deploying-the-app).

#### Local environment
### Local environment

1. Install the required tools:
* [Azure Developer CLI](https://aka.ms/azure-dev/install)
Expand Down Expand Up @@ -81,6 +90,8 @@ The steps below will provision Azure resources and deploy the application code t
Enter a name that will be used for the resource group.
This will create a new folder in the `.azure` folder, and set it as the active environment for any calls to `azd` going forward.

1. (Optional) If you want to re-use any existing resources, follow [these instructions](docs/existing_services.md) to set the appropriate `azd` environment variables.

1. Run this single command to provision the resources, deploy the code, and setup integrated vectorization for the sample data:

```shell
Expand All @@ -90,15 +101,15 @@ The steps below will provision Azure resources and deploy the application code t
* **Important**: Beware that the resources created by this command will incur immediate costs, primarily from the AI Search resource. These resources may accrue costs even if you interrupt the command before it is fully executed. You can run `azd down` or delete the resources manually to avoid unnecessary spending.
* You will be prompted to select two locations, one for the majority of resources and one for the OpenAI resource, which is currently a short list. That location list is based on the [OpenAI model availability table](https://learn.microsoft.com/azure/ai-services/openai/concepts/models#global-standard-model-availability) and may become outdated as availability changes.
1. After the application has been successfully deployed you will see a URL printed to the console. Click that URL to interact with the application in your browser. You can also now run the app locally by following the instructions in the next section.
1. After the application has been successfully deployed you will see a URL printed to the console. Navigate to that URL to interact with the app in your browser. To try out the app, click the "Start conversation button", say "Hello", and then ask a question about your data like "What is the whistleblower policy for Contoso electronics?" You can also now run the app locally by following the instructions in [the next section](#development-server).
## Running the app locally
## Development server
You can run this app locally using either the Azure services you provisioned by following the [deployment instructions](#deploying-the-app), or by pointing the local app at already existing services.
You can run this app locally using either the Azure services you provisioned by following the [deployment instructions](#deploying-the-app), or by pointing the local app at already [existing services](docs/existing_services.md).
1. If you deployed with `azd up`, you should see a `app/backend/.env` file with the necessary environment variables.
2. If you are using existing services, you will need to create `app/backend/.env` file with the following environment variables:
2. If did *not* use `azd up`, you will need to create `app/backend/.env` file with the following environment variables:
```shell
AZURE_OPENAI_ENDPOINT=wss://<your instance name>.openai.azure.com
Expand Down Expand Up @@ -127,8 +138,10 @@ You can run this app locally using either the Azure services you provisioned by
4. The app is available on [http://localhost:8765](http://localhost:8765).
Once the app is running, when you navigate to the URL above you should see the start screen of the app:
![app screenshot](docs/talktoyourdataapp.png)
Once the app is running, when you navigate to the URL above you should see the start screen of the app:
![app screenshot](docs/talktoyourdataapp.png)
To try out the app, click the "Start conversation button", say "Hello", and then ask a question about your data like "What is the whistleblower policy for Contoso electronics?"
## Guidance
Expand Down
14 changes: 11 additions & 3 deletions app/backend/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,6 @@ async def create_app():
llm_endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
llm_deployment = os.environ.get("AZURE_OPENAI_REALTIME_DEPLOYMENT")
llm_key = os.environ.get("AZURE_OPENAI_API_KEY")
search_endpoint = os.environ.get("AZURE_SEARCH_ENDPOINT")
search_index = os.environ.get("AZURE_SEARCH_INDEX")
search_key = os.environ.get("AZURE_SEARCH_API_KEY")

credential = None
Expand All @@ -45,7 +43,17 @@ async def create_app():
"1. Always use the 'search' tool to check the knowledge base before answering a question. \n" + \
"2. Always use the 'report_grounding' tool to report the source of information from the knowledge base. \n" + \
"3. Produce an answer that's as short as possible. If the answer isn't in the knowledge base, say you don't know."
attach_rag_tools(rtmt, search_endpoint, search_index, search_credential)
attach_rag_tools(rtmt,
credentials=search_credential,
search_endpoint=os.environ.get("AZURE_SEARCH_ENDPOINT"),
search_index=os.environ.get("AZURE_SEARCH_INDEX"),
semantic_configuration=os.environ.get("AZURE_SEARCH_SEMANTIC_CONFIGURATION") or "default",
identifier_field=os.environ.get("AZURE_SEARCH_IDENTIFIER_FIELD") or "chunk_id",
content_field=os.environ.get("AZURE_SEARCH_CONTENT_FIELD") or "chunk",
embedding_field=os.environ.get("AZURE_SEARCH_EMBEDDING_FIELD") or "text_vector",
title_field=os.environ.get("AZURE_SEARCH_TITLE_FIELD") or "title",
use_vector_query=(os.environ.get("AZURE_SEARCH_USE_VECTOR_QUERY") == "true") or True
)

rtmt.attach_to_app(app, "/realtime")

Expand Down
46 changes: 34 additions & 12 deletions app/backend/ragtools.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,11 @@
import re
from typing import Any

from azure.identity import DefaultAzureCredential
from azure.core.credentials import AzureKeyCredential
from azure.identity import DefaultAzureCredential
from azure.search.documents.aio import SearchClient
from azure.search.documents.models import VectorizableTextQuery

from rtmt import RTMiddleTier, Tool, ToolResult, ToolResultDirection

_search_tool_schema = {
Expand Down Expand Up @@ -48,33 +49,45 @@
}
}

async def _search_tool(search_client: SearchClient, args: Any) -> ToolResult:
async def _search_tool(
search_client: SearchClient,
semantic_configuration: str,
identifier_field: str,
content_field: str,
embedding_field: str,
use_vector_query: bool,
args: Any) -> ToolResult:
print(f"Searching for '{args['query']}' in the knowledge base.")
# Hybrid + Reranking query using Azure AI Search
vector_queries = []
if use_vector_query:
vector_queries.append(VectorizableTextQuery(text=args['query'], k_nearest_neighbors=50, fields=embedding_field))
search_results = await search_client.search(
search_text=args['query'],
query_type="semantic",
semantic_configuration_name=semantic_configuration,
top=5,
vector_queries=[VectorizableTextQuery(text=args['query'], k_nearest_neighbors=50, fields="text_vector")],
select="chunk_id,title,chunk")
vector_queries=vector_queries,
select=", ".join([identifier_field, content_field])
)
result = ""
async for r in search_results:
result += f"[{r['chunk_id']}]: {r['chunk']}\n-----\n"
result += f"[{r[identifier_field]}]: {r[content_field]}\n-----\n"
return ToolResult(result, ToolResultDirection.TO_SERVER)

KEY_PATTERN = re.compile(r'^[a-zA-Z0-9_=\-]+$')

# TODO: move from sending all chunks used for grounding eagerly to only sending links to
# the original content in storage, it'll be more efficient overall
async def _report_grounding_tool(search_client: SearchClient, args: Any) -> None:
async def _report_grounding_tool(search_client: SearchClient, identifier_field: str, title_field: str, content_field: str, args: Any) -> None:
sources = [s for s in args["sources"] if KEY_PATTERN.match(s)]
list = " OR ".join(sources)
print(f"Grounding source: {list}")
# Use search instead of filter to align with how detailt integrated vectorization indexes
# are generated, where chunk_id is searchable with a keyword tokenizer, not filterable
search_results = await search_client.search(search_text=list,
search_fields=["chunk_id"],
select=["chunk_id", "title", "chunk"],
search_fields=[identifier_field],
select=[identifier_field, title_field, content_field],
top=len(sources),
query_type="full")

Expand All @@ -84,13 +97,22 @@ async def _report_grounding_tool(search_client: SearchClient, args: Any) -> None

docs = []
async for r in search_results:
docs.append({"chunk_id": r['chunk_id'], "title": r["title"], "chunk": r['chunk']})
docs.append({"chunk_id": r[identifier_field], "title": r[title_field], "chunk": r[content_field]})
return ToolResult({"sources": docs}, ToolResultDirection.TO_CLIENT)

def attach_rag_tools(rtmt: RTMiddleTier, search_endpoint: str, search_index: str, credentials: AzureKeyCredential | DefaultAzureCredential) -> None:
def attach_rag_tools(rtmt: RTMiddleTier,
credentials: AzureKeyCredential | DefaultAzureCredential,
search_endpoint: str, search_index: str,
semantic_configuration: str,
identifier_field: str,
content_field: str,
embedding_field: str,
title_field: str,
use_vector_query: bool
) -> None:
if not isinstance(credentials, AzureKeyCredential):
credentials.get_token("https://search.azure.com/.default") # warm this up before we start getting requests
search_client = SearchClient(search_endpoint, search_index, credentials, user_agent="RTMiddleTier")

rtmt.tools["search"] = Tool(schema=_search_tool_schema, target=lambda args: _search_tool(search_client, args))
rtmt.tools["report_grounding"] = Tool(schema=_grounding_tool_schema, target=lambda args: _report_grounding_tool(search_client, args))
rtmt.tools["search"] = Tool(schema=_search_tool_schema, target=lambda args: _search_tool(search_client, semantic_configuration, identifier_field, content_field, embedding_field, use_vector_query, args))
rtmt.tools["report_grounding"] = Tool(schema=_grounding_tool_schema, target=lambda args: _report_grounding_tool(search_client, identifier_field, title_field, content_field, args))
11 changes: 9 additions & 2 deletions app/backend/setup_intvect.py
Original file line number Diff line number Diff line change
Expand Up @@ -117,11 +117,11 @@ def setup_index(azure_credential, index_name, azure_search_endpoint, azure_stora
semantic_search=SemanticSearch(
configurations=[
SemanticConfiguration(
name="semsearch",
name="default",
prioritized_fields=SemanticPrioritizedFields(title_field=SemanticField(field_name="title"), content_fields=[SemanticField(field_name="chunk")])
)
],
default_configuration_name="semsearch"
default_configuration_name="default"
)
)
)
Expand Down Expand Up @@ -223,6 +223,13 @@ def upload_documents(azure_credential, indexer_name, azure_search_endpoint, azur

load_azd_env()

logger.info("Checking if we need to set up Azure AI Search index...")
if os.environ.get("AZURE_SEARCH_REUSE_EXISTING") == "true":
logger.info("Since an existing Azure AI Search index is being used, no changes will be made to the index.")
exit()
else:
logger.info("Setting up Azure AI Search index and integrated vectorization...")

# Used to name index, indexer, data source and skillset
AZURE_SEARCH_INDEX = os.environ["AZURE_SEARCH_INDEX"]
AZURE_OPENAI_EMBEDDING_ENDPOINT = os.environ["AZURE_OPENAI_ENDPOINT"]
Expand Down
97 changes: 97 additions & 0 deletions docs/existing_services.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Connecting VoiceRAG to existing services

VoiceRAG can be connected to existing Azure services, such as Azure OpenAI and Azure Search. This guide will show you how to reuse existing services in your Azure subscription.

* [Reuse existing OpenAI real-time deployment](#reuse-existing-openai-real-time-deployment)
* [Reuse existing index from azure-search-openai-demo](#reuse-existing-index-from-azure-search-openai-demo)

## Reuse existing OpenAI real-time deployment

Run these commands _before_ running `azd up`:

1. Run this command to ensure that the [infrastructure](../infra/main.bicep) does not make a brand new OpenAI service:

```bash
azd env set AZURE_OPENAI_REUSE_EXISTING true
```

2. Run this command to ensure that the [infrastructure](../infra/main.bicep) assigns the proper RBAC roles for accessing the OpenAI resource:

```bash
azd env set AZURE_OPENAI_RESOURCE_GROUP <YOUR_RESOURCE_GROUP>
```

3. Run this command to point the app code at your Azure OpenAI endpoint:

```bash
azd env set AZURE_OPENAI_ENDPOINT https://<YOUR_OPENAI_SERVICE>.openai.azure.com
```

4. Run this command to point the app code at your Azure OpenAI real-time deployment. Note that the deployment name may be different from the model name:

```bash
azd env set AZURE_OPENAI_REALTIME_DEPLOYMENT <YOUR_REALTIME_DEPLOYMENT_NAME>
```

## Reuse existing index from azure-search-openai-demo

If you are using the popular RAG solution [azure-search-openai-demo](https://www.github.com/Azure-samples/azure-search-openai-demo), you can connect VoiceRAG to the existing index by setting the following `azd` environment variables.
Run these commands _before_ running `azd up`.

1. Run this command to ensure that the [infrastructure](../infra/main.bicep) does not make a brand new Azure Search service:

```bash
azd env set AZURE_SEARCH_REUSE_EXISTING true
```

2. Run this command to ensure that the [infrastructure](../infra/main.bicep) assigns the proper RBAC roles for accessing the Azure Search resource:

```bash
azd env set AZURE_SEARCH_SERVICE_RESOURCE_GROUP <YOUR_RESOURCE_GROUP>
```

3. Run this command to point the app code at your Azure Search service:

```bash
azd env set AZURE_SEARCH_ENDPOINT https://<YOUR_SEARCH_SERVICE>.search.windows.net
```

4. Run these commands to point the app code at the existing index and fields:

```bash
azd env set AZURE_SEARCH_SEMANTIC_CONFIGURATION default
azd env set AZURE_SEARCH_IDENTIFIER_FIELD id
azd env set AZURE_SEARCH_CONTENT_FIELD content
azd env set AZURE_SEARCH_TITLE_FIELD sourcepage
azd env set AZURE_SEARCH_EMBEDDING_FIELD embedding
azd env set AZURE_SEARCH_REUSE_EXISTING true
azd env set AZURE_SEARCH_INDEX gptkbindex
```

5. (Optional) Run this command to disable vector search:

```bash
azd env set AZURE_SEARCH_USE_VECTOR_QUERY false
```

This variable is not needed if your search index has a built-in vectorizer,
which was added to the `azure-search-openai-demo` index setup in the October 17, 2024 release.

### Development server

Alternatively, you can first test the solution locally with the `azure-search-openai-demo` index by creating a `.env` file in `app/backend` with contents like the following:

```bash
AZURE_TENANT_ID=<YOUR-TENANT-ID>
AZURE_OPENAI_ENDPOINT=https://<YOUR_OPENAI_ENDPOINT>.openai.azure.com
AZURE_OPENAI_REALTIME_DEPLOYMENT=gpt-4o-realtime-preview
AZURE_SEARCH_ENDPOINT=https://<YOUR_SEARCH_SERVICE>.search.windows.net
AZURE_SEARCH_INDEX=gptkbindex
AZURE_SEARCH_SEMANTIC_CONFIGURATION=default
AZURE_SEARCH_IDENTIFIER_FIELD=id
AZURE_SEARCH_CONTENT_FIELD=content
AZURE_SEARCH_TITLE_FIELD=sourcepage
AZURE_SEARCH_EMBEDDING_FIELD=embedding
```

Then follow the steps in the project's [README](../README.md@#development-server) to run the app locally.
Loading

0 comments on commit 98a496e

Please sign in to comment.