This sample showcases a powerful Tech Support Assistant that remembers your device details and past issues, eliminating the frustration of repeating yourself. By leveraging Semantic Memories, this agent delivers a more personalized and efficient support experience. It shocases using the Memory Module to extract semantic memories from the user's messages and use them to answer questions more efficiently.
Check out the LinkedIn post for a video of the sample in action.
An initial example with the tech support assistant.
A followup example with the tech support assistant using memory to answer questions more efficiently.
- Remembers your device details - No more repeating your device type, OS, or year
- Personalized support experience - Conversations that feel continuous, not disconnected
- Efficient problem-solving - Faster resolutions by building on past interactions
- Seamless memory integration - Demonstrates practical implementation of memory in AI assistants
See tech_assistant_agent for more details on the tech support assistant agent. Its prompts are especially helpful to understand how this agent works.
Adding memory to an agent has a number of benefits:
- Contextual Understanding: Memory allows agents to maintain a context of previous interactions, enabling them to make more informed decisions and provide more accurate responses. It reduces the need to supply and rehydrate the context from scratch as was necessary for traditional chatbots.
- Simpler interactions: Memory allows agents to remember information about the user, such as their name, preferences, and history, leading to more personalized and engaging experiences. In this sample, the agent can remember the user's device type, operating system, and device year so they don't have to ask for it every time.
- Enhanced User Experience: Memory helps agents remember user preferences and history, leading to a more personalized and engaging experience.
In this sample, the agent is given access to the memory module as an explicit tool call to retrieve semantic memories if needed. The working memory on the other hand is fetched before the functional loop begins.
Note
The memory extraction happens in the background whenever messages are received by the agent, or sent by the agent.
The sample is initialized with a list of topics that it cares about. These are topics that the agent wants to remember about the user. Specifically, they are:
- Device Type
- Operating System
- Device Year
See tools.py for the definition of the topics.
When you initialize the MemoryMiddleware
, it will start to record all the messages that are incoming or outgoing from the bot. These messages are then used by the agent as working memory and also for extraction for long term memory.
By setting up the middleware we also get access to a scoped version of the memory_module
from the TurnContext. This memory module is scoped to the conversation that the TurnContext is built for.
See bot.py for the initialization of the MemoryMiddleware
.
Tip
You'll notice that for the sample, the timeout_seconds
is 60 seconds. The extraction here is set to be a bit aggressive (extract every 1 minute if there is a message in a conversation) to demonstrate memory extraction, but a higher threshold here is reasonable to set here.
The Memory Module can be set up to automatically extract long term memories from the working memory. When the application server starts up, by calling memory_middleware.memory_module.listen()
, it will start to trigger extraction of memories in depending on the configuration passed when the MemoryMiddleware
(or MemoryModule
) was initialized. This work happens in a background thread and is non-blocking.
See app.py for the initialization of the MemoryMiddleware
. Note that when listen
is called, you also should call shutdown
when the application is shutting down.
Note
The alternative to automatic extraction is explicit extraction. This can be accomplished by calling memory_module.process_messages
which will process all the messages that are in the message buffer. Check out the Memory Module for more details.
The agent can use the conversational messages as working memory to build up contexts for LLM calls for the agent. In addition to the incoming and outgoing messages, the agent can also add internal messages to the working memory.
See primary_agent.py for how working memory is used, and also how internal messages are added to the working memory.
Note
To demonstrate long term memory, the working memory only takes the last 1 minute of messages. Feel free to configure this to be longer. See primary_agent.py for the configuration.
The tech support assistant can search for memories from a tool call (See get_memorized_fields). In this tool call, the agent searches memories for a given topic. Depending on if the memories are found or not, the agent can then continue to ask the user for the information or proceed with the flow (like confirming the memories).
This usage of memory is explicit as it requires the LLM to explicitly seek memories via a tool call. Another approach is to use implicit memory. In this case, similar to working memory, the application could automatically search for memories that it deems as always necessary to the task and include it in the system prompt. The search in this case could be done for a particular topic or query.
If the agent finds memories that are relevant to the task at-hand, the tech support assistant can ask for confirmations of the memories and cite the original sources of the memories.
See confirm_memorized_fields for the implementation of the tool call.
Prerequisites
To run the template in your local dev machine, you will need:
- Python, version 3.8 to 3.11.
- Python extension, version v2024.0.1 or higher.
- Teams Toolkit Visual Studio Code Extension latest version or Teams Toolkit CLI.
- An account with OpenAI.
- Node.js (supported versions: 16, 18) for local debug in Test Tool.
- create .env in root folder. Copy the below template into it.
# AZURE CONFIG
AZURE_OPENAI_API_KEY=<API key>
AZURE_OPENAI_DEPLOYMENT=gpt-4o
AZURE_OPENAI_EMBEDDING_DEPLOYMENT=text-embedding-3-small
AZURE_OPENAI_API_BASE=https://<domain name>.openai.azure.com
AZURE_OPENAI_API_VERSION=<version number>
# OPENAI CONFIG
OPENAI_MODEL_NAME=gpt-4o
OPENAI_API_KEY=<API key>
OPENAI_EMBEDDING_MODEL_NAME=text-embedding-3-small
Remember, that these are also used by the Memory Module to extract and retrieve memories.
Fill out only one of Azure OpenAI and OpenAI configurations.
- Open a new terminal under root folder.
- run
npm install -g @microsoft/teamsapp-cli
- run
uv sync
- run
.venv\Scripts\Activate
- run
python src/app.py
If success, server will start onhttp://localhost:3978
- Open another new Terminal under root folder
- Install the teams app test tool (if you haven't already done that)
- run
mkdir -p src/devTool/teamsapptester
(orNew-Item -ItemType Directory -Path src/devTool/teamsapptester -Force
on Powershell) - run
npm i @microsoft/teams-app-test-tool --prefix "src/devTools/teamsapptester"
- run
- run
node src/devTools/teamsapptester/node_modules/@microsoft/teams-app-test-tool/cli.js start
If success, a test website will show up
- Open a new terminal under root folder.
- run
uv sync
- run
.venv\Scripts\Activate
- Open this folder as a VSCode workspace.
- Navigate to the
Run and Debug
tab in VSCode, and selectDebug in Teams (Edge)
. This will start the flow to sideload the bot into Teams, start the server locally, and start the tunnel that exposes the server to the web.
Currently the scaffolding only supports Azure OpenAI related configurations but can be easily update to support OpenAI configuration.
- Open a new terminal under root folder.
- run
uv sync
- run
.venv\Scripts\Activate
- Build the memory module into a distribtuion file by doing
uv build packages/teams_memory
. This should create the artifactdist/teams_memory-0.1.0.tar.gz
. Copy this into thesrc/dist/
folder. - Open this folder as a VSCode workspace.
- Copy the contents of the
.env
file and add it to theenv/.env.dev.user
file. - Navigate to the Teams Toolkit extension in VSCode.
- Under
Lifecycle
, first clickProvision
to provision resources to Azure. - Then click
Deploy
, this should deploy the project to the Azure App Service instance, and run the start up script. - If the above two steps completed successfully, then click
Publish
. This will create an app package in./appPackage/build/appPackage.dev.zip
. - Sideload the app package in Teams and start chatting with the bot.
Congratulations! 🎉 You are running an application that can now interact with users in Teams: