diff --git a/docs/ai/fastapi_gelai_searchbot.rst b/docs/ai/fastapi_gelai_searchbot.rst new file mode 100644 index 00000000000..673648a04a7 --- /dev/null +++ b/docs/ai/fastapi_gelai_searchbot.rst @@ -0,0 +1,1716 @@ +.. _ref_guide_fastapi_gelai_searchbot: + +=================== +FastAPI (Searchbot) +=================== + +:edb-alt-title: Building a search bot with memory using FastAPI and Gel AI + +In this tutorial we're going to walk you through building a chat bot with search +capabilities using Gel and `FastAPI `_. + +FastAPI is a framework designed to help you build web apps *fast*. Gel is a +data layer designed to help you figure out storage in your application - also +*fast*. By the end of this tutorial, you will have tried out different aspects +of using those two together. + +We will start by creating an app with FastAPI, adding web search capabilities, +and then putting search results through a language model to get a +human-friendly answer. After that, we'll use Gel to implement chat history so +that the bot remembers previous interactions with the user. We'll finish it off +with semantic search-based cross-chat memory. + +The end result is going to look something like this: + +.. image:: + /docs/tutorials/placeholder.png + :alt: Placeholder + :width: 100% + +1. Initialize the project +========================= + +.. edb:split-section:: + + We're going to start by installing `uv `_ - a Python + package manager that's going to simplify environment management for us. You can + follow their `installation instructions + `_ or simply run: + + .. code-block:: bash + + $ curl -LsSf https://astral.sh/uv/install.sh | sh + +.. edb:split-section:: + + Once that is done, we can use uv to create scaffolding for our project following + the `documentation `_: + + .. code-block:: bash + + $ uv init searchbot \ + && cd searchbot + +.. edb:split-section:: + + For now, we know we're going to need Gel and FastAPI, so let's add those + following uv's instructions on `managing dependencies + `_, + as well as FastAPI's `installation docs + `_. Running ``uv sync`` after + that will create our virtual environment in a ``.venv`` directory and ensure + it's ready. As the last step, we'll activate the environment and get started. + + .. note:: + + Every time you open a new terminal session, you should source the + environment before running ``python``, ``gel`` or ``fastapi`` commands. + + .. code-block:: bash + + $ uv add "fastapi[standard]" \ + && uv add gel \ + && uv sync \ + && source .venv/bin/activate + + +2. Get started with FastAPI +=========================== + +.. edb:split-section:: + + At this stage we need to follow FastAPI's `tutorial + `_ to create the foundation of our app. + + We're going to make a minimal web API with one endpoint that takes in a user + query as an input and echoes it as an output. First, let's make a directory + called ``app`` in our project root, and put an empty ``__init__.py`` there. + + .. code-block:: bash + + $ mkdir app && touch app/__init__.py + +.. edb:split-section:: + + Now let's create a file called ``main.py`` inside the ``app`` directory and put + the "Hello World" example in it: + + .. code-block:: python + :caption: app/main.py + + from fastapi import FastAPI + + app = FastAPI() + + + @app.get("/") + async def root(): + return {"message": "Hello World"} + + +.. edb:split-section:: + + To start the server, we'll run: + + .. code-block:: bash + + $ fastapi dev app/main.py + + +.. edb:split-section:: + + Once the server gets up and running, we can make sure it works using FastAPI's + built-in UI at _, or manually with ``curl``: + + .. code-block:: bash + + $ curl -X 'GET' \ + 'http://127.0.0.1:8000/' \ + -H 'accept: application/json' + + {"message":"Hello World"} + + +.. edb:split-section:: + + Now, to create the search endpoint we mentioned earlier, we need to pass our + query as a parameter to it. We'd prefer to have it in the request's body + since user messages can be long. + + In FastAPI land, this is done by creating a Pydantic schema and making it the + type of the input parameter. `Pydantic `_ is + a data validation library for Python. It has many features, but we don't + actually need to know about them for now. All we need to know is that FastAPI + uses Pydantic types to automatically figure out schemas for `input + `_, as well as `output + `_. + + Let's add the following to our ``main.py``: + + .. code-block:: python + :caption: app/main.py + + from pydantic import BaseModel + + + class SearchTerms(BaseModel): + query: str + + class SearchResult(BaseModel): + response: str | None = None + + +.. edb:split-section:: + + Now, we can define our endpoint. We'll set the two classes we just created as + the new endpoint's argument and return type. + + .. code-block:: python + :caption: app/main.py + + @app.post("/search") + async def search(search_terms: SearchTerms) -> SearchResult: + return SearchResult(response=search_terms.query) + + +.. edb:split-section:: + + Same as before, we can test the endpoint using the UI, or by sending a request + with ``curl``: + + .. code-block:: bash + + $ curl -X 'POST' \ + 'http://127.0.0.1:8000/search' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ "query": "string" }' + + { + "response": "string", + } + +3. Implement web search +======================= + +Now that we have our web app infrastructure in place, let's add some substance +to it by implementing web search capabilities. + +.. edb:split-section:: + + There're many powerful feature-rich products for LLM-driven web search. But + in this tutorial we're going to use a much more reliable source of real-world + information that is comment threads on `Hacker News + `_. Their `web API + `_ is free of charge and doesn't require an + account. Below is a simple function that requests a full-text search for a + string query and extracts a nice sampling of comment threads from each of the + stories that came up in the result. + + We are not going to cover this code sample in too much depth. Feel free to grab + it save it to ``app/web.py``, or make your own. + + Notice that we've created another Pydantic type called ``WebSource`` to store + our web search results. There's no framework-related reason for that, it's just + nicer than passing dictionaries around. + + .. code-block:: python + :caption: app/web.py + :class: collapsible + + import requests + from pydantic import BaseModel + from datetime import datetime + import html + + + class WebSource(BaseModel): + """Type that stores search results.""" + + url: str | None = None + title: str | None = None + text: str | None = None + + + def extract_comment_thread( + comment: dict, + max_depth: int = 3, + current_depth: int = 0, + max_children=3, + ) -> list[str]: + """ + Recursively extract comments from a thread up to max_depth. + Returns a list of formatted comment strings. + """ + if not comment or current_depth > max_depth: + return [] + + results = [] + + # Get timestamp, author and the body of the comment, + # then pad it with spaces so that it's offset appropriately for its depth + + if comment["text"]: + timestamp = datetime.fromisoformat(comment["created_at"].replace("Z", "+00:00")) + author = comment["author"] + text = html.unescape(comment["text"]) + formatted_comment = f"[{timestamp.strftime('%Y-%m-%d %H:%M')}] {author}: {text}" + results.append((" " * current_depth) + formatted_comment) + + # If there're children comments, we are going to extract them too, + # and add them to the list. + + if comment.get("children"): + for child in comment["children"][:max_children]: + child_comments = extract_comment_thread(child, max_depth, current_depth + 1) + results.extend(child_comments) + + return results + + + def fetch_web_sources(query: str, limit: int = 5) -> list[WebSource]: + """ + For a given query perform a full-text search for stories on Hacker News. + From each of the matched stories extract the comment thread and format it into a single string. + For each story return its title, url and comment thread. + """ + search_url = "http://hn.algolia.com/api/v1/search_by_date?numericFilters=num_comments>0" + + # Search for stories + response = requests.get( + search_url, + params={ + "query": query, + "tags": "story", + "hitsPerPage": limit, + "page": 0, + }, + ) + + response.raise_for_status() + search_result = response.json() + + # For each search hit fetch and process the story + web_sources = [] + for hit in search_result.get("hits", []): + item_url = f"https://hn.algolia.com/api/v1/items/{hit['story_id']}" + response = requests.get(item_url) + response.raise_for_status() + item_result = response.json() + + site_url = f"https://news.ycombinator.com/item?id={hit['story_id']}" + title = hit["title"] + comments = extract_comment_thread(item_result) + text = "\n".join(comments) if len(comments) > 0 else None + web_sources.append( + WebSource(url=site_url, title=title, text=text) + ) + + return web_sources + + + if __name__ == "__main__": + web_sources = fetch_web_sources("edgedb", limit=5) + + for source in web_sources: + print(source.url) + print(source.title) + print(source.text) + + +.. edb:split-section:: + + One more note: this snippet comes with an extra dependency called ``requests``, + which is a library for making HTTP requests. Let's add it by running: + + .. code-block:: bash + + $ uv add requests + + +.. edb:split-section:: + + Now, we can test our web search on its own by running it like this: + + .. code-block:: bash + + $ python3 app/web.py + + +.. edb:split-section:: + + It's time to reflect the new capabilities in our web app. + + .. code-block:: python + :caption: app/main.py + + from .web import fetch_web_sources, WebSource + + async def search_web(query: str) -> list[WebSource]: + raw_sources = fetch_web_sources(query, limit=5) + return [s for s in raw_sources if s.text is not None] + + +.. edb:split-section:: + + Now we can update the ``/search`` endpoint as follows: + + .. code-block:: python-diff + :caption: app/main.py + + class SearchResult(BaseModel): + response: str | None = None + + sources: list[WebSource] | None = None + + + @app.post("/search") + async def search(search_terms: SearchTerms) -> SearchResult: + + web_sources = await search_web(search_terms.query) + - return SearchResult(response=search_terms.query) + + return SearchResult( + + response=search_terms.query, sources=web_sources + + ) + + +4. Connect to the LLM +===================== + +Now that we're capable of scraping text from search results, we can forward +those results to the LLM to get a nice-looking summary. + +.. edb:split-section:: + + There's a million different LLMs accessible via a web API (`one + `_, `two + `_, `three + `_, `four `_ to name + a few), feel free to choose whichever you prefer. In this tutorial we will + roll with OpenAI, primarily for how ubiquitous it is. To keep things somewhat + provider-agnostic, we're going to get completions via raw HTTP requests. + Let's grab API descriptions from OpenAI's `API documentation + `_, and set up + LLM generation like this: + + .. code-block:: python + :caption: app/main.py + + import requests + from dotenv import load_dotenv + + _ = load_dotenv() + + + def get_llm_completion(system_prompt: str, messages: list[dict[str, str]]) -> str: + api_key = os.getenv("OPENAI_API_KEY") + url = "https://api.openai.com/v1/chat/completions" + headers = {"Content-Type": "application/json", "Authorization": f"Bearer {api_key}"} + + response = requests.post( + url, + headers=headers, + json={ + "model": "gpt-4o-mini", + "messages": [ + {"role": "developer", "content": system_prompt}, + *messages, + ], + }, + ) + response.raise_for_status() + result = response.json() + return result["choices"][0]["message"]["content"] + + +.. edb:split-section:: + + Note that this cloud LLM API (and many others) requires a secret key to be + set as an environment variable. A common way to manage those is to use the + ``python-dotenv`` library in combinations with a ``.env`` file. Feel free to + browse `the readme + `_, + to learn more. Create a file called ``.env`` in the root directory and put + your api key in there: + + .. code-block:: .env + :caption: .env + + OPENAI_API_KEY="sk-..." + + +.. edb:split-section:: + + Don't forget to add the new dependency to the environment: + + .. code-block:: bash + + uv add python-dotenv + + +.. edb:split-section:: + + And now we can integrate this LLM-related code with the rest of the app. First, + let's set up a function that prepares LLM inputs: + + + .. code-block:: python + :caption: app/main.py + + async def generate_answer( + query: str, + web_sources: list[WebSource], + ) -> SearchResult: + system_prompt = ( + "You are a helpful assistant that answers user's questions" + + " by finding relevant information in Hacker News threads." + + " When answering the question, describe conversations that people have around the subject," + + " provided to you as a context, or say i don't know if they are completely irrelevant." + ) + + prompt = f"User search query: {query}\n\nWeb search results:\n" + + for i, source in enumerate(web_sources): + prompt += f"Result {i} (URL: {source.url}):\n" + prompt += f"{source.text}\n\n" + + messages = [{"role": "user", "content": prompt}] + + llm_response = get_llm_completion( + system_prompt=system_prompt, + messages=messages, + ) + + search_result = SearchResult( + response=llm_response, + sources=web_sources, + ) + + return search_result + + +.. edb:split-section:: + + Then we can plug that function into the ``/search`` endpoint: + + .. code-block:: python-diff + :caption: app/main.py + + @app.post("/search") + async def search(search_terms: SearchTerms) -> SearchResult: + web_sources = await search_web(search_terms.query) + + search_result = await generate_answer(search_terms.query, web_sources) + + return search_result + - return SearchResult( + - response=search_terms.query, sources=web_sources + - ) + + +.. edb:split-section:: + + And now we can test the result as usual. + + .. code-block:: bash + + $ curl -X 'POST' \ + 'http://127.0.0.1:8000/search' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ "query": "gel" }' + + +5. Use Gel to implement chat history +==================================== + +So far we've built an application that can take in a query, fetch some Hacker +News threads for it, sift through them using an LLM, and generate a nice +summary. + +However, right now it's hardly user-friendly since you have to speak in +keywords and basically start over every time you want to refine the query. To +enable a more organic multi-turn interaction, we need to add chat history and +infer the query from the context of the entire conversation. + +Now's a good time to introduce Gel. + +.. edb:split-section:: + + In case you need installation instructions, take a look at the :ref:`Quickstart + `. Once Gel CLI is present in your system, initialize the + project like this: + + .. code-block:: bash + + $ gel project init --non-interactive + + +This command is going to put some project scaffolding inside our app, spin up a +local instace of Gel, and then link the two together. From now on, all +Gel-related things that happen inside our project directory are going to be +automatically run on the correct database instance, no need to worry about +connection incantations. + + +Defining the schema +------------------- + +The database :ref:`schema ` in Gel is defined +declaratively. The :ref:`gel project init ` +command has created a file called ``dbchema/default.esdl``, which we're going to +use to define our types. + +.. edb:split-section:: + + We obviously want to keep track of the messages, so we need to represent + those in the schema. By convention established in the LLM space, each message + is going to have a role in addition to the message content itself. We can + also get Gel to automatically keep track of message's creation time by adding + a property callled ``timestamp`` and setting its :ref:`default value + ` to the output of the :ref:`datetime_current() + ` function. Finally, LLM messages in our search bot have + source URLs associated with them. Let's keep track of those too, by adding a + :ref:`multi-property `. + + .. code-block:: sdl + :caption: dbschema/default.esdl + + type Message { + role: str; + body: str; + timestamp: datetime { + default := datetime_current(); + } + multi sources: str; + } + + +.. edb:split-section:: + + Messages are grouped together into a chat, so let's add that entity to our + schema too. + + .. code-block:: sdl + :caption: dbschema/default.esdl + + type Chat { + multi messages: Message; + } + + +.. edb:split-section:: + + And chats all belong to a certain user, making up their chat history. One other + thing we'd like to keep track of about our users is their username, and it would + make sense for us to make sure that it's unique by using an ``excusive`` + :ref:`constraint `. + + .. code-block:: sdl + :caption: dbschema/default.esdl + + type User { + name: str { + constraint exclusive; + } + multi chats: Chat; + } + + +.. edb:split-section:: + + We're going to keep our schema super simple. One cool thing about Gel is that + it will enable us to easily implement advanced features such as authentication + or AI down the road, but we're gonna come back to that later. + + For now, this is the entire schema we came up with: + + .. code-block:: sdl + :caption: dbschema/default.esdl + + module default { + type Message { + role: str; + body: str; + timestamp: datetime { + default := datetime_current(); + } + multi sources: str; + } + + type Chat { + multi messages: Message; + } + + type User { + name: str { + constraint exclusive; + } + multi chats: Chat; + } + } + + +.. edb:split-section:: + + Let's use the :ref:`gel migration create ` CLI + command, followed by :ref:`gel migrate ` in order to + migrate to our new schema and proceed to writing some queries. + + .. code-block:: bash + + $ gel migration create + $ gel migrate + + +.. edb:split-section:: + + Now that our schema is applied, let's quickly populate the database with some + fake data in order to be able to test the queries. We're going to explore + writing queries in a bit, but for now you can just run the following command in + the shell: + + .. code-block:: bash + :class: collapsible + + $ mkdir app/sample_data && cat << 'EOF' > app/sample_data/inserts.edgeql + # Create users first + insert User { + name := 'alice', + }; + insert User { + name := 'bob', + }; + # Insert chat histories for Alice + update User + filter .name = 'alice' + set { + chats := { + (insert Chat { + messages := { + (insert Message { + role := 'user', + body := 'What are the main differences between GPT-3 and GPT-4?', + timestamp := '2024-01-07T10:00:00Z', + sources := {'arxiv:2303.08774', 'openai.com/research/gpt-4'} + }), + (insert Message { + role := 'assistant', + body := 'The key differences include improved reasoning capabilities, better context understanding, and enhanced safety features...', + timestamp := '2024-01-07T10:00:05Z', + sources := {'openai.com/blog/gpt-4-details', 'arxiv:2303.08774'} + }) + } + }), + (insert Chat { + messages := { + (insert Message { + role := 'user', + body := 'Can you explain what policy gradient methods are in RL?', + timestamp := '2024-01-08T14:30:00Z', + sources := {'Sutton-Barto-RL-Book-Ch13', 'arxiv:1904.12901'} + }), + (insert Message { + role := 'assistant', + body := 'Policy gradient methods are a class of reinforcement learning algorithms that directly optimize the policy...', + timestamp := '2024-01-08T14:30:10Z', + sources := {'Sutton-Barto-RL-Book-Ch13', 'spinning-up.openai.com'} + }) + } + }) + } + }; + # Insert chat histories for Bob + update User + filter .name = 'bob' + set { + chats := { + (insert Chat { + messages := { + (insert Message { + role := 'user', + body := 'What are the pros and cons of different sharding strategies?', + timestamp := '2024-01-05T16:15:00Z', + sources := {'martin-kleppmann-ddia-ch6', 'aws.amazon.com/sharding-patterns'} + }), + (insert Message { + role := 'assistant', + body := 'The main sharding strategies include range-based, hash-based, and directory-based sharding...', + timestamp := '2024-01-05T16:15:08Z', + sources := {'martin-kleppmann-ddia-ch6', 'mongodb.com/docs/sharding'} + }), + (insert Message { + role := 'user', + body := 'Could you elaborate on hash-based sharding?', + timestamp := '2024-01-05T16:16:00Z', + sources := {'mongodb.com/docs/sharding'} + }) + } + }) + } + }; + EOF + + +.. edb:split-section:: + + This created the ``app/sample_data/inserts.edgeql`` file, which we can now execute + using the CLI like this: + + .. code-block:: bash + + $ gel query -f app/sample_data/inserts.edgeql + + {"id": "862de904-de39-11ef-9713-4fab09220c4a"} + {"id": "862e400c-de39-11ef-9713-2f81f2b67013"} + {"id": "862de904-de39-11ef-9713-4fab09220c4a"} + {"id": "862e400c-de39-11ef-9713-2f81f2b67013"} + + +.. edb:split-section:: + + The :ref:`gel query ` command is one of many ways we can + execute a query in Gel. Now that we've done it, there's stuff in the database. + Let's verify it by running: + + .. code-block:: bash + + $ gel query "select User { name };" + + {"name": "alice"} + {"name": "bob"} + + +Writing queries +--------------- + +With schema in place, it's time to focus on getting the data in and out of the +database. + +In this tutorial we're going to write queries using :ref:`EdgeQL +` and then use :ref:`codegen ` to +generate typesafe function that we can plug directly into out Python code. If +you are completely unfamiliar with EdgeQL, now is a good time to check out the +basics before proceeding. + + +.. edb:split-section:: + + Let's move on. First, we'll create a directory inside ``app`` called + ``queries``. This is where we're going to put all of the EdgeQL-related stuff. + + We're going to start by writing a query that fetches all of the users. In + ``queries`` create a file named ``get_users.edgeql`` and put the following query + in there: + + .. code-block:: edgeql + :caption: app/queries/get_users.edgeql + + select User { name }; + + +.. edb:split-section:: + + Now run the code generator from the shell: + + .. code-block:: bash + + $ gel-py + + +.. edb:split-section:: + + It's going to automatically locate the ``.edgeql`` file and generate types for + it. We can inspect generated code in ``app.queries/get_users_async_edgeql.py``. + Once that is done, let's use those types to create the endpoint in ``main.py``: + + .. code-block:: python + :caption: app/main.py + + from edgedb import create_async_client + from .queries.get_users_async_edgeql import get_users as get_users_query, GetUsersResult + + + gel_client = create_async_client() + + @app.get("/users") + async def get_users() -> list[GetUsersResult]: + return await get_users_query(gel_client) + + +.. edb:split-section:: + + Let's verify it that works as expected: + + .. code-block:: bash + + $ curl -X 'GET' \ + 'http://127.0.0.1:8000/users' \ + -H 'accept: application/json' + + [ + { + "id": "862de904-de39-11ef-9713-4fab09220c4a", + "name": "alice" + }, + { + "id": "862e400c-de39-11ef-9713-2f81f2b67013", + "name": "bob" + } + ] + + +.. edb:split-section:: + + While we're at it, let's also implement the option to fetch a user by their + username. In order to do that, we need to write a new query in a separate file + ``app/queries/get_user_by_name.edgeql``: + + .. code-block:: edgeql + :caption: app/queries/get_user_by_name.edgeql + + select User { name } + filter .name = $name; + + +.. edb:split-section:: + + After that, we will run the code generator again by calling ``gel-py``. In the + app, we are going to reuse the same endpoint that fetches the list of all users. + From now on, if the user calls it without any arguments (e.g. + ``http://127.0.0.1/users``), they are going to receive the list of all users, + same as before. But if they pass a username as a query argument like this: + ``http://127.0.0.1/users?username=bob``, the system will attempt to fetch a user + named ``bob``. + + In order to achieve this, we're going to need to add a ``Query``-type argument + to our endpoint function. You can learn more about how to configure this type of + arguments in `FastAPI's docs + `_. It's default value is + going to be ``None``, which will enable us to implement our conditional logic: + + .. code-block:: python + :caption: app/main.py + + from fastapi import Query, HTTPException + from http import HTTPStatus + from .queries.get_user_by_name_async_edgeql import ( + get_user_by_name as get_user_by_name_query, + GetUserByNameResult, + ) + + + @app.get("/users") + async def get_users( + username: str = Query(None), + ) -> list[GetUsersResult] | GetUserByNameResult: + """List all users or get a user by their username""" + if username: + user = await get_user_by_name_query(gel_client, name=username) + if not user: + raise HTTPException( + HTTPStatus.NOT_FOUND, + detail={"error": f"Error: user {username} does not exist."}, + ) + return user + else: + return await get_users_query(gel_client) + + +.. edb:split-section:: + + And once again, let's verify that everything works: + + .. code-block:: bash + + $ curl -X 'GET' \ + 'http://127.0.0.1:8000/users?username=alice' \ + -H 'accept: application/json' + + { + "id": "862de904-de39-11ef-9713-4fab09220c4a", + "name": "alice" + } + + +.. edb:split-section:: + + Finally, let's also implement the option to add a new user. For this, just as + before, we'll create a new file ``app/queries/create_user.edgeql``, add a query + to it and run code generation. + + Note that in this query we've wrapped the ``insert`` in a ``select`` statement. + This is a common pattern in EdgeQL, that can be used whenever you would like to + get something other than object ID when you just inserted it. + + .. code-block:: edgeql + :caption: app/queries/create_user.edgeql + + select( + insert User { + name := $username + } + ) { + name + } + + + +.. edb:split-section:: + + In order to integrate this query into our app, we're going to add a new + endpoint. Note that this one has the same name ``/users``, but is for the POST + HTTP method. + + .. code-block:: python + :caption: app/main.py + + from gel import ConstraintViolationError + from .queries.create_user_async_edgeql import ( + create_user as create_user_query, + CreateUserResult, + ) + + @app.post("/users", status_code=HTTPStatus.CREATED) + async def post_user(username: str = Query()) -> CreateUserResult: + try: + return await create_user_query(gel_client, username=username) + except ConstraintViolationError: + raise HTTPException( + status_code=HTTPStatus.BAD_REQUEST, + detail={"error": f"Username '{username}' already exists."}, + ) + + +.. edb:split-section:: + + Once more, let's verify that the new endpoint works as expected: + + .. code-block:: bash + + $ curl -X 'POST' \ + 'http://127.0.0.1:8000/users?username=charlie' \ + -H 'accept: application/json' \ + -d '' + + { + "id": "20372a1a-ded5-11ef-9a08-b329b578c45c", + "name": "charlie" + } + + +.. edb:split-section:: + + This wraps things up for our user-related functionality. Of course, we now need + to deal with Chats and Messages, too. We're not going to go in depth for those, + since the process would be quite similar to what we've just done. Instead, feel + free to implement those endpoints yourself as an exercise, or copy the code + below if you are in rush. + + .. code-block:: bash + :class: collapsible + + $ echo 'select Chat { + messages: { role, body, sources }, + user := .$username;' > app/queries/get_chats.edgeql && echo 'select Chat { + messages: { role, body, sources }, + user := .$username and .id = $chat_id;' > app/queries/get_chat_by_id.edgeql && echo 'with new_chat := (insert Chat) + select ( + update User filter .name = $username + set { + chats := assert_distinct(.chats union new_chat) + } + ) { + new_chat_id := new_chat.id + }' > app/queries/create_chat.edgeql && echo 'with + user := (select User filter .name = $username), + chat := ( + select Chat filter .$chat_id + ) + select Message { + role, + body, + sources, + chat := . app/queries/get_messages.edgeql && echo 'with + user := (select User filter .name = $username), + update Chat + filter .id = $chat_id and .$message_role, + body := $message_body, + sources := array_unpack(>$sources) + } + )) + }' > app/queries/add_message.edgeql + + +.. edb:split-section:: + + And these are the endpoint definitions, provided in bulk. + + .. code-block:: python + :caption: app/main.py + :class: collapsible + + from .queries.get_chats_async_edgeql import get_chats as get_chats_query, GetChatsResult + from .queries.get_chat_by_id_async_edgeql import ( + get_chat_by_id as get_chat_by_id_query, + GetChatByIdResult, + ) + from .queries.get_messages_async_edgeql import ( + get_messages as get_messages_query, + GetMessagesResult, + ) + from .queries.create_chat_async_edgeql import ( + create_chat as create_chat_query, + CreateChatResult, + ) + from .queries.add_message_async_edgeql import ( + add_message as add_message_query, + ) + + + @app.get("/chats") + async def get_chats( + username: str = Query(), chat_id: str = Query(None) + ) -> list[GetChatsResult] | GetChatByIdResult: + """List user's chats or get a chat by username and id""" + if chat_id: + chat = await get_chat_by_id_query( + gel_client, username=username, chat_id=chat_id + ) + if not chat: + raise HTTPException( + HTTPStatus.NOT_FOUND, + detail={"error": f"Chat {chat_id} for user {username} does not exist."}, + ) + return chat + else: + return await get_chats_query(gel_client, username=username) + + + @app.post("/chats", status_code=HTTPStatus.CREATED) + async def post_chat(username: str) -> CreateChatResult: + return await create_chat_query(gel_client, username=username) + + + @app.get("/messages") + async def get_messages( + username: str = Query(), chat_id: str = Query() + ) -> list[GetMessagesResult]: + """Fetch all messages from a chat""" + return await get_messages_query(gel_client, username=username, chat_id=chat_id) + + +.. edb:split-section:: + + For the ``post_messages`` function we're going to do something a little bit + different though. Since this is now the primary way for the user to add their + queries to the system, it functionally superceeds the ``/search`` endpoint we + made before. To this end, this function is where we're going to handle saving + messages, retrieving chat history, invoking web search and generating the + answer. + + .. code-block:: python-diff + :caption: app/main.py + + - @app.post("/search") + - async def search(search_terms: SearchTerms) -> SearchResult: + - web_sources = await search_web(search_terms.query) + - search_result = await generate_answer(search_terms.query, web_sources) + - return search_result + + + @app.post("/messages", status_code=HTTPStatus.CREATED) + + async def post_messages( + + search_terms: SearchTerms, + + username: str = Query(), + + chat_id: str = Query(), + + ) -> SearchResult: + + chat_history = await get_messages_query( + + gel_client, username=username, chat_id=chat_id + + ) + + + _ = await add_message_query( + + gel_client, + + username=username, + + message_role="user", + + message_body=search_terms.query, + + sources=[], + + chat_id=chat_id, + + ) + + + search_query = search_terms.query + + web_sources = await search_web(search_query) + + + search_result = await generate_answer( + + search_terms.query, chat_history, web_sources + + ) + + + _ = await add_message_query( + + gel_client, + + username=username, + + message_role="assistant", + + message_body=search_result.response, + + sources=search_result.sources, + + chat_id=chat_id, + + ) + + + return search_result + + +.. edb:split-section:: + + Let's not forget to modify the ``generate_answer`` function, so it can also be + history-aware. + + .. code-block:: python-diff + :caption: app/main.py + + async def generate_answer( + query: str, + + chat_history: list[GetMessagesResult], + web_sources: list[WebSource], + ) -> SearchResult: + system_prompt = ( + "You are a helpful assistant that answers user's questions" + + " by finding relevant information in HackerNews threads." + + " When answering the question, describe conversations that people have around the subject," + + " provided to you as a context, or say i don't know if they are completely irrelevant." + ) + + prompt = f"User search query: {query}\n\nWeb search results:\n" + + for i, source in enumerate(web_sources): + prompt += f"Result {i} (URL: {source.url}):\n" + prompt += f"{source.text}\n\n" + + - messages = [{"role": "user", "content": prompt}] + + messages = [ + + {"role": message.role, "content": message.body} for message in chat_history + + ] + + messages.append({"role": "user", "content": prompt}) + + llm_response = get_llm_completion( + system_prompt=system_prompt, + messages=messages, + ) + + search_result = SearchResult( + response=llm_response, + sources=web_sources, + ) + + return search_result + + +.. edb:split-section:: + + Ok, this should be it for setting up the chat history. Let's test it. First, we + are going to start a new chat for our user: + + .. code-block:: bash + + $ curl -X 'POST' \ + 'http://127.0.0.1:8000/chats?username=charlie' \ + -H 'accept: application/json' \ + -d '' + + { + "id": "20372a1a-ded5-11ef-9a08-b329b578c45c", + "new_chat_id": "544ef3f2-ded8-11ef-ba16-f7f254b95e36" + } + + +.. edb:split-section:: + + Next, let's add a couple messages and wait for the bot to respond: + + .. code-block:: bash + + $ curl -X 'POST' \ + 'http://127.0.0.1:8000/messages?username=charlie&chat_id=544ef3f2-ded8-11ef-ba16-f7f254b95e36' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "query": "best database in existence" + }' + + $ curl -X 'POST' \ + 'http://127.0.0.1:8000/messages?username=charlie&chat_id=544ef3f2-ded8-11ef-ba16-f7f254b95e36' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "query": "gel" + }' + + +.. edb:split-section:: + + Finally, let's check that the messages we saw are in fact stored in the chat + history: + + .. code-block:: bash + + $ curl -X 'GET' \ + 'http://127.0.0.1:8000/messages?username=charlie&chat_id=544ef3f2-ded8-11ef-ba16-f7f254b95e36' \ + -H 'accept: application/json' + + +In reality this workflow would've been handled by the frontend, providing the +user with a nice inteface to interact with. But even without one our chatbot is +almost functional by now. + +Generating a Google search query +-------------------------------- + +Congratulations! We just got done implementing multi-turn conversations for our +search bot. + +However, there's still one crucial piece missing. Right now we're simply +forwarding the users message straight to the full-text search. But what happens +if their message is a followup that cannot be used as a standalone search +query? + +Ideally what we should do is we should infer the search query from the entire +conversation, and use that to perform the search. + +Let's implement an extra step in which the LLM is going to produce a query for +us based on the entire chat history. That way we can be sure we're progressively +working on our query rather than rewriting it from scratch every time. + + +.. edb:split-section:: + + This is what we need to do: every time the user submits a message, we need to + fetch the chat history, extract a search query from it using the LLM, and the + other steps are going to the the same as before. Let's make the follwing + modifications to the ``main.py``: first we need to create a function that + prepares LLM inputs for the search query inference. + + + .. code-block:: python + :caption: app/main.py + + async def generate_search_query( + query: str, message_history: list[GetMessagesResult] + ) -> str: + system_prompt = ( + "You are a helpful assistant." + + " Your job is to extract a keyword search query" + + " from a chat between an AI and a human." + + " Make sure it's a single most relevant keyword to maximize matching." + + " Only provide the query itself as your response." + ) + + formatted_history = "\n---\n".join( + [ + f"{message.role}: {message.body} (sources: {message.sources})" + for message in message_history + ] + ) + prompt = f"Chat history: {formatted_history}\n\nUser message: {query} \n\n" + + llm_response = get_llm_completion( + system_prompt=system_prompt, messages=[{"role": "user", "content": prompt}] + ) + + return llm_response + + +.. edb:split-section:: + + And now we can use this function in ``post_messages`` in order to get our + search query: + + + .. code-block:: python-diff + :caption: app/main.py + + class SearchResult(BaseModel): + response: str | None = None + + search_query: str | None = None + sources: list[WebSource] | None = None + + + @app.post("/messages", status_code=HTTPStatus.CREATED) + async def post_messages( + search_terms: SearchTerms, + username: str = Query(), + chat_id: str = Query(), + ) -> SearchResult: + # 1. Fetch chat history + chat_history = await get_messages_query( + gel_client, username=username, chat_id=chat_id + ) + + # 2. Add incoming message to Gel + _ = await add_message_query( + gel_client, + username=username, + message_role="user", + message_body=search_terms.query, + sources=[], + chat_id=chat_id, + ) + + # 3. Generate a query and perform googling + - search_query = search_terms.query + + search_query = await generate_search_query(search_terms.query, chat_history) + + web_sources = await search_web(search_query) + + + # 5. Generate answer + search_result = await generate_answer( + search_terms.query, + chat_history, + web_sources, + ) + + search_result.search_query = search_query # add search query to the output + + # to see what the bot is searching for + # 6. Add LLM response to Gel + _ = await add_message_query( + gel_client, + username=username, + message_role="assistant", + message_body=search_result.response, + sources=[s.url for s in search_result.sources], + chat_id=chat_id, + ) + + # 7. Send result back to the client + return search_result + + +.. edb:split-section:: + + Done! We've now fully integrated the chat history into out app and enabled + natural language conversations. As before, let's quickly test out the + improvements before moving on: + + + .. code-block:: bash + + $ curl -X 'POST' \ + 'http://localhost:8000/messages?username=alice&chat_id=d4eed420-e903-11ef-b8a7-8718abdafbe1' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "query": "what are people saying about gel" + }' + + $ curl -X 'POST' \ + 'http://localhost:8000/messages?username=alice&chat_id=d4eed420-e903-11ef-b8a7-8718abdafbe1' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "query": "do they like it or not" + }' + + +6. Use Gel's advanced features to create a RAG +============================================== + +At this point we have a decent search bot that can refine a search query over +multiple turns of a conversation. + +It's time to add the final touch: we can make the bot remember previous similar +interactions with the user using retrieval-augmented generation (RAG). + +To achieve this we need to implement similarity search across message history: +we're going to create a vector embedding for every message in the database using +a neural network. Every time we generate a Google search query, we're also going +to use it to search for similar messages in user's message history, and inject +the corresponding chat into the prompt. That way the search bot will be able to +quickly "remember" similar interactions with the user and use them to understand +what they are looking for. + +Gel enables us to implement such a system with only minor modifications to the +schema. + + +.. edb:split-section:: + + We begin by enabling the ``ai`` extension by adding the following like on top of + the ``dbschema/default.esdl``: + + .. code-block:: sdl-diff + :caption: dbschema/default.esdl + + + using extension ai; + + +.. edb:split-section:: + + ... and do the migration: + + + .. code-block:: bash + + $ gel migration create + $ gel migrate + + +.. edb:split-section:: + + Next, we need to configure the API key in Gel for whatever embedding provider + we're going to be using. As per documentation, let's open up the CLI by typing + ``gel`` and run the following command (assuming we're using OpenAI): + + .. code-block:: edgeql-repl + + searchbot:main> configure current database + insert ext::ai::OpenAIProviderConfig { + secret := 'sk-....', + }; + + OK: CONFIGURE DATABASE + + +.. edb:split-section:: + + In order to get Gel to automatically keep track of creating and updating + message embeddings, all we need to do is create a deferred index like this. + Don't forget to run a migration one more time! + + .. code-block:: sdl-diff + + type Message { + role: str; + body: str; + timestamp: datetime { + default := datetime_current(); + } + multi sources: str; + + + deferred index ext::ai::index(embedding_model := 'text-embedding-3-small') + + on (.body); + } + + +.. edb:split-section:: + + And we're done! Gel is going to cook in the background for a while and generate + embedding vectors for our queries. To make sure nothing broke we can follow + Gel's AI documentation and take a look at instance logs: + + .. code-block:: bash + + $ gel instance logs -I searchbot | grep api.openai.com + + INFO 50121 searchbot 2025-01-30T14:39:53.364 httpx: HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK" + + +.. edb:split-section:: + + It's time to create the second half of the similarity search - the search query. + The query needs to fetch ``k`` chats in which there're messages that are most + similar to our current message. This can be a little difficult to visualize in + your head, so here's the query itself: + + .. code-block:: edgeql + :caption: app/queries/search_chats.edgeql + + with + user := (select User filter .name = $username), + chats := ( + select Chat + filter .$current_chat_id + ) + + select chats { + distance := min( + ext::ai::search( + .messages, + >$embedding, + ).distance, + ), + messages: { + role, body, sources + } + } + + order by .distance + limit $limit; + + +.. edb:split-section:: + + .. note:: + + Before we can integrate this query into our Python app, we also need to add a + new dependency for the Python binding: ``httpx-sse``. It's enables streaming + outputs, which we're not going to use right now, but we won't be able to + create the AI client without it. + + Let's place in in ``app/queries/search_chats.edgeql``, run the codegen and modify + our ``post_messages`` endpoint to keep track of those similar chats. + + .. code-block:: python-diff + :caption: app.main.py + + + from edgedb.ai import create_async_ai, AsyncEdgeDBAI + + from .queries.search_chats_async_edgeql import ( + + search_chats as search_chats_query, + + ) + + class SearchResult(BaseModel): + response: str | None = None + search_query: str | None = None + sources: list[WebSource] | None = None + + similar_chats: list[str] | None = None + + + @app.post("/messages", status_code=HTTPStatus.CREATED) + async def post_messages( + search_terms: SearchTerms, + username: str = Query(), + chat_id: str = Query(), + ) -> SearchResult: + # 1. Fetch chat history + chat_history = await get_messages_query( + gel_client, username=username, chat_id=chat_id + ) + + # 2. Add incoming message to Gel + _ = await add_message_query( + gel_client, + username=username, + message_role="user", + message_body=search_terms.query, + sources=[], + chat_id=chat_id, + ) + + # 3. Generate a query and perform googling + search_query = await generate_search_query(search_terms.query, chat_history) + web_sources = await search_web(search_query) + + + # 4. Fetch similar chats + + db_ai: AsyncEdgeDBAI = await create_async_ai(gel_client, model="gpt-4o-mini") + + embedding = await db_ai.generate_embeddings( + + search_query, model="text-embedding-3-small" + + ) + + similar_chats = await search_chats_query( + + gel_client, + + username=username, + + current_chat_id=chat_id, + + embedding=embedding, + + limit=1, + + ) + + # 5. Generate answer + search_result = await generate_answer( + search_terms.query, + chat_history, + web_sources, + + similar_chats, + ) + search_result.search_query = search_query # add search query to the output + # to see what the bot is searching for + # 6. Add LLM response to Gel + _ = await add_message_query( + gel_client, + username=username, + message_role="assistant", + message_body=search_result.response, + sources=[s.url for s in search_result.sources], + chat_id=chat_id, + ) + + # 7. Send result back to the client + return search_result + + +.. edb:split-section:: + + Finally, the answer generator needs to get updated one more time, since we need + to inject the additional messages into the prompt. + + .. code-block:: python-diff + :caption: app/main.py + + async def generate_answer( + query: str, + chat_history: list[GetMessagesResult], + web_sources: list[WebSource], + + similar_chats: list[list[GetMessagesResult]], + ) -> SearchResult: + system_prompt = ( + "You are a helpful assistant that answers user's questions" + + " by finding relevant information in HackerNews threads." + + " When answering the question, describe conversations that people have around the subject, provided to you as a context, or say i don't know if they are completely irrelevant." + + + " You can reference previous conversation with the user that" + + + " are provided to you, if they are relevant, by explicitly referring" + + + " to them by saying as we discussed in the past." + ) + + prompt = f"User search query: {query}\n\nWeb search results:\n" + + for i, source in enumerate(web_sources): + prompt += f"Result {i} (URL: {source.url}):\n" + prompt += f"{source.text}\n\n" + + + prompt += "Similar chats with the same user:\n" + + + formatted_chats = [] + + for i, chat in enumerate(similar_chats): + + formatted_chat = f"Chat {i}: \n" + + for message in chat.messages: + + formatted_chat += f"{message.role}: {message.body}\n" + + formatted_chats.append(formatted_chat) + + + prompt += "\n".join(formatted_chats) + + messages = [ + {"role": message.role, "content": message.body} for message in chat_history + ] + messages.append({"role": "user", "content": prompt}) + + llm_response = get_llm_completion( + system_prompt=system_prompt, + messages=messages, + ) + + search_result = SearchResult( + response=llm_response, + sources=web_sources, + + similar_chats=formatted_chats, + ) + + return search_result + + +.. edb:split-section:: + + And one last time, let's check to make sure everything works: + + .. code-block:: bash + + $ curl -X 'POST' \ + 'http://localhost:8000/messages?username=alice&chat_id=d4eed420-e903-11ef-b8a7-8718abdafbe1' \ + -H 'accept: application/json' \ + -H 'Content-Type: application/json' \ + -d '{ + "query": "remember that cool db i was talking to you about?" + }' + + +Keep going! +=========== + +This tutorial is over, but this app surely could use way more features! + +Basic functionality like deleting messages, a user interface or real web +search, sure. But also authentication or access policies -- Gel will let you +set those up in minutes. + +Thanks! + + + + + + +