diff --git a/docs/ai/fastapi_gelai_searchbot.rst b/docs/ai/fastapi_gelai_searchbot.rst
new file mode 100644
index 00000000000..673648a04a7
--- /dev/null
+++ b/docs/ai/fastapi_gelai_searchbot.rst
@@ -0,0 +1,1716 @@
+.. _ref_guide_fastapi_gelai_searchbot:
+
+===================
+FastAPI (Searchbot)
+===================
+
+:edb-alt-title: Building a search bot with memory using FastAPI and Gel AI
+
+In this tutorial we're going to walk you through building a chat bot with search
+capabilities using Gel and `FastAPI `_.
+
+FastAPI is a framework designed to help you build web apps *fast*. Gel is a
+data layer designed to help you figure out storage in your application - also
+*fast*. By the end of this tutorial, you will have tried out different aspects
+of using those two together.
+
+We will start by creating an app with FastAPI, adding web search capabilities,
+and then putting search results through a language model to get a
+human-friendly answer. After that, we'll use Gel to implement chat history so
+that the bot remembers previous interactions with the user. We'll finish it off
+with semantic search-based cross-chat memory.
+
+The end result is going to look something like this:
+
+.. image::
+ /docs/tutorials/placeholder.png
+ :alt: Placeholder
+ :width: 100%
+
+1. Initialize the project
+=========================
+
+.. edb:split-section::
+
+ We're going to start by installing `uv `_ - a Python
+ package manager that's going to simplify environment management for us. You can
+ follow their `installation instructions
+ `_ or simply run:
+
+ .. code-block:: bash
+
+ $ curl -LsSf https://astral.sh/uv/install.sh | sh
+
+.. edb:split-section::
+
+ Once that is done, we can use uv to create scaffolding for our project following
+ the `documentation `_:
+
+ .. code-block:: bash
+
+ $ uv init searchbot \
+ && cd searchbot
+
+.. edb:split-section::
+
+ For now, we know we're going to need Gel and FastAPI, so let's add those
+ following uv's instructions on `managing dependencies
+ `_,
+ as well as FastAPI's `installation docs
+ `_. Running ``uv sync`` after
+ that will create our virtual environment in a ``.venv`` directory and ensure
+ it's ready. As the last step, we'll activate the environment and get started.
+
+ .. note::
+
+ Every time you open a new terminal session, you should source the
+ environment before running ``python``, ``gel`` or ``fastapi`` commands.
+
+ .. code-block:: bash
+
+ $ uv add "fastapi[standard]" \
+ && uv add gel \
+ && uv sync \
+ && source .venv/bin/activate
+
+
+2. Get started with FastAPI
+===========================
+
+.. edb:split-section::
+
+ At this stage we need to follow FastAPI's `tutorial
+ `_ to create the foundation of our app.
+
+ We're going to make a minimal web API with one endpoint that takes in a user
+ query as an input and echoes it as an output. First, let's make a directory
+ called ``app`` in our project root, and put an empty ``__init__.py`` there.
+
+ .. code-block:: bash
+
+ $ mkdir app && touch app/__init__.py
+
+.. edb:split-section::
+
+ Now let's create a file called ``main.py`` inside the ``app`` directory and put
+ the "Hello World" example in it:
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ from fastapi import FastAPI
+
+ app = FastAPI()
+
+
+ @app.get("/")
+ async def root():
+ return {"message": "Hello World"}
+
+
+.. edb:split-section::
+
+ To start the server, we'll run:
+
+ .. code-block:: bash
+
+ $ fastapi dev app/main.py
+
+
+.. edb:split-section::
+
+ Once the server gets up and running, we can make sure it works using FastAPI's
+ built-in UI at _, or manually with ``curl``:
+
+ .. code-block:: bash
+
+ $ curl -X 'GET' \
+ 'http://127.0.0.1:8000/' \
+ -H 'accept: application/json'
+
+ {"message":"Hello World"}
+
+
+.. edb:split-section::
+
+ Now, to create the search endpoint we mentioned earlier, we need to pass our
+ query as a parameter to it. We'd prefer to have it in the request's body
+ since user messages can be long.
+
+ In FastAPI land, this is done by creating a Pydantic schema and making it the
+ type of the input parameter. `Pydantic `_ is
+ a data validation library for Python. It has many features, but we don't
+ actually need to know about them for now. All we need to know is that FastAPI
+ uses Pydantic types to automatically figure out schemas for `input
+ `_, as well as `output
+ `_.
+
+ Let's add the following to our ``main.py``:
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ from pydantic import BaseModel
+
+
+ class SearchTerms(BaseModel):
+ query: str
+
+ class SearchResult(BaseModel):
+ response: str | None = None
+
+
+.. edb:split-section::
+
+ Now, we can define our endpoint. We'll set the two classes we just created as
+ the new endpoint's argument and return type.
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ @app.post("/search")
+ async def search(search_terms: SearchTerms) -> SearchResult:
+ return SearchResult(response=search_terms.query)
+
+
+.. edb:split-section::
+
+ Same as before, we can test the endpoint using the UI, or by sending a request
+ with ``curl``:
+
+ .. code-block:: bash
+
+ $ curl -X 'POST' \
+ 'http://127.0.0.1:8000/search' \
+ -H 'accept: application/json' \
+ -H 'Content-Type: application/json' \
+ -d '{ "query": "string" }'
+
+ {
+ "response": "string",
+ }
+
+3. Implement web search
+=======================
+
+Now that we have our web app infrastructure in place, let's add some substance
+to it by implementing web search capabilities.
+
+.. edb:split-section::
+
+ There're many powerful feature-rich products for LLM-driven web search. But
+ in this tutorial we're going to use a much more reliable source of real-world
+ information that is comment threads on `Hacker News
+ `_. Their `web API
+ `_ is free of charge and doesn't require an
+ account. Below is a simple function that requests a full-text search for a
+ string query and extracts a nice sampling of comment threads from each of the
+ stories that came up in the result.
+
+ We are not going to cover this code sample in too much depth. Feel free to grab
+ it save it to ``app/web.py``, or make your own.
+
+ Notice that we've created another Pydantic type called ``WebSource`` to store
+ our web search results. There's no framework-related reason for that, it's just
+ nicer than passing dictionaries around.
+
+ .. code-block:: python
+ :caption: app/web.py
+ :class: collapsible
+
+ import requests
+ from pydantic import BaseModel
+ from datetime import datetime
+ import html
+
+
+ class WebSource(BaseModel):
+ """Type that stores search results."""
+
+ url: str | None = None
+ title: str | None = None
+ text: str | None = None
+
+
+ def extract_comment_thread(
+ comment: dict,
+ max_depth: int = 3,
+ current_depth: int = 0,
+ max_children=3,
+ ) -> list[str]:
+ """
+ Recursively extract comments from a thread up to max_depth.
+ Returns a list of formatted comment strings.
+ """
+ if not comment or current_depth > max_depth:
+ return []
+
+ results = []
+
+ # Get timestamp, author and the body of the comment,
+ # then pad it with spaces so that it's offset appropriately for its depth
+
+ if comment["text"]:
+ timestamp = datetime.fromisoformat(comment["created_at"].replace("Z", "+00:00"))
+ author = comment["author"]
+ text = html.unescape(comment["text"])
+ formatted_comment = f"[{timestamp.strftime('%Y-%m-%d %H:%M')}] {author}: {text}"
+ results.append((" " * current_depth) + formatted_comment)
+
+ # If there're children comments, we are going to extract them too,
+ # and add them to the list.
+
+ if comment.get("children"):
+ for child in comment["children"][:max_children]:
+ child_comments = extract_comment_thread(child, max_depth, current_depth + 1)
+ results.extend(child_comments)
+
+ return results
+
+
+ def fetch_web_sources(query: str, limit: int = 5) -> list[WebSource]:
+ """
+ For a given query perform a full-text search for stories on Hacker News.
+ From each of the matched stories extract the comment thread and format it into a single string.
+ For each story return its title, url and comment thread.
+ """
+ search_url = "http://hn.algolia.com/api/v1/search_by_date?numericFilters=num_comments>0"
+
+ # Search for stories
+ response = requests.get(
+ search_url,
+ params={
+ "query": query,
+ "tags": "story",
+ "hitsPerPage": limit,
+ "page": 0,
+ },
+ )
+
+ response.raise_for_status()
+ search_result = response.json()
+
+ # For each search hit fetch and process the story
+ web_sources = []
+ for hit in search_result.get("hits", []):
+ item_url = f"https://hn.algolia.com/api/v1/items/{hit['story_id']}"
+ response = requests.get(item_url)
+ response.raise_for_status()
+ item_result = response.json()
+
+ site_url = f"https://news.ycombinator.com/item?id={hit['story_id']}"
+ title = hit["title"]
+ comments = extract_comment_thread(item_result)
+ text = "\n".join(comments) if len(comments) > 0 else None
+ web_sources.append(
+ WebSource(url=site_url, title=title, text=text)
+ )
+
+ return web_sources
+
+
+ if __name__ == "__main__":
+ web_sources = fetch_web_sources("edgedb", limit=5)
+
+ for source in web_sources:
+ print(source.url)
+ print(source.title)
+ print(source.text)
+
+
+.. edb:split-section::
+
+ One more note: this snippet comes with an extra dependency called ``requests``,
+ which is a library for making HTTP requests. Let's add it by running:
+
+ .. code-block:: bash
+
+ $ uv add requests
+
+
+.. edb:split-section::
+
+ Now, we can test our web search on its own by running it like this:
+
+ .. code-block:: bash
+
+ $ python3 app/web.py
+
+
+.. edb:split-section::
+
+ It's time to reflect the new capabilities in our web app.
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ from .web import fetch_web_sources, WebSource
+
+ async def search_web(query: str) -> list[WebSource]:
+ raw_sources = fetch_web_sources(query, limit=5)
+ return [s for s in raw_sources if s.text is not None]
+
+
+.. edb:split-section::
+
+ Now we can update the ``/search`` endpoint as follows:
+
+ .. code-block:: python-diff
+ :caption: app/main.py
+
+ class SearchResult(BaseModel):
+ response: str | None = None
+ + sources: list[WebSource] | None = None
+
+
+ @app.post("/search")
+ async def search(search_terms: SearchTerms) -> SearchResult:
+ + web_sources = await search_web(search_terms.query)
+ - return SearchResult(response=search_terms.query)
+ + return SearchResult(
+ + response=search_terms.query, sources=web_sources
+ + )
+
+
+4. Connect to the LLM
+=====================
+
+Now that we're capable of scraping text from search results, we can forward
+those results to the LLM to get a nice-looking summary.
+
+.. edb:split-section::
+
+ There's a million different LLMs accessible via a web API (`one
+ `_, `two
+ `_, `three
+ `_, `four `_ to name
+ a few), feel free to choose whichever you prefer. In this tutorial we will
+ roll with OpenAI, primarily for how ubiquitous it is. To keep things somewhat
+ provider-agnostic, we're going to get completions via raw HTTP requests.
+ Let's grab API descriptions from OpenAI's `API documentation
+ `_, and set up
+ LLM generation like this:
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ import requests
+ from dotenv import load_dotenv
+
+ _ = load_dotenv()
+
+
+ def get_llm_completion(system_prompt: str, messages: list[dict[str, str]]) -> str:
+ api_key = os.getenv("OPENAI_API_KEY")
+ url = "https://api.openai.com/v1/chat/completions"
+ headers = {"Content-Type": "application/json", "Authorization": f"Bearer {api_key}"}
+
+ response = requests.post(
+ url,
+ headers=headers,
+ json={
+ "model": "gpt-4o-mini",
+ "messages": [
+ {"role": "developer", "content": system_prompt},
+ *messages,
+ ],
+ },
+ )
+ response.raise_for_status()
+ result = response.json()
+ return result["choices"][0]["message"]["content"]
+
+
+.. edb:split-section::
+
+ Note that this cloud LLM API (and many others) requires a secret key to be
+ set as an environment variable. A common way to manage those is to use the
+ ``python-dotenv`` library in combinations with a ``.env`` file. Feel free to
+ browse `the readme
+ `_,
+ to learn more. Create a file called ``.env`` in the root directory and put
+ your api key in there:
+
+ .. code-block:: .env
+ :caption: .env
+
+ OPENAI_API_KEY="sk-..."
+
+
+.. edb:split-section::
+
+ Don't forget to add the new dependency to the environment:
+
+ .. code-block:: bash
+
+ uv add python-dotenv
+
+
+.. edb:split-section::
+
+ And now we can integrate this LLM-related code with the rest of the app. First,
+ let's set up a function that prepares LLM inputs:
+
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ async def generate_answer(
+ query: str,
+ web_sources: list[WebSource],
+ ) -> SearchResult:
+ system_prompt = (
+ "You are a helpful assistant that answers user's questions"
+ + " by finding relevant information in Hacker News threads."
+ + " When answering the question, describe conversations that people have around the subject,"
+ + " provided to you as a context, or say i don't know if they are completely irrelevant."
+ )
+
+ prompt = f"User search query: {query}\n\nWeb search results:\n"
+
+ for i, source in enumerate(web_sources):
+ prompt += f"Result {i} (URL: {source.url}):\n"
+ prompt += f"{source.text}\n\n"
+
+ messages = [{"role": "user", "content": prompt}]
+
+ llm_response = get_llm_completion(
+ system_prompt=system_prompt,
+ messages=messages,
+ )
+
+ search_result = SearchResult(
+ response=llm_response,
+ sources=web_sources,
+ )
+
+ return search_result
+
+
+.. edb:split-section::
+
+ Then we can plug that function into the ``/search`` endpoint:
+
+ .. code-block:: python-diff
+ :caption: app/main.py
+
+ @app.post("/search")
+ async def search(search_terms: SearchTerms) -> SearchResult:
+ web_sources = await search_web(search_terms.query)
+ + search_result = await generate_answer(search_terms.query, web_sources)
+ + return search_result
+ - return SearchResult(
+ - response=search_terms.query, sources=web_sources
+ - )
+
+
+.. edb:split-section::
+
+ And now we can test the result as usual.
+
+ .. code-block:: bash
+
+ $ curl -X 'POST' \
+ 'http://127.0.0.1:8000/search' \
+ -H 'accept: application/json' \
+ -H 'Content-Type: application/json' \
+ -d '{ "query": "gel" }'
+
+
+5. Use Gel to implement chat history
+====================================
+
+So far we've built an application that can take in a query, fetch some Hacker
+News threads for it, sift through them using an LLM, and generate a nice
+summary.
+
+However, right now it's hardly user-friendly since you have to speak in
+keywords and basically start over every time you want to refine the query. To
+enable a more organic multi-turn interaction, we need to add chat history and
+infer the query from the context of the entire conversation.
+
+Now's a good time to introduce Gel.
+
+.. edb:split-section::
+
+ In case you need installation instructions, take a look at the :ref:`Quickstart
+ `. Once Gel CLI is present in your system, initialize the
+ project like this:
+
+ .. code-block:: bash
+
+ $ gel project init --non-interactive
+
+
+This command is going to put some project scaffolding inside our app, spin up a
+local instace of Gel, and then link the two together. From now on, all
+Gel-related things that happen inside our project directory are going to be
+automatically run on the correct database instance, no need to worry about
+connection incantations.
+
+
+Defining the schema
+-------------------
+
+The database :ref:`schema ` in Gel is defined
+declaratively. The :ref:`gel project init `
+command has created a file called ``dbchema/default.esdl``, which we're going to
+use to define our types.
+
+.. edb:split-section::
+
+ We obviously want to keep track of the messages, so we need to represent
+ those in the schema. By convention established in the LLM space, each message
+ is going to have a role in addition to the message content itself. We can
+ also get Gel to automatically keep track of message's creation time by adding
+ a property callled ``timestamp`` and setting its :ref:`default value
+ ` to the output of the :ref:`datetime_current()
+ ` function. Finally, LLM messages in our search bot have
+ source URLs associated with them. Let's keep track of those too, by adding a
+ :ref:`multi-property `.
+
+ .. code-block:: sdl
+ :caption: dbschema/default.esdl
+
+ type Message {
+ role: str;
+ body: str;
+ timestamp: datetime {
+ default := datetime_current();
+ }
+ multi sources: str;
+ }
+
+
+.. edb:split-section::
+
+ Messages are grouped together into a chat, so let's add that entity to our
+ schema too.
+
+ .. code-block:: sdl
+ :caption: dbschema/default.esdl
+
+ type Chat {
+ multi messages: Message;
+ }
+
+
+.. edb:split-section::
+
+ And chats all belong to a certain user, making up their chat history. One other
+ thing we'd like to keep track of about our users is their username, and it would
+ make sense for us to make sure that it's unique by using an ``excusive``
+ :ref:`constraint `.
+
+ .. code-block:: sdl
+ :caption: dbschema/default.esdl
+
+ type User {
+ name: str {
+ constraint exclusive;
+ }
+ multi chats: Chat;
+ }
+
+
+.. edb:split-section::
+
+ We're going to keep our schema super simple. One cool thing about Gel is that
+ it will enable us to easily implement advanced features such as authentication
+ or AI down the road, but we're gonna come back to that later.
+
+ For now, this is the entire schema we came up with:
+
+ .. code-block:: sdl
+ :caption: dbschema/default.esdl
+
+ module default {
+ type Message {
+ role: str;
+ body: str;
+ timestamp: datetime {
+ default := datetime_current();
+ }
+ multi sources: str;
+ }
+
+ type Chat {
+ multi messages: Message;
+ }
+
+ type User {
+ name: str {
+ constraint exclusive;
+ }
+ multi chats: Chat;
+ }
+ }
+
+
+.. edb:split-section::
+
+ Let's use the :ref:`gel migration create ` CLI
+ command, followed by :ref:`gel migrate ` in order to
+ migrate to our new schema and proceed to writing some queries.
+
+ .. code-block:: bash
+
+ $ gel migration create
+ $ gel migrate
+
+
+.. edb:split-section::
+
+ Now that our schema is applied, let's quickly populate the database with some
+ fake data in order to be able to test the queries. We're going to explore
+ writing queries in a bit, but for now you can just run the following command in
+ the shell:
+
+ .. code-block:: bash
+ :class: collapsible
+
+ $ mkdir app/sample_data && cat << 'EOF' > app/sample_data/inserts.edgeql
+ # Create users first
+ insert User {
+ name := 'alice',
+ };
+ insert User {
+ name := 'bob',
+ };
+ # Insert chat histories for Alice
+ update User
+ filter .name = 'alice'
+ set {
+ chats := {
+ (insert Chat {
+ messages := {
+ (insert Message {
+ role := 'user',
+ body := 'What are the main differences between GPT-3 and GPT-4?',
+ timestamp := '2024-01-07T10:00:00Z',
+ sources := {'arxiv:2303.08774', 'openai.com/research/gpt-4'}
+ }),
+ (insert Message {
+ role := 'assistant',
+ body := 'The key differences include improved reasoning capabilities, better context understanding, and enhanced safety features...',
+ timestamp := '2024-01-07T10:00:05Z',
+ sources := {'openai.com/blog/gpt-4-details', 'arxiv:2303.08774'}
+ })
+ }
+ }),
+ (insert Chat {
+ messages := {
+ (insert Message {
+ role := 'user',
+ body := 'Can you explain what policy gradient methods are in RL?',
+ timestamp := '2024-01-08T14:30:00Z',
+ sources := {'Sutton-Barto-RL-Book-Ch13', 'arxiv:1904.12901'}
+ }),
+ (insert Message {
+ role := 'assistant',
+ body := 'Policy gradient methods are a class of reinforcement learning algorithms that directly optimize the policy...',
+ timestamp := '2024-01-08T14:30:10Z',
+ sources := {'Sutton-Barto-RL-Book-Ch13', 'spinning-up.openai.com'}
+ })
+ }
+ })
+ }
+ };
+ # Insert chat histories for Bob
+ update User
+ filter .name = 'bob'
+ set {
+ chats := {
+ (insert Chat {
+ messages := {
+ (insert Message {
+ role := 'user',
+ body := 'What are the pros and cons of different sharding strategies?',
+ timestamp := '2024-01-05T16:15:00Z',
+ sources := {'martin-kleppmann-ddia-ch6', 'aws.amazon.com/sharding-patterns'}
+ }),
+ (insert Message {
+ role := 'assistant',
+ body := 'The main sharding strategies include range-based, hash-based, and directory-based sharding...',
+ timestamp := '2024-01-05T16:15:08Z',
+ sources := {'martin-kleppmann-ddia-ch6', 'mongodb.com/docs/sharding'}
+ }),
+ (insert Message {
+ role := 'user',
+ body := 'Could you elaborate on hash-based sharding?',
+ timestamp := '2024-01-05T16:16:00Z',
+ sources := {'mongodb.com/docs/sharding'}
+ })
+ }
+ })
+ }
+ };
+ EOF
+
+
+.. edb:split-section::
+
+ This created the ``app/sample_data/inserts.edgeql`` file, which we can now execute
+ using the CLI like this:
+
+ .. code-block:: bash
+
+ $ gel query -f app/sample_data/inserts.edgeql
+
+ {"id": "862de904-de39-11ef-9713-4fab09220c4a"}
+ {"id": "862e400c-de39-11ef-9713-2f81f2b67013"}
+ {"id": "862de904-de39-11ef-9713-4fab09220c4a"}
+ {"id": "862e400c-de39-11ef-9713-2f81f2b67013"}
+
+
+.. edb:split-section::
+
+ The :ref:`gel query ` command is one of many ways we can
+ execute a query in Gel. Now that we've done it, there's stuff in the database.
+ Let's verify it by running:
+
+ .. code-block:: bash
+
+ $ gel query "select User { name };"
+
+ {"name": "alice"}
+ {"name": "bob"}
+
+
+Writing queries
+---------------
+
+With schema in place, it's time to focus on getting the data in and out of the
+database.
+
+In this tutorial we're going to write queries using :ref:`EdgeQL
+` and then use :ref:`codegen ` to
+generate typesafe function that we can plug directly into out Python code. If
+you are completely unfamiliar with EdgeQL, now is a good time to check out the
+basics before proceeding.
+
+
+.. edb:split-section::
+
+ Let's move on. First, we'll create a directory inside ``app`` called
+ ``queries``. This is where we're going to put all of the EdgeQL-related stuff.
+
+ We're going to start by writing a query that fetches all of the users. In
+ ``queries`` create a file named ``get_users.edgeql`` and put the following query
+ in there:
+
+ .. code-block:: edgeql
+ :caption: app/queries/get_users.edgeql
+
+ select User { name };
+
+
+.. edb:split-section::
+
+ Now run the code generator from the shell:
+
+ .. code-block:: bash
+
+ $ gel-py
+
+
+.. edb:split-section::
+
+ It's going to automatically locate the ``.edgeql`` file and generate types for
+ it. We can inspect generated code in ``app.queries/get_users_async_edgeql.py``.
+ Once that is done, let's use those types to create the endpoint in ``main.py``:
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ from edgedb import create_async_client
+ from .queries.get_users_async_edgeql import get_users as get_users_query, GetUsersResult
+
+
+ gel_client = create_async_client()
+
+ @app.get("/users")
+ async def get_users() -> list[GetUsersResult]:
+ return await get_users_query(gel_client)
+
+
+.. edb:split-section::
+
+ Let's verify it that works as expected:
+
+ .. code-block:: bash
+
+ $ curl -X 'GET' \
+ 'http://127.0.0.1:8000/users' \
+ -H 'accept: application/json'
+
+ [
+ {
+ "id": "862de904-de39-11ef-9713-4fab09220c4a",
+ "name": "alice"
+ },
+ {
+ "id": "862e400c-de39-11ef-9713-2f81f2b67013",
+ "name": "bob"
+ }
+ ]
+
+
+.. edb:split-section::
+
+ While we're at it, let's also implement the option to fetch a user by their
+ username. In order to do that, we need to write a new query in a separate file
+ ``app/queries/get_user_by_name.edgeql``:
+
+ .. code-block:: edgeql
+ :caption: app/queries/get_user_by_name.edgeql
+
+ select User { name }
+ filter .name = $name;
+
+
+.. edb:split-section::
+
+ After that, we will run the code generator again by calling ``gel-py``. In the
+ app, we are going to reuse the same endpoint that fetches the list of all users.
+ From now on, if the user calls it without any arguments (e.g.
+ ``http://127.0.0.1/users``), they are going to receive the list of all users,
+ same as before. But if they pass a username as a query argument like this:
+ ``http://127.0.0.1/users?username=bob``, the system will attempt to fetch a user
+ named ``bob``.
+
+ In order to achieve this, we're going to need to add a ``Query``-type argument
+ to our endpoint function. You can learn more about how to configure this type of
+ arguments in `FastAPI's docs
+ `_. It's default value is
+ going to be ``None``, which will enable us to implement our conditional logic:
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ from fastapi import Query, HTTPException
+ from http import HTTPStatus
+ from .queries.get_user_by_name_async_edgeql import (
+ get_user_by_name as get_user_by_name_query,
+ GetUserByNameResult,
+ )
+
+
+ @app.get("/users")
+ async def get_users(
+ username: str = Query(None),
+ ) -> list[GetUsersResult] | GetUserByNameResult:
+ """List all users or get a user by their username"""
+ if username:
+ user = await get_user_by_name_query(gel_client, name=username)
+ if not user:
+ raise HTTPException(
+ HTTPStatus.NOT_FOUND,
+ detail={"error": f"Error: user {username} does not exist."},
+ )
+ return user
+ else:
+ return await get_users_query(gel_client)
+
+
+.. edb:split-section::
+
+ And once again, let's verify that everything works:
+
+ .. code-block:: bash
+
+ $ curl -X 'GET' \
+ 'http://127.0.0.1:8000/users?username=alice' \
+ -H 'accept: application/json'
+
+ {
+ "id": "862de904-de39-11ef-9713-4fab09220c4a",
+ "name": "alice"
+ }
+
+
+.. edb:split-section::
+
+ Finally, let's also implement the option to add a new user. For this, just as
+ before, we'll create a new file ``app/queries/create_user.edgeql``, add a query
+ to it and run code generation.
+
+ Note that in this query we've wrapped the ``insert`` in a ``select`` statement.
+ This is a common pattern in EdgeQL, that can be used whenever you would like to
+ get something other than object ID when you just inserted it.
+
+ .. code-block:: edgeql
+ :caption: app/queries/create_user.edgeql
+
+ select(
+ insert User {
+ name := $username
+ }
+ ) {
+ name
+ }
+
+
+
+.. edb:split-section::
+
+ In order to integrate this query into our app, we're going to add a new
+ endpoint. Note that this one has the same name ``/users``, but is for the POST
+ HTTP method.
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ from gel import ConstraintViolationError
+ from .queries.create_user_async_edgeql import (
+ create_user as create_user_query,
+ CreateUserResult,
+ )
+
+ @app.post("/users", status_code=HTTPStatus.CREATED)
+ async def post_user(username: str = Query()) -> CreateUserResult:
+ try:
+ return await create_user_query(gel_client, username=username)
+ except ConstraintViolationError:
+ raise HTTPException(
+ status_code=HTTPStatus.BAD_REQUEST,
+ detail={"error": f"Username '{username}' already exists."},
+ )
+
+
+.. edb:split-section::
+
+ Once more, let's verify that the new endpoint works as expected:
+
+ .. code-block:: bash
+
+ $ curl -X 'POST' \
+ 'http://127.0.0.1:8000/users?username=charlie' \
+ -H 'accept: application/json' \
+ -d ''
+
+ {
+ "id": "20372a1a-ded5-11ef-9a08-b329b578c45c",
+ "name": "charlie"
+ }
+
+
+.. edb:split-section::
+
+ This wraps things up for our user-related functionality. Of course, we now need
+ to deal with Chats and Messages, too. We're not going to go in depth for those,
+ since the process would be quite similar to what we've just done. Instead, feel
+ free to implement those endpoints yourself as an exercise, or copy the code
+ below if you are in rush.
+
+ .. code-block:: bash
+ :class: collapsible
+
+ $ echo 'select Chat {
+ messages: { role, body, sources },
+ user := .$username;' > app/queries/get_chats.edgeql && echo 'select Chat {
+ messages: { role, body, sources },
+ user := .$username and .id = $chat_id;' > app/queries/get_chat_by_id.edgeql && echo 'with new_chat := (insert Chat)
+ select (
+ update User filter .name = $username
+ set {
+ chats := assert_distinct(.chats union new_chat)
+ }
+ ) {
+ new_chat_id := new_chat.id
+ }' > app/queries/create_chat.edgeql && echo 'with
+ user := (select User filter .name = $username),
+ chat := (
+ select Chat filter .$chat_id
+ )
+ select Message {
+ role,
+ body,
+ sources,
+ chat := . app/queries/get_messages.edgeql && echo 'with
+ user := (select User filter .name = $username),
+ update Chat
+ filter .id = $chat_id and .$message_role,
+ body := $message_body,
+ sources := array_unpack(>$sources)
+ }
+ ))
+ }' > app/queries/add_message.edgeql
+
+
+.. edb:split-section::
+
+ And these are the endpoint definitions, provided in bulk.
+
+ .. code-block:: python
+ :caption: app/main.py
+ :class: collapsible
+
+ from .queries.get_chats_async_edgeql import get_chats as get_chats_query, GetChatsResult
+ from .queries.get_chat_by_id_async_edgeql import (
+ get_chat_by_id as get_chat_by_id_query,
+ GetChatByIdResult,
+ )
+ from .queries.get_messages_async_edgeql import (
+ get_messages as get_messages_query,
+ GetMessagesResult,
+ )
+ from .queries.create_chat_async_edgeql import (
+ create_chat as create_chat_query,
+ CreateChatResult,
+ )
+ from .queries.add_message_async_edgeql import (
+ add_message as add_message_query,
+ )
+
+
+ @app.get("/chats")
+ async def get_chats(
+ username: str = Query(), chat_id: str = Query(None)
+ ) -> list[GetChatsResult] | GetChatByIdResult:
+ """List user's chats or get a chat by username and id"""
+ if chat_id:
+ chat = await get_chat_by_id_query(
+ gel_client, username=username, chat_id=chat_id
+ )
+ if not chat:
+ raise HTTPException(
+ HTTPStatus.NOT_FOUND,
+ detail={"error": f"Chat {chat_id} for user {username} does not exist."},
+ )
+ return chat
+ else:
+ return await get_chats_query(gel_client, username=username)
+
+
+ @app.post("/chats", status_code=HTTPStatus.CREATED)
+ async def post_chat(username: str) -> CreateChatResult:
+ return await create_chat_query(gel_client, username=username)
+
+
+ @app.get("/messages")
+ async def get_messages(
+ username: str = Query(), chat_id: str = Query()
+ ) -> list[GetMessagesResult]:
+ """Fetch all messages from a chat"""
+ return await get_messages_query(gel_client, username=username, chat_id=chat_id)
+
+
+.. edb:split-section::
+
+ For the ``post_messages`` function we're going to do something a little bit
+ different though. Since this is now the primary way for the user to add their
+ queries to the system, it functionally superceeds the ``/search`` endpoint we
+ made before. To this end, this function is where we're going to handle saving
+ messages, retrieving chat history, invoking web search and generating the
+ answer.
+
+ .. code-block:: python-diff
+ :caption: app/main.py
+
+ - @app.post("/search")
+ - async def search(search_terms: SearchTerms) -> SearchResult:
+ - web_sources = await search_web(search_terms.query)
+ - search_result = await generate_answer(search_terms.query, web_sources)
+ - return search_result
+
+ + @app.post("/messages", status_code=HTTPStatus.CREATED)
+ + async def post_messages(
+ + search_terms: SearchTerms,
+ + username: str = Query(),
+ + chat_id: str = Query(),
+ + ) -> SearchResult:
+ + chat_history = await get_messages_query(
+ + gel_client, username=username, chat_id=chat_id
+ + )
+
+ + _ = await add_message_query(
+ + gel_client,
+ + username=username,
+ + message_role="user",
+ + message_body=search_terms.query,
+ + sources=[],
+ + chat_id=chat_id,
+ + )
+
+ + search_query = search_terms.query
+ + web_sources = await search_web(search_query)
+
+ + search_result = await generate_answer(
+ + search_terms.query, chat_history, web_sources
+ + )
+
+ + _ = await add_message_query(
+ + gel_client,
+ + username=username,
+ + message_role="assistant",
+ + message_body=search_result.response,
+ + sources=search_result.sources,
+ + chat_id=chat_id,
+ + )
+
+ + return search_result
+
+
+.. edb:split-section::
+
+ Let's not forget to modify the ``generate_answer`` function, so it can also be
+ history-aware.
+
+ .. code-block:: python-diff
+ :caption: app/main.py
+
+ async def generate_answer(
+ query: str,
+ + chat_history: list[GetMessagesResult],
+ web_sources: list[WebSource],
+ ) -> SearchResult:
+ system_prompt = (
+ "You are a helpful assistant that answers user's questions"
+ + " by finding relevant information in HackerNews threads."
+ + " When answering the question, describe conversations that people have around the subject,"
+ + " provided to you as a context, or say i don't know if they are completely irrelevant."
+ )
+
+ prompt = f"User search query: {query}\n\nWeb search results:\n"
+
+ for i, source in enumerate(web_sources):
+ prompt += f"Result {i} (URL: {source.url}):\n"
+ prompt += f"{source.text}\n\n"
+
+ - messages = [{"role": "user", "content": prompt}]
+ + messages = [
+ + {"role": message.role, "content": message.body} for message in chat_history
+ + ]
+ + messages.append({"role": "user", "content": prompt})
+
+ llm_response = get_llm_completion(
+ system_prompt=system_prompt,
+ messages=messages,
+ )
+
+ search_result = SearchResult(
+ response=llm_response,
+ sources=web_sources,
+ )
+
+ return search_result
+
+
+.. edb:split-section::
+
+ Ok, this should be it for setting up the chat history. Let's test it. First, we
+ are going to start a new chat for our user:
+
+ .. code-block:: bash
+
+ $ curl -X 'POST' \
+ 'http://127.0.0.1:8000/chats?username=charlie' \
+ -H 'accept: application/json' \
+ -d ''
+
+ {
+ "id": "20372a1a-ded5-11ef-9a08-b329b578c45c",
+ "new_chat_id": "544ef3f2-ded8-11ef-ba16-f7f254b95e36"
+ }
+
+
+.. edb:split-section::
+
+ Next, let's add a couple messages and wait for the bot to respond:
+
+ .. code-block:: bash
+
+ $ curl -X 'POST' \
+ 'http://127.0.0.1:8000/messages?username=charlie&chat_id=544ef3f2-ded8-11ef-ba16-f7f254b95e36' \
+ -H 'accept: application/json' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "query": "best database in existence"
+ }'
+
+ $ curl -X 'POST' \
+ 'http://127.0.0.1:8000/messages?username=charlie&chat_id=544ef3f2-ded8-11ef-ba16-f7f254b95e36' \
+ -H 'accept: application/json' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "query": "gel"
+ }'
+
+
+.. edb:split-section::
+
+ Finally, let's check that the messages we saw are in fact stored in the chat
+ history:
+
+ .. code-block:: bash
+
+ $ curl -X 'GET' \
+ 'http://127.0.0.1:8000/messages?username=charlie&chat_id=544ef3f2-ded8-11ef-ba16-f7f254b95e36' \
+ -H 'accept: application/json'
+
+
+In reality this workflow would've been handled by the frontend, providing the
+user with a nice inteface to interact with. But even without one our chatbot is
+almost functional by now.
+
+Generating a Google search query
+--------------------------------
+
+Congratulations! We just got done implementing multi-turn conversations for our
+search bot.
+
+However, there's still one crucial piece missing. Right now we're simply
+forwarding the users message straight to the full-text search. But what happens
+if their message is a followup that cannot be used as a standalone search
+query?
+
+Ideally what we should do is we should infer the search query from the entire
+conversation, and use that to perform the search.
+
+Let's implement an extra step in which the LLM is going to produce a query for
+us based on the entire chat history. That way we can be sure we're progressively
+working on our query rather than rewriting it from scratch every time.
+
+
+.. edb:split-section::
+
+ This is what we need to do: every time the user submits a message, we need to
+ fetch the chat history, extract a search query from it using the LLM, and the
+ other steps are going to the the same as before. Let's make the follwing
+ modifications to the ``main.py``: first we need to create a function that
+ prepares LLM inputs for the search query inference.
+
+
+ .. code-block:: python
+ :caption: app/main.py
+
+ async def generate_search_query(
+ query: str, message_history: list[GetMessagesResult]
+ ) -> str:
+ system_prompt = (
+ "You are a helpful assistant."
+ + " Your job is to extract a keyword search query"
+ + " from a chat between an AI and a human."
+ + " Make sure it's a single most relevant keyword to maximize matching."
+ + " Only provide the query itself as your response."
+ )
+
+ formatted_history = "\n---\n".join(
+ [
+ f"{message.role}: {message.body} (sources: {message.sources})"
+ for message in message_history
+ ]
+ )
+ prompt = f"Chat history: {formatted_history}\n\nUser message: {query} \n\n"
+
+ llm_response = get_llm_completion(
+ system_prompt=system_prompt, messages=[{"role": "user", "content": prompt}]
+ )
+
+ return llm_response
+
+
+.. edb:split-section::
+
+ And now we can use this function in ``post_messages`` in order to get our
+ search query:
+
+
+ .. code-block:: python-diff
+ :caption: app/main.py
+
+ class SearchResult(BaseModel):
+ response: str | None = None
+ + search_query: str | None = None
+ sources: list[WebSource] | None = None
+
+
+ @app.post("/messages", status_code=HTTPStatus.CREATED)
+ async def post_messages(
+ search_terms: SearchTerms,
+ username: str = Query(),
+ chat_id: str = Query(),
+ ) -> SearchResult:
+ # 1. Fetch chat history
+ chat_history = await get_messages_query(
+ gel_client, username=username, chat_id=chat_id
+ )
+
+ # 2. Add incoming message to Gel
+ _ = await add_message_query(
+ gel_client,
+ username=username,
+ message_role="user",
+ message_body=search_terms.query,
+ sources=[],
+ chat_id=chat_id,
+ )
+
+ # 3. Generate a query and perform googling
+ - search_query = search_terms.query
+ + search_query = await generate_search_query(search_terms.query, chat_history)
+ + web_sources = await search_web(search_query)
+
+
+ # 5. Generate answer
+ search_result = await generate_answer(
+ search_terms.query,
+ chat_history,
+ web_sources,
+ )
+ + search_result.search_query = search_query # add search query to the output
+ + # to see what the bot is searching for
+ # 6. Add LLM response to Gel
+ _ = await add_message_query(
+ gel_client,
+ username=username,
+ message_role="assistant",
+ message_body=search_result.response,
+ sources=[s.url for s in search_result.sources],
+ chat_id=chat_id,
+ )
+
+ # 7. Send result back to the client
+ return search_result
+
+
+.. edb:split-section::
+
+ Done! We've now fully integrated the chat history into out app and enabled
+ natural language conversations. As before, let's quickly test out the
+ improvements before moving on:
+
+
+ .. code-block:: bash
+
+ $ curl -X 'POST' \
+ 'http://localhost:8000/messages?username=alice&chat_id=d4eed420-e903-11ef-b8a7-8718abdafbe1' \
+ -H 'accept: application/json' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "query": "what are people saying about gel"
+ }'
+
+ $ curl -X 'POST' \
+ 'http://localhost:8000/messages?username=alice&chat_id=d4eed420-e903-11ef-b8a7-8718abdafbe1' \
+ -H 'accept: application/json' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "query": "do they like it or not"
+ }'
+
+
+6. Use Gel's advanced features to create a RAG
+==============================================
+
+At this point we have a decent search bot that can refine a search query over
+multiple turns of a conversation.
+
+It's time to add the final touch: we can make the bot remember previous similar
+interactions with the user using retrieval-augmented generation (RAG).
+
+To achieve this we need to implement similarity search across message history:
+we're going to create a vector embedding for every message in the database using
+a neural network. Every time we generate a Google search query, we're also going
+to use it to search for similar messages in user's message history, and inject
+the corresponding chat into the prompt. That way the search bot will be able to
+quickly "remember" similar interactions with the user and use them to understand
+what they are looking for.
+
+Gel enables us to implement such a system with only minor modifications to the
+schema.
+
+
+.. edb:split-section::
+
+ We begin by enabling the ``ai`` extension by adding the following like on top of
+ the ``dbschema/default.esdl``:
+
+ .. code-block:: sdl-diff
+ :caption: dbschema/default.esdl
+
+ + using extension ai;
+
+
+.. edb:split-section::
+
+ ... and do the migration:
+
+
+ .. code-block:: bash
+
+ $ gel migration create
+ $ gel migrate
+
+
+.. edb:split-section::
+
+ Next, we need to configure the API key in Gel for whatever embedding provider
+ we're going to be using. As per documentation, let's open up the CLI by typing
+ ``gel`` and run the following command (assuming we're using OpenAI):
+
+ .. code-block:: edgeql-repl
+
+ searchbot:main> configure current database
+ insert ext::ai::OpenAIProviderConfig {
+ secret := 'sk-....',
+ };
+
+ OK: CONFIGURE DATABASE
+
+
+.. edb:split-section::
+
+ In order to get Gel to automatically keep track of creating and updating
+ message embeddings, all we need to do is create a deferred index like this.
+ Don't forget to run a migration one more time!
+
+ .. code-block:: sdl-diff
+
+ type Message {
+ role: str;
+ body: str;
+ timestamp: datetime {
+ default := datetime_current();
+ }
+ multi sources: str;
+
+ + deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
+ + on (.body);
+ }
+
+
+.. edb:split-section::
+
+ And we're done! Gel is going to cook in the background for a while and generate
+ embedding vectors for our queries. To make sure nothing broke we can follow
+ Gel's AI documentation and take a look at instance logs:
+
+ .. code-block:: bash
+
+ $ gel instance logs -I searchbot | grep api.openai.com
+
+ INFO 50121 searchbot 2025-01-30T14:39:53.364 httpx: HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
+
+
+.. edb:split-section::
+
+ It's time to create the second half of the similarity search - the search query.
+ The query needs to fetch ``k`` chats in which there're messages that are most
+ similar to our current message. This can be a little difficult to visualize in
+ your head, so here's the query itself:
+
+ .. code-block:: edgeql
+ :caption: app/queries/search_chats.edgeql
+
+ with
+ user := (select User filter .name = $username),
+ chats := (
+ select Chat
+ filter .$current_chat_id
+ )
+
+ select chats {
+ distance := min(
+ ext::ai::search(
+ .messages,
+ >$embedding,
+ ).distance,
+ ),
+ messages: {
+ role, body, sources
+ }
+ }
+
+ order by .distance
+ limit $limit;
+
+
+.. edb:split-section::
+
+ .. note::
+
+ Before we can integrate this query into our Python app, we also need to add a
+ new dependency for the Python binding: ``httpx-sse``. It's enables streaming
+ outputs, which we're not going to use right now, but we won't be able to
+ create the AI client without it.
+
+ Let's place in in ``app/queries/search_chats.edgeql``, run the codegen and modify
+ our ``post_messages`` endpoint to keep track of those similar chats.
+
+ .. code-block:: python-diff
+ :caption: app.main.py
+
+ + from edgedb.ai import create_async_ai, AsyncEdgeDBAI
+ + from .queries.search_chats_async_edgeql import (
+ + search_chats as search_chats_query,
+ + )
+
+ class SearchResult(BaseModel):
+ response: str | None = None
+ search_query: str | None = None
+ sources: list[WebSource] | None = None
+ + similar_chats: list[str] | None = None
+
+
+ @app.post("/messages", status_code=HTTPStatus.CREATED)
+ async def post_messages(
+ search_terms: SearchTerms,
+ username: str = Query(),
+ chat_id: str = Query(),
+ ) -> SearchResult:
+ # 1. Fetch chat history
+ chat_history = await get_messages_query(
+ gel_client, username=username, chat_id=chat_id
+ )
+
+ # 2. Add incoming message to Gel
+ _ = await add_message_query(
+ gel_client,
+ username=username,
+ message_role="user",
+ message_body=search_terms.query,
+ sources=[],
+ chat_id=chat_id,
+ )
+
+ # 3. Generate a query and perform googling
+ search_query = await generate_search_query(search_terms.query, chat_history)
+ web_sources = await search_web(search_query)
+
+ + # 4. Fetch similar chats
+ + db_ai: AsyncEdgeDBAI = await create_async_ai(gel_client, model="gpt-4o-mini")
+ + embedding = await db_ai.generate_embeddings(
+ + search_query, model="text-embedding-3-small"
+ + )
+ + similar_chats = await search_chats_query(
+ + gel_client,
+ + username=username,
+ + current_chat_id=chat_id,
+ + embedding=embedding,
+ + limit=1,
+ + )
+
+ # 5. Generate answer
+ search_result = await generate_answer(
+ search_terms.query,
+ chat_history,
+ web_sources,
+ + similar_chats,
+ )
+ search_result.search_query = search_query # add search query to the output
+ # to see what the bot is searching for
+ # 6. Add LLM response to Gel
+ _ = await add_message_query(
+ gel_client,
+ username=username,
+ message_role="assistant",
+ message_body=search_result.response,
+ sources=[s.url for s in search_result.sources],
+ chat_id=chat_id,
+ )
+
+ # 7. Send result back to the client
+ return search_result
+
+
+.. edb:split-section::
+
+ Finally, the answer generator needs to get updated one more time, since we need
+ to inject the additional messages into the prompt.
+
+ .. code-block:: python-diff
+ :caption: app/main.py
+
+ async def generate_answer(
+ query: str,
+ chat_history: list[GetMessagesResult],
+ web_sources: list[WebSource],
+ + similar_chats: list[list[GetMessagesResult]],
+ ) -> SearchResult:
+ system_prompt = (
+ "You are a helpful assistant that answers user's questions"
+ + " by finding relevant information in HackerNews threads."
+ + " When answering the question, describe conversations that people have around the subject, provided to you as a context, or say i don't know if they are completely irrelevant."
+ + + " You can reference previous conversation with the user that"
+ + + " are provided to you, if they are relevant, by explicitly referring"
+ + + " to them by saying as we discussed in the past."
+ )
+
+ prompt = f"User search query: {query}\n\nWeb search results:\n"
+
+ for i, source in enumerate(web_sources):
+ prompt += f"Result {i} (URL: {source.url}):\n"
+ prompt += f"{source.text}\n\n"
+
+ + prompt += "Similar chats with the same user:\n"
+
+ + formatted_chats = []
+ + for i, chat in enumerate(similar_chats):
+ + formatted_chat = f"Chat {i}: \n"
+ + for message in chat.messages:
+ + formatted_chat += f"{message.role}: {message.body}\n"
+ + formatted_chats.append(formatted_chat)
+
+ + prompt += "\n".join(formatted_chats)
+
+ messages = [
+ {"role": message.role, "content": message.body} for message in chat_history
+ ]
+ messages.append({"role": "user", "content": prompt})
+
+ llm_response = get_llm_completion(
+ system_prompt=system_prompt,
+ messages=messages,
+ )
+
+ search_result = SearchResult(
+ response=llm_response,
+ sources=web_sources,
+ + similar_chats=formatted_chats,
+ )
+
+ return search_result
+
+
+.. edb:split-section::
+
+ And one last time, let's check to make sure everything works:
+
+ .. code-block:: bash
+
+ $ curl -X 'POST' \
+ 'http://localhost:8000/messages?username=alice&chat_id=d4eed420-e903-11ef-b8a7-8718abdafbe1' \
+ -H 'accept: application/json' \
+ -H 'Content-Type: application/json' \
+ -d '{
+ "query": "remember that cool db i was talking to you about?"
+ }'
+
+
+Keep going!
+===========
+
+This tutorial is over, but this app surely could use way more features!
+
+Basic functionality like deleting messages, a user interface or real web
+search, sure. But also authentication or access policies -- Gel will let you
+set those up in minutes.
+
+Thanks!
+
+
+
+
+
+
+
diff --git a/docs/ai/guide_edgeql.rst b/docs/ai/guide_edgeql.rst
new file mode 100644
index 00000000000..f21f57157c1
--- /dev/null
+++ b/docs/ai/guide_edgeql.rst
@@ -0,0 +1,286 @@
+.. _ref_ai_guide_edgeql:
+
+=========================
+Guide to Gel AI in EdgeQL
+=========================
+
+:edb-alt-title: How to set up Gel AI in EdgeQL
+
+
+|Gel| AI brings vector search capabilities and retrieval-augmented generation
+directly into the database.
+
+
+Enable and configure the extension
+==================================
+
+.. edb:split-section::
+
+ AI is a |Gel| extension. To enable it, we will need to add the extension
+ to the app’s schema:
+
+ .. code-block:: sdl
+
+ using extension ai;
+
+
+.. edb:split-section::
+
+ |Gel| AI uses external APIs in order to get vectors and LLM completions. For it
+ to work, we need to configure an API provider and specify their API key. Let's
+ open EdgeQL REPL and run the following query:
+
+ .. code-block:: edgeql
+
+ configure current database
+ insert ext::ai::OpenAIProviderConfig {
+ secret := 'sk-....',
+ };
+
+
+Now our |Gel| application can take advantage of OpenAI's API to implement AI
+capabilities.
+
+
+.. note::
+
+ |Gel| AI comes with its own :ref:`UI ` that can
+ be used to configure providers, set up prompts and test them in a sandbox.
+
+
+.. note::
+
+ Most API providers require you to set up and account and charge money for
+ model use.
+
+
+Add vectors and perform similarity search
+=========================================
+
+.. edb:split-section::
+
+ Before we start introducing AI capabilities, let's set up our database with a
+ schema and populate it with some data (we're going to be helping Komi-san keep
+ track of her friends).
+
+ .. code-block:: sdl
+
+ module default {
+ type Friend {
+ required name: str {
+ constraint exclusive;
+ };
+
+ summary: str; # A brief description of personality and role
+ relationship_to_komi: str; # Relationship with Komi
+ defining_trait: str; # Primary character trait or quirk
+ }
+ }
+
+.. edb:split-section::
+
+ Here's a shell command you can paste and run that will populate the
+ database with some sample data.
+
+ .. code-block:: bash
+ :class: collapsible
+
+ $ cat << 'EOF' > populate_db.edgeql
+ insert Friend {
+ name := 'Tadano Hitohito',
+ summary := 'An extremely average high school boy with a remarkable ability to read the atmosphere and understand others\' feelings, especially Komi\'s.',
+ relationship_to_komi := 'First friend and love interest',
+ defining_trait := 'Perceptiveness',
+ };
+
+ insert Friend {
+ name := 'Osana Najimi',
+ summary := 'An extremely outgoing person who claims to have been everyone\'s childhood friend. Gender: Najimi.',
+ relationship_to_komi := 'Second friend and social catalyst',
+ defining_trait := 'Universal childhood friend',
+ };
+
+ insert Friend {
+ name := 'Yamai Ren',
+ summary := 'An intense and sometimes obsessive classmate who is completely infatuated with Komi.',
+ relationship_to_komi := 'Self-proclaimed guardian and admirer',
+ defining_trait := 'Obsessive devotion',
+ };
+
+ insert Friend {
+ name := 'Katai Makoto',
+ summary := 'A intimidating-looking but shy student who shares many communication problems with Komi.',
+ relationship_to_komi := 'Fellow communication-challenged friend',
+ defining_trait := 'Scary appearance but gentle nature',
+ };
+
+ insert Friend {
+ name := 'Nakanaka Omoharu',
+ summary := 'A self-proclaimed wielder of dark powers who acts like an anime character and is actually just a regular gaming enthusiast.',
+ relationship_to_komi := 'Gaming buddy and chuunibyou friend',
+ defining_trait := 'Chuunibyou tendencies',
+ };
+ EOF
+ $ gel query -f populate_db.edgeql
+
+
+.. edb:split-section::
+
+ In order to get |Gel| to produce embedding vectors, we need to create a special
+ ``deferred index`` on the type we would like to perform similarity search on.
+ More specifically, we need to specify an EdgeQL expression that produces a
+ string that we're going to create an embedding vector for. This is how we would
+ set up an index if we wanted to perform similarity search on
+ ``Friend.summary``:
+
+ .. code-block:: sdl-diff
+
+ module default {
+ type Friend {
+ required name: str {
+ constraint exclusive;
+ };
+
+ summary: str; # A brief description of personality and role
+ relationship_to_komi: str; # Relationship with Komi
+ defining_trait: str; # Primary character trait or quirk
+
+ + deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
+ + on (.summary);
+ }
+ }
+
+
+.. edb:split-section::
+
+ But actually, in our case it would be better if we could similarity search
+ across all properties at the same time. We can define the index on a more
+ complex expression - like a concatenation of string properties - like this:
+
+
+ .. code-block:: sdl-diff
+
+ module default {
+ type Friend {
+ required name: str {
+ constraint exclusive;
+ };
+
+ summary: str; # A brief description of personality and role
+ relationship_to_komi: str; # Relationship with Komi
+ defining_trait: str; # Primary character trait or quirk
+
+ deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
+ - on (.summary);
+ + on (
+ + .name ++ ' ' ++ .summary ++ ' '
+ + ++ .relationship_to_komi ++ ' '
+ + ++ .defining_trait
+ + );
+ }
+ }
+
+
+.. edb:split-section::
+
+ Once we're done with schema modification, we need to apply them by going
+ through a migration:
+
+ .. code-block:: bash
+
+ $ gel migration create
+ $ gel migrate
+
+
+.. edb:split-section::
+
+ That's it! |Gel| will make necessary API requests in the background and create an
+ index that will enable us to perform efficient similarity search like this:
+
+ .. code-block:: edgeql
+
+ select ext::ai::search(Friend, query_vector);
+
+
+.. edb:split-section::
+
+ Note that this function accepts an embedding vector as the second argument, not
+ a text string. This means that in order to similarity search for a string, we
+ need to create a vector embedding for it using the same model as we used to
+ create the index. |Gel| offers an HTTP endpoint ``/ai/embeddings`` that can
+ handle it for us. All we need to do is to pass the vector it produces into the
+ search query:
+
+ .. note::
+
+ Note that we're passing our login and password in order to autheticate the
+ request. We can find those using the CLI: ``gel instance credentials
+ --json``. Learn about all the other ways you can authenticate a request
+ :ref:`here `.
+
+ .. code-block:: bash
+
+ $ curl --user user:password \
+ --json '{"input": "Who helps Komi make friends?", "model": "text-embedding-3-small"}' \
+ http://localhost:/branch/main/ai/embeddings \
+ | jq -r '.data[0].embedding' \ # extract the embedding out of the JSON
+ | tr -d '\n' \ # remove newlines
+ | sed 's/^\[//;s/\]$//' \ # remove square brackets
+ | awk '{print "select ext::ai::search(Friend, >[" $0 "]);"}' \ # assemble the query
+ | gel query --file - # pass the query into Gel CLI
+
+
+
+Use the built-in RAG
+====================
+
+One more feature |Gel| AI offers is built-in retrieval-augmented generation, also
+known as RAG.
+
+.. edb:split-section::
+
+ |Gel| comes preconfigured to be able to process our text query, perform
+ similarity search across the index we just created, pass the results to an LLM
+ and return a response. We can access the built-in RAG using the ``/ai/rag``
+ HTTP endpoint:
+
+
+ .. code-block:: bash
+
+ $ curl --user user:password --json '{
+ "query": "Who helps Komi make friends?",
+ "model": "gpt-4-turbo-preview",
+ "context": {"query":"select Friend"}
+ }' http://localhost:/branch/main/ai/rag
+
+
+.. edb:split-section::
+
+ We can also stream the response like this:
+
+
+ .. code-block:: bash-diff
+
+ $ curl --user user:password --json '{
+ "query": "Who helps Komi make friends?",
+ "model": "gpt-4-turbo-preview",
+ "context": {"query":"select Friend"},
+ + "stream": true,
+ }' http://localhost:/branch/main/ai/rag
+
+
+Keep going!
+===========
+
+You are now sufficiently equipped to use |Gel| AI in your applications.
+
+If you'd like to build something on your own, make sure to check out the
+:ref:`Reference manual ` in order to learn the details
+about using different APIs and models, configuring prompts or using the UI.
+Make sure to also check out the |Gel| AI bindings in Python and JavaScript if
+those languages are relevant to you.
+
+And if you would like more guidance for how |Gel| AI can be fit into an
+application, take a look at the FastAPI Gel AI Tutorial, where we're building a
+search bot using features you learned about above.
+
diff --git a/docs/ai/guide_python.rst b/docs/ai/guide_python.rst
new file mode 100644
index 00000000000..aed9004bd66
--- /dev/null
+++ b/docs/ai/guide_python.rst
@@ -0,0 +1,369 @@
+.. _ref_ai_guide_python:
+
+=========================
+Guide to Gel AI in Python
+=========================
+
+:edb-alt-title: How to set up Gel AI in Python
+
+.. edb:split-section::
+
+ |Gel| AI brings vector search capabilities and retrieval-augmented
+ generation directly into the database. It's integrated into the |Gel|
+ Python binding via the ``gel.ai`` module.
+
+ .. code-block:: bash
+
+ $ pip install 'gel[ai]'
+
+
+Enable and configure the extension
+==================================
+
+.. edb:split-section::
+
+ AI is an |Gel| extension. To enable it, we will need to add the extension
+ to the app’s schema:
+
+ .. code-block:: sdl
+
+ using extension ai;
+
+
+.. edb:split-section::
+
+ |Gel| AI uses external APIs in order to get vectors and LLM completions.
+ For it to work, we need to configure an API provider and specify their API
+ key. Let's open EdgeQL REPL and run the following query:
+
+ .. code-block:: edgeql
+
+ configure current database
+ insert ext::ai::OpenAIProviderConfig {
+ secret := 'sk-....',
+ };
+
+
+Now our |Gel| application can take advantage of OpenAI's API to implement AI
+capabilities.
+
+
+.. note::
+
+ |Gel| AI comes with its own :ref:`UI ` that can
+ be used to configure providers, set up prompts and test them in a sandbox.
+
+
+.. note::
+
+ Most API providers require you to set up and account and charge money for
+ model use.
+
+
+Add vectors
+===========
+
+.. edb:split-section::
+
+ Before we start introducing AI capabilities, let's set up our database with a
+ schema and populate it with some data (we're going to be helping Komi-san keep
+ track of her friends).
+
+ .. code-block:: sdl
+
+ module default {
+ type Friend {
+ required name: str {
+ constraint exclusive;
+ };
+
+ summary: str; # A brief description of personality and role
+ relationship_to_komi: str; # Relationship with Komi
+ defining_trait: str; # Primary character trait or quirk
+ }
+ }
+
+.. edb:split-section::
+
+ Here's a shell command you can paste and run that will populate the
+ database with some sample data.
+
+ .. code-block:: bash
+ :class: collapsible
+
+ $ cat << 'EOF' > populate_db.edgeql
+ insert Friend {
+ name := 'Tadano Hitohito',
+ summary := 'An extremely average high school boy with a remarkable ability to read the atmosphere and understand others\' feelings, especially Komi\'s.',
+ relationship_to_komi := 'First friend and love interest',
+ defining_trait := 'Perceptiveness',
+ };
+
+ insert Friend {
+ name := 'Osana Najimi',
+ summary := 'An extremely outgoing person who claims to have been everyone\'s childhood friend. Gender: Najimi.',
+ relationship_to_komi := 'Second friend and social catalyst',
+ defining_trait := 'Universal childhood friend',
+ };
+
+ insert Friend {
+ name := 'Yamai Ren',
+ summary := 'An intense and sometimes obsessive classmate who is completely infatuated with Komi.',
+ relationship_to_komi := 'Self-proclaimed guardian and admirer',
+ defining_trait := 'Obsessive devotion',
+ };
+
+ insert Friend {
+ name := 'Katai Makoto',
+ summary := 'A intimidating-looking but shy student who shares many communication problems with Komi.',
+ relationship_to_komi := 'Fellow communication-challenged friend',
+ defining_trait := 'Scary appearance but gentle nature',
+ };
+
+ insert Friend {
+ name := 'Nakanaka Omoharu',
+ summary := 'A self-proclaimed wielder of dark powers who acts like an anime character and is actually just a regular gaming enthusiast.',
+ relationship_to_komi := 'Gaming buddy and chuunibyou friend',
+ defining_trait := 'Chuunibyou tendencies',
+ };
+ EOF
+ $ gel query -f populate_db.edgeql
+
+
+.. edb:split-section::
+
+ In order to get |Gel| to produce embedding vectors, we need to create a
+ special ``deferred index`` on the type we would like to perform similarity
+ search on. More specifically, we need to specify an EdgeQL expression that
+ produces a string that we're going to create an embedding vector for. This
+ is how we would set up an index if we wanted to perform similarity search
+ on ``Friend.summary``:
+
+ .. code-block:: sdl-diff
+
+ module default {
+ type Friend {
+ required name: str {
+ constraint exclusive;
+ };
+
+ summary: str; # A brief description of personality and role
+ relationship_to_komi: str; # Relationship with Komi
+ defining_trait: str; # Primary character trait or quirk
+
+ + deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
+ + on (.summary);
+ }
+ }
+
+
+.. edb:split-section::
+
+ But actually, in our case it would be better if we could similarity search
+ across all properties at the same time. We can define the index on a more
+ complex expression - like a concatenation of string properties - like this:
+
+
+ .. code-block:: sdl-diff
+
+ module default {
+ type Friend {
+ required name: str {
+ constraint exclusive;
+ };
+
+ summary: str; # A brief description of personality and role
+ relationship_to_komi: str; # Relationship with Komi
+ defining_trait: str; # Primary character trait or quirk
+
+ deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
+ - on (.summary);
+ + on (
+ + .name ++ ' ' ++ .summary ++ ' '
+ + ++ .relationship_to_komi ++ ' '
+ + ++ .defining_trait
+ + );
+ }
+ }
+
+
+.. edb:split-section::
+
+ Once we're done with schema modification, we need to apply them by going
+ through a migration:
+
+ .. code-block:: bash
+
+ $ gel migration create
+ $ gel migrate
+
+
+That's it! |Gel| will make necessary API requests in the background and create an
+index that will enable us to perform efficient similarity search.
+
+
+Perform similarity search in Python
+===================================
+
+.. edb:split-section::
+
+ In order to run queries against the index we just created, we need to create a
+ |Gel| client and pass it to a |Gel| AI instance.
+
+ .. code-block:: python
+
+ import gel
+ import gel.ai
+
+ gel_client = gel.create_client()
+ gel_ai = gel.ai.create_rag_client(client)
+
+ text = "Who helps Komi make friends?"
+ vector = gel_ai.generate_embeddings(
+ text,
+ "text-embedding-3-small",
+ )
+
+ gel_client.query(
+ "select ext::ai::search(Friend, >$embedding_vector",
+ embedding_vector=vector,
+ )
+
+
+.. edb:split-section::
+
+ We are going to execute a query that calls a single function:
+ ``ext::ai::search(, )``. That function accepts an
+ embedding vector as the second argument, not a text string. This means that in
+ order to similarity search for a string, we need to create a vector embedding
+ for it using the same model as we used to create the index. The |Gel| AI binding
+ in Python comes with a ``generate_embeddings`` function that does exactly that:
+
+
+ .. code-block:: python-diff
+
+ import gel
+ import gel.ai
+
+ gel_client = gel.create_client()
+ gel_ai = gel.ai.create_rag_client(client)
+
+ + text = "Who helps Komi make friends?"
+ + vector = gel_ai.generate_embeddings(
+ + text,
+ + "text-embedding-3-small",
+ + )
+
+
+.. edb:split-section::
+
+ Now we can plug that vector directly into our query to get similarity search
+ results:
+
+
+ .. code-block:: python-diff
+
+ import gel
+ import gel.ai
+
+ gel_client = gel.create_client()
+ gel_ai = gel.ai.create_rag_client(client)
+
+ text = "Who helps Komi make friends?"
+ vector = gel_ai.generate_embeddings(
+ text,
+ "text-embedding-3-small",
+ )
+
+ + gel_client.query(
+ + "select ext::ai::search(Friend, >$embedding_vector",
+ + embedding_vector=vector,
+ + )
+
+
+Use the built-in RAG
+====================
+
+One more feature |Gel| AI offers is built-in retrieval-augmented generation,
+also known as RAG.
+
+.. edb:split-section::
+
+ |Gel| comes preconfigured to be able to process our text query, perform
+ similarity search across the index we just created, pass the results to an
+ LLM and return a response. In order to access the built-in RAG, we need to
+ start by selecting an LLM and passing its name to the |Gel| AI instance
+ constructor:
+
+
+ .. code-block:: python-diff
+
+ import gel
+ import gel.ai
+
+ gel_client = gel.create_client()
+ gel_ai = gel.ai.create_rag_client(
+ client,
+ + model="gpt-4-turbo-preview"
+ )
+
+
+.. edb:split-section::
+
+ Now we can access the RAG using the ``query_rag`` function like this:
+
+
+ .. code-block:: python-diff
+
+ import gel
+ import gel.ai
+
+ gel_client = gel.create_client()
+ gel_ai = gel.ai.create_rag_client(
+ client,
+ model="gpt-4-turbo-preview"
+ )
+
+ + gel_ai.query_rag(
+ + "Who helps Komi make friends?",
+ + context="Friend",
+ + )
+
+
+.. edb:split-section::
+
+ We can also stream the response like this:
+
+
+ .. code-block:: python-diff
+
+ import gel
+ import gel.ai
+
+ gel_client = gel.create_client()
+ gel_ai = gel.ai.create_rag_client(
+ client,
+ model="gpt-4-turbo-preview"
+ )
+
+ - gel_ai.query_rag(
+ + gel_ai.stream_rag(
+ "Who helps Komi make friends?",
+ context="Friend",
+ )
+
+
+Keep going!
+===========
+
+You are now sufficiently equipped to use |Gel| AI in your applications.
+
+If you'd like to build something on your own, make sure to check out the
+:ref:`Reference manual ` in order to learn the details
+about using different APIs and models, configuring prompts or using the UI.
+
+And if you would like more guidance for how |Gel| AI can be fit into an
+application, take a look at the FastAPI Gel AI Tutorial, where we're building a
+search bot using features you learned about above.
+
+
diff --git a/docs/ai/images/ui_playground.png b/docs/ai/images/ui_playground.png
new file mode 100644
index 00000000000..5fa19839aee
Binary files /dev/null and b/docs/ai/images/ui_playground.png differ
diff --git a/docs/ai/images/ui_prompts.png b/docs/ai/images/ui_prompts.png
new file mode 100644
index 00000000000..2a5b2d50784
Binary files /dev/null and b/docs/ai/images/ui_prompts.png differ
diff --git a/docs/ai/images/ui_providers.png b/docs/ai/images/ui_providers.png
new file mode 100644
index 00000000000..cf1adfb6c8e
Binary files /dev/null and b/docs/ai/images/ui_providers.png differ
diff --git a/docs/ai/index.rst b/docs/ai/index.rst
index 88134092fdf..a23c768bd26 100644
--- a/docs/ai/index.rst
+++ b/docs/ai/index.rst
@@ -1,267 +1,39 @@
.. _ref_ai_overview:
-==
-AI
-==
+======
+Gel AI
+======
.. toctree::
:hidden:
:maxdepth: 3
+ quickstart_fastapi_ai
+ reference_extai
+ reference_http
+ reference_python
javascript
- python
- reference
+ guide_edgeql
+ guide_python
:edb-alt-title: Using Gel AI
-|Gel| AI allows you to ship AI-enabled apps with practically no effort. It
-automatically generates embeddings for your data. Works with OpenAI, Mistral
-AI, Anthropic, and any other provider with a compatible API.
+|Gel| AI is a set of tools designed to enable you to ship AI-enabled apps with
+practically no effort. This is what comes in the box:
+1. ``ext::ai``: this Gel extension automatically generates embeddings for your
+ data. Works with OpenAI, Mistral AI, Anthropic, and any other provider with a
+ compatible API.
-Enable extension in your schema
-===============================
+2. ``ext::vectorstore``: this extension is designed to replicate workflows that
+ might be familiar to you from vectorstore-style databases. Powered by
+ ``pgvector``, it allows you to store and search for embedding vectors, and
+ integrates with popular AI frameworks.
-AI is a |Gel| extension. To enable it, you will need to add the extension
-to your app's schema:
+3. Python library: ``gel.ai``. Access all Gel AI features straight from your
+ Python application.
-.. code-block:: sdl
+4. JavaScript library: ``@gel/ai``.
- using extension ai;
-Extension configuration
-=======================
-
-The AI extension may be configured via our UI or via EdgeQL. To use the
-built-in UI, access it by running :gelcmd:`ui`. If you have the extension
-enabled in your schema as shown above and have migrated that schema change, you
-will see the "AI Admin" icon in the left-hand toolbar.
-
-.. image:: images/ui-ai.png
- :alt: The Gel local development server UI highlighting the AI admin
- icon in the left-hand toolbar. The icon is two stars, one larger and
- one smaller, the smaller being a light pink color and the larger
- being a light blue when selected.
- :width: 100%
-
-The default tab "Playground" allows you to test queries against your data after
-you first configure the model, prompt, and context query in the right sidebar.
-
-The "Prompts" tab allows you to configure prompts for use in the playground.
-The "Providers" tab must be configured for the API you want to use for
-embedding generation and querying. We currently support OpenAI, Mistral AI, and
-Anthropic.
-
-
-Configuring a provider
-----------------------
-
-To configure a provider, you will first need to obtain an API key for your
-chosen provider, which you may do from their respective sites:
-
-* `OpenAI API keys `__
-* `Mistral API keys `__
-* `Anthropic API keys `__
-
-With your API key, you may now configure in the UI by clickin the "Add
-Provider" button, selecting the appropriate API, and pasting your key in the
-"Secret" field.
-
-.. image:: images/ui-ai-add-provider.png
- :alt: The "Add Provider" form of the Gel local development server UI.
- On the left, the sidebar navigation for the view showing Playground,
- Prompts, and Providers options, with Provider selected (indicated
- with a purple border on the left). The main content area shows a
- heading Providers with a form under it. The form contains a dropdown
- to select the API. (Anthropic is currently selected.) The form
- contains two fields: an optional Client ID and a Secret. The Secret
- field is filled with your-api-key-here. Under the fields to the
- right, the form has a gray button to cancel and a purple Add Provider
- button.
- :width: 100%
-
-You may alternatively configure a provider via EdgeQL:
-
-.. code-block:: edgeql
-
- configure current branch
- insert ext::ai::OpenAIProviderConfig {
- secret := 'sk-....',
- };
-
-This object has other properties as well, including ``client_id`` and
-``api_url``, which can be set as strings to override the defaults for the
-chosen provider.
-
-We have provider config types for each of the three supported APIs:
-
-* ``OpenAIProviderConfig``
-* ``MistralProviderConfig``
-* ``AnthropicProviderConfig``
-
-
-Usage
-=====
-
-Using |Gel| AI requires some changes to your schema.
-
-
-Add an index
-------------
-
-To start using |Gel| AI on a type, create an index:
-
-.. code-block:: sdl-diff
-
- module default {
- type Astronomy {
- content: str;
- + deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
- + on (.content);
- }
- };
-
-In this example, we have added an AI index on the ``Astronomy`` type's
-``content`` property using the ``text-embedding-3-small`` model. Once you have
-the index in your schema, :ref:`create ` and
-:ref:`apply ` your migration, and you're ready
-to start running queries!
-
-.. note::
-
- The particular embedding model we've chosen here
- (``text-embedding-3-small``) is an OpenAI model, so it will require an
- OpenAI provider to be configured as described above.
-
- You may use any of :ref:`our pre-configured embedding generation models
- `.
-
-You may want to include multiple properties in your AI index. Fortunately, you
-can define an AI index on an expression:
-
-.. code-block:: sdl
-
- module default {
- type Astronomy {
- climate: str;
- atmosphere: str;
- deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
- on (.climate ++ ' ' ++ .atmosphere);
- }
- };
-
-.. note:: When AI indexes aren't working…
-
- If you find your queries are not returning the expected results, try
- inspecting your instance logs. On a |Gel| Cloud instance, use the "Logs"
- tab in your instance dashboard. On local or :ref:`CLI-linked remote
- instances `, use :gelcmd:`instance logs -I
- `. You may find the problem there.
-
- Providers impose rate limits on their APIs which can often be the source of
- AI index problems. If index creation hits a rate limit, |Gel| will wait
- the ``indexer_naptime`` (see the docs on :ref:`ext::ai configuration
- `) and resume index creation.
-
- If your indexed property contains values that exceed the token limit for a
- single request, you may consider truncating the property value in your
- index expression. You can do this with a string by slicing it:
-
- .. code-block:: sdl
-
- module default {
- type Astronomy {
- content: str;
- deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
- on (.content[0:10000]);
- }
- };
-
- This example will slice the first 10,000 characters of the ``content``
- property for indexing.
-
- Tokens are not equivalent to characters. For OpenAI embedding generation,
- you may test values via `OpenAI's web-based tokenizer
- `__. You may alternatively download
- the library OpenAI uses for tokenization from that same page if you prefer.
- By testing, you can get an idea how much of your content can be sent for
- indexing.
-
-
-Run a semantic similarity query
--------------------------------
-
-Once your index has been migrated, running a query against the embeddings is
-super simple:
-
-.. code-block:: edgeql
-
- select ext::ai::search(Astronomy, query)
-
-Simple, but you'll still need to generate embeddings from your query or pass in
-existing embeddings. If your ultimate goal is retrieval-augmented generation
-(i.e., RAG), we've got you covered.
-
-.. _ref_ai_overview_rag:
-
-Use RAG via HTTP
-----------------
-
-By making an HTTP request to
-``https://:/branch//ai/rag``, you can generate
-text via the generative AI API of your choice within the context of a type with
-a deferred embedding index.
-
-.. note::
-
- Making HTTP requests to |Gel| requires :ref:`authentication
- `.
-
-.. code-block:: bash
-
- $ curl --json '{
- "query": "What color is the sky on Mars?",
- "model": "gpt-4-turbo-preview",
- "context": {"query":"select Astronomy"}
- }' https://:/branch//ai/rag
- {"response": "The sky on Mars is red."}
-
-Since LLMs are often slow, it may be useful to stream the response. To do this,
-add ``"stream": true`` to your request JSON.
-
-.. note::
-
- The particular text generation model we've chosen here
- (``gpt-4-turbo-preview``) is an OpenAI model, so it will require an OpenAI
- provider to be configured as described above.
-
- You may use any of our supported :ref:`text generation models
- `.
-
-
-Use RAG via JavaScript
-----------------------
-
-``@gel/ai`` offers a convenient wrapper around ``ext::ai``. Install it with
-``npm install @gel/ai`` (or via your package manager of choice) and
-implement it like this example:
-
-.. code-block:: typescript
-
- import { createClient } from "gel";
- import { createAI } from "@gel/ai";
-
- const client = createClient();
-
- const gpt4AI = createAI(client, {
- model: "gpt-4-turbo-preview",
- });
-
- const blogAI = gpt4AI.withContext({
- query: "select Astronomy"
- });
-
- console.log(await blogAI.queryRag(
- "What color is the sky on Mars?"
- ));
diff --git a/docs/ai/javascript.rst b/docs/ai/javascript.rst
index e1822023cef..fc5e7f6305f 100644
--- a/docs/ai/javascript.rst
+++ b/docs/ai/javascript.rst
@@ -41,7 +41,7 @@ model):
});
You may use any of the supported :ref:`text generation models
-`. Add your query as context:
+`. Add your query as context:
.. code-block:: typescript
diff --git a/docs/ai/quickstart_fastapi_ai.rst b/docs/ai/quickstart_fastapi_ai.rst
new file mode 100644
index 00000000000..9708e1fc9ae
--- /dev/null
+++ b/docs/ai/quickstart_fastapi_ai.rst
@@ -0,0 +1,346 @@
+.. _ref_quickstart_ai:
+
+======================
+Using the built-in RAG
+======================
+
+.. edb:split-section::
+
+ In this section we'll use |Gel|'s built-in vector search and
+ retrieval-augmented generation capabilities to decorate our flashcard app
+ with a couple AI features. We're going to create a ``/fetch_similar``
+ endpoint that's going to look up flashcards similar to a text search query,
+ as well as a ``/fetch_rag`` endpoint that's going to enable us to talk to
+ an LLM about the content of our flashcard deck.
+
+ We're going to start with the same schema we left off with in the primary
+ quickstart.
+
+
+ .. code-block:: sdl
+ :caption: dbschema/default.gel
+
+ module default {
+ abstract type Timestamped {
+ required created_at: datetime {
+ default := datetime_of_statement();
+ };
+ required updated_at: datetime {
+ default := datetime_of_statement();
+ };
+ }
+
+ type Deck extending Timestamped {
+ required name: str;
+ description: str;
+ cards := (
+ select . configure current database
+ insert ext::ai::OpenAIProviderConfig {
+ secret := 'sk-....',
+ };
+
+
+.. edb:split-section::
+
+ Once last thing before we move on. Let's add some sample data to give the
+ embedding model something to work with. You can copy and run this command
+ in the terminal, or come up with your own sample data.
+
+
+ .. code-block:: edgeql
+ :class: collapsible
+
+ $ cat << 'EOF' | gel query --file -
+ with deck := (
+ insert Deck {
+ name := 'Smelly Cheeses',
+ description := 'To impress everyone with stinky cheese trivia.'
+ }
+ )
+ for card_data in {(
+ 1,
+ 'Époisses de Bourgogne',
+ 'Known as the "king of cheeses", this French cheese is so pungent it\'s banned on public transport in France. Washed in brandy, it becomes increasingly funky as it ages. Orange-red rind, creamy interior.'
+ ), (
+ 2,
+ 'Vieux-Boulogne',
+ 'Officially the smelliest cheese in the world according to scientific studies. This northern French cheese has a reddish-orange rind from being washed in beer. Smooth, creamy texture with a powerful aroma.'
+ ), (
+ 3,
+ 'Durian Cheese',
+ 'This Malaysian creation combines durian fruit with cheese, creating what some consider the ultimate "challenging" dairy product. Combines the pungency of blue cheese with durian\'s notorious aroma.'
+ ), (
+ 4,
+ 'Limburger',
+ 'German cheese famous for its intense smell, often compared to foot odor due to the same bacteria. Despite its reputation, has a surprisingly mild taste with notes of mushroom and grass.'
+ ), (
+ 5,
+ 'Roquefort',
+ 'The "king of blue cheeses", aged in limestone caves in southern France. Contains Penicillium roqueforti mold. Strong, tangy, and salty with a crumbly texture. Legend says it was discovered when a shepherd left his lunch in a cave.'
+ ), (
+ 6,
+ 'What makes washed-rind cheeses so smelly?',
+ 'The process of washing cheese rinds in brine, alcohol, or other solutions promotes the growth of Brevibacterium linens, the same bacteria responsible for human body odor. This bacteria contributes to both the orange color and distinctive aroma.'
+ ), (
+ 7,
+ 'Stinking Bishop',
+ 'Named after the Stinking Bishop pear (not a religious figure). This English cheese is washed in perry made from these pears. Known for its powerful aroma and sticky, pink-orange rind. Gained fame after being featured in Wallace & Gromit.'
+ )}
+ union (
+ insert Card {
+ deck := deck,
+ order := card_data.0,
+ front := card_data.1,
+ back := card_data.2
+ }
+ );
+ EOF
+
+
+.. edb:split-section::
+
+ Now we can finally start producing embedding vectors. Since |Gel| is fully
+ aware of when your data gets inserted, updated and deleted, it's perfectly
+ equipped to handle all the tedious work of keeping those vectors up to
+ date. All that's left for us is to create a special ``deferred index`` on
+ the data we would like to perform similarity search on.
+
+
+ .. code-block:: sdl-diff
+ :caption: dbschema/default.gel
+
+ using extension ai;
+
+ module default {
+ abstract type Timestamped {
+ required created_at: datetime {
+ default := datetime_of_statement();
+ };
+ required updated_at: datetime {
+ default := datetime_of_statement();
+ };
+ }
+
+ type Deck extending Timestamped {
+ required name: str;
+ description: str;
+ cards := (
+ select .>$embedding_vector)",
+ + embedding_vector=embedding_vector,
+ + )
+
+ + return similar_cards
+
+
+.. edb:split-section::
+
+ Let's test the endpoint to see that everything works the way we expect.
+
+
+ .. code-block:: bash
+
+ $ curl -X 'GET' \
+ 'http://localhost:8000/fetch_similar?query=the%20stinkiest%20cheese' \
+ -H 'accept: application/json'
+
+
+.. edb:split-section::
+
+ Finally, let's create the second endpoint we mentioned, called
+ ``/fetch_rag``. We'll be able to use this one to, for example, ask an LLM
+ to quiz us on the contents of our deck.
+
+ The RAG feature is represented in the Python binding with the ``query_rag``
+ method of the ``GelRAG`` class. To use it, we're going to instantiate the
+ class and call the method... And that's it!
+
+
+ .. code-block:: python-diff
+ :caption: main.py
+
+ import gel
+ import gel.ai
+
+ from fastapi import FastAPI
+
+
+ client = gel.create_async_client()
+
+ app = FastAPI()
+
+
+ @app.get("/fetch_similar")
+ async def fetch_similar_cards(query: str):
+ rag = await gel.ai.create_async_rag_client(client, model="gpt-4-turbo-preview")
+ embedding_vector = await rag.generate_embeddings(
+ query, model="text-embedding-3-small"
+ )
+
+ similar_cards = await client.query(
+ "select ext::ai::search(Card, >$embedding_vector)",
+ embedding_vector=embedding_vector,
+ )
+
+ return similar_cards
+
+
+ + @app.get("/fetch_rag")
+ + async def fetch_rag_response(query: str):
+ + rag = await gel.ai.create_async_rag_client(client, model="gpt-4-turbo-preview")
+ + response = await rag.query_rag(
+ + message=query,
+ + context=gel.ai.QueryContext(query="select Card"),
+ + )
+ + return response
+
+
+.. edb:split-section::
+
+ Let's test the endpoint to see if it works:
+
+
+ .. code-block:: bash
+
+ $ curl -X 'GET' \
+ 'http://localhost:8000/fetch_rag?query=what%20cheese%20smells%20like%20feet' \
+ -H 'accept: application/json'
+
+
+.. edb:split-section::
+
+ Congratulations! We've now implemented AI features in our flashcards app.
+ Of course, there's more to learn when it comes to using the AI extension.
+ Make sure to check out the Reference manual, or build an LLM-powered search
+ bot from the ground up with the FastAPI Gel AI tutorial.
diff --git a/docs/ai/reference.rst b/docs/ai/reference.rst
deleted file mode 100644
index 85557ff33fb..00000000000
--- a/docs/ai/reference.rst
+++ /dev/null
@@ -1,671 +0,0 @@
-.. _ref_ai_reference:
-
-=======
-ext::ai
-=======
-
-To activate |Gel| AI functionality, you can use the :ref:`extension
-` mechanism:
-
-.. code-block:: sdl
-
- using extension ai;
-
-
-.. _ref_ai_reference_config:
-
-Configuration
-=============
-
-Use the ``configure`` command to set configuration for the AI extension. Update
-the values using the ``configure session`` or the ``configure current branch``
-command depending on the scope you prefer:
-
-.. code-block:: edgeql-repl
-
- db> configure current branch
- ... set ext::ai::Config::indexer_naptime := '0:00:30';
- OK: CONFIGURE DATABASE
-
-The only property available currently is ``indexer_naptime`` which specifies
-the minimum delay between deferred ``ext::ai::index`` indexer runs on any given
-branch.
-
-Examine the ``extensions`` link of the ``cfg::Config`` object to check the
-current config values:
-
-.. code-block:: edgeql-repl
-
- db> select cfg::Config.extensions[is ext::ai::Config]{*};
- {
- ext::ai::Config {
- id: 1a53f942-d7ce-5610-8be2-c013fbe704db,
- indexer_naptime: '0:00:30'
- }
- }
-
-You may also restore the default config value using ``configure session
-reset`` if you set it on the session or ``configure current branch reset``
-if you set it on the branch:
-
-.. code-block:: edgeql-repl
-
- db> configure current branch reset ext::ai::Config::indexer_naptime;
- OK: CONFIGURE DATABASE
-
-
-Providers
----------
-
-Provider configs are required for AI indexes (for embedding generation) and for
-RAG (for text generation). They may be added via :ref:`ref_cli_gel_ui` or by
-via EdgeQL:
-
-.. code-block:: edgeql
-
- configure current branch
- insert ext::ai::OpenAIProviderConfig {
- secret := 'sk-....',
- };
-
-The extension makes available types for each provider and for a custom provider
-compatible with one of the supported API styles.
-
-* ``ext::ai::OpenAIProviderConfig``
-* ``ext::ai::MistralProviderConfig``
-* ``ext::ai::AnthropicProviderConfig``
-* ``ext::ai::CustomProviderConfig``
-
-All provider types require the ``secret`` property be set with a string
-containing the secret provided by the AI vendor. Other properties may
-optionally be set:
-
-* ``name``- A unique provider name
-* ``display_name``- A human-friendly provider name
-* ``api_url``- The provider's API URL
-* ``client_id``- ID for the client provided by model API vendor
-
-In addition to the required ``secret`` property,
-``ext::ai::CustomProviderConfig requires an ``api_style`` property be set.
-Available values are ``ext::ai::ProviderAPIStyle.OpenAI`` and
-``ext::ai::ProviderAPIStyle.Anthropic``.
-
-Prompts
--------
-
-You may add prompts either via :ref:`ref_cli_gel_ui` or via EdgeQL. Here's
-an example of how you might add a prompt with a single message:
-
-.. code-block:: edgeql
-
- insert ext::ai::ChatPrompt {
- name := 'test-prompt',
- messages := (
- insert ext::ai::ChatPromptMessage {
- participant_role := ext::ai::ChatParticipantRole.System,
- content := "Your message content"
- }
- )
- };
-
-``participant_role`` may be any of these values:
-
-* ``ext::ai::ChatParticipantRole.System``
-* ``ext::ai::ChatParticipantRole.User``
-* ``ext::ai::ChatParticipantRole.Assistant``
-* ``ext::ai::ChatParticipantRole.Tool``
-
-``ext::ai::ChatPromptMessage`` also has a ``participant_name`` property which
-is an optional ``str``.
-
-
-.. _ref_guide_ai_reference_index:
-
-Index
-=====
-
-The ``ext::ai::index`` creates a deferred semantic similarity index of an
-expression on a type.
-
-.. code-block:: sdl-diff
-
- module default {
- type Astronomy {
- content: str;
- + deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
- + on (.content);
- }
- };
-
-It can accept several named arguments:
-
-* ``embedding_model``- The name of the model to use for embedding generation as
- a string.
-
- .. _ref_ai_reference_embedding_models:
-
- You may use any of these pre-configured embedding generation models:
-
- **OpenAI**
-
- * ``text-embedding-3-small``
- * ``text-embedding-3-large``
- * ``text-embedding-ada-002``
-
- `Learn more about the OpenAI embedding models `__
-
- **Mistral**
-
- * ``mistral-embed``
-
- `Learn more about the Mistral embedding model `__
-* ``distance_function``- The function to use for determining semantic
- similarity. Default: ``ext::ai::DistanceFunction.Cosine``
-
- The distance function may be any of these:
-
- * ``ext::ai::DistanceFunction.Cosine``
- * ``ext::ai::DistanceFunction.InnerProduct``
- * ``ext::ai::DistanceFunction.L2``
-* ``index_type``- The type of index to create. Currently the only option is the
- default: ``ext::ai::IndexType.HNSW``.
-* ``index_parameters``- A named tuple of additional index parameters:
-
- * ``m``- The maximum number of edges of each node in the graph. Increasing
- can increase the accuracy of searches at the cost of index size. Default:
- ``32``
- * ``ef_construction``- Dictates the depth and width of the search when
- building the index. Higher values can lead to better connections and more
- accurate results at the cost of time and resource usage when building the
- index. Default: ``100``
-
-
-When indexes aren't working…
-----------------------------
-
-If you find your queries are not returning the expected results, try
-inspecting your instance logs. On a |Gel| Cloud instance, use the "Logs"
-tab in your instance dashboard. On local or :ref:`CLI-linked remote
-instances `, use :gelcmd:`instance logs -I
-`. You may find the problem there.
-
-Providers impose rate limits on their APIs which can often be the source of
-AI index problems. If index creation hits a rate limit, Gel will wait
-the ``indexer_naptime`` (see the docs on :ref:`ext::ai configuration
-`) and resume index creation.
-
-If your indexed property contains values that exceed the token limit for a
-single request, you may consider truncating the property value in your
-index expression. You can do this with a string by slicing it:
-
-.. code-block:: sdl
-
- module default {
- type Astronomy {
- content: str;
- deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
- on (.content[0:10000]);
- }
- };
-
-This example will slice the first 10,000 characters of the ``content``
-property for indexing.
-
-Tokens are not equivalent to characters. For OpenAI embedding generation,
-you may test values via `OpenAI's web-based tokenizer
-`__. You may alternatively download
-the library OpenAI uses for tokenization from that same page if you prefer.
-By testing, you can get an idea how much of your content can be sent for
-indexing.
-
-
-Functions
-=========
-
-.. list-table::
- :class: funcoptable
-
- * - :eql:func:`ext::ai::to_context`
- - :eql:func-desc:`ext::ai::to_context`
-
- * - :eql:func:`ext::ai::search`
- - :eql:func-desc:`ext::ai::search`
-
-
-------------
-
-
-.. eql:function:: ext::ai::to_context(object: anyobject) -> str
-
- Evaluates the expression of an :ref:`ai::index
- ` on the passed object and returns it.
-
- This can be useful for confirming the basis of embedding generation for a
- particular object or type.
-
- Given this schema:
-
- .. code-block:: sdl
-
- module default {
- type Astronomy {
- topic: str;
- content: str;
- deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
- on (.topic ++ ' ' ++ .content);
- }
- };
-
- and with these inserts:
-
- .. code-block:: edgeql-repl
-
- db> insert Astronomy {
- ... topic := 'Mars',
- ... content := 'Skies on Mars are red.'
- ... }
- db> insert Astronomy {
- ... topic := 'Earth',
- ... content := 'Skies on Earth are blue.'
- ... }
-
- ``to_context`` returns these results:
-
- .. code-block:: edgeql-repl
-
- db> select ext::ai::to_context(Astronomy);
- {'Mars Skies on Mars are red.', 'Earth Skies on Earth are blue.'}
- db> select ext::ai::to_context((select Astronomy limit 1));
- {'Mars Skies on Mars are red.'}
-
-
-------------
-
-
-.. eql:function:: ext::ai::search( \
- object: anyobject, \
- query: array \
- ) -> optional tuple
-
- Search an object using its :ref:`ai::index `
- index.
-
- Returns objects that match the specified semantic query and the
- similarity score.
-
- .. note::
-
- The ``query`` argument should *not* be a textual query but the
- embeddings generated *from* a textual query. To have |Gel| generate
- the query for you along with a text response, try :ref:`our built-in
- RAG `.
-
- .. code-block:: edgeql-repl
-
- db> with query := >$query
- ... select ext::ai::search(Knowledge, query);
- {
- (
- object := default::Knowledge {id: 9af0d0e8-0880-11ef-9b6b-4335855251c4},
- distance := 0.20410746335983276
- ),
- (
- object := default::Knowledge {id: eeacf638-07f6-11ef-b9e9-57078acfce39},
- distance := 0.7843298847773637
- ),
- (
- object := default::Knowledge {id: f70863c6-07f6-11ef-b9e9-3708318e69ee},
- distance := 0.8560434728860855
- ),
- }
-
-
-HTTP endpoints
-==============
-
-Use the AI extension's HTTP endpoints to perform retrieval-augmented generation
-using your AI indexes or to generate embeddings against a model of your choice.
-
-.. note::
-
- All |Gel| server HTTP endpoints require :ref:`authentication
- `. By default, you may use `HTTP Basic Authentication
- `_
- with your Gel username and password.
-
-
-RAG
----
-
-``POST``: ``https://:/branch//ai/rag``
-
-Responds with text generated by the specified text generation model in response
-to the provided query.
-
-
-Request
-^^^^^^^
-
-Make a ``POST`` request to the endpoint with a JSON body. The body may have
-these properties:
-
-* ``model`` (string, required): The name of the text generation model to use.
-
- .. _ref_ai_reference_text_generation_models:
-
- You may use any of these text generation models:
-
- **OpenAI**
-
- * ``gpt-3.5-turbo``
- * ``gpt-4-turbo-preview``
-
- `Learn more about the OpenAI text generation models `__
-
- **Mistral**
-
- * ``mistral-small-latest``
- * ``mistral-medium-latest``
- * ``mistral-large-latest``
-
- `Learn more about the Mistral text generation models `__
-
- **Anthropic**
-
- * ``claude-3-haiku-20240307``
- * ``claude-3-sonnet-20240229``
- * ``claude-3-opus-20240229``
-
- `Learn more about the Athropic text generation models `__
-
-* ``query`` (string, required): The query string use as the basis for text
- generation.
-
-* ``context`` (object, required): Settings that define the context of the
- query.
-
- * ``query`` (string, required): Specifies an expression to determine the
- relevant objects and index to serve as context for text generation. You may
- set this to any expression that produces a set of objects, even if it is
- not a standalone query.
-
- * ``variables`` (object, optional): A dictionary of variables for use in the
- context query.
-
- * ``globals`` (object, optional): A dictionary of globals for use in the
- context query.
-
- * ``max_object_count`` (int, optional): Maximum number of objects to return;
- default is 5.
-
-* ``stream`` (boolean, optional): Specifies whether the response should be
- streamed. Defaults to false.
-
-* ``prompt`` (object, optional): Settings that define a prompt. Omit to use the
- default prompt.
-
- You may specify an existing prompt by its ``name`` or ``id``, you may define
- a custom prompt inline by sending an array of objects, or you may do both to
- augment an existing prompt with additional custom messages.
-
- * ``name`` (string, optional) or ``id`` (string, optional): The ``name`` or
- ``id`` of an existing custom prompt to use. Provide only one of these if
- you want to use or start from an existing prompt.
-
- * ``custom`` (array of objects, optional): Custom prompt messages, each
- containing a ``role`` and ``content``. If no ``name`` or ``id`` was
- provided, the custom messages provided here become the prompt. If one of
- those was provided, these messages will be added to that existing prompt.
-
-**Example request**
-
-.. code-block::
-
- curl --user : --json '{
- "query": "What color is the sky on Mars?",
- "model": "gpt-4-turbo-preview",
- "context": {"query":"Knowledge"}
- }' http://:/branch/main/ai/rag
-
-
-Response
-^^^^^^^^
-
-**Example successful response**
-
-* **HTTP status**: 200 OK
-* **Content-Type**: application/json
-* **Body**:
-
- .. code-block:: json
-
- {"response": "The sky on Mars is red."}
-
-**Example error response**
-
-* **HTTP status**: 400 Bad Request
-* **Content-Type**: application/json
-* **Body**:
-
- .. code-block:: json
-
- {
- "message": "missing required 'query' in request 'context' object",
- "type": "BadRequestError"
- }
-
-
-Streaming response (SSE)
-^^^^^^^^^^^^^^^^^^^^^^^^
-
-When the ``stream`` parameter is set to ``true``, the server uses `Server-Sent
-Events
-`__
-(SSE) to stream responses. Here is a detailed breakdown of the typical
-sequence and structure of events in a streaming response:
-
-* **HTTP Status**: 200 OK
-* **Content-Type**: text/event-stream
-* **Cache-Control**: no-cache
-
-The stream consists of a sequence of five events, each encapsulating part of
-the response in a structured format:
-
-1. **Message start**
-
- * Event type: ``message_start``
-
- * Data: Starts a message, specifying identifiers and roles.
-
- .. code-block:: json
-
- {
- "type": "message_start",
- "message": {
- "id": "",
- "role": "assistant",
- "model": ""
- }
- }
-
-2. **Content block start**
-
- * Event type: ``content_block_start``
-
- * Data: Marks the beginning of a new content block.
-
- .. code-block:: json
-
- {
- "type": "content_block_start",
- "index": 0,
- "content_block": {
- "type": "text",
- "text": ""
- }
- }
-
-3. **Content block delta**
-
- * Event type: ``content_block_delta``
-
- * Data: Incrementally updates the content, appending more text to the
- message.
-
- .. code-block:: json
-
- {
- "type": "content_block_delta",
- "index": 0,
- "delta": {
- "type": "text_delta",
- "text": "The"
- }
- }
-
- Subsequent ``content_block_delta`` events add more text to the message.
-
-4. **Content block stop**
-
- * Event type: ``content_block_stop``
-
- * Data: Marks the end of a content block.
-
- .. code-block:: json
-
- {
- "type": "content_block_stop",
- "index": 0
- }
-
-5. **Message stop**
-
- * Event type: ``message_stop``
-
- * Data: Marks the end of the message.
-
- .. code-block:: json
-
- {"type": "message_stop"}
-
-Each event is sent as a separate SSE message, formatted as shown above. The
-connection is closed after all events are sent, signaling the end of the
-stream.
-
-**Example SSE response**
-
-.. code-block::
-
- event: message_start
- data: {"type": "message_start", "message": {"id": "chatcmpl-9MzuQiF0SxUjFLRjIdT3mTVaMWwiv", "role": "assistant", "model": "gpt-4-0125-preview"}}
-
- event: content_block_start
- data: {"type": "content_block_start","index":0,"content_block":{"type":"text","text":""}}
-
- event: content_block_delta
- data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": "The"}}
-
- event: content_block_delta
- data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " skies"}}
-
- event: content_block_delta
- data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " on"}}
-
- event: content_block_delta
- data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " Mars"}}
-
- event: content_block_delta
- data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " are"}}
-
- event: content_block_delta
- data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " red"}}
-
- event: content_block_delta
- data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": "."}}
-
- event: content_block_stop
- data: {"type": "content_block_stop","index":0}
-
- event: message_delta
- data: {"type": "message_delta", "delta": {"stop_reason": "stop"}}
-
- event: message_stop
- data: {"type": "message_stop"}
-
-
-Embeddings
-----------
-
-``POST``: ``https://:/branch//ai/embeddings``
-
-Responds with embeddings generated by the specified embeddings model in
-response to the provided input.
-
-Request
-^^^^^^^
-
-Make a ``POST`` request to the endpoint with a JSON body. The body may have
-these properties:
-
-* ``input`` (array of strings or a single string, required): The text to use as
- the basis for embeddings generation.
-
-* ``model`` (string, required): The name of the embedding model to use. You may
- use any of the supported :ref:`embedding models
- `.
-
-**Example request**
-
-.. code-block::
-
- curl --user : --json '{
- "input": "What color is the sky on Mars?",
- "model": "text-embedding-3-small"
- }' http://localhost:10931/branch/main/ai/embeddings
-
-
-Response
-^^^^^^^^
-
-**Example successful response**
-
-* **HTTP status**: 200 OK
-* **Content-Type**: application/json
-* **Body**:
-
-
-.. code-block:: json
-
- {
- "object": "list",
- "data": [
- {
- "object": "embedding",
- "index": 0,
- "embedding": [-0.009434271, 0.009137661]
- }
- ],
- "model": "text-embedding-3-small",
- "usage": {
- "prompt_tokens": 8,
- "total_tokens": 8
- }
- }
-
-.. note::
-
- The ``embedding`` property is shown here with only two values for brevity,
- but an actual response would contain many more values.
-
-**Example error response**
-
-* **HTTP status**: 400 Bad Request
-* **Content-Type**: application/json
-* **Body**:
-
- .. code-block:: json
-
- {
- "message": "missing or empty required \"model\" value in request",
- "type": "BadRequestError"
- }
diff --git a/docs/ai/reference_extai.rst b/docs/ai/reference_extai.rst
new file mode 100644
index 00000000000..69c0ab34453
--- /dev/null
+++ b/docs/ai/reference_extai.rst
@@ -0,0 +1,491 @@
+.. _ref_ai_extai_reference:
+
+============
+AI Extension
+============
+
+This reference documents the |Gel| AI extension's components, configuration
+options, and APIs.
+
+
+Enabling the Extension
+======================
+
+The AI extension can be enabled using the :ref:`extension ` mechanism:
+
+.. code-block:: sdl
+
+ using extension ai;
+
+Configuration
+=============
+
+The AI extension can be configured using ``configure session`` or ``configure current branch``:
+
+.. code-block:: edgeql
+
+ configure current branch
+ set ext::ai::Config::indexer_naptime := 'PT30S';
+
+Configuration Properties
+------------------------
+
+* ``indexer_naptime``: Duration
+ Specifies minimum delay between deferred ``ext::ai::index`` indexer runs.
+
+View current configuration:
+
+.. code-block:: edgeql
+
+ select cfg::Config.extensions[is ext::ai::Config]{*};
+
+Reset configuration:
+
+.. code-block:: edgeql
+
+ configure current branch reset ext::ai::Config::indexer_naptime;
+
+
+.. _ref_ai_extai_reference_ui:
+
+UI
+==
+
+The AI section of the UI can be accessed via the sidebar after the extension
+has been enabled in the schema. It provides ways to manage provider
+configurations and RAG prompts, as well as try out different settings in the
+playground.
+
+Playground tab
+--------------
+
+Provides an interactive environment for testing and configuring the built-in
+RAG.
+
+.. image:: images/ui_playground.png
+ :alt: Screenshot of the Playground tab of the UI depicting an empty message window and three input fields set with default values.
+ :width: 100%
+
+Components:
+
+* Message window: Displays conversation history between the user and the LLM.
+* Model: Dropdown menu for selecting the text generation model.
+* Prompt: Dropdown menu for selecting the RAG prompt template.
+* Context Query: Input field for entering an EdgeQL expression returning a set of objects with AI indexes.
+
+
+Prompts tab
+-----------
+
+Provides ways to manage system prompts used in the built-in RAG.
+
+.. image:: images/ui_prompts.png
+ :alt: Screenshot of the Prompts tab of the UI depicting an expanded prompt configuration menu.
+ :width: 100%
+
+Providers tab
+-------------
+
+Enables management of API configurations for AI API providers.
+
+.. image:: images/ui_providers.png
+ :alt: Screenshot of the Providers tab of the UI depicting an expanded provider configuration menu.
+ :width: 100%
+
+
+.. _ref_ai_extai_reference_index:
+
+Index
+=====
+
+The ``ext::ai::index`` creates a deferred semantic similarity index of an
+expression on a type.
+
+.. code-block:: sdl-diff
+
+ module default {
+ type Astronomy {
+ content: str;
+ + deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
+ + on (.content);
+ }
+ };
+
+
+Parameters:
+
+* ``embedding_model``- The name of the model to use for embedding generation as
+ a string.
+* ``distance_function``- The function to use for determining semantic
+ similarity. Default: ``ext::ai::DistanceFunction.Cosine``
+* ``index_type``- The type of index to create. Currently the only option is the
+ default: ``ext::ai::IndexType.HNSW``.
+* ``index_parameters``- A named tuple of additional index parameters:
+
+ * ``m``- The maximum number of edges of each node in the graph. Increasing
+ can increase the accuracy of searches at the cost of index size. Default:
+ ``32``
+ * ``ef_construction``- Dictates the depth and width of the search when
+ building the index. Higher values can lead to better connections and more
+ accurate results at the cost of time and resource usage when building the
+ index. Default: ``100``
+
+* ``dimensions``: int64 (Optional) - Embedding dimensions
+* ``truncate_to_max``: bool (Default: False)
+
+
+Built-in resources
+==================
+
+.. _ref_ai_extai_reference_embedding_models:
+
+Embedding models
+----------------
+
+OpenAI (`documentation `__)
+
+* ``text-embedding-3-small``
+* ``text-embedding-3-large``
+* ``text-embedding-ada-002``
+
+Mistral (`documentation `__)
+
+* ``mistral-embed``
+
+
+.. _ref_ai_extai_reference_text_generation_models:
+
+Text generation models
+----------------------
+
+OpenAI (`documentation `__)
+
+* ``gpt-3.5-turbo``
+* ``gpt-4-turbo-preview``
+
+Mistral (`documentation `__)
+
+* ``mistral-small-latest``
+* ``mistral-medium-latest``
+* ``mistral-large-latest``
+
+Anthropic (`documentation `__)
+
+* ``claude-3-haiku-20240307``
+* ``claude-3-sonnet-20240229``
+* ``claude-3-opus-20240229``
+
+
+Functions
+=========
+
+.. list-table::
+ :class: funcoptable
+
+ * - :eql:func:`ext::ai::to_context`
+ - :eql:func-desc:`ext::ai::to_context`
+
+ * - :eql:func:`ext::ai::search`
+ - :eql:func-desc:`ext::ai::search`
+
+
+------------
+
+
+.. eql:function:: ext::ai::to_context(object: anyobject) -> str
+
+ Returns the indexed expression value for an object with an ``ext::ai::index``.
+
+ **Example**:
+
+ Schema:
+
+ .. code-block:: sdl
+
+ module default {
+ type Astronomy {
+ topic: str;
+ content: str;
+ deferred index ext::ai::index(embedding_model := 'text-embedding-3-small')
+ on (.topic ++ ' ' ++ .content);
+ }
+ };
+
+ Data:
+
+ .. code-block:: edgeql-repl
+
+ db> insert Astronomy {
+ ... topic := 'Mars',
+ ... content := 'Skies on Mars are red.'
+ ... }
+ db> insert Astronomy {
+ ... topic := 'Earth',
+ ... content := 'Skies on Earth are blue.'
+ ... }
+
+ Results of calling ``to_context``:
+
+ .. code-block:: edgeql-repl
+
+ db> select ext::ai::to_context(Astronomy);
+
+ {'Mars Skies on Mars are red.', 'Earth Skies on Earth are blue.'}
+
+
+------------
+
+
+.. eql:function:: ext::ai::search( \
+ object: anyobject, \
+ query: array \
+ ) -> optional tuple
+
+ Searches objects using their :ref:`ai::index
+ `.
+
+ Returns tuples of (object, distance).
+
+ .. note::
+
+ The ``query`` argument should *not* be a textual query but the
+ embeddings generated *from* a textual query.
+
+ .. code-block:: edgeql-repl
+
+ db> with query := >$query
+ ... select ext::ai::search(Knowledge, query);
+
+ {
+ (
+ object := default::Knowledge {id: 9af0d0e8-0880-11ef-9b6b-4335855251c4},
+ distance := 0.20410746335983276
+ ),
+ (
+ object := default::Knowledge {id: eeacf638-07f6-11ef-b9e9-57078acfce39},
+ distance := 0.7843298847773637
+ ),
+ (
+ object := default::Knowledge {id: f70863c6-07f6-11ef-b9e9-3708318e69ee},
+ distance := 0.8560434728860855
+ ),
+ }
+
+
+Types
+=====
+
+Provider Configuration Types
+----------------------------
+
+.. list-table::
+ :class: funcoptable
+
+ * - :eql:type:`ext::ai::ProviderAPIStyle`
+ - Enum defining supported API styles
+
+ * - :eql:type:`ext::ai::ProviderConfig`
+ - Abstract base configuration for AI providers.
+
+
+Provider configurations are required for AI indexes and RAG functionality.
+
+Example provider configuration:
+
+.. code-block:: edgeql
+
+ configure current database
+ insert ext::ai::OpenAIProviderConfig {
+ secret := 'sk-....',
+ };
+
+.. note::
+
+ All provider types require the ``secret`` property be set with a string
+ containing the secret provided by the AI vendor.
+
+
+.. note::
+
+ ``ext::ai::CustomProviderConfig requires an ``api_style`` property be set.
+
+
+---------
+
+
+.. eql:type:: ext::ai::ProviderAPIStyle
+
+ Enum defining supported API styles:
+
+ * ``OpenAI``
+ * ``Anthropic``
+
+
+---------
+
+
+.. eql:type:: ext::ai::ProviderConfig
+
+ Abstract base configuration for AI providers.
+
+ Properties:
+
+ * ``name``: str (Required) - Unique provider identifier
+ * ``display_name``: str (Required) - Human-readable name
+ * ``api_url``: str (Required) - Provider API endpoint
+ * ``client_id``: str (Optional) - Provider-supplied client ID
+ * ``secret``: str (Required) - Provider API secret
+ * ``api_style``: ProviderAPIStyle (Required) - Provider's API style
+
+ Provider-specific types:
+
+ * ``ext::ai::OpenAIProviderConfig``
+ * ``ext::ai::MistralProviderConfig``
+ * ``ext::ai::AnthropicProviderConfig``
+ * ``ext::ai::CustomProviderConfig``
+
+ Each inherits from :eql:type:`ext::ai::ProviderConfig` with provider-specific defaults.
+
+
+Model Types
+-----------
+
+.. list-table::
+ :class: funcoptable
+
+ * - :eql:type:`ext::ai::Model`
+ - Abstract base type for AI models.
+
+ * - :eql:type:`ext::ai::EmbeddingModel`
+ - Abstract type for embedding models.
+
+ * - :eql:type:`ext::ai::TextGenerationModel`
+ - Abstract type for text generation models.
+
+---------
+
+.. eql:type:: ext::ai::Model
+
+ Abstract base type for AI models.
+
+ Annotations:
+ * ``model_name`` - Model identifier
+ * ``model_provider`` - Provider identifier
+
+---------
+
+.. eql:type:: ext::ai::EmbeddingModel
+
+ Abstract type for embedding models.
+
+ Annotations:
+ * ``embedding_model_max_input_tokens`` - Maximum tokens per input
+ * ``embedding_model_max_batch_tokens`` - Maximum tokens per batch
+ * ``embedding_model_max_output_dimensions`` - Maximum embedding dimensions
+ * ``embedding_model_supports_shortening`` - Input shortening support flag
+
+
+---------
+
+.. eql:type:: ext::ai::TextGenerationModel
+
+ Abstract type for text generation models.
+
+ Annotations:
+ * ``text_gen_model_context_window`` - Model's context window size
+
+
+Indexing Types
+--------------
+
+.. list-table::
+ :class: funcoptable
+
+ * - :eql:type:`ext::ai::DistanceFunction`
+ - Enum for similarity metrics.
+
+ * - :eql:type:`ext::ai::IndexType`
+ - Enum for index implementations.
+
+---------
+
+.. eql:type:: ext::ai::DistanceFunction
+
+ Enum for similarity metrics.
+
+ * ``Cosine``
+ * ``InnerProduct``
+ * ``L2``
+
+---------
+
+.. eql:type:: ext::ai::IndexType
+
+ Enum for index implementations.
+
+ * ``HNSW``
+
+
+
+Prompt Types
+------------
+
+.. list-table::
+ :class: funcoptable
+
+ * - :eql:type:`ext::ai::ChatParticipantRole`
+ - Enum for chat roles.
+
+ * - :eql:type:`ext::ai::ChatPromptMessage`
+ - Type for chat prompt messages.
+
+ * - :eql:type:`ext::ai::ChatPrompt`
+ - Type for chat prompt configuration.
+
+Example custom prompt configuration:
+
+.. code-block:: edgeql
+
+ insert ext::ai::ChatPrompt {
+ name := 'test-prompt',
+ messages := (
+ insert ext::ai::ChatPromptMessage {
+ participant_role := ext::ai::ChatParticipantRole.System,
+ content := "Your message content"
+ }
+ )
+ };
+
+
+---------
+
+.. eql:type:: ext::ai::ChatParticipantRole
+
+ Enum for chat roles.
+
+ * ``System``
+ * ``User``
+ * ``Assistant``
+ * ``Tool``
+
+---------
+
+.. eql:type:: ext::ai::ChatPromptMessage
+
+ Type for chat prompt messages.
+
+ Properties:
+ * ``participant_role``: ChatParticipantRole (Required)
+ * ``participant_name``: str (Optional)
+ * ``content``: str (Required)
+
+---------
+
+.. eql:type:: ext::ai::ChatPrompt
+
+ Type for chat prompt configuration.
+
+ Properties:
+ * ``name``: str (Required)
+ * ``messages``: set of ChatPromptMessage (Required)
+
diff --git a/docs/ai/reference_http.rst b/docs/ai/reference_http.rst
new file mode 100644
index 00000000000..44d86a95bd4
--- /dev/null
+++ b/docs/ai/reference_http.rst
@@ -0,0 +1,382 @@
+.. _ref_ai_http_reference:
+
+=====================
+AI HTTP API Reference
+=====================
+
+:edb-alt-title: AI Extension HTTP API
+
+.. note::
+
+ All |Gel| server HTTP endpoints require :ref:`authentication
+ `, such as `HTTP Basic Authentication
+ `_
+ with Gel username and password.
+
+
+Embeddings
+==========
+
+``POST``: ``https://:/branch//ai/embeddings``
+
+Generates text embeddings using the specified embeddings model.
+
+
+Request headers
+---------------
+
+* ``Content-Type: application/json`` (required)
+
+
+Request body
+------------
+
+.. code-block:: json
+
+ {
+ "model": string, // Required: Name of the embedding model
+ "inputs": string[], // Required: Array of texts to embed
+ "dimensions": number, // Optional: Number of dimensions to truncate to
+ "user": string // Optional: User identifier
+ }
+
+* ``input`` (array of strings or a single string, required): The text to use as
+ the basis for embeddings generation.
+
+* ``model`` (string, required): The name of the embedding model to use. You may
+ use any of the supported :ref:`embedding models
+ `.
+
+
+Example request
+---------------
+
+.. code-block:: bash
+
+ curl --user : --json '{
+ "input": "What color is the sky on Mars?",
+ "model": "text-embedding-3-small"
+ }' http://localhost:10931/branch/main/ai/embeddings
+
+
+Response
+--------
+
+* **HTTP status**: 200 OK
+* **Content-Type**: application/json
+* **Body**:
+
+
+.. code-block:: json
+
+ {
+ "object": "list",
+ "data": [
+ {
+ "object": "embedding",
+ "index": 0,
+ "embedding": [-0.009434271, 0.009137661]
+ }
+ ],
+ "model": "text-embedding-3-small",
+ "usage": {
+ "prompt_tokens": 8,
+ "total_tokens": 8
+ }
+ }
+
+.. note::
+
+ The ``embedding`` property is shown here with only two values for brevity,
+ but an actual response would contain many more values.
+
+
+Error response
+--------------
+
+* **HTTP status**: 400 Bad Request
+* **Content-Type**: application/json
+* **Body**:
+
+ .. code-block:: json
+
+ {
+ "message": "missing or empty required \"model\" value in request",
+ "type": "BadRequestError"
+ }
+
+RAG
+===
+
+``POST``: ``https://:/branch//ai/rag``
+
+Performs retrieval-augmented text generation using the specified model based on
+the provided text query and the database content selected using similarity
+search.
+
+
+Request headers
+---------------
+
+* ``Content-Type: application/json`` (required)
+
+
+Request body
+------------
+
+.. code-block:: json
+
+ {
+ "context": {
+ "query": string, // Required: EdgeQL query for context retrieval
+ "variables": object, // Optional: Query variables
+ "globals": object, // Optional: Query globals
+ "max_object_count": number // Optional: Max objects to retrieve (default: 5)
+ },
+ "model": string, // Required: Name of the generation model
+ "query": string, // Required: User query
+ "stream": boolean, // Optional: Enable streaming (default: false)
+ "prompt": {
+ "name": string, // Optional: Name of predefined prompt
+ "id": string, // Optional: ID of predefined prompt
+ "custom": [ // Optional: Custom prompt messages
+ {
+ "role": string, // "system"|"user"|"assistant"|"tool"
+ "content": string|object,
+ "tool_call_id": string,
+ "tool_calls": array
+ }
+ ]
+ },
+ "temperature": number, // Optional: Sampling temperature
+ "top_p": number, // Optional: Nucleus sampling parameter
+ "max_tokens": number, // Optional: Maximum tokens to generate
+ "seed": number, // Optional: Random seed
+ "safe_prompt": boolean, // Optional: Enable safety features
+ "top_k": number, // Optional: Top-k sampling parameter
+ "logit_bias": object, // Optional: Token biasing
+ "logprobs": number, // Optional: Return token log probabilities
+ "user": string // Optional: User identifier
+ }
+
+
+* ``model`` (string, required): The name of the text generation model to use.
+
+
+* ``query`` (string, required): The query string use as the basis for text
+ generation.
+
+* ``context`` (object, required): Settings that define the context of the
+ query.
+
+ * ``query`` (string, required): Specifies an expression to determine the
+ relevant objects and index to serve as context for text generation. You may
+ set this to any expression that produces a set of objects, even if it is
+ not a standalone query.
+
+ * ``variables`` (object, optional): A dictionary of variables for use in the
+ context query.
+
+ * ``globals`` (object, optional): A dictionary of globals for use in the
+ context query.
+
+ * ``max_object_count`` (int, optional): Maximum number of objects to return;
+ default is 5.
+
+* ``stream`` (boolean, optional): Specifies whether the response should be
+ streamed. Defaults to false.
+
+* ``prompt`` (object, optional): Settings that define a prompt. Omit to use the
+ default prompt.
+
+ You may specify an existing prompt by its ``name`` or ``id``, you may define
+ a custom prompt inline by sending an array of objects, or you may do both to
+ augment an existing prompt with additional custom messages.
+
+ * ``name`` (string, optional) or ``id`` (string, optional): The ``name`` or
+ ``id`` of an existing custom prompt to use. Provide only one of these if
+ you want to use or start from an existing prompt.
+
+ * ``custom`` (array of objects, optional): Custom prompt messages, each
+ containing a ``role`` and ``content``. If no ``name`` or ``id`` was
+ provided, the custom messages provided here become the prompt. If one of
+ those was provided, these messages will be added to that existing prompt.
+
+
+Example request
+---------------
+
+.. code-block::
+
+ curl --user : --json '{
+ "query": "What color is the sky on Mars?",
+ "model": "gpt-4-turbo-preview",
+ "context": {"query":"Knowledge"}
+ }' http://:/branch/main/ai/rag
+
+
+Response
+--------
+
+* **HTTP status**: 200 OK
+* **Content-Type**: application/json
+* **Body**:
+
+ .. code-block:: json
+
+ {"response": "The sky on Mars is red."}
+
+Error response
+--------------
+
+* **HTTP status**: 400 Bad Request
+* **Content-Type**: application/json
+* **Body**:
+
+ .. code-block:: json
+
+ {
+ "message": "missing required 'query' in request 'context' object",
+ "type": "BadRequestError"
+ }
+
+
+Streaming response (SSE)
+------------------------
+
+When the ``stream`` parameter is set to ``true``, the server uses `Server-Sent
+Events
+`__
+(SSE) to stream responses. Here is a detailed breakdown of the typical
+sequence and structure of events in a streaming response:
+
+* **HTTP Status**: 200 OK
+* **Content-Type**: text/event-stream
+* **Cache-Control**: no-cache
+
+The stream consists of a sequence of five events, each encapsulating part of
+the response in a structured format:
+
+1. **Message start**
+
+ * Event type: ``message_start``
+
+ * Data: Starts a message, specifying identifiers and roles.
+
+ .. code-block:: json
+
+ {
+ "type": "message_start",
+ "message": {
+ "id": "",
+ "role": "assistant",
+ "model": ""
+ }
+ }
+
+2. **Content block start**
+
+ * Event type: ``content_block_start``
+
+ * Data: Marks the beginning of a new content block.
+
+ .. code-block:: json
+
+ {
+ "type": "content_block_start",
+ "index": 0,
+ "content_block": {
+ "type": "text",
+ "text": ""
+ }
+ }
+
+3. **Content block delta**
+
+ * Event type: ``content_block_delta``
+
+ * Data: Incrementally updates the content, appending more text to the
+ message.
+
+ .. code-block:: json
+
+ {
+ "type": "content_block_delta",
+ "index": 0,
+ "delta": {
+ "type": "text_delta",
+ "text": "The"
+ }
+ }
+
+ Subsequent ``content_block_delta`` events add more text to the message.
+
+4. **Content block stop**
+
+ * Event type: ``content_block_stop``
+
+ * Data: Marks the end of a content block.
+
+ .. code-block:: json
+
+ {
+ "type": "content_block_stop",
+ "index": 0
+ }
+
+5. **Message stop**
+
+ * Event type: ``message_stop``
+
+ * Data: Marks the end of the message.
+
+ .. code-block:: json
+
+ {"type": "message_stop"}
+
+Each event is sent as a separate SSE message, formatted as shown above. The
+connection is closed after all events are sent, signaling the end of the
+stream.
+
+**Example SSE response**
+
+.. code-block::
+ :class: collapsible
+
+ event: message_start
+ data: {"type": "message_start", "message": {"id": "chatcmpl-9MzuQiF0SxUjFLRjIdT3mTVaMWwiv", "role": "assistant", "model": "gpt-4-0125-preview"}}
+
+ event: content_block_start
+ data: {"type": "content_block_start","index":0,"content_block":{"type":"text","text":""}}
+
+ event: content_block_delta
+ data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": "The"}}
+
+ event: content_block_delta
+ data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " skies"}}
+
+ event: content_block_delta
+ data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " on"}}
+
+ event: content_block_delta
+ data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " Mars"}}
+
+ event: content_block_delta
+ data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " are"}}
+
+ event: content_block_delta
+ data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": " red"}}
+
+ event: content_block_delta
+ data: {"type": "content_block_delta","index":0,"delta":{"type": "text_delta", "text": "."}}
+
+ event: content_block_stop
+ data: {"type": "content_block_stop","index":0}
+
+ event: message_delta
+ data: {"type": "message_delta", "delta": {"stop_reason": "stop"}}
+
+ event: message_stop
+ data: {"type": "message_stop"}
+
+
diff --git a/docs/ai/python.rst b/docs/ai/reference_python.rst
similarity index 84%
rename from docs/ai/python.rst
rename to docs/ai/reference_python.rst
index 19d0af9a791..fc0b1cecb87 100644
--- a/docs/ai/python.rst
+++ b/docs/ai/reference_python.rst
@@ -1,87 +1,61 @@
-.. _ref_ai_python:
+.. _ref_ai_python_reference:
-======
-Python
-======
+=============
+AI Python API
+=============
-:edb-alt-title: Gel AI's Python package
+:edb-alt-title: AI Extension Python API
The ``gel.ai`` package is an optional binding of the AI extension in |Gel|.
-To use the AI binding, you need to install ``gel`` Python package with the
-``ai`` extra dependencies:
.. code-block:: bash
$ pip install 'gel[ai]'
-Usage
-=====
+Blocking and async API
+======================
-Start by importing ``gel`` and ``gel.ai``:
+The AI binding is built on top of the regular |Gel| client objects, providing
+both blocking and asynchronous versions of its API.
+
+**Blocking client example**:
.. code-block:: python
import gel
import gel.ai
-
-Blocking
---------
-
-The AI binding is built on top of the regular |Gel| client objects, providing
-both blocking and asynchronous versions of its API. For example, a blocking AI
-client is initialized like this:
-
-.. code-block:: python
-
client = gel.create_client()
- gpt4ai = gel.ai.create_ai(
+ gpt4ai = gel.ai.create_rag_client(
client,
model="gpt-4-turbo-preview"
)
-Add your query as context:
-
-.. code-block:: python
-
astronomy_ai = gpt4ai.with_context(
query="Astronomy"
)
-The default text generation prompt will ask your selected provider to limit
-answer to information provided in the context and will pass the queried
-objects' AI index as context along with that prompt.
-
-Call your AI client's ``query_rag`` method, passing in a text query.
-
-.. code-block:: python
-
print(
astronomy_ai.query_rag("What color is the sky on Mars?")
);
-or stream back the results by using ``stream_rag`` instead:
-
-.. code-block:: python
-
for data in astronomy_ai.stream_rag("What color is the sky on Mars?"):
print(data)
-Async
------
-
-To use an async client instead, do this:
+**Async client example**:
.. code-block:: python
- import asyncio # alongside the Gel imports
+ import gel
+ import gel.ai
+ import asyncio
client = gel.create_async_client()
async def main():
- gpt4ai = await gel.ai.create_async_ai(
+ gpt4ai = await gel.ai.create_async_rag_client(
client,
model="gpt-4-turbo-preview"
)
@@ -100,12 +74,12 @@ To use an async client instead, do this:
asyncio.run(main())
-API reference
-=============
+Factory functions
+=================
-.. py:function:: create_ai(client, **kwargs) -> GelAI
+.. py:function:: create_rag_client(client, **kwargs) -> RAGClient
- Creates an instance of ``GelAI`` with the specified client and options.
+ Creates an instance of ``RAGClient`` with the specified client and options.
This function ensures that the client is connected before initializing the
AI with the specified options.
@@ -114,7 +88,7 @@ API reference
A |Gel| client instance.
:param kwargs:
- Keyword arguments that are passed to the ``AIOptions`` data class to
+ Keyword arguments that are passed to the ``RAGOptions`` data class to
configure AI-specific options. These options are:
* ``model``: The name of the model to be used. (required)
@@ -122,9 +96,9 @@ API reference
``None`` will result in the client using the default prompt.
(default: ``None``)
-.. py:function:: create_async_ai(client, **kwargs) -> AsyncGelAI
+.. py:function:: create_async_rag_client(client, **kwargs) -> AsyncRAGClient
- Creates an instance of ``AsyncGelAI`` w/ the specified client & options.
+ Creates an instance of ``AsyncRAGClient`` w/ the specified client & options.
This function ensures that the client is connected asynchronously before
initializing the AI with the specified options.
@@ -133,21 +107,17 @@ API reference
An asynchronous |Gel| client instance.
:param kwargs:
- Keyword arguments that are passed to the ``AIOptions`` data class to
+ Keyword arguments that are passed to the ``RAGOptions`` data class to
configure AI-specific options. These options are:
* ``model``: The name of the model to be used. (required)
* ``prompt``: An optional prompt to guide the model's behavior. (default: None)
-AI client classes
------------------
-
+Core classes
+============
-BaseGelAI
-^^^^^^^^^
-
-.. py:class:: BaseGelAI
+.. py:class:: BaseRAGClient
The base class for |Gel| AI clients.
@@ -158,7 +128,7 @@ BaseGelAI
these methods are available on an AI client of either type.
:ivar options:
- An instance of :py:class:`AIOptions`, storing the AI options.
+ An instance of :py:class:`RAGOptions`, storing the RAG options.
:ivar context:
An instance of :py:class:`QueryContext`, storing the context for AI
@@ -210,10 +180,7 @@ BaseGelAI
objects returned by the query.
-GelAI
-^^^^^
-
-.. py:class:: GelAI
+.. py:class:: RAGClient
A synchronous class for creating |Gel| AI clients.
@@ -253,11 +220,17 @@ GelAI
the query. If not provided, uses the default context of this AI client
instance.
+.. py:method:: generate_embeddings(*inputs: str, model: str) -> list[float]
+
+ Generates embeddings for input texts.
+
+ :param inputs:
+ Input texts.
+ :param model:
+ The embedding model to use
-AsyncGelAI
-^^^^^^^^^^
-.. py:class:: AsyncGelAI
+.. py:class:: AsyncRAGClient
An asynchronous class for creating |Gel| AI clients.
@@ -301,9 +274,19 @@ AsyncGelAI
the query. If not provided, uses the default context of this AI client
instance.
+.. py:method:: generate_embeddings(*inputs: str, model: str) -> list[float]
+ :noindex:
-Other classes
--------------
+ Generates embeddings for input texts.
+
+ :param inputs:
+ Input texts.
+ :param model:
+ The embedding model to use
+
+
+Configuration classes
+=====================
.. py:class:: ChatParticipantRole
@@ -343,9 +326,9 @@ Other classes
role-specific content within the prompt.
-.. py:class:: AIOptions
+.. py:class:: RAGOptions
- A data class for AI options, specifying model and prompt settings.
+ A data class for RAG options, specifying model and prompt settings.
:ivar model:
The name of the AI model.
@@ -354,8 +337,8 @@ Other classes
the model.
:method derive(kwargs):
- Creates a new instance of :py:class:`AIOptions` by merging existing options
- with provided keyword arguments. Returns a new :py:class:`AIOptions`
+ Creates a new instance of :py:class:`RAGOptions` by merging existing options
+ with provided keyword arguments. Returns a new :py:class:`RAGOptions`
instance with updated attributes.
:param kwargs:
@@ -414,3 +397,4 @@ Other classes
:method to_httpx_request():
Converts the RAGRequest into a dictionary suitable for making an HTTP
request using the httpx library.
+
diff --git a/docs/datamodel/access_policies.rst b/docs/datamodel/access_policies.rst
index 184994804ad..e94e14013ea 100644
--- a/docs/datamodel/access_policies.rst
+++ b/docs/datamodel/access_policies.rst
@@ -7,12 +7,23 @@ Access Policies
.. index:: access policy, object-level security, row-level security, RLS,
allow, deny, using
-Object types can contain security policies that restrict the set of objects
-that can be selected, inserted, updated, or deleted by a particular query.
-This is known as *object-level security* and it is similar in function to SQL's
-row-level security.
+Object types in |Gel| can contain security policies that restrict the set of
+objects that can be selected, inserted, updated, or deleted by a particular
+query. This is known as *object-level security* and is similar in function
+to SQL's row-level security.
-Let's start with a simple schema for a blog without any access policies.
+When no access policies are defined, object-level security is not activated:
+any properly authenticated client can carry out any operation on any object
+in the database. Access policies allow you to ensure that the database itself
+handles access control logic rather than having to implement it in every
+application or service that connects to your database.
+
+Access policies can greatly simplify your backend code, centralizing access
+control logic in a single place. They can also be extremely useful for
+implementing AI agentic flows, where you want to have guardrails around
+your data that agents can't break.
+
+We'll illustrate access policies in this document with this simple schema:
.. code-block:: sdl
@@ -25,37 +36,43 @@ Let's start with a simple schema for a blog without any access policies.
required author: User;
}
-When no access policies are defined, object-level security is not activated.
-Any properly authenticated client can carry out any operation on any object
-in the database. At the moment, we would need to ensure that the app handles
-the logic to restrict users from accessing other users' posts. Access
-policies allow us to ensure that the database itself handles this logic,
-thereby freeing us up from implementing access control in each and every
-piece of software that accesses the data.
.. warning::
- Once a policy is added to a particular object type, **all operations**
- (``select``, ``insert``, ``delete``, ``update``, etc.) on any object of
- that type are now *disallowed by default* unless specifically allowed by an
- access policy! See the subsection on resolution order below for details.
-
-Defining a global
-^^^^^^^^^^^^^^^^^
-
-Global variables are the a convenient way to provide the context needed to
-determine what sort of access should be allowed for a given object, as they
-can be set and reset by the application as needed.
-
-To start, we'll add two global variables to our schema. We'll use one global
-``uuid`` to represent the identity of the user executing the query, and an
-enum for the other to represent the type of country that the user is currently
-in. The enum represents three types of countries: those where the service has
-not been rolled out, those with read-only access, and those with full access.
-A global makes sense in this case because a user's current country is
-context-specific: the same user who can access certain content in one country
-might not be able to in another country due to different legal frameworks
-(such as copyright length).
+ Once a policy is added to a particular object type, **all operations**
+ (``select``, ``insert``, ``delete``, ``update``, etc.) on any object of
+ that type are now *disallowed by default* unless specifically allowed by
+ an access policy! See :ref:`resolution order `
+ below for details.
+
+Global variables
+================
+
+Global variables are a convenient way to set up the context for your access
+policies. Gel's global variables are tightly integrated with the Gel's
+data model, client APIs, EdgeQL and SQL, and the tooling around them.
+
+Global variables in Gel are not pre-defined. Users are free to define
+as many globals in their schema as they want to represent the business
+logic of their application.
+
+A common scenario is storing a ``current_user`` global representing
+the user executing queries. We'd like to have a slightly more complex example
+showing that you can use more than one global variable. Let's do that:
+
+* We'll use one *global* ``uuid`` to represent the identity of the user
+ executing the query.
+* We'll have the ``Country`` *enum* to represent the type of country
+ that the user is currently in. The enum represents three types of
+ countries: those where the service has not been rolled out, those with
+ read-only access, and those with full access.
+* We'll use the ``current_country`` *global* to represent the user's
+ current country. In our *example schema*, we want *country* to be
+ context-specific: the same user who can access certain content in one
+ country might not be able to in another country (let's imagine that's
+ due to different country-specific legal frameworks).
+
+Here is an illustration:
.. code-block:: sdl-diff
@@ -74,8 +91,7 @@ might not be able to in another country due to different legal frameworks
required author: User;
}
-The value of these global variables is attached to the *client* you use to
-execute queries. The exact API depends on which client library you're using:
+You can set and reset these globals in Gel client libraries, for example:
.. tabs::
@@ -83,11 +99,17 @@ execute queries. The exact API depends on which client library you're using:
import createClient from 'gel';
- const client = createClient().withGlobals({
+ const client = createClient();
+
+ // 'authedClient' will share the network connection with 'client',
+ // but will have the 'current_user' global set.
+ const authedClient = client.withGlobals({
current_user: '2141a5b4-5634-4ccc-b835-437863534c51',
});
- await client.query(`select global current_user;`);
+ const result = await authedClient.query(
+ `select global current_user;`);
+ console.log(result);
.. code-tab:: python
@@ -100,6 +122,7 @@ execute queries. The exact API depends on which client library you're using:
result = client.query("""
select global current_user;
""")
+ print(result)
.. code-tab:: go
@@ -166,66 +189,53 @@ execute queries. The exact API depends on which client library you're using:
.expect("Returning value");
-Defining a policy
-^^^^^^^^^^^^^^^^^
+Defining policies
+=================
-Let's add two policies to our sample schema.
+A policy example for our simple blog schema might look like:
.. code-block:: sdl-diff
- global current_user: uuid;
- required global current_country: Country {
- default := Country.None
- }
- scalar type Country extending enum;
+ global current_user: uuid;
+ required global current_country: Country {
+ default := Country.None
+ }
+ scalar type Country extending enum;
- type User {
- required email: str { constraint exclusive; }
- }
+ type User {
+ required email: str { constraint exclusive; }
+ }
- type BlogPost {
- required title: str;
- required author: User;
+ type BlogPost {
+ required title: str;
+ required author: User;
- + access policy author_has_full_access
- + allow all
- + using (global current_user ?= .author.id
- + and global current_country ?= Country.Full) {
- + errmessage := "User does not have full access";
- + }
- + access policy author_has_read_access
- + allow select
- + using (global current_user ?= .author.id
- + and global current_country ?= Country.ReadOnly);
- }
+ + access policy author_has_full_access
+ + allow all
+ + using (global current_user ?= .author.id
+ + and global current_country ?= Country.Full) {
+ + errmessage := "User does not have full access";
+ + }
-Let's break down the access policy syntax piece-by-piece. These policies grant
-full read-write access (``all``) to the ``author`` of each ``BlogPost``, if
-the author is in a country that allows full access to the service. Otherwise,
-the same author will be restricted to either read-only access or no access at
-all, depending on the country.
+ + access policy author_has_read_access
+ + allow select
+ + using (global current_user ?= .author.id
+ + and global current_country ?= Country.ReadOnly);
+ }
-.. note::
+Explanation:
+
+- ``access policy `` introduces a new policy in an object type.
+- ``allow all`` grants ``select``, ``insert``, ``update``, and ``delete``
+ access if the condition passes. We also used a separate policy to allow
+ only ``select`` in some cases.
+- ``using ()`` is a boolean filter restricting the set of objects to
+ which the policy applies. (We used the coalescing operator ``?=`` to
+ handle empty sets gracefully.)
+- ``errmessage`` is an optional custom message to display in case of a write
+ violation.
- We're using the *coalescing equality* operator ``?=`` because it returns
- ``false`` even if one of its arguments is an empty set.
-
-- ``access policy``: The keyword used to declare a policy inside an object
- type.
-- ``author_has_full_access`` and ``author_has_read_access``: The names of these
- policies; could be any string.
-- ``allow``: The kind of policy; could be ``allow`` or ``deny``
-- ``all``: The set of operations being allowed/denied; a comma-separated list
- of any number of the following: ``all``, ``select``, ``insert``, ``delete``,
- ``update``, ``update read``, and ``update write``.
-- ``using ()``: A boolean expression. Think of this as a ``filter``
- expression that defines the set of objects to which the policy applies.
-- ``errmessage``: Here we have added an error message that will be shown in
- case the policy expression returns ``false``. We could have added other
- annotations of our own inside this code block instead of, or in addition
- to ``errmessage``.
-
-Let's do some experiments.
+Let's run some experiments in the REPL:
.. code-block:: edgeql-repl
@@ -242,58 +252,22 @@ Let's do some experiments.
... };
{default::BlogPost {id: e76afeae-03db-11ed-b346-fbb81f537ca6}}
-We've created a ``User``, set the value of ``current_user`` to its ``id``, the
-country to ``Country.Full``, and created a new ``BlogPost``. When we try to
-select all ``BlogPost`` objects, we'll see the post we just created.
-
-.. code-block:: edgeql-repl
-
- db> select BlogPost;
- {default::BlogPost {id: e76afeae-03db-11ed-b346-fbb81f537ca6}}
- db> select count(BlogPost);
- {1}
-
-Next, let's test what happens when the same user is in two other countries:
-one that allows read-only access to our app, and another where we haven't
-yet been given permission to roll out our service.
+Because the user is in a "full access" country and the current user ID
+matches the author, the new blog post is permitted. When the same user sets
+``global current_country := Country.ReadOnly;``:
.. code-block:: edgeql-repl
db> set global current_country := Country.ReadOnly;
OK: SET GLOBAL
db> select BlogPost;
- {default::BlogPost {id: dd274432-94ff-11ee-953e-0752e8ad3010}}
+ {default::BlogPost {id: e76afeae-03db-11ed-b346-fbb81f537ca6}}
db> insert BlogPost {
... title := "My second post",
... author := (select User filter .id = global current_user)
... };
gel error: AccessPolicyError: access policy violation on
insert of default::BlogPost (User does not have full access)
- db> set global current_country := Country.None;
- OK: SET GLOBAL
- db> select BlogPost;
- {}
-
-Note that for a ``select`` operation, the access policy works as a filter
-by simply returning an empty set. Meanwhile, when attempting an ``insert``
-operation, the operation may or may not work and thus we have provided a
-helpful error message in the access policy to give users a heads up on what
-went wrong.
-
-Now let's move back to a country with full access, but set the
-``global current_user`` to some other id: a new user that has yet to write
-any blog posts. Now the number of ``BlogPost`` objects returned via
-the ``count`` function is zero:
-
-.. code-block:: edgeql-repl
-
- db> set global current_country := Country.Full;
- OK: SET GLOBAL
- db> set global current_user :=
- ... 'd1c64b84-8e3c-11ee-86f0-d7ddecf3e9bd';
- OK: SET GLOBAL
- db> select count(BlogPost);
- {0}
Finally, let's unset ``current_user`` and see how many blog posts are returned
when we count them.
@@ -314,76 +288,43 @@ When ``current_user`` has no value or has a different value from the
But thanks to ``Country`` being set to ``Country.Full``, this user will be
able to write a new blog post.
-The access policies use global variables to define a "subgraph" of data that
-is visible to a particular query.
+**The bottom line:** access policies use global variables to define a
+"subgraph" of data that is visible to your queries.
+
Policy types
-^^^^^^^^^^^^
+============
-.. index:: accesss policy, select, insert, delete, update, update read,
+.. index:: access policy, select, insert, delete, update, update read,
update write, all
-For the most part, the policy types correspond to EdgeQL's *statement types*:
-
-- ``select``: Applies to all queries; objects without a ``select`` permission
- cannot be modified either.
-- ``insert``: Applies to insert queries; executed *post-insert*. If an
- inserted object violates the policy, the query will fail.
-- ``delete``: Applies to delete queries.
-- ``update``: Applies to update queries.
-
-Additionally, the ``update`` operation can be broken down into two
-sub-policies: ``update read`` and ``update write``.
+The types of policy rules map to the statement type in EdgeQL:
-- ``update read``: This policy restricts *which* objects can be updated. It
- runs *pre-update*; that is, this policy is executed before the updates have
- been applied. As a result, an empty set is returned on an ``update read``
- when a query lacks access to perform the operation.
-- ``update write``: This policy restricts *how* you update the objects; you
- can think of it as a *post-update* validity check. As a result, an error
- is returned on an ``update write`` when a query lacks access to perform
- the operation. Preventing a ``User`` from transferring a ``BlogPost`` to
- another ``User`` is one example of an ``update write`` access policy.
-
-Finally, there's an umbrella policy that can be used as a shorthand for all
-the others.
-
-- ``all``: A shorthand policy that can be used to allow or deny full read/
- write permissions. Exactly equivalent to ``select, insert, update, delete``.
+- ``select``: Controls which objects are visible to any query.
+- ``insert``: Post-insert check. If the inserted object violates the policy,
+ the operation fails.
+- ``delete``: Controls which objects can be deleted.
+- ``update read``: Pre-update check on which objects can be updated at all.
+- ``update write``: Post-update check for how objects can be updated.
+- ``all``: Shorthand for granting or denying ``select, insert, update,
+ delete``.
Resolution order
-^^^^^^^^^^^^^^^^
-
-An object type can contain an arbitrary number of access policies, including
-several conflicting ``allow`` and ``deny`` policies. |Gel| uses a particular
-algorithm for resolving these policies.
-
-.. figure:: images/ols.png
-
- The access policy resolution algorithm, explained with Venn diagrams.
+================
-1. When no policies are defined on a given object type, all objects of that
- type can be read or modified by any appropriately authenticated connection.
+If multiple policies apply (some are ``allow`` and some are ``deny``), the
+logic is:
-2. Gel then applies all ``allow`` policies. Each policy grants a
- *permission* that is scoped to a particular *set of objects* as defined by
- the ``using`` clause. Conceptually, these permissions are merged with
- the ``union`` / ``or`` operator to determine the set of allowable actions.
+1. If there are no policies, access is allowed.
+2. All ``allow`` policies collectively form a *union* / *or* of allowed sets.
+3. All ``deny`` policies *subtract* from that union, overriding allows!
+4. The final set of objects is the intersection of the above logic for each
+ operation: ``select, insert, update read, update write, delete``.
-3. After the ``allow`` policies are resolved, the ``deny`` policies can be
- used to carve out exceptions to the ``allow`` rules. Deny rules *supersede*
- allow rules! As before, the set of objects targeted by the policy is
- defined by the ``using`` clause.
-
-4. This results in the final access level: a set of objects targetable by each
- of ``select``, ``insert``, ``update read``, ``update write``, and
- ``delete``.
-
-Currently, by default the access policies affect the values visible
-in expressions of *other* access
-policies. This means that they can affect each other in various ways. Because
-of this, great care needs to be taken when creating access policies based on
-objects other than the ones they are defined on. For example:
+By default, once you define any policy on an object type, you must explicitly
+allow the operations you need. This is a common **pitfall** when you are
+starting out with access policies (but you will develop an intuition for this
+quickly). Let's look at an example:
.. code-block:: sdl
@@ -410,45 +351,34 @@ objects other than the ones they are defined on. For example:
using (global current_user ?= .author.id);
}
-In the above schema only the admin will see a non-empty ``author`` link,
-because only the admin can see any user objects at all. This means that
-instead of making ``BlogPost`` visible to its author, all non-admin authors
-won't be able to see their own posts. The above issue can be remedied by
-making the current user able to see their own ``User`` record.
-
-.. _ref_datamodel_access_policies_nonrecursive:
-.. _nonrecursive:
+In the above schema only admins will see a non-empty ``author`` link when
+running ``select BlogPost { author }``. Why? Because only admins can see
+``User`` objects at all: ``admin_only`` policy is the only one defined on
+the ``User`` type!
-.. note::
+This means that instead of making ``BlogPost`` visible to its author, all
+non-admin authors won't be able to see their own posts. The above issue can be
+remedied by making the current user able to see their own ``User`` record.
- Starting with |EdgeDB| 3.0, access policy restrictions will **not** apply to
- any access policy expression. This means that when reasoning about access
- policies it is no longer necessary to take other policies into account.
- Instead, all data is visible for the purpose of *defining* an access
- policy.
- This change is being made to simplify reasoning about access policies and
- to allow certain patterns to be express efficiently. Since those who have
- access to modifying the schema can remove unwanted access policies, no
- additional security is provided by applying access policies to each
- other's expressions.
+Interaction between policies
+============================
- It is possible (and recommended) to enable this :ref:`future
- ` behavior in |EdgeDB| 2.6 and later by adding the
- following to the schema: ``using future nonrecursive_access_policies;``
+Policy expressions themselves do not take other policies into account
+(since |EdgeDB| 3). This makes it easier to reason about policies.
Custom error messages
-^^^^^^^^^^^^^^^^^^^^^
+=====================
.. index:: access policy, errmessage, using
-When you run a query that attempts a write and is restricted by an access
-policy, you will get a generic error message.
+When an ``insert`` or ``update write`` violates an access policy, Gel will
+raise a generic ``AccessPolicyError``:
.. code-block::
- gel error: AccessPolicyError: access policy violation on insert of
-
+ gel error: AccessPolicyError: access policy violation
+ on insert of
.. note::
@@ -458,9 +388,8 @@ policy, you will get a generic error message.
simply won't get the data that is being restricted. Other operations
(``insert`` and ``update write``) will return an error message.
-If you have multiple access policies, it can be useful to know which policy is
-restricting your query and provide a friendly error message. You can do this
-by adding a custom error message to your policy.
+If multiple policies are in effect, it can be helpful to define a distinct
+``errmessage`` in your policy:
.. code-block:: sdl-diff
@@ -499,134 +428,126 @@ will receive this error:
gel error: AccessPolicyError: access policy violation on insert of
default::User (Only admins may query Users)
+
Disabling policies
-^^^^^^^^^^^^^^^^^^
+==================
.. index:: apply_access_policies
You may disable all access policies by setting the ``apply_access_policies``
:ref:`configuration parameter ` to ``false``.
-You may also toggle access policies using the "Disable Access Policies"
-checkbox in the "Config" dropdown in the Gel UI (accessible by running
-the CLI command :gelcmd:`ui` from inside your project). This is the most
-convenient way to temporarily disable access policies since it applies only to
-your UI session.
+You may also temporarily disable access policies using the Gel UI configuration
+checkbox (or via :gelcmd:`ui`), which only applies to your UI session.
+More examples
+=============
-Examples
-^^^^^^^^
+Here are some additional patterns:
-Blog posts are publicly visible if ``published`` but only writable by the
-author.
+1. Publicly visible blog posts, only writable by the author:
-.. code-block:: sdl-diff
-
- global current_user: uuid;
+ .. code-block:: sdl-diff
- type User {
- required email: str { constraint exclusive; }
- }
+ global current_user: uuid;
- type BlogPost {
- required title: str;
- required author: User;
- + required published: bool { default := false };
+ type User {
+ required email: str { constraint exclusive; }
+ }
- access policy author_has_full_access
- allow all
- using (global current_user ?= .author.id);
- + access policy visible_if_published
- + allow select
- + using (.published);
- }
+ type BlogPost {
+ required title: str;
+ required author: User;
+ + required published: bool { default := false };
-Blog posts are visible to friends but only modifiable by the author.
+ access policy author_has_full_access
+ allow all
+ using (global current_user ?= .author.id);
+ + access policy visible_if_published
+ + allow select
+ + using (.published);
+ }
-.. code-block:: sdl-diff
+2. Visible to friends, only modifiable by the author:
- global current_user: uuid;
+ .. code-block:: sdl-diff
- type User {
- required email: str { constraint exclusive; }
- + multi friends: User;
- }
+ global current_user: uuid;
- type BlogPost {
- required title: str;
- required author: User;
+ type User {
+ required email: str { constraint exclusive; }
+ + multi friends: User;
+ }
- access policy author_has_full_access
- allow all
- using (global current_user ?= .author.id);
- + access policy friends_can_read
- + allow select
- + using ((global current_user in .author.friends.id) ?? false);
- }
+ type BlogPost {
+ required title: str;
+ required author: User;
-Blog posts are publicly visible except to users that have been ``blocked`` by
-the author.
+ access policy author_has_full_access
+ allow all
+ using (global current_user ?= .author.id);
+ + access policy friends_can_read
+ + allow select
+ + using ((global current_user in .author.friends.id) ?? false);
+ }
-.. code-block:: sdl-diff
+3. Publicly visible except to those blocked by the author:
- type User {
- required email: str { constraint exclusive; }
- + multi blocked: User;
- }
+ .. code-block:: sdl-diff
- type BlogPost {
- required title: str;
- required author: User;
+ type User {
+ required email: str { constraint exclusive; }
+ + multi blocked: User;
+ }
- access policy author_has_full_access
- allow all
- using (global current_user ?= .author.id);
- + access policy anyone_can_read
- + allow select;
- + access policy exclude_blocked
- + deny select
- + using ((global current_user in .author.blocked.id) ?? false);
- }
+ type BlogPost {
+ required title: str;
+ required author: User;
+ access policy author_has_full_access
+ allow all
+ using (global current_user ?= .author.id);
+ + access policy anyone_can_read
+ + allow select;
+ + access policy exclude_blocked
+ + deny select
+ + using ((global current_user in .author.blocked.id) ?? false);
+ }
-"Disappearing" posts that become invisible after 24 hours.
+4. "Disappearing" posts that become invisible after 24 hours:
-.. code-block:: sdl-diff
+ .. code-block:: sdl-diff
- type User {
- required email: str { constraint exclusive; }
- }
+ type User {
+ required email: str { constraint exclusive; }
+ }
- type BlogPost {
- required title: str;
- required author: User;
- + required created_at: datetime {
- + default := datetime_of_statement() # non-volatile
- + }
+ type BlogPost {
+ required title: str;
+ required author: User;
+ + required created_at: datetime {
+ + default := datetime_of_statement() # non-volatile
+ + }
- access policy author_has_full_access
- allow all
- using (global current_user ?= .author.id);
- + access policy hide_after_24hrs
- + allow select
- + using (datetime_of_statement() - .created_at < '24 hours');
- }
+ access policy author_has_full_access
+ allow all
+ using (global current_user ?= .author.id);
+ + access policy hide_after_24hrs
+ + allow select
+ + using (
+ + datetime_of_statement() - .created_at < '24 hours'
+ + );
+ }
Super constraints
-*****************
-
-Access policies support arbitrary EdgeQL and can be used to define "super
-constraints". Policies on ``insert`` and ``update write`` can
-be thought of as post-write "validity checks"; if the check fails, the write
-will be rolled back.
-
-.. note::
+=================
- Due to an underlying Postgres limitation, :ref:`constraints on object types
- ` can only reference properties, not
- links.
+Access policies can act like "super constraints." For instance, a policy on
+``insert`` or ``update write`` can do a post-write validity check, rejecting
+the operation if a certain condition is not met.
-Here's a policy that limits the number of blog posts a ``User`` can post.
+E.g. here's a policy that limits the number of blog posts a
+``User`` can post:
.. code-block:: sdl-diff
@@ -647,9 +568,236 @@ Here's a policy that limits the number of blog posts a ``User`` can post.
+ using (count(.author.posts) > 500);
}
-.. list-table::
- :class: seealso
+.. _ref_eql_sdl_access_policies:
+.. _ref_eql_sdl_access_policies_syntax:
+
+Declaring access policies
+=========================
+
+This section describes the syntax to declare access policies in your schema.
+
+Syntax
+------
+
+.. sdl:synopsis::
+
+ access policy
+ [ when () ]
+ { allow | deny } [, ... ]
+ [ using () ]
+ [ "{"
+ [ errmessage := value ; ]
+ [ ]
+ "}" ] ;
+
+ # where is one of
+ all
+ select
+ insert
+ delete
+ update [{ read | write }]
+
+Where:
+
+:eql:synopsis:``
+ The name of the access policy.
+
+:eql:synopsis:`when ()`
+ Specifies which objects this policy applies to. The
+ :eql:synopsis:`` has to be a :eql:type:`bool` expression.
+
+ When omitted, it is assumed that this policy applies to all objects of a
+ given type.
+
+:eql:synopsis:`allow`
+ Indicates that qualifying objects should allow access under this policy.
+
+:eql:synopsis:`deny`
+ Indicates that qualifying objects should *not* allow access under this
+ policy. This flavor supersedes any :eql:synopsis:`allow` policy and can
+ be used to selectively deny access to a subset of objects that otherwise
+ explicitly allows accessing them.
+
+:eql:synopsis:`all`
+ Apply the policy to all actions. It is exactly equivalent to listing
+ :eql:synopsis:`select`, :eql:synopsis:`insert`, :eql:synopsis:`delete`,
+ :eql:synopsis:`update` actions explicitly.
+
+:eql:synopsis:`select`
+ Apply the policy to all selection queries. Note that any object that
+ cannot be selected, cannot be modified either. This makes
+ :eql:synopsis:`select` the most basic "visibility" policy.
+
+:eql:synopsis:`insert`
+ Apply the policy to all inserted objects. If a newly inserted object would
+ violate this policy, an error is produced instead.
+
+:eql:synopsis:`delete`
+ Apply the policy to all objects about to be deleted. If an object does not
+ allow access under this kind of policy, it is not going to be considered
+ by any :eql:stmt:`delete` command.
+
+ Note that any object that cannot be selected, cannot be modified either.
+
+:eql:synopsis:`update read`
+ Apply the policy to all objects selected for an update. If an object does
+ not allow access under this kind of policy, it is not visible cannot be
+ updated.
+
+ Note that any object that cannot be selected, cannot be modified either.
+
+:eql:synopsis:`update write`
+ Apply the policy to all objects at the end of an update. If an updated
+ object violates this policy, an error is produced instead.
+
+ Note that any object that cannot be selected, cannot be modified either.
+
+:eql:synopsis:`update`
+ This is just a shorthand for :eql:synopsis:`update read` and
+ :eql:synopsis:`update write`.
+
+ Note that any object that cannot be selected, cannot be modified either.
+
+:eql:synopsis:`using `
+ Specifies what the policy is with respect to a given eligible (based on
+ :eql:synopsis:`when` clause) object. The :eql:synopsis:`` has to be
+ a :eql:type:`bool` expression. The specific meaning of this value also
+ depends on whether this policy flavor is :eql:synopsis:`allow` or
+ :eql:synopsis:`deny`.
+
+ The expression must be :ref:`Stable `.
+
+ When omitted, it is assumed that this policy applies to all eligible
+ objects of a given type.
+
+:eql:synopsis:`set errmessage := `
+ Set a custom error message of :eql:synopsis:`` that is displayed
+ when this access policy prevents a write action.
+
+:sdl:synopsis:``
+ Set access policy :ref:`annotation `
+ to a given *value*.
+
+Any sub-type extending a type inherits all of its access policies.
+You can define additional access policies on sub-types.
+
+
+.. _ref_eql_ddl_access_policies:
+
+DDL commands
+============
+
+This section describes the low-level DDL commands for creating, altering, and
+dropping access policies. You typically don't need to use these commands
+directly, but knowing about them is useful for reviewing migrations.
+
+Create access policy
+--------------------
+
+:eql-statement:
+
+Define a new object access policy on a type:
+
+.. eql:synopsis::
+
+ [ with [, ...] ]
+ { create | alter } type "{"
+ [ ... ]
+ create access policy
+ [ when () ; ]
+ { allow | deny } action [, action ... ; ]
+ [ using () ; ]
+ [ "{"
+ [ set errmessage := value ; ]
+ [ create annotation := value ; ]
+ "}" ]
+ "}"
+
+ # where is one of
+ all
+ select
+ insert
+ delete
+ update [{ read | write }]
+
+See the meaning of each parameter in the `Declaring access policies`_ section.
+
+The following subcommands are allowed in the ``create access policy`` block:
+
+:eql:synopsis:`set errmessage := `
+ Set a custom error message of :eql:synopsis:`` that is displayed
+ when this access policy prevents a write action.
+
+:eql:synopsis:`create annotation := `
+ Set access policy annotation :eql:synopsis:`` to
+ :eql:synopsis:``.
+
+ See :eql:stmt:`create annotation` for details.
+
+
+Alter access policy
+-------------------
+
+:eql-statement:
+
+Modify an existing access policy:
+
+.. eql:synopsis::
+
+ [ with [, ...] ]
+ alter type "{"
+ [ ... ]
+ alter access policy "{"
+ [ when () ; ]
+ [ reset when ; ]
+ { allow | deny } [, ... ; ]
+ [ using () ; ]
+ [ set errmessage := value ; ]
+ [ reset expression ; ]
+ [ create annotation := ; ]
+ [ alter annotation := ; ]
+ [ drop annotation ; ]
+ "}"
+ "}"
+
+You can change the policy's condition, actions, or error message, or add/drop
+annotations.
+
+The parameters describing the action policy are identical to the parameters
+used by ``create action policy``. There are a handful of additional
+subcommands that are allowed in the ``alter access policy`` block:
+
+:eql:synopsis:`reset when`
+ Clear the :eql:synopsis:`when ()` so that the policy applies to
+ all objects of a given type. This is equivalent to ``when (true)``.
+
+:eql:synopsis:`reset expression`
+ Clear the :eql:synopsis:`using ()` so that the policy always
+ passes. This is equivalent to ``using (true)``.
+
+:eql:synopsis:`alter annotation ;`
+ Alter access policy annotation :eql:synopsis:``.
+ See :eql:stmt:`alter annotation` for details.
+
+:eql:synopsis:`drop annotation ;`
+ Remove access policy annotation :eql:synopsis:``.
+ See :eql:stmt:`drop annotation` for details.
+
+
+All the subcommands allowed in the ``create access policy`` block are also
+valid subcommands for ``alter access policy`` block.
+
+Drop access policy
+------------------
+
+:eql-statement:
+
+Remove an existing policy:
+
+.. eql:synopsis::
- * - **See also**
- * - :ref:`SDL > Access policies `
- * - :ref:`DDL > Access policies `
+ [ with [, ...] ]
+ alter type "{"
+ [ ... ]
+ drop access policy ;
+ "}"
diff --git a/docs/datamodel/aliases.rst b/docs/datamodel/aliases.rst
index fc56a1dd911..587c9eb4c3f 100644
--- a/docs/datamodel/aliases.rst
+++ b/docs/datamodel/aliases.rst
@@ -6,85 +6,99 @@ Aliases
.. index:: alias, virtual type
-.. important::
+You can think of *aliases* as a way to give schema names to arbitrary EdgeQL
+expressions. You can later refer to aliases in queries and in other aliases.
- This section assumes a basic understanding of EdgeQL. If you aren't familiar
- with it, feel free to skip this page for now.
+Aliases are functionally equivalent to expression aliases defined in EdgeQL
+statements in :ref:`with block `, but are available
+to all queries using the schema and can be introspected.
+Like computed properties, the aliased expression is evaluated on the fly
+whenever the alias is referenced.
-An **alias** is a *pointer* to a set of values. This set is defined with an
-arbitrary EdgeQL expression.
-Like computed properties, this expression is evaluated on the fly whenever the
-alias is referenced in a query. Unlike computed properties, aliases are
-defined independent of an object type; they are standalone expressions.
-As such, aliases are fairly open ended. Some examples are:
-
-**Scalar alias**
+Scalar alias
+============
.. code-block:: sdl
+ # in your schema:
alias digits := {0,1,2,3,4,5,6,7,8,9};
-**Object type alias**
+Later, in some query:
+
+.. code-block:: edgeql
+
+ select count(digits);
+
+
+Object type alias
+=================
The name of a given object type (e.g. ``User``) is itself a pointer to the *set
of all User objects*. After declaring the alias below, you can use ``User`` and
-``UserAlias`` interchangably.
+``UserAlias`` interchangeably:
.. code-block:: sdl
alias UserAlias := User;
-**Object type alias with computeds**
+Object type alias with computeds
+================================
-Object type aliases can include a *shape* that declare additional computed
-properties or links.
+Object type aliases can include a *shape* that declares additional computed
+properties or links:
.. code-block:: sdl
- type Post {
- required title: str;
- }
+ type Post {
+ required title: str;
+ }
+
+ alias PostWithTrimmedTitle := Post {
+ trimmed_title := str_trim(.title)
+ }
- alias PostAlias := Post {
- trimmed_title := str_trim(.title)
- }
+Later, in some query:
-In effect, this creates a *virtual subtype* of the base type, which can be
-referenced in queries just like any other type.
+.. code-block:: edgeql
-**Other arbitrary expressions**
+ select PostWithTrimmedTitle {
+ trimmed_title
+ };
+
+Arbitrary expressions
+=====================
Aliases can correspond to any arbitrary EdgeQL expression, including entire
queries.
.. code-block:: sdl
- # Tuple alias
- alias Color := ("Purple", 128, 0, 128);
-
- # Named tuple alias
- alias GameInfo := (
- name := "Li Europan Lingues",
- country := "Iceland",
- date_published := 2023,
- creators := (
- (name := "Bob Bobson", age := 20),
- (name := "Trina Trinadóttir", age := 25),
- ),
- );
-
- type BlogPost {
- required title: str;
- required is_published: bool;
- }
-
- # Query alias
- alias PublishedPosts := (
- select BlogPost
- filter .is_published = true
- );
+ # Tuple alias
+ alias Color := ("Purple", 128, 0, 128);
+
+ # Named tuple alias
+ alias GameInfo := (
+ name := "Li Europan Lingues",
+ country := "Iceland",
+ date_published := 2023,
+ creators := (
+ (name := "Bob Bobson", age := 20),
+ (name := "Trina Trinadóttir", age := 25),
+ ),
+ );
+
+ type BlogPost {
+ required title: str;
+ required is_published: bool;
+ }
+
+ # Query alias
+ alias PublishedPosts := (
+ select BlogPost
+ filter .is_published = true
+ );
.. note::
@@ -92,11 +106,127 @@ queries.
`.
+.. _ref_eql_sdl_aliases:
+.. _ref_eql_sdl_aliases_syntax:
+
+Defining aliases
+================
+
+Syntax
+------
+
+Define a new alias corresponding to the :ref:`more explicit DDL
+commands `.
+
+.. sdl:synopsis::
+
+ alias := ;
+
+ alias "{"
+ using ;
+ [ ]
+ "}" ;
+
+Where:
+
+:eql:synopsis:``
+ The name (optionally module-qualified) of an alias to be created.
+
+:eql:synopsis:``
+ The aliased expression. Must be a :ref:`Stable `
+ EdgeQL expression.
+
+The valid SDL sub-declarations are listed below:
+
+:sdl:synopsis:``
+ Set alias :ref:`annotation `
+ to a given *value*.
+
+
+.. _ref_eql_ddl_aliases:
+
+DDL commands
+============
+
+This section describes the low-level DDL commands for creating and
+dropping aliases. You typically don't need to use these commands
+directly, but knowing about them is useful for reviewing migrations.
+
+Create alias
+------------
+
+:eql-statement:
+:eql-haswith:
+
+Define a new alias in the schema.
+
+.. eql:synopsis::
+
+ [ with [, ...] ]
+ create alias := ;
+
+ [ with [, ...] ]
+ create alias "{"
+ using ;
+ [ create annotation := ; ... ]
+ "}" ;
+
+ # where is:
+
+ [ := ] module
+
+Parameters
+^^^^^^^^^^
+
+Most sub-commands and options of this command are identical to the
+:ref:`SDL alias declaration `, with some
+additional features listed below:
+
+:eql:synopsis:`[ := ] module `
+ An optional list of module alias declarations to be used in the
+ alias definition.
+
+:eql:synopsis:`create annotation := ;`
+ An optional list of annotation values for the alias.
+ See :eql:stmt:`create annotation` for details.
+
+Example
+^^^^^^^
+
+Create a new alias:
+
+.. code-block:: edgeql
+
+ create alias Superusers := (
+ select User filter User.groups.name = 'Superusers'
+ );
+
+
+Drop alias
+----------
+
+:eql-statement:
+:eql-haswith:
+
+Remove an alias from the schema.
+
+.. eql:synopsis::
+
+ [ with [, ...] ]
+ drop alias ;
+
+Parameters
+^^^^^^^^^^
+
+*alias-name*
+ The name (optionally qualified with a module name) of an existing
+ expression alias.
+
+Example
+^^^^^^^
+
+Remove an alias:
-.. list-table::
- :class: seealso
+.. code-block:: edgeql
- * - **See also**
- * - :ref:`SDL > Aliases `
- * - :ref:`DDL > Aliases `
- * - :ref:`Cheatsheets > Aliases `
+ drop alias SuperUsers;
diff --git a/docs/datamodel/annotations.rst b/docs/datamodel/annotations.rst
index 4983081d205..1ddb58929cb 100644
--- a/docs/datamodel/annotations.rst
+++ b/docs/datamodel/annotations.rst
@@ -1,46 +1,51 @@
.. _ref_datamodel_annotations:
+.. _ref_eql_sdl_annotations:
===========
Annotations
===========
-.. index:: annotation, title, description, deprecated
+.. index:: annotation
-*Annotations* are named values associated with schema items and
-are designed to hold arbitrary schema-level metadata represented as a
-:eql:type:`str`.
+*Annotations* are named values associated with schema items and are
+designed to hold arbitrary schema-level metadata represented as a
+:eql:type:`str` (unstructured text).
+
+Users can store JSON-encoded data in annotations if they need to store
+more complex metadata.
Standard annotations
---------------------
+====================
+
+.. index:: title, description, deprecated
-There are a number of annotations defined in the standard library.
-The following are the annotations which can be set on any schema item:
+There are a number of annotations defined in the standard library. The
+following are the annotations which can be set on any schema item:
-- ``title``
-- ``description``
-- ``deprecated``
+- ``std::title``
+- ``std::description``
+- ``std::deprecated``
For example, consider the following declaration:
.. code-block:: sdl
- type Status {
- annotation title := 'Activity status';
- annotation description := 'All possible user activities';
+ type Status {
+ annotation title := 'Activity status';
+ annotation description := 'All possible user activities';
- required name: str {
- constraint exclusive
- }
+ required name: str {
+ constraint exclusive
}
+ }
-The ``deprecated`` annotation is used to mark deprecated items (e.g.
-:eql:func:`str_rpad`) and to provide some information such as what
+And the ``std::deprecated`` annotation can be used to mark deprecated items
+(e.g., :eql:func:`str_rpad`) and to provide some information such as what
should be used instead.
-
User-defined annotations
-------------------------
+========================
.. index:: abstract annotation
@@ -58,12 +63,328 @@ and code generation.
}
+.. _ref_eql_sdl_annotations_syntax:
+
+Declaring annotations
+=====================
+
+This section describes the syntax to use annotations in your schema.
+
+Syntax
+------
+
+.. sdl:synopsis::
+
+ # Abstract annotation form:
+ abstract [ inheritable ] annotation
+ [ "{" ; [...] "}" ] ;
+
+ # Concrete annotation (same as ) form:
+ annotation := ;
+
+Description
+^^^^^^^^^^^
+
+There are two forms of annotation declarations: abstract and concrete.
+The *abstract annotation* form is used for declaring new kinds of
+annotation in a module. The *concrete annotation* declarations are
+used as sub-declarations for all other declarations in order to
+actually annotate them.
+
+The annotation declaration options are as follows:
+
+:eql:synopsis:`abstract`
+ If specified, the annotation will be *abstract*.
+
+:eql:synopsis:`inheritable`
+ If specified, the annotation will be *inheritable*. The
+ annotations are non-inheritable by default. That is, if a schema
+ item has an annotation defined on it, the descendants of that
+ schema item will not automatically inherit the annotation. Normal
+ inheritance behavior can be turned on by declaring the annotation
+ with the ``inheritable`` qualifier. This is only valid for *abstract
+ annotation*.
+
+:eql:synopsis:``
+ The name (optionally module-qualified) of the annotation.
+
+:eql:synopsis:``
+ Any string value that the specified annotation is intended to have
+ for the given context.
+
+The only valid SDL sub-declarations are *concrete annotations*:
+
+:sdl:synopsis:`