Skip to content

Commit

Permalink
Merge pull request #5889 from EnterpriseDB/release-2024-07-30a
Browse files Browse the repository at this point in the history
Release 2024-07-30a
  • Loading branch information
djw-m authored Jul 30, 2024
2 parents 0e9d94c + e7bc3cf commit 02ce1c2
Show file tree
Hide file tree
Showing 120 changed files with 5,668 additions and 1,076 deletions.
22 changes: 22 additions & 0 deletions .github/workflows/check-links.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: check links on PR
on:
pull_request:
types: [opened, synchronize]
jobs:
check-links:
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
with:
lfs: true
ref: ${{ github.event.pull_request.head.sha }}

- name: setup node
uses: actions/setup-node@v4

- name: install dependencies
run: npm --prefix ./tools/automation/actions/link-check ci

- name: check links
uses: ./tools/automation/actions/link-check
26 changes: 26 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"type": "node",
"request": "launch",
"name": "Launch update links to renames",
"skipFiles": [
"<node_internals>/**"
],
"program": "${workspaceFolder}/tools/user/reorg/links/update-links-to-renames.js"
},
{
"type": "node",
"request": "launch",
"name": "Launch link-check",
"skipFiles": [
"<node_internals>/**"
],
"program": "${workspaceFolder}/tools/automation/actions/link-check/index.js"
}
]
}
4 changes: 2 additions & 2 deletions advocacy_docs/community/contributing/styleguide.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -389,11 +389,11 @@ Information about managing authentication is also available in the [Postgres co

If you're referring to a guide on Docs 2.0, the label is the name of the guide and in italics. For example:

For information about modifying the `pg_hba.conf` file, see the [_PEM Administrator's Guide_](https://www.enterprisedb.com/docs/pem/latest/pem_admin/).
For information about modifying the `pg_hba.conf` file, see the [_PEM Administrator's Guide_](/pem/latest/).

Link capitalization can be either title or sentence case:

* **Use title case** and _italics_ when referring to the linked doc by name. For example. “For information about modifying the `pg_hba.conf` file, see the [_PEM Administrator's Guide_](https://www.enterprisedb.com/docs/pem/latest/pem_admin/).”).
* **Use title case** and _italics_ when referring to the linked doc by name. For example. “For information about modifying the `pg_hba.conf` file, see the [_PEM Administrator's Guide_](/pem/latest/).”).

* **Use sentence case** when linking in the middle of a sentence. For example, “\[\] follow the identifier rules when creating \[\]“).

Expand Down
8 changes: 4 additions & 4 deletions advocacy_docs/edb-postgres-ai/ai-ml/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ title: EDB Postgres AI - AI/ML
navTitle: AI/ML
indexCards: simple
iconName: BrainCircuit
description: How to make use of EDB Postgres AI for AI/ML workloads and using the pgai extension.
description: How to make use of EDB Postgres AI for AI/ML workloads and using the aidb extension.
navigation:
- overview
- install-tech-preview
Expand All @@ -12,11 +12,11 @@ navigation:

EDB Postgres® AI Database is designed to solve all AI data management needs, including storing, searching, and retrieving of AI data. This up-levels Postgres to a database that manages and serves all types of data modalities directly and combines it with its battle-proof strengths as an established Enterprise system of record that manages high-value business data.

In this tech preview, you can use the pgai extension to build a simple retrieval augmented generation (RAG) application in Postgres.
In this tech preview, you can use the aidb extension to build a simple retrieval augmented generation (RAG) application in Postgres.

An [overview](overview) of the pgai extension gives you a high-level understanding of the major functionality available to date.
An [overview](overview) of the aidb extension gives you a high-level understanding of the major functionality available to date.

To get started, you will need to [install the pgai tech preview](install-tech-preview) and then you can start [using the pgai tech preview](using-tech-preview) to build your RAG application.
To get started, you will need to [install the aidb tech preview](install-tech-preview) and then you can start [using the aidb tech preview](using-tech-preview) to build your RAG application.



Expand Down
38 changes: 19 additions & 19 deletions advocacy_docs/edb-postgres-ai/ai-ml/install-tech-preview.mdx
Original file line number Diff line number Diff line change
@@ -1,64 +1,64 @@
---
title: EDB Postgres AI AI/ML - Installing the pgai tech preview
title: EDB Postgres AI AI/ML - Installing the aidb tech preview
navTitle: Installing
description: How to install the EDB Postgres AI AI/ML pgai tech preview and run the container image.
description: How to install the EDB Postgres AI AI/ML aidb tech preview and run the container image.
prevNext: true
---

The preview release of pgai is distributed as a self-contained Docker container that runs PostgreSQL and includes all of the pgai dependencies.
The preview release of aidb is distributed as a self-contained Docker container that runs PostgreSQL and includes all of the aidb dependencies.

## Configuring and running the container image

If you haven't already, sign up for an EDB account and log in to the EDB container registry.

Log in to Docker with the username tech-preview and your EDB Repo 2.0 subscription token as your password:
Log in to Docker with the username tech-preview and your EDB Repos 2.0 subscription token as your password:

```shell
docker login docker.enterprisedb.com -u tech-preview -p <your_EDB_repo_token>
__OUTPUT__
Login Succeeded
```

Download the pgai container image:
Download the aidb container image:

```shell
docker pull docker.enterprisedb.com/tech-preview/pgai
docker pull docker.enterprisedb.com/tech-preview/aidb
__OUTPUT__
...
Status: Downloaded newer image for docker.enterprisedb.com/tech-preview/pgai:latest
docker.enterprisedb.com/tech-preview/pgai:latest
Status: Downloaded newer image for docker.enterprisedb.com/tech-preview/aidb:latest
docker.enterprisedb.com/tech-preview/aidb:latest
```

Specify a password to use for Postgres in the environment variable PGPASSWORD. The tech preview container will set up Postgres with this password and use it to connect to it. In bash or zsh set it as follows:
Specify a password to use for Postgres in the environment variable PGPASSWORD. The tech preview container set up Postgres with this password and use it to connect to it. In bash or zsh set it as follows:

```shell
export PGPASSWORD=<your_password>
```

You can use the pgai extension with encoder LLMs in Open AI or with open encoder LLMs from HuggingFace. If you want to use Open AI you also must provide your API key for that in the OPENAI_API_KEY environment variable:
You can use the aidb extension with encoder LLMs in Open AI or with open encoder LLMs from HuggingFace. If you want to use Open AI you also must provide your API key for that in the OPENAI_API_KEY environment variable:

```shell
export OPENAI_API_KEY=<your_openai_key>
```

You can use the pgai extension with AI data stored in Postgres tables or on S3 compatible object storage. To work with object storage you need to specify the ACCESS_KEY and SECRET_KEY environment variables:.
You can use the aidb extension with AI data stored in Postgres tables or on S3 compatible object storage. To work with object storage you need to specify the ACCESS_KEY and SECRET_KEY environment variables:

```shell
export ACCESS_KEY=<your_access_key>
export SECRET_KEY=<your_secret_key>
```

Start the pgai tech preview container with the following command. It makes the tech preview PostgreSQL database available on local port 15432:
Start the aidb tech preview container with the following command. It makes the tech preview PostgreSQL database available on local port 15432:

```shell
docker run -d --name pgai \
docker run -d --name aidb \
-e ACCESS_KEY=$ACCESS_KEY \
-e SECRET_KEY=$SECRET_KEY \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-e POSTGRES_PASSWORD=$PGPASSWORD \
-e PGDATA=/var/lib/postgresql/data/pgdata \
-p 15432:5432 \
docker.enterprisedb.com/tech-preview/pgai:latest
docker.enterprisedb.com/tech-preview/aidb:latest
```


Expand All @@ -70,7 +70,7 @@ If you haven't yet, install the Postgres command-line tools. If you're on a Mac,
brew install libpq
```

Connect to the tech preview PostgreSQL running in the container. Note that this relies on $PGPASSWORD being set - if you're using a different terminal for this part, make sure you re-export the password:
Connect to the tech preview PostgreSQL running in the container. Note that this relies on setting the PGPASSWORD environment variable - if you're using a different terminal for this part, make sure you re-export the password:

```shell
psql -h localhost -p 15432 -U postgres postgres
Expand All @@ -82,10 +82,10 @@ postgres=#
```


Install the pgai extension:
Install the aidb extension:

```sql
create extension pgai cascade;
create extension aidb cascade;
__OUTPUT__
NOTICE: installing required extension "plpython3u"
NOTICE: installing required extension "vector"
Expand All @@ -99,9 +99,9 @@ __OUTPUT__
List of installed extensions
Name | Version | Schema | Description
------------+---------+------------+------------------------------------------------------
pgai | 0.0.1 | public | An extension to do the AIs
aidb | 0.0.2 | public | An extension to do the AIs
plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
plpython3u | 1.0 | pg_catalog | PL/Python3U untrusted procedural language
vector | 0.6.0 | public | vector data type and ivfflat and hnsw access methods
vector | 0.7.2 | public | vector data type and ivfflat and hnsw access methods
(4 rows)
```
22 changes: 11 additions & 11 deletions advocacy_docs/edb-postgres-ai/ai-ml/overview.mdx
Original file line number Diff line number Diff line change
@@ -1,29 +1,29 @@
---
title: EDB Postgres AI AI/ML - Overview
navTitle: Overview
description: Where to start with EDB Postgres AI AI/ML and the pgai tech preview.
description: Where to start with EDB Postgres AI AI/ML and the aidb tech preview.
prevNext: True
---

At the heart of EDB Postgres® AI is the EDB Postgres AI database (pgai). This builds on Postgres's flexibility and extends its capability to include storing the vector data of embeddings.
At the heart of EDB Postgres® AI is the EDB Postgres AI database (aidb). This builds on Postgres's flexibility and extends its capability to include storing the vector data of embeddings.

The pgai extension is currently available as a tech preview. It will be continuously extended with new functions. This overview presents the functionality available to date.
The aidb extension is currently available as a tech preview. It will be continuously extended with new functions. This overview presents the functionality available to date.

![PGAI Overview](images/pgai-overview-withbackground.png)
![AIDB Overview](images/aidb-overview-withbackground.png)

pgai introduces the concept of a “retriever” that you can create for a given type and location of AI data. Currently pgai supports unstructured plain text documents as well as a set of image formats. This data can either reside in regular columns of a Postgres table or it can reside in an S3 compatible object storage bucket.
aidb introduces the concept of a “retriever” that you can create for a given type and location of AI data. Currently aidb supports unstructured plain text documents as well as a set of image formats. This data can either reside in regular columns of a Postgres table or it can reside in an S3 compatible object storage bucket.

A retriever encapsulates all processing that is needed to make the AI data in the provided source location searchable and retrievable through similarity. The application just needs to create a retriever via the `pgai.create_retriever()` function. When `auto_embedding=TRUE` is specified the pgai extension will automatically generate embeddings for all the data in the source location.
A retriever encapsulates all processing that is needed to make the AI data in the provided source location searchable and retrievable through similarity. The application just needs to create a retriever via the `aidb.create_retriever()` function. When `auto_embedding=TRUE` is specified the aidb extension will automatically generate embeddings for all the data in the source location.

Otherwise it will be up to the application to request a bulk generation of embeddings using `pgai.refresh_retriever()`.
Otherwise it will be up to the application to request a bulk generation of embeddings using `aidb.refresh_retriever()`.

Auto embedding is currently supported for AI data stored in Postgres tables and it automates the embedding updates using Postgres triggers. You can also combine the two options by using pgai.refresh_retriever() to embed all previously existing data and also setting `auto_embedding=TRUE` to generate embeddings for all new and changed data from now on.
Auto embedding is currently supported for AI data stored in Postgres tables and it automates the embedding updates using Postgres triggers. You can also combine the two options by using aidb.refresh_retriever() to embed all previously existing data and also setting `auto_embedding=TRUE` to generate embeddings for all new and changed data from now on.

All embedding generation, storage, indexing and management is handled by the pgai extension internally. The application just has to specify the encoder LLM that the retriever should be using for this specific data and use case.
All embedding generation, storage, indexing, and management is handled by the aidb extension internally. The application just has to specify the encoder LLM that the retriever should be using for this specific data and use case.

Once a retriever is created and all embeddings are up to date, the application can just use pgai.retrieve() to run a similarity search and retrieval by providing a query input. When the retriever is created for text data, the query input is also a text term. For image retrievers the query input is an image. The pgai retriever makes sure to use the same encoder LLM for the query input, conducts a similarity search and finally returns the ranked list of similar data from the source location.
Once a retriever is created and all embeddings are up to date, the application can just use aidb.retrieve() to run a similarity search and retrieval by providing a query input. When the retriever is created for text data, the query input is also a text term. For image retrievers the query input is an image. The aidb retriever makes sure to use the same encoder LLM for the query input, conducts a similarity search and finally returns the ranked list of similar data from the source location.

pgai currently supports a broad list of open encoder LLMs from HuggingFace as well as a set of OpenAI encoders. Consult the list of supported encoder LLMs in the pgai.encoders meta table. HuggingFace LLMs are running locally on the Postgres node, while OpenAI encoders involve a call out to the OpenAI cloud service.
aidb currently supports a broad list of open encoder LLMs from HuggingFace as well as a set of OpenAI encoders. Consult the list of supported encoder LLMs in the aidb.encoders meta table. HuggingFace LLMs are running locally on the Postgres node, while OpenAI encoders involve a call out to the OpenAI cloud service.



Expand Down
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
---
title: Additional functions and standalone embedding in pgai
title: Additional functions and standalone embedding in aidb
navTitle: Additional functions
description: Other pgai extension functions and how to generate embeddings for images and text.
description: Other aidb extension functions and how to generate embeddings for images and text.
---

## Standalone embedding

Use the `generate_single_image_embedding` function to get embeddings for the given image. Currently, `model_provider` can only be `openai` or `huggingface`. You can check the list of valid embedding models and model providers from the Encoders Supported PGAI section.
Use the `generate_single_image_embedding` function to get embeddings for the given image. Currently, `model_provider` can only be `openai` or `huggingface`. You can check the list of valid embedding models and model providers from the Encoders Supported AIDB section.

```sql
SELECT pgai.generate_single_image_embedding(
SELECT aidb.generate_single_image_embedding(
'clip-vit-base-patch32', -- embedding model name
'openai', -- model provider
'https://s3.us-south.cloud-object-storage.appdomain.cloud', -- S3 endpoint
Expand All @@ -26,7 +26,7 @@ __OUTPUT__
Use the `generate_text_embedding` function to get embeddings for the given image. Currently, the `model_provider` can only be `openai` or `huggingface`.

```sql
SELECT pgai.generate_text_embedding(
SELECT aidb.generate_text_embedding(
'text-embedding-3-small', -- embedding model name
'openai', -- model provider
0, -- dimensions, setting 0 will replace with the default value in encoder's table
Expand All @@ -41,10 +41,10 @@ __OUTPUT__

## Supported encoders

You can check the list of valid embedding models and model providers from pgai.encoders table
You can check the list of valid embedding models and model providers from aidb.encoders table

```sql
SELECT provider, count(*) encoder_model_count FROM pgai.encoders group by (provider);
SELECT provider, count(*) encoder_model_count FROM aidb.encoders group by (provider);
__OUTPUT__
provider | encoder_model_count
-------------+---------------------
Expand All @@ -55,11 +55,11 @@ __OUTPUT__

## Available functions

You can find the complete list of currently available functions of the pgai extension by selecting from `information_schema.routines` any `routine_name` belonging to the pgai routine schema:
You can find the complete list of currently available functions of the aidb extension by selecting from `information_schema.routines` any `routine_name` belonging to the aidb routine schema:


```
SELECT routine_name from information_schema.routines WHERE routine_schema='pgai';
SELECT routine_name from information_schema.routines WHERE routine_schema='aidb';
__OUTPUT__
routine_name
---------------------------------
Expand Down
16 changes: 10 additions & 6 deletions advocacy_docs/edb-postgres-ai/ai-ml/using-tech-preview/index.mdx
Original file line number Diff line number Diff line change
@@ -1,16 +1,20 @@
---
title: EDB Postgres AI AI/ML - Using the pgai tech preview
title: EDB Postgres AI AI/ML - Using the aidb tech preview
navTitle: Using
description: Using the EDB Postgres AI AI/ML tech preview to build a simple retrieval augmented generation (RAG) application in Postgres.
navigation:
- working-with-ai-data-in-postgres
- working-with-ai-data-in-s3
- standard-encoders
- additional_functions
---

This section shows how you can use your [newly installed pgai tech preview](install-tech-preview) to retrieve and generate AI data in Postgres.
This section shows how you can use your [newly installed aidb tech preview](../install-tech-preview) to retrieve and generate AI data in Postgres.

* [Working with AI data in Postgres](working-with-ai-data-in-postgres) details how to use the aidb extension to work with AI data stored in Postgres tables.

* [Working with AI data in S3](working-with-ai-data-in-S3) covers how to use the aidb extension to work with AI data stored in S3 compatible object storage.

* [Additional functions](additional_functions) notes other aidb extension functions and how to generate standalone embeddings for images and text.


* [Working with AI data in Postgres](working-with-ai-data-in-postgres) details how to use the pgai extension to work with AI data stored in Postgres tables.
* [Working with AI data in S3](working-with-ai-data-in-s3) covers how to use the pgai extension to work with AI data stored in S3 compatible object storage.
* [Standard encoders](standard-encoders) goes through the standard encoder LLMs that are supported by the pgai extension.

Loading

2 comments on commit 02ce1c2

@github-actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@github-actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.