oso-rag-chatbot

A CLI demo app that uses Oso to authorize context vector embeddings before sending them to a RAG chatbot.

Prerequisites

Docker
Node.js
An OpenAI API key to generate embeddings and chatbot responses.
- NOTE: The requests to OpenAI will incur charges, but they're low. For reference, I made 53 requests to the API while testing this app and incurred USD $0.01 in charges.
Supabase local environment (installed via npm/docker)
Oso Dev Server (installed via docker)

Quickstart

Install prerequisites

Docker Desktop
node.js

Oso Dev Server

docker run -p '8080:8080' public.ecr.aws/osohq/dev-server:latest

Local Supabase environment
```
npx supabase init
npx supabase start
```
An OpenAI API Key

Clone this repository

SSH

git clone [email protected]:osohq/oso-rag-chatbot.git

HTTPS

git clone https://github.com/osohq/oso-rag-chatbot.git

Install repo dependencies

cd oso-rag-chatbot
npm install

Set required environment variables

export DATABASE_URL="postgresql://postgres:postgres@localhost:54322/postgres"
export OPENAI_API_KEY=YOUR_OPENAI_API_KEY

Initialize the database and Oso Dev Server

npx supabase db reset
npm run initialize

Run the chatbot

npm run start

Scenario

The app models a company chatbot. It can accept context from internal company documents, which are stored in the Supabase PostgreSQL database you created above.

The chatbot is aware of two users:

Bob: An engineering employee who received a bad review. Part of the review feedback was from Alice, a coworker who said that he's "horrible to work with."
Diane: An HR employee who has access to everyone's review data

The chatbot can answer two questions from with the context that's available in the database:

Why did Bob get a bad review?
When are the company holidays?

When Bob asks the chatbot "Why did Bob get a bad review?", he should only see the generalized feedback from the review his manager shared with him.

When Diane asks the chatbot "Why did Bob get a bad review?", she should see both the generalized feedback and Alice's pointed remark.

Any user can see the company holidays.

Usage

Environment variables

There are two required environment variables and three optional ones:

DATABASE_URL: (required) The connection string to the postgresql instance that contains the chatbot data. If you're using the default Supabase setup, then set this to postgresql://postgres:postgres@localhost:54322/postgres
OPENAI_API_KEY: (required) The API key used for OpenAI LLM operations. You can get a key here after creating an account.
OSO_URL: (optional) The URL of the Oso Cloud instance used for authorization. Defaults to http://localhost:8080
OSO_AUTH: (optional) The API key used for Oso Cloud operations. Defaults to the Local Dev Server key: e_0123456789_12345_osotesttoken01xiIn
DEBUG: (optional) Used to emit various debugging information. Unset by default (no debug logging). Can be set to any combination of the following as a comma-delimited string:
- main: Emit the context sent to the chatbot as plain text.
- authz: Emit information about authorization operations.
- data: Emit information about database operations.
- llm: Emit information about LLM operations.
- embedding: Emit the vector embeddings generated by OpenAI.

The application supports setting environment variables with dotenv. If you place a file named .env in the repository root directory, the app will recognize any environment variables you define in that file.

The .env file is listed in .gitignore, so it won't be committed to version control. It's safe to put your API keys and database connection information in it.

Operation

When you start the chatbot with npm start, it will ask you who you are. Enter Bob or Diane

❯ npm start

> start
> node cli.js start

? Who are you? Diane

Next, it will prompt you for a question. Enter one of the following:

Why did Bob get a bad review?
When are the company holidays?

? What would you like to ask? Why did Bob get a bad review?

The response from the chatbot will depend on your identity and the question you asked. For example, when Diane asks about Bob's review, the response will be something like this:

It seems that Bob received a bad review primarily due to feedback from Alice, who mentioned that he is "horrible to work with." This suggests there are significant issues related to his behavior or collaboration skills in the workplace. Additionally, it was pointed out that Bob should work on being more collaborative, which indicates that perhaps he isn't engaging effectively with his team or contributing positively to group dynamics. Furthermore, there's a suggestion that he needs to contribute more to design and architecture discussions, meaning that his input in critical areas of project development may be lacking. Overall, it looks like the feedback highlights a need for improvement in teamwork and participation in discussions. If you have any more questions about the review process or how to provide support for Bob, feel free to ask!

Debug logging

By setting the DEBUG environment variable, you can emit various information about what the chatbot is doing. This can be useful if you'd like more insight into what's going on under the hood without adding a bunch of console.log() statements. For example, if you set DEBUG to main,authz,data, then you'll see the context, database operations and LLM operations. Examples of each type of logging follow.

main:

  main I'll send the following additional context: +0ms
  main (Similarity: 0.548) Alice says that Bob is horrible to work with +0ms
  main (Similarity: 0.412) Bob should work on being more collaborative. +0ms
  main (Similarity: 0.355) Bob should contribute more to design and architectur

authz:

  authz Authorization filter query from Oso: +0ms
  authz id IN (WITH RECURSIVE
  authz c0(arg0, arg2) AS NOT MATERIALIZED (
  authz SELECT id, document_id FROM "block"
  authz ),
  authz c1(arg0, arg2) AS NOT MATERIALIZED (
  authz SELECT id, folder_id FROM "document"
  authz ),
  authz c2(arg0) AS NOT MATERIALIZED (
  authz SELECT id FROM "folder" WHERE is_public=true
  authz )
  authz  SELECT f0.arg0
  authz FROM c0 AS f0, c1 AS f1
  authz WHERE f0.arg2 = f1.arg0 and f1.arg2 = '1'
  authz  UNION SELECT f0.arg0
  authz FROM c0 AS f0, c1 AS f1
  authz WHERE f0.arg2 = f1.arg0
  authz  UNION SELECT f0.arg0
  authz FROM c0 AS f0, c1 AS f1, c2 AS f2
  authz WHERE f0.arg2 = f1.arg0 and f1.arg2 = f2.arg0
  authz ) +0ms

data:

  data Authorized similarity search: +0ms
  data SELECT
  data       id,
  data       document_id,
  data       content,
  data       1 - (embedding::vector <=> promptEmbedding::vector) as similarity
  data     FROM block
  data     WHERE id IN ([2,3,5,4,1])
  data     AND (1 - (embedding::vector <=> promptEmbedding::vector)) > 0.3 +0ms

llm:

  llm Model: text-embedding-3-small +0ms
  llm Prompt: Why did Bob get a bad review? +0ms

embedding:

  embedding Embedding: +0ms
  embedding [
  embedding     -0.02280535,     0.01509199, -0.038275734,  0.027753543, -0.0067892126,
  embedding     0.004817212,   -0.036267348,   0.01919608,  0.017915372,  -0.041040897,
  embedding    -0.023154635,   -0.058854394, -0.013090882,   0.06193974,  -0.011453613,
  embedding    -0.006676423,    0.047851942, 0.0011242586, -0.028757736,    0.04531963,
  embedding     0.035481457,    0.054371916,  0.015615917,  0.018060906,  -0.015819665,
  embedding     0.030504158,    0.021277232, -0.027375152,  0.033647716,   0.009634424,
  embedding   -0.0140950745,   -0.026239978,  0.045930877, -0.009707191,  -0.061241172,
  embedding     -0.04203054,    0.009139605, -0.028481219,   0.06589829,   0.015484935,
  embedding    -0.011701022,    0.013738514, 0.0015290282, -0.025963463,   0.005453928,
  embedding     0.035335924, -0.00019772309,  0.017682515,  0.030853441,   0.019385276,
  embedding    -0.031930402,    0.038828764, -0.027477028, -0.024537219, -0.0075023347,
  embedding    -0.035685208,   -0.028088275, -0.016358146,  0.028641308, -0.0072549246,
  embedding     0.049656577,   -0.014597171, -0.055507086,  0.016736537,   -0.08132502,
  embedding     -0.04133197,   -0.018090013,  -0.03964376,  0.008266394,   0.001715495,
  embedding   -0.0009896387,     0.07270934,  0.004828127, -0.009685361,   0.013134543,
  embedding   -0.0015326665,     0.07439754,   0.03216326, 0.0047226143,  0.0103184385,
  embedding    -0.010121967,    0.037751805, -0.015193865, 0.0016481851,   -0.00926331,
  embedding      0.00856474,    -0.03830484, -0.012421421,  0.017682515,  -0.036325563,
  embedding    -0.040575188,     0.02558507,  0.046600338,   0.01737689,   0.019923756,
  embedding    -0.053382274,   -0.061765097,  0.017726175,  -0.02105893,   0.008528357,
  embedding   ... 1436 more items
  embedding ] +0ms

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
authorization		authorization
data		data
prisma		prisma
supabase		supabase
.gitignore		.gitignore
README.md		README.md
app.js		app.js
authz.js		authz.js
cli.js		cli.js
data.js		data.js
llm.js		llm.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

oso-rag-chatbot

Prerequisites

Quickstart

Install prerequisites

Clone this repository

Install repo dependencies

Set required environment variables

Initialize the database and Oso Dev Server

Run the chatbot

Scenario

Usage

Environment variables

Operation

Debug logging

About

Releases

Packages

Languages

osohq/oso-rag-chatbot

Folders and files

Latest commit

History

Repository files navigation

oso-rag-chatbot

Prerequisites

Quickstart

Install prerequisites

Clone this repository

Install repo dependencies

Set required environment variables

Initialize the database and Oso Dev Server

Run the chatbot

Scenario

Usage

Environment variables

Operation

Debug logging

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages