Feat/onboarding bot #18

Keyrxng · 2024-09-18T00:23:01Z

Resolves #17
Requires #16

Keyrxng · 2024-10-01T00:59:43Z

Keep in mind that my DB only has embeddings for readmes but this current approach is using #16 with sshivaditya2019's original DB function with the current_id param removed so it compares just the query embedding against all stored embeddings. A threshold of 0.6, direct prompts and I use only the top ranked embedding section into the existing ctx.

Once the DB starts to scale the signal to noise will drop and so I intend on implementing a couple of additional search functions.

type: essentially a classification of the prompt, setup_instructions etc. Requires one zeroShot GPT4 classification
metadata: we can index using JSON keys as mentioned by mentlegen/whilefoo I think, so with each embedding we have a metadata interface something like:

type Metadata = {
  repoNodeId: string;
  issueNodeId: string;
  authorAssociation: string;
}

Then we create a similar search fn which indexes based on those which allows us to easily restrict the scope we search for embedding context based on:

webhook payload details: repo, issue
categorization of text: setup_instructions | complete task summary | task spec | etc...

scenario:

Help me start this task? I'm stuck on this issue...

classify > [setup, summary, spec] > obtain payload meta > 3x embedding search, one in each category, use the best > gptDecideContext() > gptContext + prompt

I'm stuck on this PR, Review this PR for me...

classify > [summary, spec, sourceCode] (diff and source separate) > meta > 3x search > gptDecideContent() >

Like shiv said about context distillation, we used to have the gptDecideContext fn and we'd feed it the entire ctx and have it truncate it, I think that would be a good thing to bring back.

Idk if it's overkill but we do have the entire convo history so we could fetch embeddings based on conversational context fetched prior to truncating, if we aren't already Supabase premium I think we'll for sure have to shortly lol.

Keyrxng added 18 commits September 17, 2024 17:50

chore: createComment

879d3d7

chore: refactor adapters

e769769

chore: add proxy callbacks for cleaner context types

50dfaf2

chore: create-comment-embeddings

b59b298

chore: delete-comment-embeddings

5366f26

chore: update-comment-embeddings

1654da3

chore: create-task-embeddings

991a2ae

chore: delete-task-embeddings

5dd0813

chore: update-task-embeddings

59eccc1

chore: task dedupe

a1ab5bc

chore: types, guards

a540b47

chore: tests

215ec6e

chore: log and fix tests

09be08a

chore: knip, format, logs, fix test

d707c9f

chore: create upon edit if needed

48349ff

chore: fix test

44ea012

feat: repo readme embeddings (setup_instructions)

b3443ec

chore: renaming

7be884b

Keyrxng marked this pull request as ready for review September 18, 2024 01:00

Keyrxng mentioned this pull request Sep 21, 2024

Refactor db #16

Closed

Keyrxng mentioned this pull request Oct 6, 2024

@ubiquityos gpt command ubiquity-os-marketplace/command-ask#1

Merged

Keyrxng closed this Oct 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/onboarding bot #18

Feat/onboarding bot #18

Keyrxng commented Sep 18, 2024

Keyrxng commented Oct 1, 2024 •

edited

Loading

Feat/onboarding bot #18

Feat/onboarding bot #18

Conversation

Keyrxng commented Sep 18, 2024

Keyrxng commented Oct 1, 2024 • edited Loading

Keyrxng commented Oct 1, 2024 •

edited

Loading