Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: issue_comments linting added issue_comments:edited, created and … #1

Merged

Conversation

sshivaditya2019
Copy link
Collaborator

@sshivaditya2019 sshivaditya2019 commented Aug 27, 2024

Resolves #32

  • Created webhook listener for issue_comments:created and issue_comments:edited.
  • Uses OpenAI embeddings.
  • Embeddings are stored in Supabase DB
  • New OpenAI adapter can be expanded in future to support text generation and question answering using RAG.
  • issue_comments:deleted, deletes the comment from the DB
  • Expects theComment ID , Issue Body in the payload.
  • OpenAI Key, Supabase Key and URL have to be added in the envs.
  • When deployed a new table has to be created with the associated supabase account with the following schema. For Vector type, the pgvector extension must be enabled in the postgres extensions.
issue_comment
id: int8
issuebody: Text,
commentbody: Text,
embedding: Vector(3072)

@sshivaditya2019 sshivaditya2019 marked this pull request as ready for review August 27, 2024 04:29
@sshivaditya2019
Copy link
Collaborator Author

@0x4007

Copy link
Contributor

github-actions bot commented Aug 27, 2024

Unlisted binaries (2)

Filename binaries
package.json supabase
cross-env-shell

Unused types (4)

Filename types
src/types/database.ts Tables
TablesInsert
TablesUpdate
Enums

.vscode/settings.json Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
src/adapters/openai/helpers/embedding.ts Outdated Show resolved Hide resolved
src/adapters/openai/helpers/embedding.ts Outdated Show resolved Hide resolved
src/main.ts Outdated Show resolved Hide resolved
src/plugin.ts Show resolved Hide resolved
src/types/env.ts Show resolved Hide resolved
tests/main.test.ts Outdated Show resolved Hide resolved
@0x4007
Copy link
Member

0x4007 commented Aug 27, 2024

Fix Resolves #32 by placing the full URL of the issue in place of the number.

package.json Outdated Show resolved Hide resolved
@gentlementlegen
Copy link
Member

gentlementlegen commented Aug 28, 2024

@sshivaditya2019 It seems that you are storing the vectors within Supabase but I see no Supabase folder for setup. Also, typings should be used so Supabase offers auto-complete. Please fix the unit tests as well.

@sshivaditya2019
Copy link
Collaborator Author

@sshivaditya2019 It seems that you are storing the vectors within Supabase but I see no Supabase folder for setup. Also, typings should be used so Supabase offers auto-complete. Please fix the unit tests as well.

Fixed added schema for table. Unit tests work now. Schema typing should be available in the project. Embedding type now expects a vector array of length 3072

wrangler.toml Outdated Show resolved Hide resolved
Co-authored-by: Mentlegen <[email protected]>
@gentlementlegen
Copy link
Member

@sshivaditya2019 Thanks a lot for the changes, appreciated. Since you seem to use Supabase, I assume that we should spin up a new instance specially for this project because this data would not make much sense in our current database, @0x4007 please confirm.

As such could you please add a migration script that would:

  • run migrations
  • update database types

when code is pushed to the main branch? Here is an example to get you started: https://github.com/ubiquibot/user-activity-watcher/blob/development/.github/workflows/database.yml#L35
It would be also nice to have code generation commands within the package.json, I don't think I saw them. Here is an example: https://github.com/ubiquibot/user-activity-watcher/blob/development/package.json#L21

@0x4007
Copy link
Member

0x4007 commented Aug 28, 2024

Sure new database is fine to keep things simple.

@sshivaditya2019
Copy link
Collaborator Author

@sshivaditya2019 Thanks a lot for the changes, appreciated. Since you seem to use Supabase, I assume that we should spin up a new instance specially for this project because this data would not make much sense in our current database, @0x4007 please confirm.

As such could you please add a migration script that would:

  • run migrations
  • update database types

when code is pushed to the main branch? Here is an example to get you started: https://github.com/ubiquibot/user-activity-watcher/blob/development/.github/workflows/database.yml#L35 It would be also nice to have code generation commands within the package.json, I don't think I saw them. Here is an example: https://github.com/ubiquibot/user-activity-watcher/blob/development/package.json#L21

Added Github Workflow, and fixed the package.json.

@gentlementlegen
Copy link
Member

Please fix the Knip issues, I will try to set up this on my repo and give it a run.

@sshivaditya2019
Copy link
Collaborator Author

Knip should be fixed in the latest commit.

Copy link
Member

@gentlementlegen gentlementlegen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a horrendous time trying this locally and had to rely on a GitHub deployment to test it. A few major changes to consider:

  • remove the commands section from the manifest otherwise the plugin will never run
  • inside compute.yml rename the content properly (plugin-name)
  • in compute.yml add the GitHub secrets properly or the run will crash (I assume this will run as an Action because it's a long running task)
  • update README.md properly to help future collaborators

Related run: https://github.com/Meniole/issue-comment-embeddings/actions/runs/10612546023/job/29414425736#step:5:8

package.json Outdated Show resolved Hide resolved
manifest.json Outdated Show resolved Hide resolved
@sshivaditya2019
Copy link
Collaborator Author

I had a horrendous time trying this locally and had to rely on a GitHub deployment to test it. A few major changes to consider:

  • remove the commands section from the manifest otherwise the plugin will never run
  • inside compute.yml rename the content properly (plugin-name)
  • in compute.yml add the GitHub secrets properly or the run will crash (I assume this will run as an Action because it's a long running task)
  • update README.md properly to help future collaborators

Related run: https://github.com/Meniole/issue-comment-embeddings/actions/runs/10612546023/job/29414425736#step:5:8

  • I expected this to run on worker, since action takes too long for each message.
  • Its working for me as expected locally, might be some issue in .ubiquibot-config.yml. Could you please let me know what issue you were facing?
  • Updated theREADME.md

.github/.ubiquibot-config.yml Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
package.json Outdated Show resolved Hide resolved
wrangler.toml Outdated Show resolved Hide resolved
tests/__mocks__/helpers.ts Outdated Show resolved Hide resolved
@@ -10,7 +10,6 @@ import { plugin } from "./plugin";
*/
export async function run() {
const payload = github.context.payload.inputs;

const env = Value.Decode(envSchema, payload.env);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will always crash on run:
https://github.com/Meniole/issue-comment-embeddings/actions/runs/10628609152/job/29463778679#step:5:45

The environment is populated either from GitHub or the Worker. Since this runs as an action, it should be loaded directly within process.env, which you can add inside payload.env if you want. In the case of a worker, it is send within the payload by Cloudflare.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, as mentioned before, I expect this to run in a Cloudflare worker environment. The env variables are sent within the payload.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think running as worker is fine.

Copy link
Member

@gentlementlegen gentlementlegen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more changes might be needed to be able to run this one, but getting closer (c.f. my last comment).


Testing actions is tedious. I would advise you to create your own organization (also works on your personal account but usage restrictions are way lower) and set up the bot so you can try some runs and make sure it works as intended.

@sshivaditya2019
Copy link
Collaborator Author

sshivaditya2019 commented Aug 30, 2024

Could you please confirm if the .ubiquibot-config.yml file isn't working for you when you try using it? I tested it again by creating a new issue and adding comments, and everything worked fine on my end.

Is there any particular reason you keep trying to set it up in Github Action environment. The plugin could be made for a Cloudflare Worker or Standalone Github action right ? I don't want this to run in Github Action environment and is intended to be setup on Worker env.

@sshivaditya2019
Copy link
Collaborator Author

Screen.Recording.2024-08-30.at.4.39.35.AM.mov

@0x4007
Copy link
Member

0x4007 commented Aug 31, 2024

Very lovely QA testing video thank you.

Copy link
Member

@0x4007 0x4007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull looks great. I just have some cosmetic changes

@@ -1,4 +1,4 @@
name: "the name of the plugin"
name: "@ubiquibot/issue-comment-embeddings"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: "@ubiquibot/issue-comment-embeddings"
name: "@ubiquity-os/comment-vector-embeddings"
  • Shouldn't this eventually cover all comments?
  • Ubiquibot is deprecated
  • ideally this plugin should be generalized to allow us to pass in anything to generate embeddings. Maybe we can generate a unique ID based on where it's coming in from (GitHub, Telegram) and the project it's posted to (repository/issue, or group chat id) etc

The purpose of this is to serve as the foundation of the system's awareness across all organization operations.

id: test-app
uses:
- plugin: https://ubiquibot-issue-comment-embeddings.sshivaditya.workers.dev
runsOn: [ "issue_comment.created", "issue_comment.edited", "issue_comment.deleted" ]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't we do "issue_comment" to be concise? @gentlementlegen

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not with the current implementation because it wouldn't match the event and wouldn't be triggered.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should be removed, otherwise no plugin will run inside of this repo.

manifest.json Outdated
"description": "Command 1 with an argument."
}
}
"name": "@ubiquibot/issue-comment-embeddings",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ubiquity-os

}
"name": "@ubiquibot/issue-comment-embeddings",
"description": "Issue comment plugin for Ubiquibot. It enables the storage, updating, and deletion of issue comment embeddings.",
"ubiquity:listeners": ["issue_comment.created", "issue_comment.edited", "issue_comment.deleted"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly issue_comment only

@@ -10,7 +10,6 @@ import { plugin } from "./plugin";
*/
export async function run() {
const payload = github.context.payload.inputs;

const env = Value.Decode(envSchema, payload.env);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think running as worker is fine.

wrangler.toml Outdated
@@ -1,4 +1,4 @@
name = "your-plugin-name"
name = "ubiquibot-issue-comment-embeddings"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UbiquiBot term is deprecated in favor of UbiquityOS (or when lowercase, ubiquity-os)

@gentlementlegen
Copy link
Member

@0x4007 My concern about having this as a worker is the following:

  • we don't need to handle commands so I don't see the need for instant feedback
  • workers are limited to 50 network calls and this plugin will most likely hit the DB a lot (each supabase action is a network call)
  • it's a long running process

But I will test it as a worker and see if it works.

@sshivaditya2019
Copy link
Collaborator Author

sshivaditya2019 commented Aug 31, 2024

@0x4007 My concern about having this as a worker is the following:

  • we don't need to handle commands so I don't see the need for instant feedback

  • workers are limited to 50 network calls and this plugin will most likely hit the DB a lot (each supabase action is a network call)

  • it's a long running process

But I will test it as a worker and see if it works.

  • If there are multiple comments simultaneously they might create a lot of action.
  • I am not sure about this but we would loose a lot of build minutes for this.
  • As far as I know, it's 50 outbound sub-requests per request. This could be avoided if we use Cloudflare D1.
  • Yeah, it's supposed to be a long process running it again and again doesn't make sense

@0x4007
Copy link
Member

0x4007 commented Aug 31, 2024

GitHub actions are totally free for us. Workers can be costly.

It is only a problem to do workers if we have to spend money

@sshivaditya2019
Copy link
Collaborator Author

sshivaditya2019 commented Aug 31, 2024

GitHub actions are totally free for us. Workers can be costly.

It is only a problem to do workers if we have to spend money

  • I think there should be a third option apart from workers and actions. Features like this are better suited for a standalone web server.

  • That being said total average execution time for this worker is 1ms CPU time and the free plan includes 30 Million ms CPU time.

@gentlementlegen
Copy link
Member

Tested thsi on my Cloudflare, works fine, thanks for all the bug fixes. In the end switching this plugin from Worker to Action should be as simple as changing a URL to an Action path.

Please remove .ubiquibot-config.yml and good to go.

@sshivaditya2019
Copy link
Collaborator Author

@gentlementlegen Removed .ubiquibot-config.yml. Changed @ubiquibot to ubiquity-os

Copy link
Member

@gentlementlegen gentlementlegen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good with me. @0x4007 if you can create the Supabase instance and set the secrets, I do not have permission to create new instances on the company's account, thanks.

@0x4007
Copy link
Member

0x4007 commented Aug 31, 2024

I think create and we can transfer later? I don't have a lot of time on my computer these days due to the upcoming conference.

@0x4007 0x4007 merged commit c6dd05e into ubiquity-os-marketplace:development Aug 31, 2024
2 checks passed
@Keyrxng
Copy link
Member

Keyrxng commented Aug 31, 2024

This is great stuff gg @sshivaditya2019

@gentlementlegen @0x4007 I think it might be a good idea as another issue to improve the db structure so that conversations are grouped together easily. This way when working with the embeddings it's easier to get all of the context specific to a certain task or PR review discussion etc. For instance telegram chat embeddings would be grouped with the issue it relates to in order to give a fuller picture when the embeddings are used.

I guess using the task body is possible but we tend to avoid using arbitrary strings for situations like this

Comment on lines 8 to +13
/**
* Update `manifest.json` with any events you want to support like so:
*
* ubiquity:listeners: ["issue_comment.created", ...]
*/
export type SupportedEventsU = "issue_comment.created";
export type SupportedEventsU = "issue_comment.created" | "issue_comment.deleted" | "issue_comment.edited";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Keyrxng can't this type be dynamically generated by importing the manifest.json file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately not without trickery because string arrays type as string and inference goes out of the window.

The easiest way to do it would be to convert them into keys inside the manifest as opposed to a string array this way we could easily extract string literal types from the object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

issues_comment.created vector embeddings
5 participants