Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: config event handler #18

Merged
merged 24 commits into from
Mar 6, 2024
Merged

feat: config event handler #18

merged 24 commits into from
Mar 6, 2024

Conversation

whilefoo
Copy link
Contributor

@whilefoo whilefoo commented Feb 7, 2024

No description provided.

export const configSchema = T.Object({
handlers: T.Object({
commands: T.Record(T.Enum(Commands), handlerSchema, { default: {} }),
events: T.Record(T.Enum(GitHubEvent), handlerSchema, { default: {} }),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid using our own enum for events but the problem is that octokit provides only a readonly array of all events or a typescript union type of all events but I haven't figured out how to use them with typebox

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An out of the box idea is to generate the enum file dynamically based on the dependency contents at build time.

Given the time constraints for getting this stable, we can consider breaking that task off into a separate issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just reading through some of the newer code and spotted this IDK what exactly the issue with getting it to work with typebox is but the below helpers won't raise any type-level errors, and will output a typebox workable enum. I haven't tried compile or runtime but idk what the issue is to begin with. Hope it helps

  type EventKeys = (typeof emitterEventNames)[number] & string;

  type RecordFromKeys = Record<EventKeys, EventKeys>;

  const eventsEnum: RecordFromKeys = emitterEventNames.reduce((acc, cur) => {
    acc[cur] = cur;
    return acc;
  }, {} as RecordFromKeys);
 eventsEnum {
   branch_protection_rule: 'branch_protection_rule',    
  'branch_protection_rule.created': 'branch_protection_rule.created',
  'branch_protection_rule.deleted': 'branch_protection_rule.deleted',

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get how you can use eventsEnum with Typebox? Can you provide an example?

For reference I've already tried making Union from Literals like this:

const eventNames = emitterEventNames.map((value) => T.Literal(value));
...
events: T.Record(T.Union(eventNames), handlerSchema, { default: {} })

and while this does work, the typescript type of events becomes just {}

Copy link
Member

@Keyrxng Keyrxng Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @whilefoo probably useless now seeing as this was all the way back in February, please tag me again if this ever happens. As far as I understand to get enums to work you just build a object of key:key as that's what a TS enum boils down to in JS essentially and then use as const on the object and feed it to T.Enum(). I don't use TS enums myself so I'm not sure how to get those to work

image

image

@whilefoo
Copy link
Contributor Author

Previous version of workflow dispatch defined exact inputs, but I'm wondering is that really necessary. Can't we just send the whole webhook event along with settings from the config?

@0x4007
Copy link
Member

0x4007 commented Feb 14, 2024

I know that there are limits around how many input parameters we can have. I assume that there are limits on character length per input parameter.

Let's try passing in everything in a single input parameter and if it hits a character limit then we can break it apart across several input parameters.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be sure to check #20 (comment)

@whilefoo
Copy link
Contributor Author

I know that there are limits around how many input parameters we can have. I assume that there are limits on character length per input parameter.

I've just tried invoking a workflow with event input which is stringified JSON webhook, and settings which is also stringified and it worked!

@0x4007
Copy link
Member

0x4007 commented Feb 15, 2024

Is there an issue we can associate with this pull request? I'm a bit lost on context after reviewing tens of notifications.

Also looks like you can add some of those words to the cspell dictionary. There seems to be at least one genuine type "WATING"

Comment on lines +73 to +80
async getToken(installationId: number) {
const auth = createAppAuth({
appId: this._appId,
privateKey: this._privateKey,
});
const token = await auth({ type: "installation", installationId });
return token.token;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I assume that this is the kernel logic to generate the token with the intent to pass it along to the plugin.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the kernel need repository dispatch event handler for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that plugins will use repository dispatch to return data to the kernel, which will call the next plugin in the chain or stop if it's the last one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. If that approach works then its great! I'm not sure because I never ran the stable kernel, so I never tested any plugin interfacing.

return {
owner: matches[1],
repo: matches[2],
workflowId: matches[3] || "compute.yml",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will default to action.yml in the project root of the plugin. See my conversation rewards(work in progress) as an example: https://github.com/ubiquibot/conversation-rewards/blob/development/action.yml

This is to conform to the existing approach that GitHub Actions takes, and to enable compatibility with all existing GitHub Actions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you can invoke workflows in the root directory because those are meant to be published to the marketplace and used in a workflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just make it a pass through action. Basically the only logic will be to invoke a workflow in the workflow folder.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should default to action.yml in the root.

@whilefoo @gentlementlegen rfc

wrangler.toml Outdated
@@ -2,6 +2,11 @@ name = "ubiquibot-worker"
main = "src/worker.ts"
compatibility_date = "2023-12-06"

[env.dev]
kv_namespaces = [
{ binding = "PLUGIN_CHAIN_STATE", id = "a50ef0a143e449c087b6be613a84e5ba" },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is my expectation that we will handle all storage from within plugins. What is the kernel seeking to remember? Can you elaborate on your decision here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For one event type there can be multiple plugin chains, so when the plugins return data to the kernel, the kernel needs to know which plugin chain is running and which plugin in the chain is currently executing.
For example:

 issues_label.added:

    - name: "Help Menu"
      description: "This will parse the config and display the help menu for slash commands."
      command: "^\/help$"
      example: "/help"
      uses:
        - ubiquibot/plugin-A
        - ubiquibot/plugin-B

    - name: "Assistive Pricing"
      description: "Set pricing based on Time and Priority labels."
      uses:
        - ubiquibot/plugin-B
        - ubiquibot/plugin-C
        - ubiquibot/plugin-A

Another reason is because the config can change when the plugin is running so we need to save the plugin chain at the invocation of the first plugin in the chain.

It also has a nice side feature of preventing plugins from sending data to the kernel without any invocation from the kernel

@whilefoo
Copy link
Contributor Author

I'm still not sure about the plugin input and output data.

Currently the first plugin receives this data:

{
    id: string; // reference to kernel KV
    eventName: string;
    event: string; // event payload
    settings: string;
    authToken: string;
    ref: string;
}

The plugin needs to return the following data:

{
    id: string; // reference to kernel KV
    owner: string; // owner of the repo that triggered the event
    repo: string;  // repo that triggered the event
    output: any; // output of the plugin
}

Should we let the first plugin dictate the input of the second plugin but some properties are essential for every plugin. For example the id, settings of the plugin, ref and maybe eventName and name too.

I'm thinking about a few different ways:

  1. We merge output data of the previous plugin with the essential data like this:
{
    ...pluginOutput.output,
    id: pluginOutput.id,
    settings: JSON.stringify(nextPlugin.with),
    ref: nextPlugin.plugin.ref,
}
  1. We have a separate field for that
{
    id: pluginOutput.id,
    settings: JSON.stringify(nextPlugin.with),
    ref: nextPlugin.plugin.ref,
    previousPluginOutput: pluginOutput.output,
}
  1. Same inputs for all plugins but additional data for output of the previous plugin:
{
    id: string;
    eventName: string;
    event: string; // event payload
    settings: string;
    authToken: string;
    ref: string;
    previousPluginOutput: any, // this will be null for the first plugin
}

@0x4007
Copy link
Member

0x4007 commented Feb 28, 2024

I can't answer specifically because I don't feel that I have enough context (I didn't run the code)

Generally speaking here are some thoughts:

  1. I learned that public repo actions can have unlimited inputs, and private is 10 inputs (we don't need to worry about this for now)

To me this means that we do not have to be conservative, so feel free to pass all the data you might need (especially for this v1)

  1. Intuitively I feel that each plugin should respond to the kernel, and then the kernel should invoke the next plugin; instead of a plugin daisy chaining directly into another plugin. Because that means that the plugin developers would need to actually redundantly include the daisy chain logic (and some plugins may never be used in chains!)

  2. In order to track state (for when the bot receives a plugin response) I'm wondering if it's sufficient to utilize 1. Installation ID (represents which partner is using the bot) and 2. The repo ids that host each plugin

With those combined two values perhaps that's enough context for the bot to pick up where the plugins left off.

Otherwise if more information is needed (like repo, and who invoked the event) possibly just capturing and passing around the entire webhook event that invoked the whole chain of events. It's a large object but it has details around everything relevant to what invoked the event, like org, repo, comment url, user etc

@0x4007
Copy link
Member

0x4007 commented Mar 1, 2024

Hey @whilefoo its 1 March. How's this looking for merging in?

@whilefoo
Copy link
Contributor Author

whilefoo commented Mar 1, 2024

I just need to add the code to get outputs from previous plugin to the input of next plugin and make sure everything is working and we can merge

@0x4007
Copy link
Member

0x4007 commented Mar 1, 2024

When will you have that finished by?

}

// eslint-disable-next-line sonarjs/cognitive-complexity
function checkPluginChains(config: Config) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor this function to reduce its Cognitive Complexity from 58 to the 15 allowed.
I feel like this rule is too strict, what do you think?

Copy link
Member

@0x4007 0x4007 Mar 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just break the main function apart into smaller functions. I think the rule is fine.

58 is wild.

console.log("Dispatching next plugin", nextPlugin);

const token = await context.eventHandler.getToken(state.eventPayload.installation.id);
const ref = nextPlugin.plugin.ref ?? (await getDefaultBranch(context, nextPlugin.plugin.owner, nextPlugin.plugin.repo));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expression produces a union type that is too complex to represent.
This function also gets called in another part of the code and in that case there's no error like this, so I'm not sure what's too complex...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider type guards. I was struggling with this same error when dealing with all the webhook types. I even tried hacking tsc and maxing memory limits to no avail.

Copy link
Contributor

github-actions bot commented Mar 6, 2024

Coverage Report (0%) 
File% Stmts% Branch% Funcs% LinesUncovered Line #s

Copy link
Contributor

github-actions bot commented Mar 6, 2024

Unused files (1)

src/github/types/webhook-events.ts

Unused dependencies (4)

Filename dependencies
package.json @octokit/webhooks-types
create-cloudflare
octokit
universal-github-app-jwt

Unused devDependencies (6)

Filename devDependencies
package.json @mswjs/data
@types/jest
esbuild
eslint-config-prettier
eslint-plugin-prettier
ts-node

Unlisted dependencies (8)

Filename unlisted
src/github/github-client.ts @octokit/core
@octokit/types
@octokit/plugin-paginate-rest
@octokit/plugin-rest-endpoint-methods
@octokit/plugin-retry
@octokit/plugin-throttling
@octokit/auth-app
tests/main.test.ts @jest/globals

Unlisted binaries (4)

Filename binaries
package.json lsof
awk
.github/workflows/build.yml build
.github/workflows/cspell.yml format:cspell

@whilefoo whilefoo marked this pull request as ready for review March 6, 2024 22:58
@whilefoo whilefoo merged commit 73ae1ad into development Mar 6, 2024
5 of 8 checks passed
@0x4007
Copy link
Member

0x4007 commented Mar 7, 2024

Previous version of workflow dispatch defined exact inputs, but I'm wondering is that really necessary. Can't we just send the whole webhook event along with settings from the config?

This was a naive implementation so you would know better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants