Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PR: evaluate code review comments with OpenAI #79

Merged
merged 30 commits into from
Sep 22, 2024

Conversation

EresDev
Copy link
Contributor

@EresDev EresDev commented Aug 8, 2024

Resolves #45
Depends on #55
QA: EresDevOrg/ubiquibot-issues#15 (comment)

See exactly which type of review comments are being score by this PR. #45 (comment)

In the prompt,we ask the OpenAI to score review comments, which is not exactly the relevance score like issue comments. For review code comments it is score to improve the offered solution and the improvement in code quality. This made more sense to me. Because of this, the openai is strict in scoring here. Only code review comments with good details get a good score here. OpenAI gets issue specs, part of code change suggested, and the comment itself. It doesn't have more info. This can be improved in the future by including a single review entire conversation, or maybe the entire code file itself. However, it is a lot more work. What we have here I think is a good first try.

Edit:

Please note that OpenAI isn't that much strict in evaluating the code as I said above. It was mistake and it is fixed now. I think it is much better now. latest QA: EresDevOrg/ubiquibot-issues#15 (comment)

@EresDev EresDev marked this pull request as ready for review August 13, 2024 12:42
@gentlementlegen gentlementlegen changed the base branch from development to main August 15, 2024 05:12
@gentlementlegen gentlementlegen changed the base branch from main to development August 15, 2024 05:12
src/parser/data-purge-module.ts Outdated Show resolved Hide resolved
src/parser/content-evaluator-module.ts Outdated Show resolved Hide resolved
src/parser/content-evaluator-module.ts Outdated Show resolved Hide resolved
src/parser/content-evaluator-module.ts Outdated Show resolved Hide resolved
@gentlementlegen
Copy link
Member

@EresDev Could you resolve conflicts so we can get this in please?

@EresDev
Copy link
Contributor Author

EresDev commented Sep 10, 2024

Could you resolve conflicts so we can get this in please?

@gentlementlegen
I have resolved the conflicts. Here is the latest QA.

@0x4007
Copy link
Member

0x4007 commented Sep 11, 2024

Could you resolve conflicts so we can get this in please?

@gentlementlegen

I have resolved the conflicts. Here is the latest QA.

Honestly it's a bit difficult to understand results in detail especially from mobile but if you think that it looks as expected we can merge.

I'm a bit confused why many are scored zero

@gentlementlegen
Copy link
Member

I'd be in favor to replace the - for relevances by 0 because to me it feels like no relevance was evaluated at all. I will test this on some previously closed tasks.

@0x4007
Copy link
Member

0x4007 commented Sep 11, 2024

Interesting point. If relevance evaluation was skipped due to the config, then - makes sense. If it evaluated to 0 then we should write 0. Let's do this.

@gentlementlegen
Copy link
Member

@EresDev I gave it a try here and got a relevance of 1 for all pull-request commands. Perhaps the configuration should be changed as well?

@EresDev
Copy link
Contributor Author

EresDev commented Sep 11, 2024

@EresDev I gave it a try here and got a relevance of 1 for all pull-request commands. Perhaps the configuration should be changed as well?

It appears to be running at some old commit and the latest code of this PR isn't being used.

@0x4007
Copy link
Member

0x4007 commented Sep 11, 2024

@EresDev I gave it a try here and got a relevance of 1 for all pull-request commands. Perhaps the configuration should be changed as well?

It appears to be running at some old commit and the latest code of this PR isn't being used.

@gentlementlegen change the package version to * to waste less time on housekeeping

@EresDev
Copy link
Contributor Author

EresDev commented Sep 11, 2024

I'm a bit confused why many are scored zero

They appear very good to me. They are mostly comments I just wrote when I made a change to the bot and wanted to see its result. For example "updated config". Now this comment has no relevance to the original issue specifications and its relevance was scored 0 by OpenAI. I just wrote it in there to keep track of the QA.

@0x4007 0x4007 mentioned this pull request Sep 11, 2024
Closed
Copy link
Member

@0x4007 0x4007 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems fine although to be honest sometimes it's a bit difficult to tell what's going on from our results table, especially from mobile.

@gentlementlegen
Copy link
Member

@EresDev I gave it a try here and got a relevance of 1 for all pull-request commands. Perhaps the configuration should be changed as well?

It appears to be running at some old commit and the latest code of this PR isn't being used.

@gentlementlegen change the package version to * to waste less time on housekeeping

Not sure how this is relevant to the problem, I just probably didn't merge / use latest commit properly when updating in my org.

src/parser/content-evaluator-module.ts Outdated Show resolved Hide resolved
@0x4007
Copy link
Member

0x4007 commented Sep 17, 2024

@EresDev Solve merge conflict and you can merge

@0x4007
Copy link
Member

0x4007 commented Sep 22, 2024

Looks like there's a ton of changes. Perhaps you should ensure it works by posting QA

@EresDev
Copy link
Contributor Author

EresDev commented Sep 22, 2024

Looks like there's a ton of changes. Perhaps you should ensure it works by posting QA

Here is the latest QA. It looks good to me. If you find it good, you can merge it. @0x4007

@0x4007 0x4007 merged commit 30fed26 into ubiquity-os-marketplace:development Sep 22, 2024
3 checks passed
@ubiquity-os ubiquity-os bot mentioned this pull request Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relevance Adjustment
4 participants