-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PR: evaluate code review comments with OpenAI #79
PR: evaluate code review comments with OpenAI #79
Conversation
Update test issue to one that has PR code review comments
128 token handling roughly 10 code review commetns
@EresDev Could you resolve conflicts so we can get this in please? |
mostly updates expected output to inclulde review comments
@gentlementlegen |
Honestly it's a bit difficult to understand results in detail especially from mobile but if you think that it looks as expected we can merge. I'm a bit confused why many are scored zero |
I'd be in favor to replace the |
Interesting point. If relevance evaluation was skipped due to the config, then - makes sense. If it evaluated to 0 then we should write 0. Let's do this. |
@gentlementlegen change the package version to |
They appear very good to me. They are mostly comments I just wrote when I made a change to the bot and wanted to see its result. For example "updated config". Now this comment has no relevance to the original issue specifications and its relevance was scored 0 by OpenAI. I just wrote it in there to keep track of the QA. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems fine although to be honest sometimes it's a bit difficult to tell what's going on from our results table, especially from mobile.
Not sure how this is relevant to the problem, I just probably didn't merge / use latest commit properly when updating in my org. |
@EresDev Solve merge conflict and you can merge |
Looks like there's a ton of changes. Perhaps you should ensure it works by posting QA |
Resolves #45
Depends on #55
QA: EresDevOrg/ubiquibot-issues#15 (comment)
See exactly which type of review comments are being score by this PR. #45 (comment)
In the prompt,we ask the OpenAI to score review comments, which is not exactly the relevance score like issue comments. For review code comments it is score to improve the offered solution and the improvement in code quality. This made more sense to me. Because of this, the openai is strict in scoring here. Only code review comments with good details get a good score here. OpenAI gets issue specs, part of code change suggested, and the comment itself. It doesn't have more info. This can be improved in the future by including a single review entire conversation, or maybe the entire code file itself. However, it is a lot more work. What we have here I think is a good first try.
Edit:
Please note that OpenAI isn't that much strict in evaluating the code as I said above. It was mistake and it is fixed now. I think it is much better now. latest QA: EresDevOrg/ubiquibot-issues#15 (comment)