Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Analyze and correct eventual discrepancies with the old bot #26

Closed
gentlementlegen opened this issue May 30, 2024 · 19 comments · Fixed by #55
Closed

Analyze and correct eventual discrepancies with the old bot #26

gentlementlegen opened this issue May 30, 2024 · 19 comments · Fixed by #55

Comments

@gentlementlegen
Copy link
Member

gentlementlegen commented May 30, 2024

We should keep comparing the old bot results with the new one, to make sure they behave similarly.

Issues spotted:

Old bot configuration: https://github.com/ubiquity/ubiquibot-config/blob/development/.github/ubiquibot-config.yml

Related results:

@EresDev
Copy link
Contributor

EresDev commented Jul 9, 2024

/start

Copy link

ubiquibot bot commented Jul 9, 2024

Warning! This task was created over 40 days ago. Please confirm that this issue specification is accurate before starting.
DeadlineWed, Jul 10, 11:25 AM UTC
Registered Wallet 0xE7a9fdf596D869AF34a130fa9607178B2B9800D9
Tips:
  • Use /wallet 0x0000...0000 if you want to update your registered payment wallet address.
  • Be sure to open a draft pull request as soon as possible to communicate updates on your progress.
  • Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.

Copy link

ubiquibot bot commented Aug 4, 2024

@gentlementlegen @EresDev the deadline is at 2024-08-05T09:30:02.552Z

Copy link

ubiquibot bot commented Aug 13, 2024

+ Evaluating results. Please wait...

Copy link

ubiquibot-dev bot commented Aug 13, 2024

[ 361.8 WXDAI ]

@gentlementlegen
Contributions Overview
View Contribution Count Reward
Issue Task 0.5 300
Issue Specification 1 33.6
Review Comment 11 28.2
Conversation Incentives
Comment Formatting Relevance Reward
We should keep comparing the old bot results with the new one, t…
33.6
content:
  p:
    count: 20
    score: 1
  ul:
    count: 90
    score: 0
  li:
    count: 90
    score: 1
  code:
    count: 2
    score: 1
wordValue: 0.1
formattingMultiplier: 3
1 33.6
Maybe these are not needed anymore.
0.6
content:
  p:
    count: 6
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.6
Wouldn't this be redundant in the configuration?
0.7
content:
  p:
    count: 7
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.7
Maybe remove logs or use the logger
0.7
content:
  p:
    count: 7
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.7
From my understanding, any comment within the pull request shoul…
4.7
content:
  p:
    count: 46
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 4.7
@EresDev Could you please fix the conflicts? This Pr is quite cr…
2.1
content:
  p:
    count: 21
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 2.1
Also I am trying to get this PR in, that youwill probably need t…
2.1
content:
  p:
    count: 21
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 2.1
I meant that maybe it would be better to have this one merged fi…
2.4
content:
  p:
    count: 24
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 2.4
@EresDev I'll run lots of tests and come back to you.
1.1
content:
  p:
    count: 11
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.1
@EresDev I just ran it against https://github.com/ubiquity/pay.u…
6.8
content:
  h2:
    count: 45
    score: 1
  p:
    count: 23
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 6.8
Latest QA (tests conducted with my fixes as well): - https://gi…
4.6
content:
  p:
    count: 9
    score: 1
  ul:
    count: 37
    score: 0
  li:
    count: 37
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 4.6
@EresDev on the issues I mentioned it should be yes. Actually ri…
2.4
content:
  p:
    count: 24
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 2.4

[ 670 WXDAI ]

@EresDev
Contributions Overview
View Contribution Count Reward
Issue Task 0.5 300
Review Comment 26 370
Conversation Incentives
Comment Formatting Relevance Reward
Resolves #26 ## QA - [part-1](https://github.com/EresDevOrg/ub…
0
content:
  p:
    count: 28
    score: 1
  h2:
    count: 1
    score: 1
  ul:
    count: 21
    score: 0
  li:
    count: 21
    score: 1
  a:
    count: 3
    score: 1
  h3:
    count: 4
    score: 1
wordValue: 0
formattingMultiplier: 0
0.4 -
Ok, I was wondering what the benefit of switching to JSON would …
56.4
content:
  p:
    count: 141
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 56.4
Here is a quickly written sample prompt for jSON. I will improve…
5.2
content:
  p:
    count: 13
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 2
1 5.2
Here is the response using the above prompt. ![image](https://…
3.2
content:
  p:
    count: 8
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 2
1 3.2
I am about to finish the PR and this one is blocking. Let me kno…
31.6
content:
  p:
    count: 79
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 31.6
I went ahead and implemented this. Specifications and comments a…
16.8
content:
  p:
    count: 42
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 16.8
I think you asked for it in the comment Some of these configure…
22.8
content:
  p:
    count: 57
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 22.8
Here is an example of our use case. Seems like helping with data…
12.8
content:
  p:
    count: 32
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 2
1 12.8
resolved by using typebox
1.6
content:
  p:
    count: 4
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 1.6
Ok, any issues with evalutation or openai API propagates the exc…
8.4
content:
  p:
    count: 21
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 8.4
Yeah that was dumb of me. Some leftovers of strickly typed langu…
12.8
content:
  p:
    count: 32
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 12.8
So this is still present. Just to let you know. If it is importa…
10.4
content:
  p:
    count: 26
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 10.4
removed this line.
1.2
content:
  p:
    count: 3
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 1.2
fixed
0.4
content:
  p:
    count: 1
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 0.4
resolved by https://github.com/ubiquibot/conversation-rewards/pu…
4.4
content:
  p:
    count: 9
    score: 1
  a:
    count: 2
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 4.4
resolved by https://github.com/ubiquibot/conversation-rewards/pu…
4.4
content:
  p:
    count: 9
    score: 1
  a:
    count: 2
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 4.4
It wasn't just h5 missing from permit comment, there were other …
35.2
content:
  p:
    count: 86
    score: 1
  code:
    count: 2
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 2
1 35.2
Question: Review comments need `relevance: 1` as given…
17.2
content:
  p:
    count: 39
    score: 1
  code:
    count: 4
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 17.2
@gentlementlegen I see tests failing here, and on also the devel…
14.4
content:
  p:
    count: 36
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 2
1 14.4
I have made it work by changing the expected output.
4
content:
  p:
    count: 10
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 4
Thanks. I was thinking the same to start a new issue for releva…
34.8
content:
  p:
    count: 87
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 34.8
The "Should evaluate content" is covering this test case. The co…
18.8
content:
  p:
    count: 47
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 18.8
I am in the process of resolving conflicts. But I didn't get wha…
13.6
content:
  p:
    count: 34
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 13.6
I have run a 2nd round of tests. All seems ok except the precis…
28
content:
  p:
    count: 64
    score: 1
  img:
    count: 1
    score: 0
  ul:
    count: 3
    score: 0
  li:
    count: 3
    score: 1
  a:
    count: 3
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 28
The floating point problem is fixed https://github.com/EresDevOr…
6
content:
  p:
    count: 15
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 6
I have merged your PR. I believe everything is fixed for this PR…
5.6
content:
  p:
    count: 14
    score: 1
wordValue: 0.2
formattingMultiplier: 2
1 5.6

[ 18.4 WXDAI ]

@whilefoo
Contributions Overview
View Contribution Count Reward
Review Comment 9 18.4
Conversation Incentives
Comment Formatting Relevance Reward
it'd be a good idea to merge latest changes because there were s…
2.7
content:
  p:
    count: 26
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 2.7
```suggestion const commentType = Type.Union([...Ob…
1.8
content:
  pre:
    count: 10
    score: 0
  code:
    count: 10
    score: 1
  p:
    count: 8
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.8
Why is casting to string needed?
0.6
content:
  p:
    count: 6
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.6
seems weird to return `{}`, if AI evaluation failed we s…
1.5
content:
  p:
    count: 14
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.5
it might be a good idea to validate the response with typebox in…
1.8
content:
  p:
    count: 18
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.8
For example one time I made a prompt that instructed it to outpu…
7.1
content:
  p:
    count: 70
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 7.1
usually typebox schemas are in types folder
0.7
content:
  p:
    count: 7
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.7
Consider using `Value.Decode` because it checks the valu…
1.9
content:
  p:
    count: 18
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.9
Then it's fine
0.3
content:
  p:
    count: 3
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.3

[ 10.1 WXDAI ]

@0x4007
Contributions Overview
View Contribution Count Reward
Review Comment 6 10.1
Conversation Incentives
Comment Formatting Relevance Reward
I think there's an official way to tell the API to return a JSON…
3.5
content:
  p:
    count: 35
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 3.5
Reference docs https://community.openai.com/t/how-do-i-use-the-n…
0.3
content:
  p:
    count: 3
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.3
I would imagine that relevance scoring should be a number.
1
content:
  p:
    count: 10
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1
Can you provide an example of what types of mistakes typebox can…
1.4
content:
  p:
    count: 14
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.4
Defaults should be set to `0` value.
0.8
content:
  p:
    count: 7
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.8
Good catch on this task. Why don't you address that in a new pul…
3.1
content:
  p:
    count: 31
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 3.1

Copy link

ubiquibot bot commented Aug 13, 2024

[ 7.9 WXDAI ]

@0x4007
Contributions Overview
ViewContributionCountReward
ReviewComment27.9
Conversation Incentives
CommentFormattingRelevanceReward
> The bot scores 1 to tags not html listed in config. Is this...
1.7
code:
  count: 1
  score: "1"
  words: 1
0.671.7
> Question: Review comments need `relevance: 1` as gi...
6.2
code:
  count: 3
  score: "3"
  words: 6
0.786.2

[ 474.8 WXDAI ]

@gentlementlegen
Contributions Overview
ViewContributionCountReward
IssueSpecification162
IssueTask0.5600
ReviewComment875.2
ReviewComment837.6
Conversation Incentives
CommentFormattingRelevanceReward
We should keep comparing the old bot results with the new one, t...
62
li:
  count: 16
  score: "16"
  words: 187
code:
  count: 2
  score: "2"
  words: 2
162
From my understanding, any comment within the pull request shoul...
11.6
code:
  count: 1
  score: "2"
  words: 1
0.7611.6
@EresDev Could you please fix the conflicts? This Pr is quite cr...
4.20.764.2
Also I am trying to get this PR in, that youwill probably need t...
5.60.685.6
> > Also I am trying to get this PR in, that youwill proba...
4.80.764.8
@EresDev I'll run lots of tests and come back to you....
2.40.792.4
@EresDev I just ran it against https://github.com/ubiquity/pay.u...
20.4
hr:
  count: 1
  score: "2"
  words: 0
0.8120.4
Latest QA (tests conducted with my fixes as well): - https://gi...
21.2
li:
  count: 3
  score: "6"
  words: 67
0.8621.2
@EresDev on the issues I mentioned it should be yes. Actually ri...
50.845
From my understanding, any comment within the pull request shoul...
5.8
code:
  count: 1
  score: "1"
  words: 1
0.765.8
@EresDev Could you please fix the conflicts? This Pr is quite cr...
2.10.762.1
Also I am trying to get this PR in, that youwill probably need t...
2.80.682.8
> > Also I am trying to get this PR in, that youwill proba...
2.40.762.4
@EresDev I'll run lots of tests and come back to you....
1.20.791.2
@EresDev I just ran it against https://github.com/ubiquity/pay.u...
10.2
hr:
  count: 1
  score: "1"
  words: 0
0.8110.2
Latest QA (tests conducted with my fixes as well): - https://gi...
10.6
li:
  count: 3
  score: "3"
  words: 67
0.8610.6
@EresDev on the issues I mentioned it should be yes. Actually ri...
2.50.842.5

[ 418.2 WXDAI ]

@EresDev
Contributions Overview
ViewContributionCountReward
IssueTask0.5600
ReviewComment1059.1
ReviewComment1059.1
Conversation Incentives
CommentFormattingRelevanceReward
It wasn't just h5 missing from permit comment, there were other ...
9.7
code:
  count: 1
  score: "1"
  words: 1
0.749.7
Question: Review comments need `relevance: 1` as given...
7.8
code:
  count: 3
  score: "3"
  words: 6
0.747.8
@gentlementlegen I see tests failing here, and on also the devel...
4.30.744.3
> @gentlementlegen I see tests failing here, and on also the ...
10.681
> > Question: Review comments need `relevance: 1` ...
12.6
code:
  count: 3
  score: "3"
  words: 6
0.7412.6
> consider adding a test case which tests fixed relevance

...

4.80.814.8
> Also I am trying to get this PR in, that youwill probably n...
3.50.793.5
I have run a 2nd round of tests. All seems ok except the precis...
11.7

a:
  count: 2
  score: "2"
  words: 4
li:
  count: 2
  score: "2"
  words: 30
0.7111.7
> All seems ok except the precision float in the value of for...
2.30.812.3
> Since this is urgently needed I opened a PR against your re...
1.40.881.4
It wasn't just h5 missing from permit comment, there were other ...
9.7
code:
  count: 1
  score: "1"
  words: 1
0.749.7
Question: Review comments need `relevance: 1` as given...
7.8
code:
  count: 3
  score: "3"
  words: 6
0.747.8
@gentlementlegen I see tests failing here, and on also the devel...
4.30.744.3
> @gentlementlegen I see tests failing here, and on also the ...
10.681
> > Question: Review comments need `relevance: 1` ...
12.6
code:
  count: 3
  score: "3"
  words: 6
0.7412.6
> consider adding a test case which tests fixed relevance

...

4.80.814.8
> Also I am trying to get this PR in, that youwill probably n...
3.50.793.5
I have run a 2nd round of tests. All seems ok except the precis...
11.7

a:
  count: 2
  score: "2"
  words: 4
li:
  count: 2
  score: "2"
  words: 30
0.7111.7
> All seems ok except the precision float in the value of for...
2.30.812.3
> Since this is urgently needed I opened a PR against your re...
1.40.881.4

[ 0.4 WXDAI ]

@whilefoo
Contributions Overview
ViewContributionCountReward
ReviewComment10.4
Conversation Incentives
CommentFormattingRelevanceReward
> The "Should evaluate content" is covering this test case. T...
0.40.790.4

@0x4007
Copy link
Member

0x4007 commented Aug 13, 2024

@gentlementlegen @EresDev what do you guys think about the rewards comparing old to new? Old suggests you guys put in similar work, and new suggests that eres put in almost double. I'm under the impression that eres did put in about double the work (or more technically speaking, provided more value towards the completion of this project) but curious to hear your opinions.

I think we should start gently adjusting levers to align as accurately as possible for this.

@gentlementlegen
Copy link
Member Author

gentlementlegen commented Aug 13, 2024

@0x4007 Biggest difference is that the new one picked up all the comments within the pull-request (26 of them) against the old version only seing 10. Some multipliers seem to differ (e.g. code has a value of 2 in the old version, and a value of 1 in the new) Plus, all of these comments are counted with a relevance of 1 which was one of the latest changes. This also should be adjusted after #79 is merged, which might make the results more accurate.

@EresDev
Copy link
Contributor

EresDev commented Aug 14, 2024

Old suggests you guys put in similar work, and new suggests that eres put in almost double. I'm under the impression that eres did put in about double the work (or more technically speaking, provided more value towards the completion of this project) but curious to hear your opinions.

I think this is true. The old & new bot both divide the task equally. The new ubiquibot provides some relief with comments incentive. But I see distribution of task incentives should be improved.

@0x4007
Copy link
Member

0x4007 commented Aug 15, 2024

Curious to hear if you guys have any proposals to solve for this scenario.

@gentlementlegen
Copy link
Member Author

Let's see if analyzing the code with #79 helps getting more accurate results about the improvements introduced by the pull-request and its comments.

@0x4007
Copy link
Member

0x4007 commented Aug 16, 2024

But I see distribution of task incentives should be improved.

Not all lines of code are equal so I think this is tough to solve for. We could consider compiling the diff from every code contributor and then ask ChatGPT how impactful the sum total of their changes are in order to try and be more nuanced with the task reward.

@EresDev
Copy link
Contributor

EresDev commented Aug 16, 2024

Curious to hear if you guys have any proposals to solve for this scenario.

I have been thinking about this but couldn't find a decent solution. What's already present is the best solution I think. Dividing it equally is ok. Unfair distribution will be rare. The second developer usually joins for reasons that make the distribution fair.

We could consider compiling the diff from every code contributor and then ask ChatGPT how impactful the sum total of their changes are to try and be more nuanced with the task reward.

I have concerns that relying on ChatGPT for this will result in a higher occurrence of unfair distributions. But it is something that can be tried. Some pull requests are huge and sending their code to OpenAI probably may hit OpenAI token or API limit.

What is better than ChatGPT is having a solid formula to measure. I couldn't find one so far.

@EresDev
Copy link
Contributor

EresDev commented Aug 16, 2024

I haven't been able to claim the reward of this issue so far because by the time I got here, permit wallets were empty for ubiquibot upgrade. They are still empty, even though claims are working on other issues. Closing and opening of this issue will also generate higher permit for me based on the new ubiquibot I think. Please suggest a solution to this. @0x4007

@0x4007 0x4007 reopened this Aug 16, 2024
@0x4007 0x4007 closed this as completed Aug 16, 2024
Copy link

ubiquibot bot commented Aug 16, 2024

+ Evaluating results. Please wait...

@0x4007
Copy link
Member

0x4007 commented Aug 16, 2024

Just take the new permit then

Copy link

ubiquityos bot commented Aug 16, 2024

[ 361.24 WXDAI ]

@gentlementlegen
Contributions Overview
View Contribution Count Reward
Issue Task 0.5 300
Issue Specification 1 33.6
Issue Comment 2 19.24
Review Comment 13 8.4
Conversation Incentives
Comment Formatting Relevance Reward
We should keep comparing the old bot results with the new one, t…
33.6
content:
  p:
    count: 20
    score: 1
  ul:
    count: 90
    score: 0
  li:
    count: 90
    score: 1
  code:
    count: 2
    score: 1
wordValue: 0.1
formattingMultiplier: 3
1 33.6
@0x4007 Biggest difference is that the new one picked up all the…
17.8
content:
  p:
    count: 85
    score: 1
  code:
    count: 4
    score: 1
wordValue: 0.2
formattingMultiplier: 1
0.9 16.02
Let's see if analyzing the code with https://github.com/ubiquibo…
4.6
content:
  p:
    count: 23
    score: 1
wordValue: 0.2
formattingMultiplier: 1
0.7 3.22
Maybe these are not needed anymore.
0.15
content:
  p:
    count: 6
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.15
Wouldn't this be redundant in the configuration?
0.175
content:
  p:
    count: 7
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.175
Maybe remove logs or use the logger
0.175
content:
  p:
    count: 7
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.175
The nesting is die to `userExtractor` being its own modu…
0.95
content:
  p:
    count: 36
    score: 1
  code:
    count: 2
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.95
It should not be but according to GitHub types `body` ca…
0.4
content:
  p:
    count: 14
    score: 1
  code:
    count: 2
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.4
From my understanding, any comment within the pull request shoul…
1.175
content:
  p:
    count: 46
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 1.175
@EresDev Could you please fix the conflicts? This Pr is quite cr…
0.525
content:
  p:
    count: 21
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.525
Also I am trying to get this PR in, that youwill probably need t…
0.525
content:
  p:
    count: 21
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.525
I meant that maybe it would be better to have this one merged fi…
0.6
content:
  p:
    count: 24
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.6
@EresDev I'll run lots of tests and come back to you.
0.275
content:
  p:
    count: 11
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.275
@EresDev I just ran it against https://github.com/ubiquity/pay.u…
1.7
content:
  h2:
    count: 45
    score: 1
  p:
    count: 23
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 1.7
Latest QA (tests conducted with my fixes as well): - https://gi…
1.15
content:
  p:
    count: 9
    score: 1
  ul:
    count: 37
    score: 0
  li:
    count: 37
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 1.15
@EresDev on the issues I mentioned it should be yes. Actually ri…
0.6
content:
  p:
    count: 24
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.6

[ 302.53 WXDAI ]

@EresDev
Contributions Overview
View Contribution Count Reward
Issue Task 0.5 300
Issue Comment 3 2.53
Review Comment 27 0
Conversation Incentives
Comment Formatting Relevance Reward
I think this is true. The old & new bot both divide the task…
0.85
content:
  p:
    count: 34
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
0.6 0.51
I have been thinking about this but couldn't find a decent solut…
2.65
content:
  p:
    count: 106
    score: 1
  br:
    count: 1
    score: 0
wordValue: 0.1
formattingMultiplier: 0.25
0.7 1.855
I haven't been able to claim the reward of this issue so far bec…
1.65
content:
  p:
    count: 66
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
0.1 0.165
Resolves #26 ## QA - [part-1](https://github.com/EresDevOrg/ub…
0
content:
  p:
    count: 28
    score: 1
  h2:
    count: 1
    score: 1
  ul:
    count: 21
    score: 0
  li:
    count: 21
    score: 1
  a:
    count: 3
    score: 1
  h3:
    count: 4
    score: 1
wordValue: 0
formattingMultiplier: 0
0.5 -
Ok, I was wondering what the benefit of switching to JSON would …
0
content:
  p:
    count: 141
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
Here is a quickly written sample prompt for jSON. I will improve…
0
content:
  p:
    count: 13
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 0
1 -
Here is the response using the above prompt. ![image](https://…
0
content:
  p:
    count: 8
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 0
1 -
I am about to finish the PR and this one is blocking. Let me kno…
0
content:
  p:
    count: 79
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
I went ahead and implemented this. Specifications and comments a…
0
content:
  p:
    count: 42
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
I think you asked for it in the comment Some of these configure…
0
content:
  p:
    count: 57
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
Here is an example of our use case. Seems like helping with data…
0
content:
  p:
    count: 32
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 0
1 -
resolved by using typebox
0
content:
  p:
    count: 4
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
Ok, any issues with evalutation or openai API propagates the exc…
0
content:
  p:
    count: 21
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
Yeah that was dumb of me. Some leftovers of strickly typed langu…
0
content:
  p:
    count: 32
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
So this is still present. Just to let you know. If it is importa…
0
content:
  p:
    count: 26
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
removed this line.
0
content:
  p:
    count: 3
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
fixed
0
content:
  p:
    count: 1
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
resolved by https://github.com/ubiquibot/conversation-rewards/pu…
0
content:
  p:
    count: 9
    score: 1
  a:
    count: 2
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
resolved by https://github.com/ubiquibot/conversation-rewards/pu…
0
content:
  p:
    count: 9
    score: 1
  a:
    count: 2
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
The requirement was to set the relevance of issue specifications…
0
content:
  p:
    count: 94
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
It wasn't just h5 missing from permit comment, there were other …
0
content:
  p:
    count: 86
    score: 1
  code:
    count: 2
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 0
1 -
Question: Review comments need `relevance: 1` as given…
0
content:
  p:
    count: 39
    score: 1
  code:
    count: 4
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
@gentlementlegen I see tests failing here, and on also the devel…
0
content:
  p:
    count: 36
    score: 1
  img:
    count: 1
    score: 0
wordValue: 0.2
formattingMultiplier: 0
1 -
I have made it work by changing the expected output.
0
content:
  p:
    count: 10
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
Thanks. I was thinking the same to start a new issue for releva…
0
content:
  p:
    count: 87
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
The "Should evaluate content" is covering this test case. The co…
0
content:
  p:
    count: 47
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
I am in the process of resolving conflicts. But I didn't get wha…
0
content:
  p:
    count: 34
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
I have run a 2nd round of tests. All seems ok except the precis…
0
content:
  p:
    count: 64
    score: 1
  img:
    count: 1
    score: 0
  ul:
    count: 3
    score: 0
  li:
    count: 3
    score: 1
  a:
    count: 3
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
The floating point problem is fixed https://github.com/EresDevOr…
0
content:
  p:
    count: 15
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -
I have merged your PR. I believe everything is fixed for this PR…
0
content:
  p:
    count: 14
    score: 1
wordValue: 0.2
formattingMultiplier: 0
1 -

[ 24.57 WXDAI ]

@0x4007
Contributions Overview
View Contribution Count Reward
Issue Comment 4 10.37
Review Comment 9 14.2
Conversation Incentives
Comment Formatting Relevance Reward
@gentlementlegen @EresDev what do you guys think about the rewar…
7.9
content:
  p:
    count: 79
    score: 1
wordValue: 0.1
formattingMultiplier: 1
0.8 6.32
Curious to hear if you guys have any proposals to solve for this…
1.4
content:
  p:
    count: 14
    score: 1
wordValue: 0.1
formattingMultiplier: 1
0.3 0.42
Not all lines of code are equal so I think this is tough to solv…
5.1
content:
  p:
    count: 51
    score: 1
wordValue: 0.1
formattingMultiplier: 1
0.7 3.57
Just take the new permit then
0.6
content:
  p:
    count: 6
    score: 1
wordValue: 0.1
formattingMultiplier: 1
0.1 0.06
I think there's an official way to tell the API to return a JSON…
3.5
content:
  p:
    count: 35
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 3.5
Reference docs https://community.openai.com/t/how-do-i-use-the-n…
0.3
content:
  p:
    count: 3
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.3
I would imagine that relevance scoring should be a number.
1
content:
  p:
    count: 10
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1
Can you provide an example of what types of mistakes typebox can…
1.4
content:
  p:
    count: 14
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.4
Sets these to relevance 1? This is not clear to me.
1.1
content:
  p:
    count: 11
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.1
What is this? We should just make it a single property without n…
1.4
content:
  p:
    count: 14
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.4
I'm surprised that empty comments have been found in testing. I …
1.6
content:
  p:
    count: 16
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 1.6
Defaults should be set to `0` value.
0.8
content:
  p:
    count: 7
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 0.8
Good catch on this task. Why don't you address that in a new pul…
3.1
content:
  p:
    count: 31
    score: 1
wordValue: 0.1
formattingMultiplier: 1
1 3.1

[ 4.6 WXDAI ]

@whilefoo
Contributions Overview
View Contribution Count Reward
Review Comment 9 4.6
Conversation Incentives
Comment Formatting Relevance Reward
it'd be a good idea to merge latest changes because there were s…
0.675
content:
  p:
    count: 26
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.675
```suggestion const commentType = Type.Union([...Ob…
0.45
content:
  pre:
    count: 10
    score: 0
  code:
    count: 10
    score: 1
  p:
    count: 8
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.45
Why is casting to string needed?
0.15
content:
  p:
    count: 6
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.15
seems weird to return `{}`, if AI evaluation failed we s…
0.375
content:
  p:
    count: 14
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.375
it might be a good idea to validate the response with typebox in…
0.45
content:
  p:
    count: 18
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.45
For example one time I made a prompt that instructed it to outpu…
1.775
content:
  p:
    count: 70
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 1.775
usually typebox schemas are in types folder
0.175
content:
  p:
    count: 7
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.175
Consider using `Value.Decode` because it checks the valu…
0.475
content:
  p:
    count: 18
    score: 1
  code:
    count: 1
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.475
Then it's fine
0.075
content:
  p:
    count: 3
    score: 1
wordValue: 0.1
formattingMultiplier: 0.25
1 0.075

Copy link

ubiquibot bot commented Aug 16, 2024

[ 23 WXDAI ]

@0x4007
Contributions Overview
ViewContributionCountReward
IssueComment415.1
ReviewComment27.9
Conversation Incentives
CommentFormattingRelevanceReward
@gentlementlegen @EresDev what do you guys think about the rewar...
80.8058
Curious to hear if you guys have any proposals to solve for this...
1.40.6751.4
> But I see distribution of task incentives should be improve...
5.10.6655.1
Just take the new permit then ...
0.60.6350.6
> The bot scores 1 to tags not html listed in config. Is this...
1.7
code:
  count: 1
  score: "1"
  words: 1
0.731.7
> Question: Review comments need `relevance: 1` as gi...
6.2
code:
  count: 3
  score: "3"
  words: 6
0.86.2

[ 504 WXDAI ]

@gentlementlegen
Contributions Overview
ViewContributionCountReward
IssueSpecification162
IssueTask0.5600
IssueComment229.2
IssueComment20
ReviewComment875.2
ReviewComment837.6
Conversation Incentives
CommentFormattingRelevanceReward
We should keep comparing the old bot results with the new one, t...
62
li:
  count: 16
  score: "16"
  words: 187
code:
  count: 2
  score: "2"
  words: 2
162
@0x4007 Biggest difference is that the new one picked up all the...
22.8
code:
  count: 4
  score: "4"
  words: 4
0.85522.8
Let's see if analyzing the code with https://github.com/ubiquibo...
6.40.6756.4
@0x4007 Biggest difference is that the new one picked up all the...
-
code:
  count: 4
  score: "0"
  words: 4
0.855-
Let's see if analyzing the code with https://github.com/ubiquibo...
-0.675-
From my understanding, any comment within the pull request shoul...
11.6
code:
  count: 1
  score: "2"
  words: 1
0.8411.6
@EresDev Could you please fix the conflicts? This Pr is quite cr...
4.20.754.2
Also I am trying to get this PR in, that youwill probably need t...
5.60.865.6
> > Also I am trying to get this PR in, that youwill proba...
4.80.754.8
@EresDev I'll run lots of tests and come back to you....
2.40.812.4
@EresDev I just ran it against https://github.com/ubiquity/pay.u...
20.4
hr:
  count: 1
  score: "2"
  words: 0
0.7920.4
Latest QA (tests conducted with my fixes as well): - https://gi...
21.2
li:
  count: 3
  score: "6"
  words: 67
0.8121.2
@EresDev on the issues I mentioned it should be yes. Actually ri...
50.95
From my understanding, any comment within the pull request shoul...
5.8
code:
  count: 1
  score: "1"
  words: 1
0.845.8
@EresDev Could you please fix the conflicts? This Pr is quite cr...
2.10.752.1
Also I am trying to get this PR in, that youwill probably need t...
2.80.862.8
> > Also I am trying to get this PR in, that youwill proba...
2.40.752.4
@EresDev I'll run lots of tests and come back to you....
1.20.811.2
@EresDev I just ran it against https://github.com/ubiquity/pay.u...
10.2
hr:
  count: 1
  score: "1"
  words: 0
0.7910.2
Latest QA (tests conducted with my fixes as well): - https://gi...
10.6
li:
  count: 3
  score: "3"
  words: 67
0.8110.6
@EresDev on the issues I mentioned it should be yes. Actually ri...
2.50.92.5

[ 439.1 WXDAI ]

@EresDev
Contributions Overview
ViewContributionCountReward
IssueTask0.5600
IssueComment30
IssueComment320.9
ReviewComment1059.1
ReviewComment1059.1
Conversation Incentives
CommentFormattingRelevanceReward
> Old suggests you guys put in similar work, and new suggests...
-0.68-
> Curious to hear if you guys have any proposals to solve for...
-0.55-
I haven't been able to claim the reward of this issue so far bec...
-0.575-
> Old suggests you guys put in similar work, and new suggests...
3.30.683.3
> Curious to hear if you guys have any proposals to solve for...
10.90.5510.9
I haven't been able to claim the reward of this issue so far bec...
6.70.5756.7
It wasn't just h5 missing from permit comment, there were other ...
9.7
code:
  count: 1
  score: "1"
  words: 1
0.749.7
Question: Review comments need `relevance: 1` as given...
7.8
code:
  count: 3
  score: "3"
  words: 6
0.837.8
@gentlementlegen I see tests failing here, and on also the devel...
4.30.734.3
> @gentlementlegen I see tests failing here, and on also the ...
10.681
> > Question: Review comments need `relevance: 1` ...
12.6
code:
  count: 3
  score: "3"
  words: 6
0.7112.6
> consider adding a test case which tests fixed relevance

...

4.80.814.8
> Also I am trying to get this PR in, that youwill probably n...
3.50.843.5
I have run a 2nd round of tests. All seems ok except the precis...
11.7

a:
  count: 2
  score: "2"
  words: 4
li:
  count: 2
  score: "2"
  words: 30
0.7511.7
> All seems ok except the precision float in the value of for...
2.30.872.3
> Since this is urgently needed I opened a PR against your re...
1.40.861.4
It wasn't just h5 missing from permit comment, there were other ...
9.7
code:
  count: 1
  score: "1"
  words: 1
0.749.7
Question: Review comments need `relevance: 1` as given...
7.8
code:
  count: 3
  score: "3"
  words: 6
0.837.8
@gentlementlegen I see tests failing here, and on also the devel...
4.30.734.3
> @gentlementlegen I see tests failing here, and on also the ...
10.681
> > Question: Review comments need `relevance: 1` ...
12.6
code:
  count: 3
  score: "3"
  words: 6
0.7112.6
> consider adding a test case which tests fixed relevance

...

4.80.814.8
> Also I am trying to get this PR in, that youwill probably n...
3.50.843.5
I have run a 2nd round of tests. All seems ok except the precis...
11.7

a:
  count: 2
  score: "2"
  words: 4
li:
  count: 2
  score: "2"
  words: 30
0.7511.7
> All seems ok except the precision float in the value of for...
2.30.872.3
> Since this is urgently needed I opened a PR against your re...
1.40.861.4

[ 0.4 WXDAI ]

@whilefoo
Contributions Overview
ViewContributionCountReward
ReviewComment10.4
Conversation Incentives
CommentFormattingRelevanceReward
> The "Should evaluate content" is covering this test case. T...
0.40.840.4

@0x4007
Copy link
Member

0x4007 commented Aug 16, 2024

Some pull requests are huge and sending their code to OpenAI probably may hit OpenAI token or API limit.

Unlikely. The new context lengths are crazy large

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants