Analyze and correct eventual discrepancies with the old bot #26

gentlementlegen · 2024-05-30T03:00:04Z

We should keep comparing the old bot results with the new one, to make sure they behave similarly.

Issues spotted:

specification should have a relevance of 1 (c.f. Relevance Adjustment #45)
all the items should be listed (case of missing h5)
details of the applied multiplier should be displayed (supposedly already implemented, but not showing up, probably not the latest deployment)
review comments should be considered
relevance should be set at 1 in case of bug during ChatGpt evaluation
specification doesn't seem to be taken into account
comments are ignored Analyze and correct eventual discrepancies with the old bot #26
should properly display all the comments: Change ts-retry to Octokit plugin retry #66 (comment)

Old bot configuration: https://github.com/ubiquity/ubiquibot-config/blob/development/.github/ubiquibot-config.yml

Tips:

Use /wallet 0x0000...0000 if you want to update your registered payment wallet address.
Be sure to open a draft pull request as soon as possible to communicate updates on your progress.
Be sure to provide timely updates to us when requested, or you will be automatically unassigned from the task.

ubiquibot · 2024-08-04T09:30:02Z

@gentlementlegen @EresDev the deadline is at 2024-08-05T09:30:02.552Z

ubiquibot · 2024-08-13T04:55:02Z

+ Evaluating results. Please wait...

ubiquibot-dev · 2024-08-13T04:55:10Z

[ 361.8 WXDAI ]

@gentlementlegen

Contributions Overview

View	Contribution	Count	Reward
Issue	Task	0.5	300
Issue	Specification	1	33.6
Review	Comment	11	28.2

Conversation Incentives

Comment	Formatting	Relevance	Reward
We should keep comparing the old bot results with the new one, t…	33.6 content: p: count: 20 score: 1 ul: count: 90 score: 0 li: count: 90 score: 1 code: count: 2 score: 1 wordValue: 0.1 formattingMultiplier: 3	1	33.6
Maybe these are not needed anymore.	0.6 content: p: count: 6 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.6
Wouldn't this be redundant in the configuration?	0.7 content: p: count: 7 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.7
Maybe remove logs or use the logger	0.7 content: p: count: 7 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.7
From my understanding, any comment within the pull request shoul…	4.7 content: p: count: 46 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	4.7
@EresDev Could you please fix the conflicts? This Pr is quite cr…	2.1 content: p: count: 21 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	2.1
Also I am trying to get this PR in, that youwill probably need t…	2.1 content: p: count: 21 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	2.1
I meant that maybe it would be better to have this one merged fi…	2.4 content: p: count: 24 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	2.4
@EresDev I'll run lots of tests and come back to you.	1.1 content: p: count: 11 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.1
@EresDev I just ran it against https://github.com/ubiquity/pay.u…	6.8 content: h2: count: 45 score: 1 p: count: 23 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	6.8
Latest QA (tests conducted with my fixes as well): - https://gi…	4.6 content: p: count: 9 score: 1 ul: count: 37 score: 0 li: count: 37 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	4.6
@EresDev on the issues I mentioned it should be yes. Actually ri…	2.4 content: p: count: 24 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	2.4

[ 670 WXDAI ]

@EresDev

Contributions Overview

View	Contribution	Count	Reward
Issue	Task	0.5	300
Review	Comment	26	370

Conversation Incentives

Comment	Formatting	Relevance	Reward
Resolves #26 ## QA - [part-1](https://github.com/EresDevOrg/ub…	0 content: p: count: 28 score: 1 h2: count: 1 score: 1 ul: count: 21 score: 0 li: count: 21 score: 1 a: count: 3 score: 1 h3: count: 4 score: 1 wordValue: 0 formattingMultiplier: 0	0.4	-
Ok, I was wondering what the benefit of switching to JSON would …	56.4 content: p: count: 141 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	56.4
Here is a quickly written sample prompt for jSON. I will improve…	5.2 content: p: count: 13 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 2	1	5.2
Here is the response using the above prompt. ![image](https://…	3.2 content: p: count: 8 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 2	1	3.2
I am about to finish the PR and this one is blocking. Let me kno…	31.6 content: p: count: 79 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	31.6
I went ahead and implemented this. Specifications and comments a…	16.8 content: p: count: 42 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	16.8
I think you asked for it in the comment Some of these configure…	22.8 content: p: count: 57 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	22.8
Here is an example of our use case. Seems like helping with data…	12.8 content: p: count: 32 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 2	1	12.8
resolved by using typebox	1.6 content: p: count: 4 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	1.6
Ok, any issues with evalutation or openai API propagates the exc…	8.4 content: p: count: 21 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	8.4
Yeah that was dumb of me. Some leftovers of strickly typed langu…	12.8 content: p: count: 32 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	12.8
So this is still present. Just to let you know. If it is importa…	10.4 content: p: count: 26 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	10.4
removed this line.	1.2 content: p: count: 3 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	1.2
fixed	0.4 content: p: count: 1 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	0.4
resolved by https://github.com/ubiquibot/conversation-rewards/pu…	4.4 content: p: count: 9 score: 1 a: count: 2 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	4.4
resolved by https://github.com/ubiquibot/conversation-rewards/pu…	4.4 content: p: count: 9 score: 1 a: count: 2 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	4.4
It wasn't just h5 missing from permit comment, there were other …	35.2 content: p: count: 86 score: 1 code: count: 2 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 2	1	35.2
Question: Review comments need `relevance: 1` as given…	17.2 content: p: count: 39 score: 1 code: count: 4 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	17.2
@gentlementlegen I see tests failing here, and on also the devel…	14.4 content: p: count: 36 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 2	1	14.4
I have made it work by changing the expected output.	4 content: p: count: 10 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	4
Thanks. I was thinking the same to start a new issue for releva…	34.8 content: p: count: 87 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	34.8
The "Should evaluate content" is covering this test case. The co…	18.8 content: p: count: 47 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	18.8
I am in the process of resolving conflicts. But I didn't get wha…	13.6 content: p: count: 34 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	13.6
I have run a 2nd round of tests. All seems ok except the precis…	28 content: p: count: 64 score: 1 img: count: 1 score: 0 ul: count: 3 score: 0 li: count: 3 score: 1 a: count: 3 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	28
The floating point problem is fixed https://github.com/EresDevOr…	6 content: p: count: 15 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	6
I have merged your PR. I believe everything is fixed for this PR…	5.6 content: p: count: 14 score: 1 wordValue: 0.2 formattingMultiplier: 2	1	5.6

[ 18.4 WXDAI ]

@whilefoo

Contributions Overview

View	Contribution	Count	Reward
Review	Comment	9	18.4

Conversation Incentives

Comment	Formatting	Relevance	Reward
it'd be a good idea to merge latest changes because there were s…	2.7 content: p: count: 26 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	2.7
```suggestion const commentType = Type.Union([...Ob…	1.8 content: pre: count: 10 score: 0 code: count: 10 score: 1 p: count: 8 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.8
Why is casting to string needed?	0.6 content: p: count: 6 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.6
seems weird to return `{}`, if AI evaluation failed we s…	1.5 content: p: count: 14 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.5
it might be a good idea to validate the response with typebox in…	1.8 content: p: count: 18 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.8
For example one time I made a prompt that instructed it to outpu…	7.1 content: p: count: 70 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	7.1
usually typebox schemas are in types folder	0.7 content: p: count: 7 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.7
Consider using `Value.Decode` because it checks the valu…	1.9 content: p: count: 18 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.9
Then it's fine	0.3 content: p: count: 3 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.3

[ 10.1 WXDAI ]

@0x4007

Contributions Overview

View	Contribution	Count	Reward
Review	Comment	6	10.1

Conversation Incentives

Comment	Formatting	Relevance	Reward
I think there's an official way to tell the API to return a JSON…	3.5 content: p: count: 35 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	3.5
Reference docs https://community.openai.com/t/how-do-i-use-the-n…	0.3 content: p: count: 3 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.3
I would imagine that relevance scoring should be a number.	1 content: p: count: 10 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1
Can you provide an example of what types of mistakes typebox can…	1.4 content: p: count: 14 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.4
Defaults should be set to `0` value.	0.8 content: p: count: 7 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.8
Good catch on this task. Why don't you address that in a new pul…	3.1 content: p: count: 31 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	3.1

ubiquibot · 2024-08-13T04:55:44Z

[ 7.9 WXDAI ]

@0x4007

Contributions Overview

View	Contribution	Count	Reward
Review	Comment	2	7.9

Conversation Incentives

Comment	Formatting	Relevance	Reward
> The bot scores 1 to tags not html listed in config. Is this...	1.7 code: count: 1 score: "1" words: 1	0.67	1.7
> Question: Review comments need `relevance: 1` as gi...	6.2 code: count: 3 score: "3" words: 6	0.78	6.2

[ 474.8 WXDAI ]

@gentlementlegen

Contributions Overview

View	Contribution	Count	Reward
Issue	Specification	1	62
Issue	Task	0.5	600
Review	Comment	8	75.2
Review	Comment	8	37.6

Conversation Incentives

Comment	Formatting	Relevance	Reward
We should keep comparing the old bot results with the new one, t...	62 li: count: 16 score: "16" words: 187 code: count: 2 score: "2" words: 2	1	62
From my understanding, any comment within the pull request shoul...	11.6 code: count: 1 score: "2" words: 1	0.76	11.6
@EresDev Could you please fix the conflicts? This Pr is quite cr...	4.2	0.76	4.2
Also I am trying to get this PR in, that youwill probably need t...	5.6	0.68	5.6
> > Also I am trying to get this PR in, that youwill proba...	4.8	0.76	4.8
@EresDev I'll run lots of tests and come back to you....	2.4	0.79	2.4
@EresDev I just ran it against https://github.com/ubiquity/pay.u...	20.4 hr: count: 1 score: "2" words: 0	0.81	20.4
Latest QA (tests conducted with my fixes as well): - https://gi...	21.2 li: count: 3 score: "6" words: 67	0.86	21.2
@EresDev on the issues I mentioned it should be yes. Actually ri...	5	0.84	5
From my understanding, any comment within the pull request shoul...	5.8 code: count: 1 score: "1" words: 1	0.76	5.8
@EresDev Could you please fix the conflicts? This Pr is quite cr...	2.1	0.76	2.1
Also I am trying to get this PR in, that youwill probably need t...	2.8	0.68	2.8
> > Also I am trying to get this PR in, that youwill proba...	2.4	0.76	2.4
@EresDev I'll run lots of tests and come back to you....	1.2	0.79	1.2
@EresDev I just ran it against https://github.com/ubiquity/pay.u...	10.2 hr: count: 1 score: "1" words: 0	0.81	10.2
Latest QA (tests conducted with my fixes as well): - https://gi...	10.6 li: count: 3 score: "3" words: 67	0.86	10.6
@EresDev on the issues I mentioned it should be yes. Actually ri...	2.5	0.84	2.5

[ 418.2 WXDAI ]

@EresDev

Contributions Overview

View	Contribution	Count	Reward
Issue	Task	0.5	600
Review	Comment	10	59.1
Review	Comment	10	59.1

Conversation Incentives

Comment	Formatting	Relevance	Reward
It wasn't just h5 missing from permit comment, there were other ...	9.7 code: count: 1 score: "1" words: 1	0.74	9.7
Question: Review comments need `relevance: 1` as given...	7.8 code: count: 3 score: "3" words: 6	0.74	7.8
@gentlementlegen I see tests failing here, and on also the devel...	4.3	0.74	4.3
> @gentlementlegen I see tests failing here, and on also the ...	1	0.68	1
> > Question: Review comments need `relevance: 1` ...	12.6 code: count: 3 score: "3" words: 6	0.74	12.6
> consider adding a test case which tests fixed relevance ...	4.8	0.81	4.8
> Also I am trying to get this PR in, that youwill probably n...	3.5	0.79	3.5
I have run a 2nd round of tests. All seems ok except the precis...	11.7 a: count: 2 score: "2" words: 4 li: count: 2 score: "2" words: 30	0.71	11.7
> All seems ok except the precision float in the value of for...	2.3	0.81	2.3
> Since this is urgently needed I opened a PR against your re...	1.4	0.88	1.4
It wasn't just h5 missing from permit comment, there were other ...	9.7 code: count: 1 score: "1" words: 1	0.74	9.7
Question: Review comments need `relevance: 1` as given...	7.8 code: count: 3 score: "3" words: 6	0.74	7.8
@gentlementlegen I see tests failing here, and on also the devel...	4.3	0.74	4.3
> @gentlementlegen I see tests failing here, and on also the ...	1	0.68	1
> > Question: Review comments need `relevance: 1` ...	12.6 code: count: 3 score: "3" words: 6	0.74	12.6
> consider adding a test case which tests fixed relevance ...	4.8	0.81	4.8
> Also I am trying to get this PR in, that youwill probably n...	3.5	0.79	3.5
I have run a 2nd round of tests. All seems ok except the precis...	11.7 a: count: 2 score: "2" words: 4 li: count: 2 score: "2" words: 30	0.71	11.7
> All seems ok except the precision float in the value of for...	2.3	0.81	2.3
> Since this is urgently needed I opened a PR against your re...	1.4	0.88	1.4

[ 0.4 WXDAI ]

@whilefoo

Contributions Overview

View	Contribution	Count	Reward
Review	Comment	1	0.4

Conversation Incentives

Comment	Formatting	Relevance	Reward
> The "Should evaluate content" is covering this test case. T...	0.4	0.79	0.4

0x4007 · 2024-08-13T17:43:29Z

@gentlementlegen @EresDev what do you guys think about the rewards comparing old to new? Old suggests you guys put in similar work, and new suggests that eres put in almost double. I'm under the impression that eres did put in about double the work (or more technically speaking, provided more value towards the completion of this project) but curious to hear your opinions.

I think we should start gently adjusting levers to align as accurately as possible for this.

gentlementlegen · 2024-08-13T18:23:45Z

@0x4007 Biggest difference is that the new one picked up all the comments within the pull-request (26 of them) against the old version only seing 10. Some multipliers seem to differ (e.g. code has a value of 2 in the old version, and a value of 1 in the new) Plus, all of these comments are counted with a relevance of 1 which was one of the latest changes. This also should be adjusted after #79 is merged, which might make the results more accurate.

EresDev · 2024-08-14T10:51:21Z

Old suggests you guys put in similar work, and new suggests that eres put in almost double. I'm under the impression that eres did put in about double the work (or more technically speaking, provided more value towards the completion of this project) but curious to hear your opinions.

I think this is true. The old & new bot both divide the task equally. The new ubiquibot provides some relief with comments incentive. But I see distribution of task incentives should be improved.

0x4007 · 2024-08-15T01:00:25Z

Curious to hear if you guys have any proposals to solve for this scenario.

gentlementlegen · 2024-08-15T05:21:43Z

Let's see if analyzing the code with #79 helps getting more accurate results about the improvements introduced by the pull-request and its comments.

0x4007 · 2024-08-16T05:00:46Z

But I see distribution of task incentives should be improved.

Not all lines of code are equal so I think this is tough to solve for. We could consider compiling the diff from every code contributor and then ask ChatGPT how impactful the sum total of their changes are in order to try and be more nuanced with the task reward.

EresDev · 2024-08-16T10:05:05Z

Curious to hear if you guys have any proposals to solve for this scenario.

I have been thinking about this but couldn't find a decent solution. What's already present is the best solution I think. Dividing it equally is ok. Unfair distribution will be rare. The second developer usually joins for reasons that make the distribution fair.

We could consider compiling the diff from every code contributor and then ask ChatGPT how impactful the sum total of their changes are to try and be more nuanced with the task reward.

I have concerns that relying on ChatGPT for this will result in a higher occurrence of unfair distributions. But it is something that can be tried. Some pull requests are huge and sending their code to OpenAI probably may hit OpenAI token or API limit.

What is better than ChatGPT is having a solid formula to measure. I couldn't find one so far.

EresDev · 2024-08-16T10:42:59Z

I haven't been able to claim the reward of this issue so far because by the time I got here, permit wallets were empty for ubiquibot upgrade. They are still empty, even though claims are working on other issues. Closing and opening of this issue will also generate higher permit for me based on the new ubiquibot I think. Please suggest a solution to this. @0x4007

ubiquibot · 2024-08-16T10:44:00Z

+ Evaluating results. Please wait...

0x4007 · 2024-08-16T10:44:00Z

Just take the new permit then

ubiquityos · 2024-08-16T10:44:07Z

[ 361.24 WXDAI ]

@gentlementlegen

Contributions Overview

View	Contribution	Count	Reward
Issue	Task	0.5	300
Issue	Specification	1	33.6
Issue	Comment	2	19.24
Review	Comment	13	8.4

Conversation Incentives

Comment	Formatting	Relevance	Reward
We should keep comparing the old bot results with the new one, t…	33.6 content: p: count: 20 score: 1 ul: count: 90 score: 0 li: count: 90 score: 1 code: count: 2 score: 1 wordValue: 0.1 formattingMultiplier: 3	1	33.6
@0x4007 Biggest difference is that the new one picked up all the…	17.8 content: p: count: 85 score: 1 code: count: 4 score: 1 wordValue: 0.2 formattingMultiplier: 1	0.9	16.02
Let's see if analyzing the code with https://github.com/ubiquibo…	4.6 content: p: count: 23 score: 1 wordValue: 0.2 formattingMultiplier: 1	0.7	3.22
Maybe these are not needed anymore.	0.15 content: p: count: 6 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.15
Wouldn't this be redundant in the configuration?	0.175 content: p: count: 7 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.175
Maybe remove logs or use the logger	0.175 content: p: count: 7 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.175
The nesting is die to `userExtractor` being its own modu…	0.95 content: p: count: 36 score: 1 code: count: 2 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.95
It should not be but according to GitHub types `body` ca…	0.4 content: p: count: 14 score: 1 code: count: 2 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.4
From my understanding, any comment within the pull request shoul…	1.175 content: p: count: 46 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	1.175
@EresDev Could you please fix the conflicts? This Pr is quite cr…	0.525 content: p: count: 21 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.525
Also I am trying to get this PR in, that youwill probably need t…	0.525 content: p: count: 21 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.525
I meant that maybe it would be better to have this one merged fi…	0.6 content: p: count: 24 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.6
@EresDev I'll run lots of tests and come back to you.	0.275 content: p: count: 11 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.275
@EresDev I just ran it against https://github.com/ubiquity/pay.u…	1.7 content: h2: count: 45 score: 1 p: count: 23 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	1.7
Latest QA (tests conducted with my fixes as well): - https://gi…	1.15 content: p: count: 9 score: 1 ul: count: 37 score: 0 li: count: 37 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	1.15
@EresDev on the issues I mentioned it should be yes. Actually ri…	0.6 content: p: count: 24 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.6

[ 302.53 WXDAI ]

@EresDev

Contributions Overview

View	Contribution	Count	Reward
Issue	Task	0.5	300
Issue	Comment	3	2.53
Review	Comment	27	0

Conversation Incentives

Comment	Formatting	Relevance	Reward
I think this is true. The old & new bot both divide the task…	0.85 content: p: count: 34 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	0.6	0.51
I have been thinking about this but couldn't find a decent solut…	2.65 content: p: count: 106 score: 1 br: count: 1 score: 0 wordValue: 0.1 formattingMultiplier: 0.25	0.7	1.855
I haven't been able to claim the reward of this issue so far bec…	1.65 content: p: count: 66 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	0.1	0.165
Resolves #26 ## QA - [part-1](https://github.com/EresDevOrg/ub…	0 content: p: count: 28 score: 1 h2: count: 1 score: 1 ul: count: 21 score: 0 li: count: 21 score: 1 a: count: 3 score: 1 h3: count: 4 score: 1 wordValue: 0 formattingMultiplier: 0	0.5	-
Ok, I was wondering what the benefit of switching to JSON would …	0 content: p: count: 141 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
Here is a quickly written sample prompt for jSON. I will improve…	0 content: p: count: 13 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 0	1	-
Here is the response using the above prompt. ![image](https://…	0 content: p: count: 8 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 0	1	-
I am about to finish the PR and this one is blocking. Let me kno…	0 content: p: count: 79 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
I went ahead and implemented this. Specifications and comments a…	0 content: p: count: 42 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
I think you asked for it in the comment Some of these configure…	0 content: p: count: 57 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
Here is an example of our use case. Seems like helping with data…	0 content: p: count: 32 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 0	1	-
resolved by using typebox	0 content: p: count: 4 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
Ok, any issues with evalutation or openai API propagates the exc…	0 content: p: count: 21 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
Yeah that was dumb of me. Some leftovers of strickly typed langu…	0 content: p: count: 32 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
So this is still present. Just to let you know. If it is importa…	0 content: p: count: 26 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
removed this line.	0 content: p: count: 3 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
fixed	0 content: p: count: 1 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
resolved by https://github.com/ubiquibot/conversation-rewards/pu…	0 content: p: count: 9 score: 1 a: count: 2 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
resolved by https://github.com/ubiquibot/conversation-rewards/pu…	0 content: p: count: 9 score: 1 a: count: 2 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
The requirement was to set the relevance of issue specifications…	0 content: p: count: 94 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
It wasn't just h5 missing from permit comment, there were other …	0 content: p: count: 86 score: 1 code: count: 2 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 0	1	-
Question: Review comments need `relevance: 1` as given…	0 content: p: count: 39 score: 1 code: count: 4 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
@gentlementlegen I see tests failing here, and on also the devel…	0 content: p: count: 36 score: 1 img: count: 1 score: 0 wordValue: 0.2 formattingMultiplier: 0	1	-
I have made it work by changing the expected output.	0 content: p: count: 10 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
Thanks. I was thinking the same to start a new issue for releva…	0 content: p: count: 87 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
The "Should evaluate content" is covering this test case. The co…	0 content: p: count: 47 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
I am in the process of resolving conflicts. But I didn't get wha…	0 content: p: count: 34 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
I have run a 2nd round of tests. All seems ok except the precis…	0 content: p: count: 64 score: 1 img: count: 1 score: 0 ul: count: 3 score: 0 li: count: 3 score: 1 a: count: 3 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
The floating point problem is fixed https://github.com/EresDevOr…	0 content: p: count: 15 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-
I have merged your PR. I believe everything is fixed for this PR…	0 content: p: count: 14 score: 1 wordValue: 0.2 formattingMultiplier: 0	1	-

[ 24.57 WXDAI ]

@0x4007

Contributions Overview

View	Contribution	Count	Reward
Issue	Comment	4	10.37
Review	Comment	9	14.2

Conversation Incentives

Comment	Formatting	Relevance	Reward
@gentlementlegen @EresDev what do you guys think about the rewar…	7.9 content: p: count: 79 score: 1 wordValue: 0.1 formattingMultiplier: 1	0.8	6.32
Curious to hear if you guys have any proposals to solve for this…	1.4 content: p: count: 14 score: 1 wordValue: 0.1 formattingMultiplier: 1	0.3	0.42
Not all lines of code are equal so I think this is tough to solv…	5.1 content: p: count: 51 score: 1 wordValue: 0.1 formattingMultiplier: 1	0.7	3.57
Just take the new permit then	0.6 content: p: count: 6 score: 1 wordValue: 0.1 formattingMultiplier: 1	0.1	0.06
I think there's an official way to tell the API to return a JSON…	3.5 content: p: count: 35 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	3.5
Reference docs https://community.openai.com/t/how-do-i-use-the-n…	0.3 content: p: count: 3 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.3
I would imagine that relevance scoring should be a number.	1 content: p: count: 10 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1
Can you provide an example of what types of mistakes typebox can…	1.4 content: p: count: 14 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.4
Sets these to relevance 1? This is not clear to me.	1.1 content: p: count: 11 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.1
What is this? We should just make it a single property without n…	1.4 content: p: count: 14 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.4
I'm surprised that empty comments have been found in testing. I …	1.6 content: p: count: 16 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	1.6
Defaults should be set to `0` value.	0.8 content: p: count: 7 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	0.8
Good catch on this task. Why don't you address that in a new pul…	3.1 content: p: count: 31 score: 1 wordValue: 0.1 formattingMultiplier: 1	1	3.1

[ 4.6 WXDAI ]

@whilefoo

Contributions Overview

View	Contribution	Count	Reward
Review	Comment	9	4.6

Conversation Incentives

Comment	Formatting	Relevance	Reward
it'd be a good idea to merge latest changes because there were s…	0.675 content: p: count: 26 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.675
```suggestion const commentType = Type.Union([...Ob…	0.45 content: pre: count: 10 score: 0 code: count: 10 score: 1 p: count: 8 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.45
Why is casting to string needed?	0.15 content: p: count: 6 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.15
seems weird to return `{}`, if AI evaluation failed we s…	0.375 content: p: count: 14 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.375
it might be a good idea to validate the response with typebox in…	0.45 content: p: count: 18 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.45
For example one time I made a prompt that instructed it to outpu…	1.775 content: p: count: 70 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	1.775
usually typebox schemas are in types folder	0.175 content: p: count: 7 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.175
Consider using `Value.Decode` because it checks the valu…	0.475 content: p: count: 18 score: 1 code: count: 1 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.475
Then it's fine	0.075 content: p: count: 3 score: 1 wordValue: 0.1 formattingMultiplier: 0.25	1	0.075

ubiquibot · 2024-08-16T10:44:40Z

[ 23 WXDAI ]

@0x4007

Contributions Overview

View	Contribution	Count	Reward
Issue	Comment	4	15.1
Review	Comment	2	7.9

Conversation Incentives

Comment	Formatting	Relevance	Reward
@gentlementlegen @EresDev what do you guys think about the rewar...	8	0.805	8
Curious to hear if you guys have any proposals to solve for this...	1.4	0.675	1.4
> But I see distribution of task incentives should be improve...	5.1	0.665	5.1
Just take the new permit then ...	0.6	0.635	0.6
> The bot scores 1 to tags not html listed in config. Is this...	1.7 code: count: 1 score: "1" words: 1	0.73	1.7
> Question: Review comments need `relevance: 1` as gi...	6.2 code: count: 3 score: "3" words: 6	0.8	6.2

[ 504 WXDAI ]

@gentlementlegen

Contributions Overview

View	Contribution	Count	Reward
Issue	Specification	1	62
Issue	Task	0.5	600
Issue	Comment	2	29.2
Issue	Comment	2	0
Review	Comment	8	75.2
Review	Comment	8	37.6

Conversation Incentives

Comment	Formatting	Relevance	Reward
We should keep comparing the old bot results with the new one, t...	62 li: count: 16 score: "16" words: 187 code: count: 2 score: "2" words: 2	1	62
@0x4007 Biggest difference is that the new one picked up all the...	22.8 code: count: 4 score: "4" words: 4	0.855	22.8
Let's see if analyzing the code with https://github.com/ubiquibo...	6.4	0.675	6.4
@0x4007 Biggest difference is that the new one picked up all the...	- code: count: 4 score: "0" words: 4	0.855	-
Let's see if analyzing the code with https://github.com/ubiquibo...	-	0.675	-
From my understanding, any comment within the pull request shoul...	11.6 code: count: 1 score: "2" words: 1	0.84	11.6
@EresDev Could you please fix the conflicts? This Pr is quite cr...	4.2	0.75	4.2
Also I am trying to get this PR in, that youwill probably need t...	5.6	0.86	5.6
> > Also I am trying to get this PR in, that youwill proba...	4.8	0.75	4.8
@EresDev I'll run lots of tests and come back to you....	2.4	0.81	2.4
@EresDev I just ran it against https://github.com/ubiquity/pay.u...	20.4 hr: count: 1 score: "2" words: 0	0.79	20.4
Latest QA (tests conducted with my fixes as well): - https://gi...	21.2 li: count: 3 score: "6" words: 67	0.81	21.2
@EresDev on the issues I mentioned it should be yes. Actually ri...	5	0.9	5
From my understanding, any comment within the pull request shoul...	5.8 code: count: 1 score: "1" words: 1	0.84	5.8
@EresDev Could you please fix the conflicts? This Pr is quite cr...	2.1	0.75	2.1
Also I am trying to get this PR in, that youwill probably need t...	2.8	0.86	2.8
> > Also I am trying to get this PR in, that youwill proba...	2.4	0.75	2.4
@EresDev I'll run lots of tests and come back to you....	1.2	0.81	1.2
@EresDev I just ran it against https://github.com/ubiquity/pay.u...	10.2 hr: count: 1 score: "1" words: 0	0.79	10.2
Latest QA (tests conducted with my fixes as well): - https://gi...	10.6 li: count: 3 score: "3" words: 67	0.81	10.6
@EresDev on the issues I mentioned it should be yes. Actually ri...	2.5	0.9	2.5

[ 439.1 WXDAI ]

@EresDev

Contributions Overview

View	Contribution	Count	Reward
Issue	Task	0.5	600
Issue	Comment	3	0
Issue	Comment	3	20.9
Review	Comment	10	59.1
Review	Comment	10	59.1

Conversation Incentives

Comment	Formatting	Relevance	Reward
> Old suggests you guys put in similar work, and new suggests...	-	0.68	-
> Curious to hear if you guys have any proposals to solve for...	-	0.55	-
I haven't been able to claim the reward of this issue so far bec...	-	0.575	-
> Old suggests you guys put in similar work, and new suggests...	3.3	0.68	3.3
> Curious to hear if you guys have any proposals to solve for...	10.9	0.55	10.9
I haven't been able to claim the reward of this issue so far bec...	6.7	0.575	6.7
It wasn't just h5 missing from permit comment, there were other ...	9.7 code: count: 1 score: "1" words: 1	0.74	9.7
Question: Review comments need `relevance: 1` as given...	7.8 code: count: 3 score: "3" words: 6	0.83	7.8
@gentlementlegen I see tests failing here, and on also the devel...	4.3	0.73	4.3
> @gentlementlegen I see tests failing here, and on also the ...	1	0.68	1
> > Question: Review comments need `relevance: 1` ...	12.6 code: count: 3 score: "3" words: 6	0.71	12.6
> consider adding a test case which tests fixed relevance ...	4.8	0.81	4.8
> Also I am trying to get this PR in, that youwill probably n...	3.5	0.84	3.5
I have run a 2nd round of tests. All seems ok except the precis...	11.7 a: count: 2 score: "2" words: 4 li: count: 2 score: "2" words: 30	0.75	11.7
> All seems ok except the precision float in the value of for...	2.3	0.87	2.3
> Since this is urgently needed I opened a PR against your re...	1.4	0.86	1.4
It wasn't just h5 missing from permit comment, there were other ...	9.7 code: count: 1 score: "1" words: 1	0.74	9.7
Question: Review comments need `relevance: 1` as given...	7.8 code: count: 3 score: "3" words: 6	0.83	7.8
@gentlementlegen I see tests failing here, and on also the devel...	4.3	0.73	4.3
> @gentlementlegen I see tests failing here, and on also the ...	1	0.68	1
> > Question: Review comments need `relevance: 1` ...	12.6 code: count: 3 score: "3" words: 6	0.71	12.6
> consider adding a test case which tests fixed relevance ...	4.8	0.81	4.8
> Also I am trying to get this PR in, that youwill probably n...	3.5	0.84	3.5
I have run a 2nd round of tests. All seems ok except the precis...	11.7 a: count: 2 score: "2" words: 4 li: count: 2 score: "2" words: 30	0.75	11.7
> All seems ok except the precision float in the value of for...	2.3	0.87	2.3
> Since this is urgently needed I opened a PR against your re...	1.4	0.86	1.4

[ 0.4 WXDAI ]

@whilefoo

Contributions Overview

View	Contribution	Count	Reward
Review	Comment	1	0.4

Conversation Incentives

Comment	Formatting	Relevance	Reward
> The "Should evaluate content" is covering this test case. T...	0.4	0.84	0.4

0x4007 · 2024-08-16T10:44:47Z

Some pull requests are huge and sending their code to OpenAI probably may hit OpenAI token or API limit.

Unlikely. The new context lengths are crazy large

gentlementlegen added Time: <1 Day Priority: 3 (High) labels Jul 7, 2024

ubiquibot bot added the Price: 600 USD label Jul 7, 2024

ubiquibot bot mentioned this issue Jul 7, 2024

Analyze and correct eventual discrepancies with the old bot ubiquity/devpool-directory#1228

Closed

ubiquibot bot assigned EresDev Jul 9, 2024

gentlementlegen mentioned this issue Jul 11, 2024

Post a waiting message to show that the conversation is being processed #29

Closed

EresDev mentioned this issue Jul 11, 2024

PR: correct discrepancies with the old bot #55

Merged

This was referenced Jul 17, 2024

Include review comments in the final result #61

Closed

Removing permit from URL on mobile only ubiquity/pay.ubq.fi#256

Closed

gentlementlegen mentioned this issue Jul 24, 2024

Add a runs_on section to the configuration and the manifests ubiquity-os/ubiquity-os-kernel#73

Closed

gentlementlegen self-assigned this Aug 4, 2024

gentlementlegen closed this as completed in #55 Aug 13, 2024

0x4007 reopened this Aug 16, 2024

0x4007 closed this as completed Aug 16, 2024

test-app-ubo bot mentioned this issue Nov 17, 2024

Consolidate comments sshivaditya2019/test-public#210

Open

Analyze and correct eventual discrepancies with the old bot #26

Analyze and correct eventual discrepancies with the old bot #26

Comments

gentlementlegen commented May 30, 2024 • edited Loading

EresDev commented Jul 9, 2024

ubiquibot bot commented Jul 9, 2024

Tips:

ubiquibot bot commented Aug 4, 2024

ubiquibot bot commented Aug 13, 2024

ubiquibot-dev bot commented Aug 13, 2024 • edited Loading

Contributions Overview

Conversation Incentives

Contributions Overview

Conversation Incentives

Contributions Overview

Conversation Incentives

Contributions Overview

Conversation Incentives

ubiquibot bot commented Aug 13, 2024

Contributions Overview

Conversation Incentives

Contributions Overview

Conversation Incentives

Contributions Overview

Conversation Incentives

> consider adding a test case which tests fixed relevance ...

> consider adding a test case which tests fixed relevance ...

Contributions Overview

Conversation Incentives

0x4007 commented Aug 13, 2024 • edited Loading

gentlementlegen commented Aug 13, 2024 • edited Loading

EresDev commented Aug 14, 2024

0x4007 commented Aug 15, 2024

gentlementlegen commented Aug 15, 2024

0x4007 commented Aug 16, 2024

EresDev commented Aug 16, 2024

EresDev commented Aug 16, 2024

ubiquibot bot commented Aug 16, 2024

gentlementlegen commented May 30, 2024 •

edited

Loading

ubiquibot-dev bot commented Aug 13, 2024 •

edited

Loading

> consider adding a test case which tests fixed relevance
...

> consider adding a test case which tests fixed relevance
...

0x4007 commented Aug 13, 2024 •

edited

Loading

gentlementlegen commented Aug 13, 2024 •

edited

Loading