Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixed win-rate calculation #13

Merged
merged 3 commits into from
Nov 18, 2023
Merged

fixed win-rate calculation #13

merged 3 commits into from
Nov 18, 2023

Conversation

weiweiy
Copy link
Contributor

@weiweiy weiweiy commented Nov 14, 2023

This new calculation down weight the same scores that appears later in the list.
use scenario based win-rate calculation

…es, switch to scenario based win-rate caculation
@weiweiy weiweiy requested review from lantiga, msaroufim and kaleab-k and removed request for lantiga, msaroufim and kaleab-k November 14, 2023 05:59
@weiweiy
Copy link
Contributor Author

weiweiy commented Nov 14, 2023

@msaroufim @lantiga , please check

Copy link
Collaborator

@lantiga lantiga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment on the formula

elif lower_is_better and v < vv:
win_rate[i] += 1
elif v == vv:
win_rate[i] += 1.0/counts[v]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is subtly incorrect.

For the case in the comment the value 1 will take a win rate of 3/4, not 1/4.
That’s because 1/4 will be added to win rate 3 times out of four (only i == j will be skipped).

The correct expression here should be
win_rate[i] = 1.0 / (counts[v] * (counts[v] - 1))
if I’m not mistaken.

Copy link
Collaborator

@drisspg drisspg Nov 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this comment that the first element will get a score of 3/4.. but tbh from the comment I am not sure what score it should take, not sure what count(lower_rank) should equal too

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it needs to take a value that is lower the more ties there are. So if there are 4 ties, each should take 1/4.
Otherwise if you have 80 ties, you'll get 79/80 which is pretty high, while in terms of win-rate you should get a meh value.

@weiweiy
Copy link
Contributor Author

weiweiy commented Nov 18, 2023

@lantiga , made the change you suggest, merging this branch now. Thanks!

@weiweiy weiweiy merged commit d59dd1d into neurips_eval Nov 18, 2023
3 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants