Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leaderboard 2.0: added performance x n_parameters plot + more benchmark info #1437

Merged
merged 9 commits into from
Nov 12, 2024

Conversation

x-tabdeveloping
Copy link
Collaborator

I added an interactive performance vs. number of parameters plot as the first thing people see when selecting a benchmark. #1396
I also added some info on the benchmarks to the benchmark description as Niklas requested here: #1317

Here's a screenshot:
image

I also bumped the Gradio version, as I thought it might fix certain things, but I have two burning problems still, for which I opened respective issues in Gradio (gradio-app/gradio#9938, gradio-app/gradio#9937)

Copy link
Collaborator

@isaac-chung isaac-chung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is beautiful! Nice work. I love the current layout.
I see that there's already an open issue on the formatting. Hope we get a response soon.

@Samoed
Copy link
Collaborator

Samoed commented Nov 11, 2024

That's very beautiful!
Here are a few small UI suggestions: maybe move the citation section to the bottom of the page (it could even be collapsible) and switch the order of the table and plot so the table appears at the top after search bar. What do you think?

@x-tabdeveloping
Copy link
Collaborator Author

Hey @Samoed thanks! I have been there :D After deliberation I though having the citation up close to the benchmark description makes more sense since it is more visually linked to the specific benchmark, and also fills up a gap that would otherwise be there. I also prefer having the plot first than the table, since it communicates the same information, while being easier to interpret visually.

I'm open to changing it if enough people think we should rearrange things.

Copy link
Contributor

@Muennighoff Muennighoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks amazing!

@x-tabdeveloping
Copy link
Collaborator Author

@Muennighoff @Samoed @isaac-chung @KennethEnevoldsen I would also like to hear your take on whether we should be dark or light theme by default, cause in the case that we want to go dark I can also make the plot with dark background and light text.

@Samoed
Copy link
Collaborator

Samoed commented Nov 12, 2024

I thought Gradio used the system theme by default, which I think is the better option. If not, I would prefer a dark theme

@x-tabdeveloping
Copy link
Collaborator Author

Alright, we can stick with the default. It just looks a bit weird to have a light plot against a dark background and vice versa.

@x-tabdeveloping x-tabdeveloping merged commit 76c2112 into main Nov 12, 2024
10 checks passed
@Muennighoff
Copy link
Contributor

FYI somehow got the error below when trying to start the LB in a space, but maybe just me?

Traceback (most recent call last):
  File "/home/user/app/app.py", line 5, in <module>
    from mteb.leaderboard.app import demo
  File "/usr/local/lib/python3.10/site-packages/mteb/leaderboard/__init__.py", line 3, in <module>
    from mteb.leaderboard.app import demo
  File "/usr/local/lib/python3.10/site-packages/mteb/leaderboard/app.py", line 79, in <module>
    summary_table, per_task_table = scores_to_tables(default_scores)
  File "/usr/local/lib/python3.10/site-packages/mteb/leaderboard/table.py", line 138, in scores_to_tables
    model_metas.map(lambda m: format_n_parameters(m.n_parameters)),
  File "/usr/local/lib/python3.10/site-packages/pandas/core/series.py", line 4700, in map
    new_values = self._map_values(arg, na_action=na_action)
  File "/usr/local/lib/python3.10/site-packages/pandas/core/base.py", line 921, in _map_values
    return algorithms.map_array(arr, mapper, na_action=na_action, convert=convert)
  File "/usr/local/lib/python3.10/site-packages/pandas/core/algorithms.py", line 1743, in map_array
    return lib.map_infer(values, mapper, convert=convert)
  File "lib.pyx", line 2972, in pandas._libs.lib.map_infer
  File "/usr/local/lib/python3.10/site-packages/mteb/leaderboard/table.py", line 138, in <lambda>
    model_metas.map(lambda m: format_n_parameters(m.n_parameters)),
  File "/usr/local/lib/python3.10/site-packages/mteb/leaderboard/table.py", line 36, in format_n_parameters
    n_zeros = math.log10(n_million)
ValueError: math domain error

@x-tabdeveloping
Copy link
Collaborator Author

hmm strange enough. Maybe some model had model size -1 or None? Can you make an issue on this? @Muennighoff

@x-tabdeveloping
Copy link
Collaborator Author

Nvm I got this, will fix in next PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants