Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test PR 2 #2

Open
wants to merge 52 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
52 commits
Select commit Hold shift + click to select a range
8445a97
add bundle test
HarshTrivedi Nov 26, 2024
50c2f3c
minor
HarshTrivedi Nov 27, 2024
bc99ef7
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
64e3279
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
f3f5524
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
6d0f7ca
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
4c43b27
minor
HarshTrivedi Nov 27, 2024
f4c202f
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
441aa72
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
cbd8268
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
a3ae098
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
305f1b5
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
84efdb2
add leaderboard entry
github-actions[bot] Nov 27, 2024
6fb030f
Merge branch 'test-pr' of https://github.com/stonybrooknlp/appworld-l…
HarshTrivedi Nov 27, 2024
b8b6bb9
add leaderboard entry
github-actions[bot] Nov 27, 2024
c94562b
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
81cd42b
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
0efafeb
add leaderboard entry
github-actions[bot] Nov 27, 2024
f3cf617
minor
HarshTrivedi Nov 27, 2024
c6446e2
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 27, 2024
3c3e6da
add leaderboard entry
github-actions[bot] Nov 27, 2024
cdee508
add leaderboard entry
github-actions[bot] Nov 27, 2024
2c7e507
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 28, 2024
3e2654f
add leaderboard entry
github-actions[bot] Nov 28, 2024
d2ed522
add leaderboard entry
github-actions[bot] Nov 28, 2024
741cf77
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 28, 2024
13adc27
add leaderboard entry
github-actions[bot] Nov 28, 2024
d1594b6
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 28, 2024
26387c0
remove leaderboard entry
github-actions[bot] Nov 28, 2024
5e5f800
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 28, 2024
4bbf0b1
add leaderboard entry
github-actions[bot] Nov 28, 2024
4d51326
remove leaderboard entry
github-actions[bot] Nov 28, 2024
f7deb38
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 28, 2024
0d21a00
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 28, 2024
3ebbfa4
add leaderboard entry
github-actions[bot] Nov 28, 2024
48edacc
remove leaderboard entry
github-actions[bot] Nov 28, 2024
efa6db7
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
ffdb401
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
d16957b
minor
HarshTrivedi Nov 29, 2024
dffc60a
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
2c5ec4d
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
816c9c4
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
d5f7f43
add leaderboard entry
github-actions[bot] Nov 29, 2024
7392aa4
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
f99bba2
minor
HarshTrivedi Nov 29, 2024
54e2766
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
4b7891f
minor
HarshTrivedi Nov 29, 2024
832b558
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
8025108
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
5f9db71
Merge branch 'main' of https://github.com/stonybrooknlp/appworld-lead…
HarshTrivedi Nov 29, 2024
40e61cf
minor
HarshTrivedi Nov 29, 2024
cc3232f
add leaderboard entry
github-actions[bot] Nov 29, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions experiments/outputs/_leaderboard.json
Original file line number Diff line number Diff line change
Expand Up @@ -796,5 +796,62 @@
"interactions": 26.61
}
}
},
{
"id": "5509efd4-3e4bf4e2",
"method": {
"name": "temp",
"tooltip": "temp"
},
"llm": {
"name": "temp",
"tooltip": "temp"
},
"url": "temp",
"date": "2024-11-29",
"test_normal": {
"all": {
"task_goal_completion": 9.5,
"scenario_goal_completion": 7.1,
"interactions": 3.32
},
"level 1": {
"task_goal_completion": 28.1,
"scenario_goal_completion": 21.1,
"interactions": 3.32
},
"level 2": {
"task_goal_completion": 0.0,
"scenario_goal_completion": 0.0,
"interactions": 3.32
},
"level 3": {
"task_goal_completion": 0.0,
"scenario_goal_completion": 0.0,
"interactions": 3.32
}
},
"test_challenge": {
"all": {
"task_goal_completion": 5.5,
"scenario_goal_completion": 4.3,
"interactions": 4.39
},
"level 1": {
"task_goal_completion": 26.4,
"scenario_goal_completion": 20.8,
"interactions": 4.39
},
"level 2": {
"task_goal_completion": 2.7,
"scenario_goal_completion": 2.0,
"interactions": 4.39
},
"level 3": {
"task_goal_completion": 0.0,
"scenario_goal_completion": 0.0,
"interactions": 4.39
}
}
}
]
Git LFS file not shown
3 changes: 3 additions & 0 deletions experiments/outputs/temp_test_test_normal/leaderboard.bundle
Git LFS file not shown