Skip to content

Commit 4717b56

Browse files
authored
Merge pull request #27 from scicode-bench/zilinghan/doc-update
add Deepseek R1 and OpenAI o3-mini performance
2 parents e99bc4e + 798237d commit 4717b56

File tree

1 file changed

+5
-3
lines changed

1 file changed

+5
-3
lines changed

README.md

+5-3
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,11 @@ SciCode sources challenging and realistic research-level coding problems across
3131

3232
| Models | Main Problem Resolve Rate | <span style="color:grey">Subproblem</span> |
3333
|--------------------------|-------------------------------------|-------------------------------------|
34-
| 🥇 OpenAI o1-preview | <div align="center">**7.7**</div> | <div align="center" style="color:grey">28.5</div> |
35-
| 🥈 Claude3.5-Sonnet | <div align="center">**4.6**</div> | <div align="center" style="color:grey">26.0</div> |
36-
| 🥉 Claude3.5-Sonnet (new) | <div align="center">**4.6**</div> | <div align="center" style="color:grey">25.3</div> |
34+
| 🥇 OpenAI o3-mini | <div align="center">**9.2**</div> | <div align="center" style="color:grey">33.0</div> |
35+
| 🥈 OpenAI o1-preview | <div align="center">**7.7**</div> | <div align="center" style="color:grey">28.5</div> |
36+
| 🥉 Deepseek-R1 | <div align="center">**4.6**</div> | <div align="center" style="color:grey">28.5</div> |
37+
| Claude3.5-Sonnet | <div align="center">**4.6**</div> | <div align="center" style="color:grey">26.0</div> |
38+
| Claude3.5-Sonnet (new) | <div align="center">**4.6**</div> | <div align="center" style="color:grey">25.3</div> |
3739
| Deepseek-v3 | <div align="center">**3.1**</div> | <div align="center" style="color:grey">23.7</div> |
3840
| Deepseek-Coder-v2 | <div align="center">**3.1**</div> | <div align="center" style="color:grey">21.2</div> |
3941
| GPT-4o | <div align="center">**1.5**</div> | <div align="center" style="color:grey">25.0</div> |

0 commit comments

Comments
 (0)