diff --git a/README.md b/README.md index e618a49..e5688e9 100644 --- a/README.md +++ b/README.md @@ -31,9 +31,11 @@ SciCode sources challenging and realistic research-level coding problems across | Models | Main Problem Resolve Rate | Subproblem | |--------------------------|-------------------------------------|-------------------------------------| -| 🥇 OpenAI o1-preview |
**7.7**
|
28.5
| -| 🥈 Claude3.5-Sonnet |
**4.6**
|
26.0
| -| 🥉 Claude3.5-Sonnet (new) |
**4.6**
|
25.3
| +| 🥇 OpenAI o3-mini |
**9.2**
|
33.0
| +| 🥈 OpenAI o1-preview |
**7.7**
|
28.5
| +| 🥉 Deepseek-R1 |
**4.6**
|
28.5
| +| Claude3.5-Sonnet |
**4.6**
|
26.0
| +| Claude3.5-Sonnet (new) |
**4.6**
|
25.3
| | Deepseek-v3 |
**3.1**
|
23.7
| | Deepseek-Coder-v2 |
**3.1**
|
21.2
| | GPT-4o |
**1.5**
|
25.0
|