diff --git a/.gitignore b/.gitignore index 91edc7b..d97f68c 100644 --- a/.gitignore +++ b/.gitignore @@ -9,7 +9,7 @@ logs/** **/logs/** **/tmp/** integration/** - +test.sh # ------- # Created by https://www.toptal.com/developers/gitignore/api/python diff --git a/README.md b/README.md index 4434012..af10a89 100644 --- a/README.md +++ b/README.md @@ -33,9 +33,11 @@ SciCode sources challenging and realistic research-level coding problems across | Models | Main Problem Resolve Rate | Subproblem | |--------------------------|-------------------------------------|-------------------------------------| -| 🥇 OpenAI o3-mini |
**9.2**
|
33.0
| -| 🥈 OpenAI o1-preview |
**7.7**
|
28.5
| -| 🥉 Deepseek-R1 |
**4.6**
|
28.5
| +| 🥇 OpenAI o3-mini-low |
**10.8**
|
33.3
| +| 🥈 OpenAI o3-mini-high |
**9.2**
|
34.4
| +| 🥉 OpenAI o3-mini-medium |
**9.2**
|
33.0
| +| OpenAI o1-preview |
**7.7**
|
28.5
| +| Deepseek-R1 |
**4.6**
|
28.5
| | Claude3.5-Sonnet |
**4.6**
|
26.0
| | Claude3.5-Sonnet (new) |
**4.6**
|
25.3
| | Deepseek-v3 |
**3.1**
|
23.7
|