diff --git a/docs/index.html b/docs/index.html index c2e6c94..2c3098e 100644 --- a/docs/index.html +++ b/docs/index.html @@ -149,7 +149,7 @@

3Insource Services Inc, 4Teaching Lab, 5Allen Institute for AI
- NeurIPS 2024, Math AI Workshop + NeurIPS 2024, Math-AI Workshop
@@ -337,7 +337,7 @@

Introduction

Leaderboard on DrawEduMath

-

Accuracy Scores on the +

Accuracy scores on the Logo DrawEduMath dataset. @@ -383,7 +383,7 @@

Leaderboard on DrawEduMath

-

The leaderboard scores are based on the judgements using Mixtral 8x22B model.

+

The leaderboard scores are based on similarity judgements of VLMs' answers to gold ones obtained using a Mixtral 8x22B model.

🚨 To submit your results to the leaderboard, please send to this email with your result json files.

@@ -425,9 +425,7 @@

Overview

src="main_static/images/logos/assistments_a_logo.png" style="width:1.5em;vertical-align: middle" alt="Logo" />ASSISTments online learning platform, where students receive feedback from teachers on assigned work. - The problems that accompany each student response are drawn from three overlapping1 open educational - resources (OER): Eureka Math, Open Up - Resources, and Illustrative Math. + The problems that accompany each student response are drawn from three overlapping open educational resources (OER): Eureka Math, Open Up Resources, and Illustrative Math.

@@ -451,8 +449,8 @@

Overview

- You can download the dataset on Hugging Face Dataset. + In the future, we will release the dataset on Hugging Face, but in the meantime, fill out this Google form to express interest.

@@ -488,9 +486,8 @@

Overview

Examples

-

Examples of teacher’s answers to a question asking about possible errors in students’ responses to math - problems. All three examples of students’ hand-drawn responses are for the same math problem asking students - to +

Here are examples of teachers' answers to a question asking about possible errors in students’ responses to math + problems. All three examples of students’ hand-drawn responses are for the same math problem asking students to draw and shade units on fraction strips to show 4 thirds, shown on the left.

Example of teachers' answers to question about erroStatistics

Overall question types in our VQA benchmark -

Qualitative examples of the most common question types in our Examples of the most common question types in our Logo - DrawEduMath benchmark, categornized by type.
+ DrawEduMath benchmark, categorized by type.

@@ -538,7 +535,7 @@

Statistics

-

Experiment Results

+

Experimental Results