Welcome to AutoGrader, a groundbreaking graduation project aimed at transforming traditional handwriting grading using Artificial Intelligence. This project, led by a team of 9 dedicated members, is divided into two main components: Handwriting Recognition and Automatic Grading. Each component has undergone extensive development to ensure accuracy, efficiency, and applicability in real-world scenarios.
This project focuses on developing robust models for handwriting recognition and automatic short answer grading (ASAG). We utilize various datasets and advanced preprocessing techniques to enhance the quality of handwritten text recognition and grading models. The primary goal is to improve the accuracy and efficiency of recognizing and grading handwritten content in educational settings.
Handwriting recognition involves converting handwritten text into machine-readable text. Our approach includes multiple phases:
-
Data Collection:
- We collected data from IAM Handwriting Database 3.0 dataset to ensure comprehensive training.
- For testing we used two datasets which are: Egyptian Handwriting Dataset (EHD), and CVL Database.
-
Image Pre-Processing:
- Image Binarization: Converts grayscale images to binary, enhancing feature extraction.
- Noise Removal: Eliminates unwanted artifacts to improve image clarity.
- Dilation: Enhances stroke width in handwritten images.
- Deskewing: Aligns handwritten text horizontally.
- Deslanting: Corrects the slant in handwritten text.
- Rescaling: Standardizes image dimensions.
- Image Inversion: Enhances feature contrast by inverting image colors.
-
Evaluation Metrics: Our evaluation framework incorporates a range of metrics to assess the performance of our handwriting recognition models:
- Model Inference
- Test Data Preparation
- Comparison of Images
- Comparison of Labels
- Vector Inputs
- Generation of Predictions
- Decoding Predictions
- Comparison with Ground Truth
- Character Error Rate (CER)
-
Models Training:
- We used a pre-trained model which is Tr-OCR for training.
-
Model Optimization:
- Fine-tuning models to enhance performance and accuracy.
-
Error Analysis:
- Identifying and analyzing common mistakes to improve model robustness.
ASAG involves grading handwritten or typed short answers automatically. Our approach includes multiple phases:
- Data Collection:
- Collecting datasets of short answers and including four columns (Question, Model Answer, Student Answer,and Grade).
- We used datasets like: Mohler, PT-ASAG, and AR-ASAG for training and testing.
- Data Pre-Processing:
- Exploratory Data Analysis
- Cleaning and preparing the text data for analysis.
- Column Analysis
- Data Analysis:
- Extracting relevant features from the text and understanding patterns.
- Analysis was divided into two types: Single Column Analysis and Relations between Columns Analysis.
- Model Training:
- Trying different pre-trained models that utilizes Semantic Similarity for grading like Knowledge-based models, Sentence Transformers, and BERT.
- Model Optimization:
- Adjusting model parameters to enhance grading accuracy.
- Error Analysis:
- Conducting thorough error analysis to identify areas for improvement.
To demonstrate the capabilities of our handwriting recognition and ASAG models, we have developed a user-friendly website. The website allows users to upload handwritten text and receive recognition results, as well as submit short answers for automatic grading.
- Frontend:
- Designing a user-friendly interface for easy interaction.
- Backend:
- Developing the server-side logic to handle data processing and model inference.
- Computer Vision and Customizing Exam Sheets:
- Implementing computer vision techniques for recognizing handwritten text and customizing exam sheets.
- Upload Handwritten Text: Users can upload images of handwritten text for recognition.
- Submit Short Answers: Users can submit short answers for automatic grading.
- Real-Time Feedback: Instant results and feedback on handwriting recognition and grading.
The project is organized into the following directories and files:
- data: Contains the datasets used for training and evaluation.
- handwriting_recognition/data_collection: Scripts and tools for collecting handwriting samples.
- notebooks: Python notebooks for experiments, model training, and evaluation.
- scr: Source code for the project, including models, preprocessing scripts, and utilities.
- README.md: This file, providing an overview of the project.
The development of advanced models for handwriting recognition and automatic short answer grading (ASAG) represents a significant step forward in educational technology. By meticulously collecting diverse datasets and employing sophisticated preprocessing techniques, our models achieve high accuracy and reliability. These models can revolutionize the way educational assessments are conducted, offering fast and consistent grading of handwritten and short answer responses.
Our user-friendly website further demonstrates the practical applications of these models, allowing users to experience the capabilities firsthand. With real-time feedback and intuitive interfaces, the website showcases the potential for widespread adoption in educational settings.