This project is inspired by a feature commonly found on smartphones where you can draw letters or digits on the keyboard, and it automatically recognizes what you're trying to write. I wanted to replicate this functionality by creating a web app where users can draw digits, and a machine learning model will guess what number they wrote in real-time. The app is built using Next.js (React), TensorFlow.js for image processing, and ONNX Runtime for running the digit recognition model. The actual model trained is in the draft.ipynb using PyTorch.
Inspired by the drawing features on smartphones that recognize letters and digits, this app replicates that experience by allowing users to draw numbers directly on a website, which are then processed by a machine learning model trained on the MNIST dataset.
-
Handwritten Digit Recognition:
- A pre-trained ONNX model (loaded via
onnxruntime-web
) recognizes digits drawn by users in real-time.
- A pre-trained ONNX model (loaded via
-
Canvas Drawing Interface:
- Users can draw a digit using their mouse or finger on a responsive canvas in the browser.
-
Real-time Predictions:
- After drawing a digit, users can click "Predict" to get an instant prediction of the digit. The app will display the result below the canvas.
-
Clear Button:
- A button to reset the canvas and allow users to redraw as many times as they like.
To get the app running locally, follow these steps:
Make sure you have the following installed:
- Node.js
- Python (for model training, if needed)
- TensorFlow.js (for processing image data)
- ONNX Runtime for Web (for model inference in the browser)
-
Clone the repository:
git clone https://github.com/your-username/mnist-digit-recognition-app.git cd mnist-digit-recognition-app
-
Install frontend dependencies:
cd frontend npm install
-
Model Setup:
- Download or place your trained ONNX model (
model.onnx
) in the correct directory.
- Download or place your trained ONNX model (
-
Run the app:
npm run dev
-
Visit
http://localhost:3000
in your browser to interact with the app.
-
Canvas Drawing:
- The app allows users to draw digits on a 280x280 canvas using mouse or touch events.
-
Image Processing:
- The drawing is resized to 28x28 pixels, converted to grayscale, and normalized using TensorFlow.js.
-
Model Prediction:
- The processed image is sent to an ONNX model running in the browser, and the model predicts which digit was drawn.
-
Displaying Results:
- The predicted digit is shown below the canvas.
- Improve accuracy with additional training.
- Add cloud deployment (AWS/GCP/Azure).
- Improve mobile responsiveness.
This project is licensed under the MIT License. See the LICENSE file for details.