SoundSigns: Speech to Sign Language Translator

Overview

SoundSigns is a comprehensive web-based application that translates spoken English into International Sign Language (ISL) in real-time. The system captures speech, converts it to text, translates it into ISL gloss, and displays the translation through a 3D animated avatar using pre-rendered video clips.

Features

Real-time Speech Recognition: Browser-based speech-to-text conversion using Web Speech API
ISL Gloss Translation: Converts English text to International Sign Language gloss using ChatGPT API
3D Avatar Animation: Visual sign language representation through pre-rendered MP4 video clips
Video Assembly: Seamless concatenation of individual sign videos into coherent sentences
Interactive Interface: Clean, responsive UI with microphone controls and video playback
Multi-format Support: Covers alphabet letters (A-Z), numbers (0-9), and common vocabulary
Download Functionality: Save translated sign language videos for offline use
Cross-browser Compatibility: Works on modern browsers supporting Web Speech API

Architecture

The application follows a modular three-tier architecture:

Frontend: React.js with Tailwind CSS handling user interaction and video processing
Backend: Flask server managing API communications and text-to-gloss conversion
Dataset: Curated collection of ~150 pre-rendered ISL sign videos

Prerequisites

Python 3.8+
Node.js 14+
OpenAI API Key
Modern web browser with Web Speech API support (Chrome, Edge recommended)

Installation

Backend Setup

Install Python dependencies:

pip install sounddevice numpy openai flask flask-cors python-dotenv

Create a .env file in the backend/ directory:

OPENAI_API_KEY=your_openai_key_here

Security Note: Never commit the .env file to version control.

Frontend Setup

Navigate to the frontend directory and install dependencies:

cd frontend
npm install

Running the Application

Start the frontend development server:

cd frontend
npm run dev

In a separate terminal, start the backend server from the project root:

py backend/transcription.py

Access the application at http://localhost:3000 (or the port specified by your dev server)

Usage

Voice Input: Click the microphone button and speak clearly in English
Transcription: View the real-time speech-to-text conversion
Translation: See the ISL gloss translation displayed
Video Playback: Watch the 3D avatar perform the signed translation
Controls: Use play, replay, and download buttons to control video playback

Project Structure

project-root/
├── backend/
│   ├── .env                 # Environment variables (not in version control)
│   └── transcription.py     # Flask server and API logic
├── frontend/
│   ├── src/
│   │   ├── components/      # React components
│   │   └── App.jsx         # Main application file
│   └── assets/
│       └── videos/         # Pre-rendered sign language videos
│           ├── letters/    # A-Z alphabet signs
│           ├── numbers/    # 0-9 numerical signs
│           └── words/      # Common vocabulary signs

Technologies Used

Frontend: React.js, Tailwind CSS, Web Speech API
Backend: Python, Flask, Flask-CORS
Translation: OpenAI GPT-3.5-turbo API
Video Processing: Browser-based video concatenation
Dataset: Pre-rendered MP4 videos with 3D ISL avatar

System Requirements

Browser: Chrome, Edge, or other browsers with Web Speech API support
Microphone: Required for speech input
Internet Connection: Required for OpenAI API access

Known Limitations

Limited vocabulary dataset (~150 signs)
Words not in the dataset are finger-spelled letter by letter
Translation accuracy depends on ChatGPT's ISL gloss generation
Requires a quiet environment for optimal speech recognition
System latency of 3-5 seconds for the complete translation process

Contributing

This project was developed as a completed academic capstone project and is no longer under active development.
At this time, we are not accepting contributions or pull requests.

Thank you for your interest and understanding.

Dataset Attribution

The sign language video dataset is sourced from the open-source "Text-Speech to Sign Language Generator" project by JS-Coderr (2024), available on GitHub.

License

This project is the intellectual property of Ahmad Ataba, Waseem Saleem, and Braude Engineering College.
It was developed as a capstone project for academic purposes.
All rights reserved. Redistribution or commercial use is not permitted without explicit permission from the authors or the institution.

Our Team

Ahmad Ataba
Waseem Saleem

Support

For technical issues or questions about the application, please refer to the project documentation or contact the development team.

Note: This application is designed for educational and accessibility purposes. For critical communication needs, professional sign language interpretation is recommended.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SoundSigns: Speech to Sign Language Translator

Overview

Features

Architecture

Prerequisites

Installation

Backend Setup

Frontend Setup

Running the Application

Usage

Project Structure

Technologies Used

System Requirements

Known Limitations

Contributing

Dataset Attribution

License

Our Team

Support

About

Uh oh!

Releases

Packages

Contributors 2

Languages

Ataba29/SoundSigns

Folders and files

Latest commit

History

Repository files navigation

SoundSigns: Speech to Sign Language Translator

Overview

Features

Architecture

Prerequisites

Installation

Backend Setup

Frontend Setup

Running the Application

Usage

Project Structure

Technologies Used

System Requirements

Known Limitations

Contributing

Dataset Attribution

License

Our Team

Support

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages