This project is a Streamlit application runnning a keras model trained on our team's voices for speaker recognition . It allows users to sign up, log in, and use various functionalities like recording live audio, uploading audio files, predicting the speaker using a pre-trained model, and retrieving past transcriptions. The application also features a database to store user information and transcription history.
-
User Authentication:
- Signup and login functionality using bcrypt for password hashing.
- User data stored in a SQLite database.
-
Speaker Recognition:
- Record live audio or upload audio files for speaker recognition.
- Transcriptions generated using Google's Speech Recognition API.
- Speaker prediction using a pre-trained Keras model.
-
Transcription History:
- Store and retrieve past transcriptions.
- Download transcriptions as a text file.
- Python 3.6+
- Streamlit
- NumPy
- PyAudio
- Librosa
- TensorFlow
- Scikit-learn
- Soundfile
- SpeechRecognition
- Bcrypt
- SQLite3
git clone https://github.com/yourusername/speaker-recognition-app.git
cd speaker-recognition-app
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
streamlit run app.py
- Select the "Sign Up" option.
- Enter your email and password.
- Click "Sign Up" to create your account.
- Select the "Login" option.
- Enter your email and password.
- Click "Login" to access the application features.
- After logging in, choose an option to identify the speaker:
- Upload Audio File: Upload
.wav
files for prediction. - Record Live Audio: Record live audio using your microphone.
- Retrieve Past Transcriptions: View and download previous transcriptions.
- Upload Audio File: Upload
- Follow the on-screen instructions to upload or record audio.
- View the predicted speaker and transcription.
- Download transcriptions if needed.
app.py
: Main Streamlit application file.setup_database.py
: Script to set up the SQLite database.requirements.txt
: List of required Python packages.model_one.keras
: Pre-trained Keras model for speaker prediction.label_encoder_one.npy
: Label encoder for the model.
id
: INTEGER, primary keyemail
: VARCHAR(50), uniquepassword
: VARCHAR(60)status
: VARCHAR(20)created_dt
: DATETIME, default current timestamp
id
: INTEGER, primary keyuser_id
: INTEGER, foreign key referencesusers(id)
name
: TEXTtranscription_file
: BLOBcreated_dt
: DATETIME, default current timestampupdated_dt
: DATETIME, default current timestamp
- Ensure the
model_one.keras
andlabel_encoder_one.npy
files are in the project directory. - The app uses Google's Speech Recognition API, which requires an internet connection.
- Audio recording functionality requires a working microphone.
To manually set up the SQLite database for user management and history tracking, follow these steps:
-
Make Database: Use make_db.py to create the SQLite database with the required tables (users and history).
python make_db.py
-
Check Database (Optional): Use check_db.py to check the existing tables and records in the database.
python check_db.py
Note: These scripts assume the database file (voice_db.db
) is created in the same directory as the scripts. Adjust the database file path if necessary.
For training on different/new data, check training.md