Enhancing the Omi experience with advanced AI-powered audio analysis
- The hosted version for the take-home can be found here: https://aggie.server.bardia.app
- Demo Account with Loaded Audio Files + Connected to GCP Bucket
- Email: [email protected]
- Password: bardia
Omi Friend is a companion application designed to work with the Omi device, a continuous audio recorder. The Omi device records everything it hears and sends the audio to your phone. This application aims to improve upon the existing Omi app by providing more accurate transcriptions, better speaker detection, and advanced analysis through various plugins.
The Omi App will automatically output the raw audio files to a GCP bucket (same credentials used in the Omi App under developer mode), where it will get picked up by the webapp after clicking the refresh button in the top right. There is also an option to upload your own auio files through the UI, but that feature is still a little buggy :(.
Key features include:
- Improved transcription using Gladia's Wisper-Zero model
- Advanced conversation analysis using Gladia's
- Intelligent conversation detection and separation
- Easy playback of recordings
- Music detection and song identification
- Various analysis plugins: sentiment analyzer, bias detector, action item creator, reminder setter, calendar item creator, etc.
- 🔐 User authentication system
- 📁 GCP bucket integration for audio file retrieval
- 🗣️ Basic audio transcription using Gladia API
- 📊 Simple conversation view with audio player
- 🧩 Plugins marketplace concept (UI only)
- ⚙️ Settings management for GCP and Gladia credentials
- 🥴 Still a few bugs in the Quick Info section (I'll fix them soon I promise)
- 🤖 AI chat functionality
- 🎯 Improved transcription accuracy
- 👂 Enhanced speaker detection
- 🔍 Advanced search functionality across all transcripts
- 📈 Sentiment analysis and bias detection plugins
- 🎵 Music detection and analysis
- 📱 Mobile responsive design
- 🔗 API integrations with popular communication platforms
- Frontend: Next.js, React, ShadCN, Tailwind CSS
- Backend: Go
- Database: MongoDB
- Cloud Storage: Google Cloud Platform (GCP)
- Authentication: JWT
- APIs: Gladia
To run this project locally, follow these steps:
-
Clone the repository:
git clone https://github.com/TheLickIn13Keys/omi-friend.git cd omi-friend
-
Set up the backend:
cd backend go mod download
-
Set up environment variables: Create a
.env
file in the backend directory with the following variables:MONGO_URI=your_mongodb_connection_string JWT_SECRET=your_jwt_secret
-
Start the backend server:
go run main.go
-
Set up the frontend:
cd ../frontend npm install
-
Start the frontend development server:
npm run dev
-
Open your browser and navigate to
http://localhost:3000
Note: You'll need to have Go, Node.js, and MongoDB installed on your system.
All contributions are welcome! No contributions guide at the moment!
This project is licensed under the MIT License - see the LICENSE file for details.
- Omi for the inspiration and raw audio data
- Gladia for their transcription API (currently used, to be replaced)
- OpenAI for Whisper and ChatGPT (future implementation)
Made with ❤️ by Bardia Anvari for Aggie Works!