Skip to content

This project is an intelligent voice-based chat system that leverages AI to analyze user speech and respond in real-time using natural language. It integrates voice recognition, speech-to-text processing, and conversational AI models to enable seamless human-computer interaction through spoken dialogue.

Notifications You must be signed in to change notification settings

HyperBuildX/Voice-AI-Agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-LLM-Agent-With_RAG ( 🤖)

This is a Retrieval-Augmented Generation (RAG)-based multi-modal AI assistant that leverages advanced AI models to provide intelligent, context-aware responses to various types of input including text, images, code, and voice. This project uses the following models:

Features

  • Text Assistance: Handle general text-based queries.
  • Code Assistance: Provide coding assistance and help with code-related queries.
  • Image Analysis: Analyze and describe images.
  • Voice Recognition: Convert spoken language into text.

Project Structure

Generative agent/
├── models/
│   ├── llama.py
│   ├── phi_vision.py
│   ├── granite.py
│   └── whisper_asr.py
├── chains/
│   ├── language_assistant.py
│   ├── code_assistant.py
│   └── vision_assistant.py
├── utils/
│   └── image_processor.py
├── agent/
│   ├── tools/
│   │   └── uml_to_code.py
│   ├── prompt_templates.py
│   └── llm_agent.py
└── app.py

Getting Started

Prerequisites

  • Python 3.8 or higher
  • Streamlit
  • Required Python packages listed in requirements.txt

Installation

  1. Clone the repository:

    git clone [https://github.com/your-username/agent-nesh.git](https://github.com/ganeshnehru/RAG-Multi-Modal-Generative-AI-Agent.git)
    cd RAG-Multi-Modal-Generative-AI-Agent
  2. Create a virtual environment and activate it:

    python3 -m venv venv
    source venv/bin/activate  # On Windows, use `venv\Scripts\activate`
  3. Install the required packages:

    pip install -r requirements.txt
  4. Set up environment variables:

    • Create a .env file in the root directory.
    • Add your NVIDIA_API_KEY and OPENAI_API_KEY.

Running the Application

  1. Run the Streamlit application:

    streamlit run app.py
  2. Open your browser and navigate to the provided URL to interact with Agent-Nesh.

Usage

  • Text Queries: Type your text queries in the provided input box and get responses from the language model.
  • Code Assistance: Enter your coding queries to receive code assistance.
  • Image Analysis: Upload images for analysis and description.
  • Voice Input: Use the voice input feature to transcribe spoken language into text.

Contact

About

This project is an intelligent voice-based chat system that leverages AI to analyze user speech and respond in real-time using natural language. It integrates voice recognition, speech-to-text processing, and conversational AI models to enable seamless human-computer interaction through spoken dialogue.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •