Skip to content

neatnettech/ai_voice_cloning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Voice Cloning

alpha - wip

This project aims to clone a voice and use it to generate speech from written text. The goal is to create a proof of concept (PoC) where a user's voice is cloned and used to read text input.

Table of Contents

Introduction

This project demonstrates how to clone a voice and generate speech using that cloned voice. The core idea is to train a Text-to-Speech (TTS) model on a dataset of recorded voice and use it to synthesize speech from text.

Features

  • Voice Cloning
  • Text-to-Speech (TTS) Conversion
  • High-quality audio output
  • Customizable and scalable deployment

Technology Stack

  • Programming Language: Python
  • Deep Learning Frameworks: PyTorch or TensorFlow
  • Models:
    • Tacotron 2: For TTS
    • WaveGlow/MelGAN: For vocoder
  • Pre-built Libraries/Frameworks:
    • Real-Time-Voice-Cloning or Coqui TTS

Installation

  1. Clone the repository:

    git clone https://github.com/yourusername/ai-voice-cloning.git
    cd ai-voice-cloning
  2. Install the required Python packages:

    pip install -r requirements.txt
  3. Set up any additional dependencies as required by the chosen framework or model.

Usage

  1. Data Collection:

    • Record voice samples and save them in the data/raw directory.
    • Ensure each audio file is transcribed and aligned with the text.
  2. Preprocess Data:

    • Run the preprocessing script to normalize audio and extract features.
    python preprocess.py

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages