Skip to content

RohanM/onesided

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

One-sided

One-sided is a tool to process spoken audio (eg. podcasts) and selectively remove specific speakers. It's ideal for when you want to eliminate extraneous chatter, enabling you to focus on the crux of an interview.

Install

# Install ffmpeg
apt-get install ffmpeg # Ubuntu
brew install ffmpeg # MacOS

# Install python packages and open shell
poetry install
poetry shell

Getting started

Download your podcast audio from YouTube:

yt-dlp -xv --audio-format wav --restrict-filenames -o "%(title)s.%(ext)s" -- https://www.youtube.com/watch?v=xxxx

Segment the audio by speaker, producing a CSV. This process is compute-intensive, and performs best with a GPU:

python diarise.py my-podcast.wav

# outputs my-podcast.ds.csv

Produce sample audio to determine the identity of each speaker:

python remix.py --speakers --ds my-podcast.ds.csv --input my-podcast.wav
# outputs my-podcast.speakers.wav

Listen to the sample audio. This will consist of a segment for each speaker, separated by a chime. Make note of which speakers you'd like to retain and/or remove.

Output the processed audio:

# Include only specified speakers
python remix.py --ds my-podcast.ds.csv --input my-podcast.wav --include-speakers=1,3

# Exclude some speakers, retain the others:
python remix.py --ds my-podcast.ds.csv --input my-podcast.wav --exclude-speakers=2,4

# Either will output my-podcast.onesided.wav

Credits

About

Extract a single speaker from a podcast

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages