[CONTRIBUTION] Speech Dataset Generator #534

davidmartinrius · 2024-02-23T18:48:06Z

davidmartinrius
Feb 23, 2024

Hi everyone!

Now you can create datasets automatically with any audio or lists of audios.

I hope you can find it useful.

Here are the key functionalities of the project:

Dataset Generation: The project allows for the creation of datasets with Mean Opinion Score (MOS).
Silence Removal: It includes a feature to remove silences from audio files, enhancing the overall quality.
Sound Quality Improvement: The project focuses on improving the quality of the audio.
Audio Segmentation: It can segment audio files within specified second ranges.
Transcription: The project transcribes the segmented audio, providing a textual representation.
Gender Identification: It identifies the gender of each speaker in the audio.
Pyannote Embeddings: Utilizes pyannote embeddings for speaker detection across multiple audio files.
Automatic Speaker Naming: Automatically assigns names to speakers detected in multiple audios.
Multiple Speaker Detection: Capable of detecting multiple speakers within each audio file.