Speech-Text Alignment

Scripts to align speech audio with their text transcriptions in time.

READMEs for individual scripts and modules are found at directory level.

Setup

Clone Repository

git clone [email protected]:anilkeshwani/speech-text-alignment.git &&
    cd speech-text-alignment &&
    git submodule update --init --recursive --progress

Set Up Environment

Ensure the necessary binary requirements are installed:

apt install sox ffmpeg

Install the package and with it all dependencies including useful dependencies for development; specified via "dev" option to pip install.

conda create -n sardalign python=3.10.6 -y &&
    conda activate sardalign &&
    pip install pip==24.0 &&
    pip install -e .["dev"] &&
    pre-commit install --install-hooks

Note: We do not install the dataclasses library as per the fairseq MMS README it ships out of the box with Python 3.10.6.

Note: When running on Artemis / Poseidon, ensure support for CUDA is provided.

At the time of writing, NVIDIA / CUDA drivers were:

NVIDIA-SMI: 525.89.02
Driver Version: 525.89.02
CUDA Version: 12.0

Data Processing and Performing Tasks

Documentation for performing data processing steps or tasks is found in scripts/README.md.

Name		Name	Last commit message	Last commit date
Latest commit History 353 Commits
docs		docs
prompt_templates		prompt_templates
sardalign		sardalign
scripts		scripts
snippets		snippets
submodules		submodules
tests		tests
tutorial		tutorial
.gitignore		.gitignore
.gitmodules		.gitmodules
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech-Text Alignment

Setup

Clone Repository

Set Up Environment

Data Processing and Performing Tasks

About

Releases

Packages

Languages

License

anilkeshwani/speech-text-alignment

Folders and files

Latest commit

History

Repository files navigation

Speech-Text Alignment

Setup

Clone Repository

Set Up Environment

Data Processing and Performing Tasks

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages