Montreal Forced Aligner -- Dutch

Applying Montreal Forced Aligner to Dutch. my_corpus contains the input corpus and my_corpus_aligned contains the result.

Note: example-0001 and example-0002 are generated by whisper, example-0003 is manually annotated.

Installation

https://montreal-forced-aligner.readthedocs.io/en/latest/installation.html#general-installation

conda create -n aligner -c conda-forge montreal-forced-aligner
conda activate aligner

Download Pretrained Models

# Usage: mfa model download [OPTIONS] [TYPE] [MODEL_NAME]
mfa model download --ignore_cache acoustic dutch_cv
mfa model download --ignore_cache dictionary dutch_cv

Train G2P Model

# Usage: mfa train_g2p [OPTIONS] DICTIONARY_PATH OUTPUT_MODEL_PATH
mfa train_g2p --clean -j 16 dutch_cv ~/Documents/MFA/btamm/g2p/dutch_cv.zip
# mfa train_g2p --clean --phonetisaurus -j 16 dutch_cv ~/Documents/MFA/btamm/g2p/dutch_cv_phonetisaurus.zip --alignment_separator "°"

The arg --alignment_separator "°" avoids the following error, I haven't tested if it works properly though...

The symbol ";" is reserved for "alignment_separator", but is found in the graphemes or phonemes of your dictionary.
Please re-run and specify another symbol that is not used in your dictionary with the "--alignment_separator" flag.

Validate Model on Corpus

This finds OOVs in the corpus.

# Usage: mfa validate [OPTIONS] CORPUS_DIRECTORY DICTIONARY_PATH
mfa validate --clean --ignore_acoustics ./my_corpus dutch_cv

Generate OOV Pronunciations

# Usage: mfa g2p [OPTIONS] INPUT_PATH G2P_MODEL_PATH OUTPUT_PATH
mfa g2p --clean ~/Documents/MFA/my_corpus/oovs_found_dutch_cv.txt ~/Documents/MFA/btamm/g2p/dutch_cv.zip ~/Documents/MFA/my_corpus/g2p_oovs.txt --dictionary_path dutch_cv

Add Pronunciations To Dictionary

# Usage: mfa model add_words [OPTIONS] DICTIONARY_PATH NEW_PRONUNCIATIONS_PATH
mfa model add_words --clean dutch_cv ~/Documents/MFA/my_corpus/g2p_oovs.txt

Align Corpus

# Usage: mfa align [OPTIONS] CORPUS_DIRECTORY DICTIONARY_PATH ACOUSTIC_MODEL_PATH OUTPUT_DIRECTORY
mfa align --clean --output_format json ./my_corpus dutch_cv dutch_cv ./my_corpus_aligned

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Montreal Forced Aligner -- Dutch

Installation

Download Pretrained Models

Train G2P Model

Validate Model on Corpus

Generate OOV Pronunciations

Add Pronunciations To Dictionary

Align Corpus

Files

README.md

Latest commit

History

README.md

File metadata and controls

Montreal Forced Aligner -- Dutch

Installation

Download Pretrained Models

Train G2P Model

Validate Model on Corpus

Generate OOV Pronunciations

Add Pronunciations To Dictionary

Align Corpus