ThokNaath is an open-source project aimed at developing Text-to-Speech (TTS) and Speech-to-Text (STT) systems for the Nuer language. The goal is to empower communication, education, and cultural preservation by making Nuer language technology accessible to everyone.
data/
: Datasets and preprocessing scripts.models/
: Pre-trained and fine-tuned models.notebooks/
: Jupyter notebooks for experimentation.src/
: Source code for TTS and STT systems.tests/
: Unit and integration tests.docs/
: Project documentation.
Here’s the repository is organize your to support iterative development for both TTS and STT
thoknaath/
├── .github/ # GitHub-specific files
│ ├── workflows/ # CI/CD pipelines (e.g., testing, deployment)
│ ├── ISSUE_TEMPLATE/ # Templates for GitHub issues
│ └── PULL_REQUEST_TEMPLATE.md # Template for PRs
├── data/ # Datasets and preprocessing scripts
│ ├── tts/ # TTS-specific data
│ │ ├── raw/ # Raw text and audio files
│ │ ├── processed/ # Cleaned and preprocessed data
│ │ ├── metadata.csv # LJSpeech-style metadata
│ │ └── preprocess.py # Script for preprocessing TTS data
│ └── stt/ # STT-specific data
│ ├── raw/ # Raw audio and transcriptions
│ ├── processed/ # Cleaned and preprocessed data
│ ├── metadata.csv # CommonVoice-style metadata
│ └── preprocess.py # Script for preprocessing STT data
├── models/ # Pre-trained and fine-tuned models
│ ├── tts/ # TTS models
│ │ ├── pretrained/ # Pre-trained TTS models
│ │ └── finetuned/ # Fine-tuned Nuer TTS models
│ └── stt/ # STT models
│ ├── pretrained/ # Pre-trained STT models
│ └── finetuned/ # Fine-tuned Nuer STT models
├── notebooks/ # Jupyter notebooks for experimentation
│ ├── tts/ # TTS notebooks
│ │ ├── data_exploration.ipynb # Explore TTS dataset
│ │ └── model_training.ipynb # Train TTS models
│ └── stt/ # STT notebooks
│ ├── data_exploration.ipynb # Explore STT dataset
│ └── model_training.ipynb # Train STT models
├── src/ # Source code
│ ├── tts/ # TTS-related scripts
│ │ ├── preprocess.py # Data preprocessing for TTS
│ │ ├── train.py # Model training for TTS
│ │ ├── api.py # TTS API
│ │ └── utils.py # Utility functions for TTS
│ └── stt/ # STT-related scripts
│ ├── preprocess.py # Data preprocessing for STT
│ ├── train.py # Model training for STT
│ ├── api.py # STT API
│ └── utils.py # Utility functions for STT
├── tests/ # Unit and integration tests
│ ├── tts/ # TTS tests
│ │ ├── test_preprocess.py # Test TTS preprocessing
│ │ ├── test_train.py # Test TTS training
│ │ └── test_api.py # Test TTS API
│ └── stt/ # STT tests
│ ├── test_preprocess.py # Test STT preprocessing
│ ├── test_train.py # Test STT training
│ └── test_api.py # Test STT API
├── docs/ # Documentation
│ ├── setup.md # Setup instructions
│ ├── tts_guide.md # TTS development guide
│ ├── stt_guide.md # STT development guide
│ ├── contributing.md # Contribution guidelines
│ └── roadmap.md # Project roadmap
├── README.md # Project overview
├── requirements.txt # Python dependencies
└── LICENSE # Project license (e.g., MIT)
-
Python 3.8 or higher
-
Git
-
A GPU (recommended for training models)
- Clone the repository:
git clone https://github.com/thai22/thoknaath.git
- Install dependencies:
pip install -r requirements.txt
- For TTS, follow the TTS Guide.
- For STT, follow the STT Guide.
Follow the guides in docs/ to set up the TTS and STT systems.
- Preprocess the data:
python src/tts/preprocess.py
- Train the model:
python src/tts/train.py
- Run the TTS API:
python src/tts/api.py
- Send a POST request to /tts with Nuer text to synthesize speech.
- Preprocess the data:
python src/stt/preprocess.py
- python src/stt/train.py
python src/stt/train.py
- python src/stt/api.py
python src/stt/api.py
- Send a POST request to /stt with an audio file to transcribe speech.
- Add raw text and audio files to data/tts/raw/ and data/stt/raw/.
- Use the preprocessing scripts (preprocess.py) to clean and prepare the data.
- Use the training scripts (train.py) to fine-tune pre-trained models on the Nuer dataset.
- Monitor training progress using TensorBoard or logs.
- Run unit tests for TTS and STT:
pytest tests/tts/ pytest tests/stt/
We welcome contributions! Please read our Contribution Guidelines for details on how to:
- Report issues.
- Submit pull requests.
- Add new features or improve documentation.
Check out our Roadmap to see the planned features and milestones for ThokNaath.
This project is licensed under the MIT License. See LICENSE for details.
- The Nuer-speaking community for their support and contributions.
- Open-source tools like Coqui TTS, Hugging Face, and PyTorch for enabling this project.
- For questions or feedback, please open an issue or contact the project maintainers.