Skip to content

My pet project to generate bilingual sentence pairs for training corpus-based machine translation systems. Previously hosted on Bitbucket to ensure anonymous conference submissions.

Notifications You must be signed in to change notification settings

ypeels/bitext-maker

Repository files navigation

This is an attempt to generate multilingual parallel corpora from human-scalable multilingual resources using Python 3.

  • main.py: run the generator
  • check*.py: various data set checks
  • template_maker.py: pre-generate custom templates from a more concise format

About

My pet project to generate bilingual sentence pairs for training corpus-based machine translation systems. Previously hosted on Bitbucket to ensure anonymous conference submissions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published