Skip to content

Translation

Jonathan Lee edited this page May 26, 2018 · 11 revisions

Malasakit is a multilingual application that was designed to scale well with many languages. This involves translating two classes of text:

  • Comments (also known as ideas or suggestions) should be translated by creating a new comment:
    1. From the admin site home page, click "Add" for the comment model.
    2. Copy all attributes from the original comment.
    3. Translate the message and set the language as needed.
    4. Set the "original" attribute to the comment in its original language.
  • Static text (for instance, questions and instructions on each page) are translated using Django's built-in localization system, which the remainder of this document addresses. Unlike comments, static text is not user-generated, so translation is limited in scope and can be done ahead of time.

Overview

Django extends the GNU xgettext utility to prepare strings in the code and templates for translation. This preparation involves crawling through the source tree, gathering strings marked for translation, and exporting them to text files (called message files) for a translator to edit. Once the translator has finished translating each message, you compile the text files into production-ready data files. These data files are loaded into memory when the application starts, so translation simply becomes a cheap key-value lookup operation.

Enabling a Language

The languages Malasakit supports are listed in malasakit-django/cafe/settings.py under the LANGUAGES setting, like so:

LANGUAGES = (
    ('en', _('English')),
    ('tl', _('Filipino')),
)

Each entry contains a tuple of two entries: a language code (such as en) and a translated version of the language's full name. (See below for why the language names are wrapped with the _(...) function.) To begin adding a new language, you would add another entry to LANGUAGES, then run

$ make preparetrans

from the top of the repository to generate the desired message files. Then, run

$ python manage.py makemigrations
$ python manage.py migrate

from malasakit-django. This will update the language fields of the Comment and Respondent models to be compatible the new LANGUAGES.

Marking Strings for Translation

In Python sources, strings are marked for translation by wrapping them inside a function, such as ugettext or ugettext_lazy (the u prefix stands for Unicode). These functions are often aliased as _ for brevity, so you will often see code such as

from django.utils.translation import ugettext_lazy as _

...

translated_hello_world = _('Hello world')

Once Django detects what language a user wants to use, the _ function will return the appropriate translation. For Malasakit, preferred language is detected by examining the URL for a language code. (For instance, /tl/landing/ tells all calls to _ to return the Tagalog translation for given strings.)

Similarly, for templates, strings are marked for translation with Django-specific tags:

  • Short static strings are often marked like {% trans 'Hello world' %}.
  • Longer strings or strings containing placeholders for input may be marked as
{% blocktrans trimmed %}
  {{ count }} have responded.
{% endblocktrans %}

Note that Django considers English as a "base language". You can think of each language as have some key-value store from English phrases to phrases in that language. Then, at runtime, ugettext uses the English phrase argument to pick the translation out of the key-value store of the inferred language.

Gathering Translatable Strings

Once you are ready to gather the translatable strings, from the repository top level, run

$ make preparetrans

to prepare for translation. This target does several things:

  1. Run a custom manage.py command called makedbtrans. Django, by default, only pulls translatable strings from .py, .html, .txt, and .js files in the source tree. This means that text that lives in the database, such as question prompts, are excluded from the crawling process. makedbtrans was written to address this issue.
  2. Run an augmented version of makemessages, Django's command for crawling the source tree. Malasakit extends makemessages to work with makedbtrans.
  3. Delete intermediary files created from running makedbtrans.

After this process, for every language specified in LANGUAGES except for the base language en, there will be new message files django.po and djangojs.po in malasakit/locale/<locale-code>/LC_MESSAGES. (Lists of standard locale codes are specified by ISO.) django.po contains entries for translated strings in Python source files and templates, while djangojs.po, as the name implies, contains translated strings in JavaScript source files. These text files are ready to be handed off to a translator.

Messsage File Syntax

The format of a new message file looks like this:

#: QuantitativeQuestion.prompt:1
msgid "I have suffered the consequence of a typhoon or flood."
msgstr ""

#: pcari/templates/rate-comments.html:17
msgid ""
"Each icon below is a comment made by another participant. Next to each icon "
"is the topic of the comment, or <span class=\"no-tag\">(?)</span> if no "
"topic was found. Comments that are closer together were authored by "
"participants who answered the quantitative questions similarly."
msgstr ""

#: pcari/models.py:431
#, python-format
msgid "\"%(option)s\" is not a valid option"
msgstr ""

...

The lines starting with # are comments to the translator. Typically, comments are used to denote where the string was found. Here, the first string came from the prompt field of a QuantitativeQuestion instance with an id of 1, and the second string came from line 17 of the rate-comments.html template.

The msgid "..." line gives the English text to be translated, and the msgstr "" line should give the translated text of the msgid line it is paired with. Translators should also be aware of the following notes on message file syntax:

  • You will notice that long messages can be broken up onto several separate lines. When the message file is compiled, these lines are concatenated together. For instance, for the second msgid in the example above would become
msgid "Each icon below is a comment made by another participant. Next to each icon is the topic of the comment, or <span class=\"no-tag\">(?)</span> if no topic was found. Comments that are closer together were authored by participants who answered the quantitative questions similarly."
  • The second msgid above also contains markup tags: <span class="no-tag">(?)</span> (the \ before " escapes the double-quote). Anything in <...> or </...> should not be translated. However, you should translate the text in between the tags, so for <div id="message">Hello</div>, you should translate Hello while keeping the tags verbatim in the msgstr.

  • The third msgid above contains a format specifier %(option)s. The % starts a placeholder, the (option) gives the placeholder the name of "option", and s means the placeholder is a string. Translators should repeat the placeholder as-is in the msgstr.

  • Take care not to change the msgid because the strings in the source are matched exactly to those in the message file. If a msgid needs to be corrected, update the string in the source or database first, then preparetrans again, which will update the message files.

For the example given at the start of this section, the completed message file for the Tagalog locale, located at malasakit-django/locale/tl/LC_MESSAGES/django.po, should look like this:

#: QuantitativeQuestion.prompt:1
msgid "I have suffered the consequence of a typhoon or flood."
msgstr "Malawak at mabigat ang epekto ng bagyo o pagbabaha sa akin."

#: pcari/templates/rate-comments.html:17
msgid ""
"Each icon below is a comment made by another participant. Next to each icon "
"is the topic of the comment, or <span class=\"no-tag\">(?)</span> if no "
"topic was found. Comments that are closer together were authored by "
"participants who answered the quantitative questions similarly."
msgstr ""
"Ang bawat imahe sa ibaba ay naglalaman ng ibinigay na komento ng ibang kalahok. "
"Sa tabi ng bawat imahe ay ang paksa ng komento, o <span class=\"no-tag\">"
"(?)</span> kung walang natagpuang paksa. Ang mga komentong malapit sa isa’t "
"isa ay ibinigay ng mga kalahok na sumagot sa katulad na mga kuwantitatibong "
"tanong."

#: pcari/models.py:431
#, python-format
msgid "\"%(option)s\" is not a valid option"
msgstr "\"%(option)s\" ay wala sa mga pagpipilian"

...

Sometimes, an entry will start with the comment #, fuzzy (this sometimes occurs when the msgid has changed slightly). In this case, the translator will simply need to correct that entry, if necessary, and remove the comment.

Compiling Message Files

Once the translations have been completed, from the top level of the repository, run

$ make compiletrans

For each locale, Django will create compiled django.mo and djangojs.mo files from their respective message files. These compiled files are what Django will serve at runtime.

Revising Translations

When making message files, Django will automatically integrate existing translations with any new messages, so it is not possible to accidentally overwrite or lose translations you have (old translations are merely commented out using #).

Additional References