-
Notifications
You must be signed in to change notification settings - Fork 8
Translation
Malasakit is a multilingual application that was designed to scale well with many languages. This involves translating two classes of text:
- Comments (also known as ideas or suggestions) should be translated by creating a new comment:
- From the admin site home page, click "Add" for the comment model.
- Copy all attributes from the original comment.
- Translate the message and set the language as needed.
- Set the "original" attribute to the comment in its original language.
- Static text (for instance, questions and instructions on each page) are translated using Django's built-in localization system, which the remainder of this document addresses. Unlike comments, static text is not user-generated, so translation is limited in scope and can be done ahead of time.
Django extends the GNU xgettext
utility to prepare strings in the code and templates for translation.
This preparation involves crawling through the source tree, gathering strings marked for translation, and exporting them to text files (called message files) for a translator to edit.
Once the translator has finished translating each message, you compile the text files into production-ready data files.
These data files are loaded into memory when the application starts, so translation simply becomes a cheap key-value lookup operation.
The languages Malasakit supports are listed in malasakit-django/cafe/settings.py
under the LANGUAGES
setting, like so:
LANGUAGES = (
('en', _('English')),
('tl', _('Filipino')),
)
Each entry contains a tuple of two entries: a language code (such as en
) and a translated version of the language's full name.
(See below for why the language names are wrapped with the _(...)
function.)
To begin adding a new language, you would add another entry to LANGUAGES
, then run
$ make preparetrans
from the top of the repository to generate the desired message files. Then, run
$ python manage.py makemigrations
$ python manage.py migrate
from malasakit-django
.
This will update the language
fields of the Comment
and Respondent
models to be compatible the new LANGUAGES
.
In Python sources, strings are marked for translation by wrapping them inside a function, such as ugettext
or ugettext_lazy
(the u
prefix stands for Unicode).
These functions are often aliased as _
for brevity, so you will often see code such as
from django.utils.translation import ugettext_lazy as _
...
translated_hello_world = _('Hello world')
Once Django detects what language a user wants to use, the _
function will return the appropriate translation.
For Malasakit, preferred language is detected by examining the URL for a language code.
(For instance, /tl/landing/
tells all calls to _
to return the Tagalog translation for given strings.)
Similarly, for templates, strings are marked for translation with Django-specific tags:
- Short static strings are often marked like
{% trans 'Hello world' %}
. - Longer strings or strings containing placeholders for input may be marked as
{% blocktrans trimmed %}
{{ count }} have responded.
{% endblocktrans %}
Note that Django considers English as a "base language".
You can think of each language as have some key-value store from English phrases to phrases in that language.
Then, at runtime, ugettext
uses the English phrase argument to pick the translation out of the key-value store of the inferred language.
Once you are ready to gather the translatable strings, from the repository top level, run
$ make preparetrans
to prepare for translation. This target does several things:
- Run a custom
manage.py
command calledmakedbtrans
. Django, by default, only pulls translatable strings from.py
,.html
,.txt
, and.js
files in the source tree. This means that text that lives in the database, such as question prompts, are excluded from the crawling process.makedbtrans
was written to address this issue. - Run an augmented version of
makemessages
, Django's command for crawling the source tree. Malasakit extendsmakemessages
to work withmakedbtrans
. - Delete intermediary files created from running
makedbtrans
.
After this process, for every language specified in LANGUAGES
except for the base language en
, there will be new message files django.po
and djangojs.po
in malasakit/locale/<locale-code>/LC_MESSAGES
.
(Lists of standard locale codes are specified by ISO.)
django.po
contains entries for translated strings in Python source files and templates, while djangojs.po
, as the name implies, contains translated strings in JavaScript source files.
These text files are ready to be handed off to a translator.
The format of a new message file looks like this:
#: QuantitativeQuestion.prompt:1
msgid "I have suffered the consequence of a typhoon or flood."
msgstr ""
#: pcari/templates/rate-comments.html:17
msgid ""
"Each icon below is a comment made by another participant. Next to each icon "
"is the topic of the comment, or <span class=\"no-tag\">(?)</span> if no "
"topic was found. Comments that are closer together were authored by "
"participants who answered the quantitative questions similarly."
msgstr ""
#: pcari/models.py:431
#, python-format
msgid "\"%(option)s\" is not a valid option"
msgstr ""
...
The lines starting with #
are comments to the translator.
Typically, comments are used to denote where the string was found.
Here, the first string came from the prompt
field of a QuantitativeQuestion
instance with an id
of 1, and the second string came from line 17 of the rate-comments.html
template.
The msgid "..."
line gives the English text to be translated, and the msgstr ""
line should give the translated text of the msgid
line it is paired with.
Translators should also be aware of the following notes on message file syntax:
- You will notice that long messages can be broken up onto several separate lines.
When the message file is compiled, these lines are concatenated together.
For instance, for the second
msgid
in the example above would become
msgid "Each icon below is a comment made by another participant. Next to each icon is the topic of the comment, or <span class=\"no-tag\">(?)</span> if no topic was found. Comments that are closer together were authored by participants who answered the quantitative questions similarly."
-
The second
msgid
above also contains markup tags:<span class="no-tag">(?)</span>
(the\
before"
escapes the double-quote). Anything in<...>
or</...>
should not be translated. However, you should translate the text in between the tags, so for<div id="message">Hello</div>
, you should translateHello
while keeping the tags verbatim in themsgstr
. -
The third
msgid
above contains a format specifier%(option)s
. The%
starts a placeholder, the(option)
gives the placeholder the name of "option", ands
means the placeholder is a string. Translators should repeat the placeholder as-is in themsgstr
. -
Take care not to change the
msgid
because the strings in the source are matched exactly to those in the message file. If amsgid
needs to be corrected, update the string in the source or database first, thenpreparetrans
again, which will update the message files.
For the example given at the start of this section, the completed message file for the Tagalog locale, located at malasakit-django/locale/tl/LC_MESSAGES/django.po
, should look like this:
#: QuantitativeQuestion.prompt:1
msgid "I have suffered the consequence of a typhoon or flood."
msgstr "Malawak at mabigat ang epekto ng bagyo o pagbabaha sa akin."
#: pcari/templates/rate-comments.html:17
msgid ""
"Each icon below is a comment made by another participant. Next to each icon "
"is the topic of the comment, or <span class=\"no-tag\">(?)</span> if no "
"topic was found. Comments that are closer together were authored by "
"participants who answered the quantitative questions similarly."
msgstr ""
"Ang bawat imahe sa ibaba ay naglalaman ng ibinigay na komento ng ibang kalahok. "
"Sa tabi ng bawat imahe ay ang paksa ng komento, o <span class=\"no-tag\">"
"(?)</span> kung walang natagpuang paksa. Ang mga komentong malapit sa isa’t "
"isa ay ibinigay ng mga kalahok na sumagot sa katulad na mga kuwantitatibong "
"tanong."
#: pcari/models.py:431
#, python-format
msgid "\"%(option)s\" is not a valid option"
msgstr "\"%(option)s\" ay wala sa mga pagpipilian"
...
Sometimes, an entry will start with the comment #, fuzzy
(this sometimes occurs when the msgid
has changed slightly).
In this case, the translator will simply need to correct that entry, if necessary, and remove the comment.
Once the translations have been completed, from the top level of the repository, run
$ make compiletrans
For each locale, Django will create compiled django.mo
and djangojs.mo
files from their respective message files.
These compiled files are what Django will serve at runtime.
When making message files, Django will automatically integrate existing translations with any new messages, so it is not possible to accidentally overwrite or lose translations you have (old translations are merely commented out using #
).