User Interface Internationalisation

Introduction

This document is targeted at developers (core, extension and user interface developers). If you are looking for instructions on configuring your Foswiki to work with your local language, see InternationalizationSupplement.

Internationalisation ( I18N) is a generalization process. It aims to make an application capable of interacting with the user using various languages, without hard-coded support for specific languages. This page covers a key aspect, of I18N, namely enabling the translation of text in the user interface into several languages.

This topic documents Foswiki's user interface I18N, and presents guidelines to internationalising templates and Foswiki code by ensuring that any English language text is extracted into a message catalogue that can then be easily extracted. Support for international characters work in WikiWords, URLs using 8-bit character sets, and the translation support described in the page has been available since the first release of Foswiki.

See also:

System requirements for user interface I18N

UserInterfaceInternationalisation requires the following Perl modules:

  • CPAN:Locale::Maketext::Lexicon (debian apt: liblocale-maketext-lexicon-perl): module supporting PO files (among other formats) for storing translations
  • CPAN:Locale::Maketext (debian apt: liblocale-maketext-perl): module supporting internationalisation of user interface text
  • CPAN:Encode (part of Perl's core since 5.8) for converting strings in PO files (encoded in UTF-8) to {Site}{Charset}.
Once these modules are installed, UserInterfaceInternationalisation just works. There is no setting or preference that need to be configured in order to make it work. Note, however, that all translated text is stored in UTF-8 and is translated to the character encoding specified in {Site}{Locale} for display to the user, so you must set a {Site}{Locale}.

Localization and Internalization works better on newer versions of Perl, a minimum of Perl 5.18 is preferred. Note also that "Taint checking" should be disabled if Locales are going to be enabled. (Foswiki 1.2 will ship with Taint checking disabled.)

Support for non-English character encodings

All 8-bit character encodings are supported (e.g. ISO-8859-*, KOI8-R, etc), including non-Roman alphabets such as Cyrillic.

This support enables use of international characters in WikiWords, form field names, tables of contents, and so on. East Asian languages are supported as long as they use Unicode (UTF-8) - although this support does not include use of WikiWords (you must use explicit links to Foswiki pages), it is quite usable, and there are many Foswiki sites in Chinese, Japanese and probably other East Asian locales.

Language detection

Language is detected in the following way:

  • If the LANGUAGE variable is set (either as a preference or as a session variable), it's assumed to represent the desired language (setting it no a non-existing language causes a fallback to English). The "Change language" feature uses a session variable.
  • Otherwise, language is detected from the Accept-Language sent by the browser: the available language that has the highest priority to the user (as informed by the browser is used). In this case Foswiki uses CPAN:Locale::Maketext's language detection.

Foswiki Topics and templates I18N

See the ... macro: VarMAKETEXT

tools/xgettext (see below) extracts translatable strings from topics shipped with Foswiki.

Guidelines for internationalising Foswiki topics and templates with ...

  • Don't ever put Foswiki %VARIABLES% inside translatable strings. Write
    Attachments in topic ", "UserInterfaceInternationalisation
    instead of
    Attachments in topic UserInterfaceInternationalisation
  • when the string is inside an HTML attribute, be sure the attribute is defined used single quotes to avoid confusion. Write
    <input type="submit" name="action" value='%MAKETEXT{"Save"}%'/>
    instead of
    <input type"submit" name="action" value="%MAKETEXT{"Save"}%"/>
  • Use \" for double quotes inside translatable strings: Example:
    %MAKETEXT{"Click on \"Save\" to record your changes."}%
  • Try hard to keep HTML out of translated strings, as this makes life harder for translators. (But sometimes, there is no way of doing it).
  • If you need to write something inside square brackets, escape it with tildes (this is a CPAN:Locale::Maketext restriction). Example:
    %MAKETEXT{"To save changes: Press the ~[Save Changes~] button."}%
  • Always put punctuation inside the translated string, never outside: Example:
    %MAKETEXT{"Users list:"}%, instead of
    %MAKETEXT{"Users list"}%:

General guidelines for user interface text I18N

There are some guidelines to be followed when internationalising an application. Try to follow all of them to make it easier to translate Foswiki into local languages:

Translating strings in Perl modules: interpolation instead of concatenation

Instead of:
$session->i18n->maketext("Found ") . $number . maketext(" items.")

Write:
$session->i18n->maketext("Found [_1] items.", $number)

The same is valid in templates. Intead of:
%_{"This is the "}% %WEB %_{"web"}%



%_{"This is the %WEB% web"}%

When dealing with plurals, put them fully inside some context

Depending on the context, plurals can be translated differently in some languages.

Instead of:

maketext("Found") . $numbers) . ($number > 1)?maketext("items"):maketext("item")

Write:

($number > 1)?maketext("Found [_1] items"):translate("Found [_1] item")

In fact, the rules for inflecting (typically modifying the endings of) nouns in the presence of numbers can be very complicated, or not exist at all. CPAN:Locale::Maketext solves several of them for us, but for now just avoid plurals when you can. smile

Generating the PO files

Attention: this procedure assumes you are working with Foswiki sources from git.

tools/xgettext is a utility for extracting all strings inside Foswiki's code, (Perl code, templates, and topics) into Foswiki.pot and <language>.po files. Don't confuse tools/xgettext with the system utility of the same name. tools/xgettext is a perl script which performs a similar function, but customized for the Foswiki installation. See below.

tools/xgettext requirements:

  • Locale::Maketext::Lexicon perl package
  • GNU gettext
To extract the strings, just run tools/xgettext from Foswiki sources root:

[somebody@somehost:~/src/Foswiki]$ ./core/tools/xgettext

tools/xgettext will extract strings from all Perl source files, Foswiki topics and templates listed in MANIFEST and add the strings to po/Foswiki.pot. If there is already a locale/Foswiki.pot file, the extracted strings are added into the existing Foswiki.pot, i.e. your existing comments in Foswiki.pot are preserved. After the strings are extracted into locale/Foswiki.pot, they are then merged into all of the existing locale/<language>.po files. Translations and comments already done are preserved.

The merging process will try to guess similar sentences. This happens in two situations:

  • A string which was already translated in the PO file had a small change in source code. The old translation is kept.
  • A new string is somewhat similar to another one which was already translated. The translation of the older string is also used for the new string.
In both cases, the new strings will be marked as "fuzzy" to indicate that the string needs "human review". Translation maintainers have to check those strings and remove their fuzzy tags from the PO file, so Foswiki knows that they are correctly translated.

Note: tools/xgettext is named after the utility with the same name provided by GNU gettext. Foswiki's xgettext doesn't use GNU gettext's xgettext, it was written spefically for Foswiki using the CPAN:Locale::Maketext module's Locale::Maketext::Extract module for handling translatable strings in Foswiki templates and topics ( CPAN:Locale::Maketext::Extract already handles Perl source code). Some GNU gettext utilities like msgmerge and msguniq are used in tools/xgettext, and msgfmt can be used for checking translations.

Extracted strings

tools/xgettext will extract strings basically in two forms:

  • My text or My text :
    for regular user interface element translation.
  • $percntMAKETEXT{\"My text\" ...}$percnt or $percntMAKETEXT{string=\"My text\" ...}$percnt :
    for extracting MAKETEXT when used in %SEARCH{...}% formats. Note that this second form require strings to be escaped, since they are supposed to be inside an already double-quoted string, the format parameter for %SEARCH{...}%.
The actual work for extracting the strings is done by the Foswiki::I18N::Extract class.

Get translation status

You can check translation status by using tools/check_translations within core. The Translation status is also reported by the Weblate interface.

Compression of PO files

Tasks.Item9845 has added optional compression of .po files into .mo files into Release 1.2, however it is disabled by default. If $Foswiki::cfg{UserInterfaceInternationalisation} and $Foswiki::cfg{LanguageFileCompression} are enabled in the configuration, the next save will use Locale::Msgfmt to compile / compress the .po files for all enabled languages into .mo files. When doing language file development, it is best to disable compression so that Foswiki does not use stale compressed files. Disable the expert parameter $Foswiki::cfg{LanguageFileCompression}. Note: configure will detect stale .mo files if compression is disabled and generate a warning. (Compression is disabled by default due to some inconsistencies reported using some versions of perl.)

References

  • Locale::Maketext::TPJ13 -- article about software localization (try perldoc Locale::Maketext::TPJ13 on your system).
  • Web Localization in Perl, by Autrijus Tang (adjust your browser to UTF-8, if needed)

-- AntonioTerceiro - 14 Jan 2006


Discussion

Excellent to see this work progressing! It would be useful to have some reference to the existing InternationalisationEnhancements, which mainly focus on correct handling of Foswiki pages and WikiWords in various languages. This UserInterfaceInternationalisation page is really covering internationalisation support for message text, which is a step beyond the existing work.

-- RichardDonkin - 11 Sep 2005

I've modified the intro a bit to reflect the fact that this page covers one part of I18N, not the whole problem. Also, I would like to see this page renamed since it covers only internationalisation of message text in the user interface, which is not the whole of I18N of course - how about UserInterfaceInternationalisation or MessageTextInternationalisation? The English-style spelling of Internationalisation is already quite prevalent in Foswiki page names (yes, I am English smile ) so we should either change those over to US spelling or change this page to use the 'isation' ending.

Re the Perl code I18N section, I've inserted a reference to the existing InternationalisationGuidelines which cover I18N (in the non-message-text sense) of core and plugin code. This section should be merged into that page once it's matured a bit.

Another issue is that we seem to have four separate pages covering localisation framework activity... Some refactoring would be good.

-- RichardDonkin - 18 Sep 2005

Not sure why Perl 5.6 doesn't work for character encoding conversion - see my comment on Bug 482 for a bit more.

Also renamed this topic!

-- RichardDonkin - 26 Sep 2005

Hi, Richard. Thank you for you comment. See my comment on Bug 482.

-- AntonioTerceiro - 27 Sep 2005

Antonio - just to keep hassling you about Perl versions, I think that CPAN:Unicode::MapUTF8 should work on Perl 5.5 (5.005_03) as well as 5.6... Of course, if we decide to change the general Foswiki FoswikiSystemRequirements that might be OK - not sure how common 5.5 is these days.

-- RichardDonkin - 03 Oct 2005

tools/xgettext does not run on Mac OS X 10.4. I get this feedback:

I: scanning sources, it may take some time...
Can't call method "extract_file" on an undefined value at ./tools/xgettext line 47.

-- ArthurClemens - 18 Oct 2005

Arthur - could you provide testenv output as per SupportGuidelines, and create a bug entry over on the Develop Branch Foswiki?

-- RichardDonkin - 19 Oct 2005

Just made a general review of this document.

-- AntonioTerceiro - 14 Jan 2006

Unfortunately in my installation the TranslationUserInterface didn't "just work". Finally I found out, that in addition to the Locale::Maketext also the I18N::LangTags Modul is required. Now everything works fine.

-- BertShome - 19 Mar 2006

I18N::LangTags is a Locale::Maketext requirement. And in Perl 5.8, AFAICT, both modules are part of the Perl core.

-- AntonioTerceiro - 21 Mar 2006

Added a link to the updated InternationalisationGuidelines and a few other minor edits. Also added this note above, which should help avoid people mis-configuring Foswiki for I18N - already in InstallationWithI18N.

  • NOTE: It is incorrect to set {Site}{Charset} directly in configure, in 99% of all cases - the correct approach is to set {Site}{Locale}, which includes the character encoding on the end (e.g. .iso-8859-1), and also sets the 'locale' needed for other purposes such as WikiWord I18N. The only time you should set the {Site}{Charset} is to override the character encoding when (1) the locale specifies character encoding X and (2) the web browser will only accept a different spelling of X, e.g. iso8859-9 vs iso-8859-9 or Latin-9 (this is a made-up example only but it can occasionally happen).
-- RichardDonkin - 10 Nov 2006

Broken link: Foswiki.InstallationWithI18N (Foswiki web doesn't exist). What would be the correct link?

-- MartinMutilva - 18 Nov 2010

Added paragraph on trunk option to compress PO files into MO files.

-- GeorgeClark - 12 Mar 2011
Topic revision: r11 - 05 Mar 2015, GeorgeClark
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. See Copyright Statement. Creative Commons License    Legal Imprint    Privacy Policy