User Interface Internationalisation

Introduction

This document is targeted at developers (core and plugin code and user interface developers). If you are looking for instructions on configuring your Foswiki to work with your local language, see InstallationWithI18N

Internationalisation ( I18N) is a generalization process. It aims to make an application capable of interacting with the user using various languages, without hard-coded support for specific languages. This page covers a key aspect, of I18N, namely enabling the translation of text in the user interface into several languages.

This topic documents Foswiki's user interface I18N, and presents guidelines to internationalising templates and Foswiki code by ensuring that any English language text is extracted into a message catalogue that can then be easily extracted. Earlier I18N work on Foswiki ensured that international characters work in WikiWords and URLs using 8-bit character sets, as described in InternationalisationEnhancements. The translation support described in the page has been available since FoswikiRelease04x00x00 (DakarRelease).

See also:

System requirements for user interface I18N

UserInterfaceInternationalisation requires the following Perl modules:

  • CPAN:Locale::Maketext::Lexicon (debian apt: liblocale-maketext-lexicon-perl): module supporting PO files (among other formats) for storing translations
  • CPAN:Locale::Maketext (debian apt: liblocale-maketext-perl): module supporting internationalisation of user interface text
  • For Perl 5.8:
    • CPAN:Encode (part of Perl's core since 5.8) for converting strings in PO files (encoded in UTF-8) to {Site}{Charset}.
  • For Perl 5.6:
Once these modules are installed, UserInterfaceInternationalisation just works. There is no setting or preference that need to be configured in order to make it work. Note, however, that all translated text is stored in UTF-8 and is translated to the character encoding specified in {Site}{Locale} for display to the user, so you must set a {Site}{Locale}.

Support for non-English character encodings

All 8-bit character encodings are supported (e.g. ISO-8859-*, KOI8-R, etc), including non-Roman alphabets such as Cyrillic. This support is in production releases including FoswikiRelease03Sep2004, but requires some patches highlighted in InternationalisationIssues. It works on any Perl version from Perl 5.005_03 upwards.

This support enables use of international characters in WikiWords, form field names, tables of contents, and so on. East Asian languages are supported as long as they use Unicode (UTF-8) - although this support does not include use of WikiWords (you must use explicit links to Foswiki pages), it is quite usable, and there are many Foswiki sites in Chinese, Japanese and probably other East Asian locales. See InternationalisationEnhancements for background on this work.

Language detection

Language is detected in the following way:

  • If the LANGUAGE variable is set (either as a preference or as a session variable), it's assumed to represent the desired language (setting it no a non-existing language causes a fallback to English). The "Change language" feature uses a session variable.
  • Otherwise, language is detected from the Accept-Language sent by the browser: the available language that has the highest priority to the user (as informed by the browser is used). In this case Foswiki uses CPAN:Locale::Maketext's language detection.

Foswiki Topics and templates I18N

See the ... variable in post-DakarRelease FoswikiVariables topic.

tools/xgettext (see below) extracts translatable strings from topics shipped with Foswiki.

User-created topics will be handled after DakarRelease.

Guidelines for internationalising Foswiki topics and templates with ...

  • Don't ever put Foswiki %VARIABLES% inside translatable strings. Write
    Attachments in topic ", "UserInterfaceInternationalisation
    instead of
    Attachments in topic UserInterfaceInternationalisation
  • when the string is inside an HTML attribute, be sure the attribute is defined used single quotes to avoid confusion. Write
    <input type"submit" name="action" value='Save'/>
    instead of
    <input type"submit" name="action" value="Save"/>
  • Use \" for double quotes inside translatable strings: Example:
    Click on "Save" to record your changes.
  • Try hard to keep HTML out of translated strings, as this makes life harder for translators. (But sometimes, there is no way of doing it).
  • If you need to write something inside square brackets, escape it with tildes (this is a CPAN:Locale::Maketext restriction). Example:
    To save changes: Press the [Save Changes] button.

General guidelines for user interface text I18N

There are some guidelines to be followed when internationalising an application. Try to follow all of them to make it easier to translate Foswiki into local languages:

Use interpolation instead of concatenation

Instead of:

maketext("Found ") . $number . maketext(" items.")

Write:

maketext("Found [_1] items.", $number)

The same is valid in templates. Intead of:
%_{"This is the "}% %WEB %_{"web"}%



%_{"This is the %WEB% web"}%

When dealing with plurals, put them fully inside some context

Depending on the context, plurals can be translated differently in some languages.

Instead of:
maketext("Found") . $numbers) . ($number > 1)?maketext("items"):maketext("item")

Write:
($number > 1)?maketext("Found [_1] items"):translate("Found [_1] item")

In fact, the rules for inflecting (typically modifying the endings of) nouns in the presence of numbers can be very complicated, or not exist at all. CPAN:Locale::Maketext solves several of them for us, but for now just avoid plurals when you can. smile

Generating the PO files

Attention: this procedure assumes you are working with Foswiki sources from svn.

tools/xgettext is a utility for extracting all strings inside Foswiki's code, in Perl code, templates, and topics into a po/Foswiki.pot file, which must be copied to create a new Foswiki translation.

tools/xgettext requirements:

  • Locale::Maketext::Lexicon perl package
  • GNU gettext
To extract the strings, just run tools/xgettext from Foswiki sources root:

[somebody@somehost:~/src/Foswiki]$ ./core/tools/xgettext

tools/xgettext will extract strings from all Perl source files, Foswiki topics and templates listed in tools/MANIFESTand add the strings to po/Foswiki.pot. If there is already a po/Foswiki.pot file, the extracted strings are added into the existing po/Foswiki.pot, i.e. your existing comments in po/Foswiki.pot are preserved.

Extracted strings are also merged into existing translations. Translations and comments already done are preserved.

The merging process will try to guess similar sentences. This happens in two situations:

  • A string which was already translated in the PO file had a small change in source code. The old translation is kept.
  • A new string is somewhat similar to another one which was already translated. The translation of the older string is also used for the new string.
In both cases, the new strings will be marked as "fuzzy" to indicate that the string needs "human review". Translation maintainers have to check those strings and remove their fuzzy tags from the PO file, so Foswiki knows that they are correctly translated.

Note: tools/xgettext is named after the utility with the same name provided by GNU gettext. Foswiki's xgettext doesn't use GNU gettext's xgettext, it was written spefically for Foswiki using the CPAN:Locale::Maketext module's Locale::Maketext::Extract module for handling translatable strings in Foswiki templates and topics ( CPAN:Locale::Maketext::Extract already handles Perl source code). Some GNU gettext utilities like msgmerge and msguniq are used in tools/xgettext, and msgfmt can be used for checking translations.

Extracted strings

tools/xgettext will extract strings basically in two forms:

  • My text or My text :
    for regular user interface element translation.
  • $percntMAKETEXT{\"My text\" ...}$percnt or $percntMAKETEXT{string=\"My text\" ...}$percnt :
    for extracting MAKETEXT when used in %SEARCH{...}% formats. Note that this second form require strings to be escaped, since they are supposed to be inside an already double-quoted string, the format parameter for %SEARCH{...}%.
The actual work for extracting thr strings is done by the Foswiki::I18N::Extract class.

Get translation status

You can check translation status by using tools/check_translations within core.

Compression of PO files

Tasks.Item9845 has added compression of .po files into .mo files. (As of SVN rev 11011, on trunk only). If $Foswiki::cfg{UserInterfaceInternationalisation} is enabled in the configuration, the next save will use Locale::Msgfmt to compile / compress the .po files for all enabled languages into .mo files. When doing language file development, it is best to disable compression so that Foswiki does not use stale compressed files. Disable the expert parameter $Foswiki::cfg{LanguageFileCompression}. Note: configure will detect stale .mo files if compression is disabled and generate a warning.

References

  • Locale::Maketext::TPJ13 -- article about software localization (try perldoc Locale::Maketext::TPJ13 on your system).
  • Web Localization in Perl, by Autrijus Tang (adjust your browser to UTF-8, if needed)

-- AntonioTerceiro - 14 Jan 2006


Discussion

Excellent to see this work progressing! It would be useful to have some reference to the existing InternationalisationEnhancements, which mainly focus on correct handling of Foswiki pages and WikiWords in various languages. This UserInterfaceInternationalisation page is really covering internationalisation support for message text, which is a step beyond the existing work.

-- RichardDonkin - 11 Sep 2005

I've modified the intro a bit to reflect the fact that this page covers one part of I18N, not the whole problem. Also, I would like to see this page renamed since it covers only internationalisation of message text in the user interface, which is not the whole of I18N of course - how about UserInterfaceInternationalisation or MessageTextInternationalisation? The English-style spelling of Internationalisation is already quite prevalent in Foswiki page names (yes, I am English smile ) so we should either change those over to US spelling or change this page to use the 'isation' ending.

Re the Perl code I18N section, I've inserted a reference to the existing InternationalisationGuidelines which cover I18N (in the non-message-text sense) of core and plugin code. This section should be merged into that page once it's matured a bit.

Another issue is that we seem to have four separate pages covering localisation framework activity... Some refactoring would be good.

-- RichardDonkin - 18 Sep 2005

Not sure why Perl 5.6 doesn't work for character encoding conversion - see my comment on Bug 482 for a bit more.

Also renamed this topic!

-- RichardDonkin - 26 Sep 2005

Hi, Richard. Thank you for you comment. See my comment on Bug 482.

-- AntonioTerceiro - 27 Sep 2005

Antonio - just to keep hassling you about Perl versions, I think that CPAN:Unicode::MapUTF8 should work on Perl 5.5 (5.005_03) as well as 5.6... Of course, if we decide to change the general Foswiki FoswikiSystemRequirements that might be OK - not sure how common 5.5 is these days.

-- RichardDonkin - 03 Oct 2005

tools/xgettext does not run on Mac OS X 10.4. I get this feedback:
I: scanning sources, it may take some time...
Can't call method "extract_file" on an undefined value at ./tools/xgettext line 47.

-- ArthurClemens - 18 Oct 2005

Arthur - could you provide testenv output as per SupportGuidelines, and create a bug entry over on the Develop Branch Foswiki?

-- RichardDonkin - 19 Oct 2005

Just made a general review of this document.

-- AntonioTerceiro - 14 Jan 2006

Unfortunately in my installation the TranslationUserInterface didn't "just work". Finally I found out, that in addition to the Locale::Maketext also the I18N::LangTags Modul is required. Now everything works fine.

-- BertShome - 19 Mar 2006

I18N::LangTags is a Locale::Maketext requirement. And in Perl 5.8, AFAICT, both modules are part of the Perl core.

-- AntonioTerceiro - 21 Mar 2006

Added a link to the updated InternationalisationGuidelines and a few other minor edits. Also added this note above, which should help avoid people mis-configuring Foswiki for I18N - already in InstallationWithI18N.

  • NOTE: It is incorrect to set {Site}{Charset} directly in configure, in 99% of all cases - the correct approach is to set {Site}{Locale}, which includes the character encoding on the end (e.g. .iso-8859-1), and also sets the 'locale' needed for other purposes such as WikiWord I18N. The only time you should set the {Site}{Charset} is to override the character encoding when (1) the locale specifies character encoding X and (2) the web browser will only accept a different spelling of X, e.g. iso8859-9 vs iso-8859-9 or Latin-9 (this is a made-up example only but it can occasionally happen).
-- RichardDonkin - 10 Nov 2006

Broken link: Foswiki.InstallationWithI18N (Foswiki web doesn't exist). What would be the correct link?

-- MartinMutilva - 18 Nov 2010

Added paragraph on trunk option to compress PO files into MO files.

-- GeorgeClark - 12 Mar 2011
Topic revision: r8 - 12 Mar 2011, GeorgeClark
 
The copyright of the content on this website is held by the contributing authors, except where stated elsewhere. see CopyrightStatement. Creative Commons License