Codex tools: Log in / create account
Contents
|
Internationalization and localization are terms used to describe the effort to make WordPress (and other such projects) available in languages other than English, for people from different locales, who use different dialects and local preferences.
The process of localizing a program has two steps. The first step is when the program's developers provide a mechanism and method for the eventual translation of the program and its interface to suit local preferences and languages for users worldwide. WordPress developers have done this, so in theory, WordPress can be used in any language.
The second step is the actual localization, the process by which the text on the page and other settings are translated and adapted to another language and culture, using the framework prescribed by the developers of the software. WordPress has already been localized into many other languages (see WordPress in Your Language for more information).
This article explains how translators (bi- or multi-lingual WordPress users) can go about localizing WordPress to more languages.
Before you start translating WordPress, check WordPress in Your Language (and resources cited there) to see if a translation of WordPress into your language already exists. It is also possible that someone (or a team) is already working on translating WordPress into your language, but they haven't finished yet. To find out, subscribe to the wp-polyglots mailing list, introduce yourself, and ask if there's anyone translating into your language. There is also a list of localization teams and localization teams currently forming, which you can check to see if a translation is in progress.
Assuming that a WordPress translation into your language does not already exist or have someone working on it, you may want to volunteer to create a public translation of WordPress into your language. If so, here are the qualifications you will need:
A locale is a combination of language and regional dialect. Usually locales correspond to countries, as is the case with Portuguese (Portugal) and Portuguese (Brazil).
You can do a translation for any locale you wish, even other English locales such as Canadian English or Australian English, to adjust for regional spelling and idioms.
The default locale of WordPress is U.S. English.
WordPress's developers chose to use the GNU gettext localization framework to provide localization infrastructure to WordPress. gettext is a mature, widely used framework for modular translation of software, and is the de facto standard for localization in the open source/free software realm.
gettext uses message-level translation — that is, every "message" displayed to users is translated individually, whether it be a paragraph or a single word. In WordPress, such "messages" are generated, translated, and used by the WordPress PHP files via two PHP functions. __() is used when the message is passed as an argument to another function; _e() is used to write the message directly to the page. More detail on these two functions:
Note that if you are internationalizing a Theme or Plugin, you should use a "Text Domain". See Writing a Plugin for more information on how to do this for a plugin; themes are similar.
The gettext framework takes care of most of WordPress. However, there are a few places in the WordPress distribution where gettext cannot be used -- see Files For Direct Translation for more information on how to translate these spots.
There are three types of files used in the gettext translation framework. These files are used and/or generated by translation tools during the translation process, as follows:
There are various tools available to aid in translating. You may use whichever you prefer.
We have a separate page with instructions for translating WordPress at Launchpad.
This section is incomplete.
At the beginning of the PO file is something called the header. This gives information about what package and version the translation is for, who the translator was, and when it was created. Certain portions of this header should be universal for all WordPress translations:
# LANGUAGE (LOCALE) translation for WordPress. # Copyright (C) YEAR WordPress contributors. # This file is distributed under the same license as the WordPress package. # FIRST AUTHOR <EMAIL@ADDRESS>, YEAR. # #, fuzzy msgid "" msgstr "" "Project-Id-Version: WordPress VERSION\n" "Report-Msgid-Bugs-To: \n" "POT-Creation-Date: 2005-02-27 17:11-0600\n" "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n" "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n" "Language-Team: LANGUAGE <LL@li.org>\n" "MIME-Version: 1.0\n" "Content-Type: text/plain; charset=CHARSET\n" "Content-Transfer-Encoding: 8bit\n"
Fill in the rest of the capitalized text with the appropriate values.
The remainder of the file will be in a format as follows:
#: wp-comments-post.php:13 msgid "Sorry, comments are closed for this item." msgstr "" #: wp-comments-post.php:29 msgid "Sorry, you must be logged in to post a comment." msgstr "" #: wp-comments-post.php:35 msgid "Error: please fill the required fields (name, email)." msgstr ""
The first line of each message contains the location of the message in the WordPress code. In the case of these messages, they're all located in wp-comments-post.php, on lines 13, 29, and 35, respectively. Occasionally you will come across a message for which you will need to check its context; look at the appropriate line or lines in the WordPress core, and you should be able to figure out when and where the message is displayed, and even reproduce it yourself using your web browser. Some messages will also appear with the same text in multiple locations; in that case, there may be more than one line giving a file and line location.
The next line, msgid, is the source message. This is the string that WordPress passes to its __() or _e() functions, and the message you will need to translate.
The final line, msgstr, is a blank string where you will fill in your translation.
Here's how the same few lines would look after being translated, using the French (France) locale as an example:
#: wp-comments-post.php:13 msgid "Sorry, comments are closed for this item." msgstr "L'ajout de commentaire n'est pas ou plus possible pour cet article." #: wp-comments-post.php:29 msgid "Sorry, you must be logged in to post a comment." msgstr "Vous devez être connecté pour rédiger un commentaire." #: wp-comments-post.php:35 msgid "Error: please fill the required fields (name, email)." msgstr "Erreur : veuillez remplir les champs obligatoires vides (nom, e-mail)."
Labels are often used in the context of HTML <label>, <legend>, <a>, or <select> tags. They are short and precise descriptors of the purpose of a UI element. These can be very difficult to translate at times, especially if they are single words, and if the word used in English can be interpreted as either a noun or imperative verb. With most labels you will need to do some searching through the code to find the context of its use before coming up with an appropriate translation.
Because so many of the messages are part of the WordPress administration interface, Labels are probably the most frequent type of message to translate.
msgid "Post" msgstr "Artikkeli"
"Post" could be interpreted as an imperative verb, but in this context it's a noun. The noun form of "post" in English can be difficult to translate, and the most appropriate translation has been difficult for some teams to decide upon. Many translations use their language's equivalent to the English "Article," as this one does. (From the Finnish (Finland) translation.)
#: wp-login.php:79 wp-login.php:233 wp-register.php:166 #: wp-includes/template-functions-general.php:46 msgid "Register" msgstr "रजिस्टर"
From the Hindi translation.
#: wp-admin/admin-functions.php:357 msgid "- Select -" msgstr " - Dewis -"
Items like the surrounding dashes in this example can be eliminated or replaced if they might be confusing to users in your target locale, or if there are different established conventions for your locale. From the Welsh translation.
Another frequent type of message, the informational message is usually composed of full sentences, and conveys information or requests an action of the user. Since these tend to be longer than labels, they tend to be slightly easier to translate. However, with the longer messages comes more variation in the level of formality (or informality), which is something translators need to be aware of.
#: wp-login.php:146 msgid "Your new password is in the mail." msgstr "Вашата нова парола е в електронната ви поща."
This particular message contains a modified English formulaic expression ("the check/cheque is in the mail"), which contributes to its informality. (From the Bulgarian (Bulgaria) translation.)
#: wp-includes/functions.php:1636 msgid "<strong>Error</strong>: Incorrect password." msgstr "<strong>FEL</strong>: Felaktigt lösenord."
Error messages tend to be more formal, simply because they're short and concise. (From the Swedish (Sweden) translation.)
#: wp-includes/functions-post.php:467 msgid "Sorry, you can only post a new comment once every 15 seconds. Slow down cowboy." msgstr "Leider kannst du nur alle 15 Sekunden einen neuen Kommentar eingeben. Immer locker bleiben."
Of course, not all of them. (From the German (Germany) translation.)
If a string contains a vertical bar |, the part on the right of | is a description. Its purpose is to help you translate the string, placing it in certain context or providing additional information.
#: wp-includes/locale.php:186 msgid "" "number_format_decimal_point|$dec_point argument for http://php.net/number_format, default is ." msgstr ","
The description suggest you look at a web page, in order to translate the string.
Rather than using PHP's built-in locale switching features, which is not configured for very many languages on most hosts, WordPress uses the gettext translation module to accomplish date and time translations and formatting.
WordPress translates the following:
#: wp-includes/locale.php:42 wp-includes/locale.php:57 msgid "May" msgstr "Květen"
(From the Czech (Czech Republic) translation.)
#: wp-includes/locale.php:57 msgid "May_May_abbreviation" msgstr "Mag"
Note the unusual msgid. These messages should NOT be translated literally: they are a hack to get around the fact that in English, the full name and abbreviation for May are the same, which Gettext would erroneously combine into one entry. (From the Italian (Italy) translation.)
#: wp-includes/locale.php:7 #: wp-includes/locale.php:18 #: wp-includes/locale.php:31 msgid "Tuesday" msgstr "火曜日"
(From the Japanese (Japan) translation.)
#: wp-includes/locale.php:31 msgid "Tue" msgstr "Уто"
(From the Serbian (Serbia) translation.)
#: wp-includes/locale.php:18 msgid "T_Tuesday_initial" msgstr "ti"
The weekday initials are for WordPress's calendar feature, and use the same hack as the month abbreviations to get around the fact that in English Tuesday and Thursday share the same first letter. Not all locales use single-letter abbreviations for all days: in this example, Norwegian Bokmål uses an extra letter to distinguish tirsdag (Tuesday) and torsdag (Thursday). (From the Norwegian Bokmål (Norway) translation.)
These are PHP date() formatting strings, and they allow you to change the formatting of the date and time for your locale.
WordPress uses the translations elsewhere in the localization file for month names, weekday names, etc. This special string is for the selection of which elements to include in the date & time, as well as the order in which they're presented.
Take this msgid from the theme.pot file:
#: archive.php:40 search.php:19 single.php:22 msgid "l, F jS, Y" msgstr ""
In English, this gets formatted as:
Sunday, February 27th, 2005
However, different locales format their dates differently. In Danish, for example, dates are written:
søndag, 27. februar 2005
To accomplish this, the msgid above would be translated to:
#: archive.php:40 search.php:19 single.php:22 msgid "l, F jS, Y" msgstr "l, j. F Y"
To use another example, one way to format dates in Chinese and Japanese is as follows:
2005年2月27日
This would be accomplished in the translation like this:
#: archive.php:40 search.php:19 single.php:22 msgid "l, F jS, Y" msgstr "Y年n月j日"
Lastly, if you need to include literal alphabetic characters in your date format, as sometimes occurs in Spanish, you can backslash them:
#: archive.php:40 search.php:19 single.php:22 msgid "l, F jS, Y" msgstr "l j \d\e F \d\e Y "
This would output:
domingo 27 de febrero de 2005
To translate your Date e.g. inside your plugin use the wp-function mysql2date(Dateformat, Datestring). It uses the month- and week-translations to return your date.
Many messages contain special PHP formatting placeholders, which allow the insertion of untranslatable dynamic content into the message after it is translated. The PHP placeholders come in two different formats:
#: wp-login.php:116 msgid "The e-mail was sent successfully to %s's e-mail address." msgstr "El e-mail fue enviado satisfactoriamente a la dirección e-mail de %s"
This message inserts the username of the user to which an email has been sent. (From the Spanish (Spain) translation.)
#: wp-admin/upload.php:96 #, php-format msgid "File %1$s of type %2$s is not allowed." msgstr "类型为%2$s的文件%1$s不允许被上传。"
This message reverses the order in which the file name and type are used in the translation. (From the Chinese (China) translation.)
The WordPress Localization Repository at http://svn.automattic.com/wordpress-i18n/ is a Subversion repository where official WordPress translations are maintained. Various teams collaborate on translations for their native language, and team maintainers commit updates and changes to the repository.
Participation in the repository is open to anyone. Simply subscribe to the wp-polyglots mailing list, introduce yourself, and let everyone know what translation you'd like to work on. If there is already a team for your language and locale, they'll let you know and you can join them. If not, you can either volunteer to be a maintainer for your language and locale, or simply contribute your localization and the repository maintainers will add it.
Note: these guidelines are subject to change as the system evolves; repository maintainers will be happy to assist you in updating the files you maintain in the repository should these guidelines change.
All localizations should have at least a UTF-8 version, but may optionally add versions in other character encodings popular for that locale.
PHP does not support Byte Order Markers (BOMs), so be sure the UTF-8 encoded files you contribute do not have them.
With a few exceptions (noted below), all translations should be written literally, rather than escaping accented and special characters with HTML character entities.
Some characters must be escaped to avoid conflict with XHTML markup: angle brackets (< and >), and ampersands (&). In addition, there are a few other characters better used escaped, such as non-breaking spaces ( ), angle quotes (« and »), curly apostrophes (’) and curly quotes.
For more information about the W3C's best practices involving character encodings and character entities, see the following references:
The repository contains directories for each locale, which are named as follows:
Within each locale's directory are the regular Subversion versioning directories: branches/, tags/, and trunk/.
Inside the appropriate versioning directory are the following subdirectories:
This directory contains the Gettext MO and PO files for the locale. Message files are named after the locale name.
In the kubrick folder you should put the translation (using exactly the same PO/MO filename as above) of the i18n-ed default theme, residing at the wordpress-i18n svn repository. There is also another way of translating the default theme:
This directory contains all files in the WordPress distribution that cannot be Gettexted, which have been translated into the target locale.
If the locale has only a UTF-8 translation of the files, the dist/ directory may be populated with them directly, and the structure within dist should mirror the structure of the wordpress root directory:
If the locale contains more than just a UTF-8 character encoding, then dist/ should contain subdirectories for each encoding:
It is better to translate the i18n-ed kubrick (see the messages/ part above), instead of using theme/.
Similarly to the dist/ dir, theme/ contains hard-translated theme files. If only a UTF-8 translation is present, the directory can be populated with subdirectories for each theme translated. These subdirectories contain all of the same files as the original theme (except that they're translated), and are named the same as the original theme:
Just as with the dist/ directory, if there are multiple character encodings represented, theme/ should contain a subdirectory for each character encoding, which in turn would contain subdirectories for each theme translated.
Ryan Boren's Localizing Plugins and Themes
Translating WordPress into another language (themes and plugins too)