Chapter 15. Internationalization and localization

This chapter covers
  • The differences between translating languages and idioms
  • Using Zend_Locale to translate idioms
  • Using Zend_Translate to translate languages
  • Integrating Zend_Translate into a Zend Framework application

Most websites are written in a single language for a single country, and this makes life easier for both the designers and developers. Some projects, however, require more than this. Some countries have more than one language (in Wales, both English and Welsh are used) and some websites are intended to target all the countries that the company operates in. To create a website targeted at different countries and cultures, significant changes to the application are required to support multiple languages and the different formats for dates, times, currency, and so on, that each country uses.

We’re going to look at what needs to be done to make a multilingual website, then look at how the Zend_Locale and Zend_Translate components of Zend Framework help to make the process easier. Finally, we’ll implement a second language into the Places website to show how to create a localized application.

15.1. Translating Languages and Idioms

Before making a multilingual website, we first need to consider how language and customs affect a website. Intuitively, most people think about changing the language when they consider supporting another country on their website. Clearly, we need to display all the text in the correct language, but for some locales, there are cultural issues to consider too. The most common are the formatting of dates and currency.

For dates, the most infamous issue is that the U.S. uses mm/dd/yy whereas the UK uses dd/mm/yy, so it gets tricky determining what date 02/03/08 actually is. Is it the second of March or the third of February? Similarly, for currency. In France they use the comma where the UK uses a decimal place, and they use a space where the UK would use a comma. To make a French user feel at home, €1,234.56 should be displayed as €1 234,56.

The key control on a computer system for this is called the locale. The locale is a string that defines the current language and region used by the user. For example, the locale “en_GB” means English language in the Great Britain region. Similarly, “es_PR” is Spanish language in Puerto Rico. Generally, for language localization, only the first part of the locale is used, because it’s rare to find a website that provides both U.S. and UK English.

Let’s look first at what’s involved in translating languages in web applications, and then we’ll look at handling idioms.

15.1.1. Translating Languages

Translating languages involves making changes to both the HTML design and build of the website and to the PHP code that runs it. The most obvious change that’s required is that every string displayed to the user has to be in the correct language, so correct sizing of areas for text is required in the design. This also includes any text that is embedded in a graphical image, so to provide for multiple languages, the image files need to be separated out into generic and language-specific ones if text is used in images.

There are multiple methods of doing the actual translation, but they all boil down to the same thing. Every string displayed on the site needs to be mapped to a string in the target language. It follows that it’s important that the strings are rewritten using a professional translator, because there is rarely a one-to-one mapping from one language to another. Industry-standard systems, such as gettext(), have many tools dedicated to making it simple for a translator to perform the translation of phrases used in the application without having to know anything about programming.

15.1.2. Translating Idioms

The most obvious idioms that need translation are the formatting of currency and dates. PHP has locale support built in that is set using the setlocale() function. Once the locale is set, all the locale-aware functions will use it. This means that strftime() will use the correct language for the months of the year and money_format() will use the comma and period characters in the right places for the language involved. One gotcha is that setlocale() isn’t thread-safe, and the strings you need to set are inconsistent across operating systems, so care must be taken using it. Also, some functions like money_format() aren’t available on all operating systems, such as Windows. The Zend_Locale component is intended to mitigate these issues, and it also provides additional functionality, like normalization.

Now that we know a little about what localization and internationalization is, let’s look at what Zend Framework provides to make the process of creating an international website easier. We’ll start by looking at Zend_Locale’s ability to convert numbers and dates before moving on to investigate Zend_Translate’s functionality for providing translated text.

15.2. Using Zend_Locale and Zend_Translate

Zend_Locale and Zend_Translate are the key components of Zend Framework for providing a multilingual, worldwide website. Other components that are locale-aware are Zend_Date and Zend_Currency. Let’s look at Zend_Locale first.

15.2.1. Setting the Locale with Zend_Locale

Selecting the correct locale is as easy as this:

  $locale = new Zend_Locale('en_GB'),

This will create a locale object for the English language, Great Britain region. This means that a locale always contains two parts: the language and the region. We need to know both before we can specify the locale string when creating an instance of Zend_Locale. We can also create a Zend_Locale object for the locale of the user’s browser, like this:

  $locale = new Zend_Locale();

The locale object can then be used for translating lists of common strings, such as countries, units of measurement, and time information, such as month names and days of the week. We can also retrieve the language and region using this code:

  $language = $locale->getLanguage();
  $region = $locale->getRegion();

Clearly, we can then use this information to provide websites in the correct language with the right formatting of dates, times, and currencies, which will make our user feel right at home. Let’s look at numbers first.

Dealing With Numbers

The most significant regional problem with numbers is that some countries use the comma to separate the decimal places from the whole number, and some countries use the period character. If your website allows the user to enter a number, you may have to convert it appropriately. This is known as normalization.

Consider a form that asks someone to enter her monthly insurance costs in order to try to provide a cheaper quotation. A German user might type in the number 3.637,34 (three thousand, six hundred, and thirty-seven euros, and thirty-four cents), which you need normalize to 3637.34. This is achieved using the code shown in listing 15.1.

Listing 15.1. Number normalization with Zend_Locale

We can then process the number as appropriate, and we may need to display a number to the user. In this case, we again need to format the number appropriately for the user’s location, and we can use Zend_Locale’s toNumber() function to do this, as shown in listing 15.2.

Listing 15.2. Number localization with Zend_Locale

The precision parameter is optional and is used to round the provided number to the given number of decimal places.

This covers the basics of what Zend_Locale can do with numbers, but Zend_Locale provides complete number handling, including translation of numbers between different numeral systems, such as from Arabic to Latin. There is also support for integer and floating-point number normalization and localization. The manual gives full information on these functions.

Date and Time With Zend_locale

Handling the formatting of dates and times is also within the province of Zend_Locale. This class operates in conjunction with Zend_Date to provide comprehensive support for reading and writing dates. Let’s start by looking at normalizing dates because, like numbers, different regions of the world write dates in different formats, and residents obviously use their local language for the names of the months and days of the week.

Consider, the date 2 March 2007. In the UK, this may be written as 2/3/2007; in the U.S. it would be written as 3/2/2007. To use the date supplied by our users, we need to normalize it, and getDate() is the function to use, as shown in listing 15.3.

Listing 15.3. Date normalization with Zend_Locale

As usual, we create a locale object for the correct language and region and use it with the getDate() function. We’ve used the en_US locale for the U.S., so getDate() correctly determines that the month is March. If we changed the locale to en_GB, the month would be February.

Similarly, we can use checkDateFormat() to ensure that the date string received is valid for the locale, and once we have the date information separated into its components, we can manipulate it in any way we like.

Now that we know the basics of using Zend_Locale to help our international visitors feel at home, let’s have a look at Zend_Translate’s ability to help us present our site in different languages.

15.2.2. Translating with Zend_Translate

As we’ve already seen, website translation requires, at a minimum, ensuring that every string that is displayed has a translated version. The most common way to do this is with gettext(), which is powerful but fairly complicated. Zend_Translate supports the gettext() format but also supports other popular formats such as arrays, CSV, TBX, Qt, XLIFF, and XmlTm. Zend_Translate is also thread-safe, which can be very helpful if you’re running a multithreaded web server, such as IIS.

Zend_Translate supports multiple input formats using an adapter system. This approach is very common in Zend Framework and allows for further adapters to be added as required. We’ll look at the array adapter first, because that’s a very simple format and very quick to learn. Its most common use is with Zend_Cache, to cache the translations from one of the other input formats.

Using Zend_Translate is simple enough. In listing 15.4, we output text using pretty ordinary PHP and repeat the exercise in listing 15.5 using Zend_Translate’s array adapter. In this case, we make life easier by “translating” to uppercase, but we could equally have translated to German (if any of us knew enough German to avoid embarrassing ourselves).

Listing 15.4. Standard PHP output

Listing 15.4 is a very simple piece of code that displays three lines of text, maybe for a command-line script. To provide a translation, we need to create an array of translation data for the target language. The array consists of identifying keys mapped against the actual text to be displayed. The keys can be anything, but it makes things easier if it’s essentially the same as the source language.

Listing 15.5. Translated version of Listing 15.4

In this example, we use the _() function to do the translation . This is a very common function name in many programming languages and frameworks for translation. It’s a very frequently used function, and it’s less distracting in the source code if it’s short. As you can see, the _() function also supports the use of printf() placeholders so you can embed dynamic text into the correct place within a string . The current date is a good example, because in English we say, “Today’s date is {date},” whereas in another language the idiom may be, “{date} is today’s date.” By using printf() placeholders, we’re able to move the dynamic data to the correct place for the language construct used.

The array adapter is mainly useful for very small projects, where the PHP developers update the translation strings. For a large project, the gettext() format or CSV format is much more useful. For gettext(), the translation text is stored in .po files, which are best managed using a specialized editor, such as the open source poEdit application. These applications provide a list of the source language strings, and next to each one the translator can type the target-language equivalent string. This makes creating translation files relatively easy and completely independent of the website source code.

The process of using gettext() source files with Zend_Translate is as simple as picking a different adapter, as shown in listing 15.6.

Listing 15.6. Zend_Translate using the gettext() adapter

As you can see, the use of the translation object is exactly the same, regardless of the translation source adapter used.

Now let’s look at integrating what we’ve learned into a real application. We’ll use our Places application and adapt it to support two languages.

15.3. Adding a Second Language to the Places Application

Our initial goal for making Places multilingual is to present the user interface in the language of the viewer. We’ll use the same view templates for all languages and ensure that all phrases are translated appropriately. Each language will have its own translation file stored, in our case using the array adapter to keep things simple. If the user requests a language for which we don’t have a translation, we’ll use English. The result can be seen in figure 15.1, which shows the German version of Places. (Note that most of the translation text was done using Google Translate—a professional translator would do a much better job of it!)

Figure 15.1. The text on the German version of Places is translated, but the same view templates are used to ensure that adding additional languages doesn’t require too much work.

These are the key steps we’ll be taking to make Places multilingual:

.  Change the default router to support a language element.

.  Create a front controller plug-in to create a Zend_Translate object and load the correct language file. It will also create a Zend_Locale object.

.  Update the controllers and views to translate text.

We’ll start by looking at how to make the front controller’s router multi-language aware so that the user can select a language.

15.3.1. Selecting the Language

The first decision to make is how to determine the user’s language choice. The easiest solution is to ask the web browser by using Zend_Locale’s getLanguage() function, and then to store this into the session. There are a few problems with this approach. First, sessions rely on cookies, so the user would have to have cookies enabled in order to view the site in another language. Second, creating a session for every user involves overhead that we may not want to bear. Third, search engines like Google would see only the English version of the site.

To solve these problems, the code for the language of choice should be held within the URL, and there are two places we can put it: the domain or the path. To use the domain, we’d need to buy the relevant domains, such as placestotakethekids.de, placestotakethekids.fr, and so on. These country-specific domains offer a very simple solution and, for a commercial operation, can show your customers that you’re serious about doing business in their country. Problems that may arise are that the domain name may not be available for the country of choice (for example, apple.co.uk isn’t owned by Apple Inc.) and for some country-specific domains you need to have proof of business incorporation within that country in order to purchase the domain name. One alternative is to include the language code as part of the path, such as www.placestotakethekids.com/fr for French and www.placestotakethekids.com/de for the German language. We’ll use this approach.

Because we wish to use full locales for each language code, we need to map from the language code used in the URL to the full locale code. For example, /en will be mapped to en_GB, /fr to fr_FR, and so on for all supported languages. We’ll use our configuration INI file to store this mapping, as shown in listing 15.7.

Listing 15.7. Setting locale information in config.ini
languages.en = en_GB
languages.fr = fr_FR
languages.de = de_DE

The list of valid language codes and their associated locales are now available in the $config object that was loaded in the Bootstrap class and stored in the Zend_Registry. We can retrieve the list of supported language codes like this:

  $config = Zend_Registry::get('config'),
  $languages = array_keys($config->languages->toArray());

To use the language codes within the address, we need to alter the routing system to account for the additional parameter. The standard router interprets paths of the form

  /{module}/{controller}/{action}/{other_parameters}

where {module} is optional. A typical path for Places is

  /place/index/id/4

This calls the index action of the place controller with the id parameter set to 4. For our multilingual site, we need to introduce the language as the first parameter, so that the path now looks like this:

  /{language}/{controller}/{action}/{other parameters}

We’ll use the standard two-character codes for the language, so that a typical path for the German language version of Places is

  /de/place/index/id/4

To change this, we need to implement a new routing rule and replace the default route with it. The front controller will then be able to do its magic and ensure that the correct controller and action are called. This is done in the Bootstrap class’s runApp() method, as shown in listing 15.8.

Listing 15.8. Implementing a new routing rule for language support

We use Zend_Controller_Router_Route to define the route using the colon (:) character to define the variable parts with language first, then the controller, the action, and the asterisk, which means “all other parameters” .

We also define the defaults for each part of the route, for when it’s missing. Because we’re replacing the default route with our new route, we keep the same defaults so that the index action of the index controller is called for an empty address. We set the default language to the browser’s default language, as determined by Zend_Locale’s getLanguage() method. However, setting this as a default means that if the user chooses a specific language, the choice will take precedence. This allows people using a Spanish browser to view the site in English, for instance.

Now that we have routing working, we need to load the translation files. We need to do this after the routing has happened, but before we get to the action methods. The dispatchLoopStartup() method hook of a front controller plug-in is the ideal vehicle to perform this work.

15.3.2. The LanguageSetup Front Controller Plug-in

A front controller plug-in has a number of different hooks into the various stages of the dispatch process. In this case, we’re interested in the dispatchLoopStartup() hook, because we want to load the language files after routing has happened but we only need it to be called once per request. Our plug-in, LanguageSetup, will be stored in the Places library and follows Zend Framework naming guidelines in order to take advantage of Zend_Loader. The full class name is Places_Controller_Plug-in_LanguageSetup, and it’s stored in library/Places/Controller/Action/Helper/ LanguageSetup.php. These are the main functions it performs:

  • Loads language file containing array of translations
  • Instantiates Zend_Translate object for the selected language
  • Assigns language string and Zend_Translate object to the controller and view

All this is done in the dispatchLoopStartup() method. In order to do its work, it will need to know the directory where the language files are stored and also the list of languages available. This information is in the Bootstrap class, so we pass it to the plug-in via its constructor.

The plug-in’s constructor takes two parameters: the directory where the translation files can be found and the list of languages from the config file. We could get the plug-in to figure out these values, but we prefer to leave the specific knowledge of the directory system to the Bootstrap class so that if we change anything, all changes will be in one place. Similarly, the plug-in could retrieve the list of languages from the Zend_Config object directly, but this would introduce a coupling to this component that isn’t needed.

Let’s start building the LanguageSetup plug-in now. The first part is the constructor, shown in listing 15.9.

Listing 15.9. LanguageSetup front controller plug-in

We need to register the new plug-in with the front controller in the runApp() method of the Bootstrap class. This is done in the same way as registering the ActionSetup plug-in back in Chapter 3, and it looks like this:

  $frontController->registerPlugin(
     new Places_Controller_Plugin_LanguageSetup(
        ROOT_DIR . '/application/configuration/translations',
        $config->languages->toArray()));

We take advantage of the ROOT_DIR to absolutely specify the translations directory.

Now that we’ve registered the plug-in and have stored the required data in local member variables, we can write the dispatchLoopStartup() method that sets up the locale and translation objects. This is shown in listing 15.10.

Listing 15.10. The LanguageSetup’s dispatchLoopStartup() method

The code in dispatchLoopStartup() consists of as much error checking as actual code, which isn’t uncommon. First, we collect the language that the user has chosen. This is in the Request object, so it is accessible via getParam() .

Before we load a language file, we first check that the selected language is available , because the user could, in theory, type any text within the language element of the address path. Because we only have a limited number of language files in the translations directory, we check that the user’s choice is available. If not, we pick English instead. Similarly, if the language is valid, we double-check that the file actually exists , in order to avoid errors later, and we load the file using a simple include statement. An assumption is made that the language file returns an array which we assign to $translationStrings. The array contains the actual translations, so we throw an exception if this array doesn’t exist.

Having completed our error checking, we instantiate a new Zend_Translate object . Finally, we register everything with Zend_Registry so that the information can be used later . Registering the Zend_Translate object with the registry also means that Zend_Form and Zend_Validate will use it for translating form labels, form buttons, and validation error messages.

As we discovered when we looked at Zend_Translate, the system supports multiple adapters to allow for a variety of input sources for the translation strings. For Places, we’ve chosen arrays, because they’re the simplest to get going, but if the site grows significantly, moving to gettext() would be easy to do and would require changes to just this init() method.

The next stage is to use the translate object to translate our website.

15.3.3. Translating the View

For our website to be multilingual, we need every piece of English text on every page to be changed so it runs through the _() method of Zend_Translate. Zend Framework provides the translate view helper to make this easy. This view helper needs access to a Zend_Translate object, and the easiest way to give it one is to register one with Zend_Registry using the key “Zend_Translate”, as we did in listing 15.10.

In the view scripts we can change the original <h2>Recent reviews</h2> on the home page to <?php echo $this->translate('Recent reviews'), ?>. Let’s look at it in use on the home page. listing 15.11 shows the top part of the index.phtml view template for the home page before localization.

Listing 15.11. The top part of the non-localized index.phtml
<h1><?php echo $this->escape($this->title);?></h1>

<p>Welcome to <em>Places to take the kids</em>! This site will
help you to plan a good day out for you and your children. Every
place featured on this site has been reviewed by people like you,
so you"ll be able to make informed decisions with no marketing
waffle!</p>

<h2>Recent reviews</h2>

As you can see, with the exception of the title, all the text in the view template is hard-coded directly, so we have to change this, as shown in listing 15.12.

Listing 15.12. The localized version of index.phtml

With the localized template, every string is passed through the translate() view helper. For the very long body of text, we use a simple key () in order to make the template easier to understand and the language file simpler.

The relevant parts of the language files for English and German are shown in listings 15.13 and 15.14. The full files are provided in the accompanying source code.

Listing 15.13. The English translation file, en_GB.php Listing
<?php
$translationStrings = array(
   'Welcome to Places to take the kids!' => 'Welcome to Places to take the
kids!',
   'welcome-body' => 'Welcome to <i>Places to take the kids</i>! This site
will help you to plan a good day out for you and your children. Every place
featured on this site has been reviewed by people like you, so you'll be
able to make informed decisions with no marketing waffle!',
   'Recent reviews' => 'Recent reviews',
);

And the equivalent file in German is shown in listing 15.14

Listing 15.14. The German translation file, de_DE.php
<?php
$translationStrings = array(
   'Welcome to Places to take the kids!' => 'Willkommen bei Places to take
the kids!',
   'welcome-body' => 'Willkommen bei <i>Places to take the kids</i>! Diese
Website wird Ihnen helfen, einen guten Tag für Sie und Ihre Kinder zu

planen. Jeder auf dieser Website präsentierte Ort wurde von Menschen wie
Sie es sind geprüft, damit Sie in der Lage sind, fundierte Entscheidungen
treffen zu können, ohne den Marketing Quatsch!',
);

We had a little help with the translation of welcome-body. Thomas Wiedner looked at what Google Translate had come up with and provided the correct translation, because Google’s wasn’t very good! But it’s quite clear when looking at listings 15.11 and 15.12 how easy array-based translations are to create.

The final area we need to look at is creating links from one page to another. Fortunately, Zend Framework provides a URL builder for us in the shape of the url() view helper. listing 15.15 shows how we can use it.

Listing 15.15. Creating URLs using the url() view helper

As can be seen in listing 15.15, creating a URL is simplified by using the url() view helper. It can be further simplified by not passing true as the last parameter, which is called $reset. When $reset is false (the default) the helper will “remember” the state of any parameters you don’t override. Because the lang parameter would never be overridden, it only has to be specified if the $reset parameter is set to true.

We’ve now provided comprehensive support for all aspects of translation using Zend_Translate, and adding languages can be done very easily by adding the language to config.ini and writing a translation file.

It’s traditional to provide a mechanism for allowing users to choose the language that they wish to view the site in. One common mechanism is the flag, because it works in all languages, although it can annoy UK citizens when they see the U.S. flag indicating English. An alternative is to use text in each language, but that can be quite hard to integrate into a site’s design.

To complete our conversion of Places into a multilingual website and make our visitors feel at home, we need to ensure that we translate the dates displayed throughout the site using Zend_Locale.

15.3.4. Displaying the Correct Date with Zend_Locale

If you look closely at Figure 15.1, you’ll notice that the dates are in the wrong format and are displaying the English month name rather than the German one. This is due to our view helper, displayDate(), which isn’t correctly localized. We’ll now look at how to localize dates. As a reminder, the original code for displaying dates is shown in listing 15.16.

Listing 15.16. Naïve localization of dates

We’re using strftime() in listing 15.16, which, according to the PHP manual, is locale aware. Unfortunately, although we know which locale we want, strftime() uses the locale of the server unless you tell it otherwise. There are two solutions: use setlocale() or Zend_Date.

At first glance, using setlocale() is very tempting. We just need to add the following line to our LanguageSetup front controller plug-in:

  setlocale(LC_TIME, $lang);

However, this doesn’t work as expected. The first problem with setlocale() is that it isn’t thread-safe. This means that if you’re using a threaded web server, you need to call setlocale() before every use of a locale-aware PHP function. The second issue is that on Windows, the string that is passed to the setlocale() function isn’t the same as the string we’re using in our config.ini file. That is, we’re using “de” for German, but the Windows version of setlocale() expects “deu”. Zend_Date, as a member of Zend Framework, works more predictably for us.

Zend_Date is a class that comprehensively handles date and time manipulation and display. It’s also locale-aware, and if you pass it a Zend_Locale object, it will use it to translate all date- and time-related strings. Our immediate requirement is simply to retrieve dates in the correct format for the user, so we modify our displayDate() view helper, which is stored in views/helpers/DisplayDate.php, as shown in listing 15.17.

Listing 15.17. Localization of dates using Zend_Date

The LanguageSetup front controller plug-in stored the locale object into the registry, so we can simply retrieve it for use with Zend_Date . Displaying a locale-aware date is simply a matter of creating the Zend_Date object from the $timestamp to be displayed , then calling get() . The get() method takes a format parameter that can be a string or a Zend_Date constant. These constants are locale-aware so that DATE_LONG will display the month name in the correct language. Similarly, Zend_Date::DATE_SHORT knows that the format of the short date is dd/mm/yy in the UK and mm/dd/yy in the U.S. In general, we recommend staying away from short dates, because they can confuse users who aren’t used to locale-aware websites.

The German version of Places, including localized dates can be seen in figure 15.2.

Figure 15.2. Using the locale-aware Zend_Date, we can display the date in the correct language

As you can see in figure 15.2, the date string created by Zend_Date::DATE_LONG for our German users is as they’d expect. As a result, our German guests realize that they’re first-class citizens on this site, and not an afterthought.

15.4. Summary

Zend_Locale and Zend_Translate make creating multilingual, locale-aware websites much easier. Of course, creating a website in multiple languages isn’t easy, because care needs to be taken to ensure that the text fits in the spaces provided, and you need to make sure you have translations that work!

Zend_Locale is the heart of localization in Zend Framework. It allows for normalizing numbers and dates written in different formats so that they can be stored consistently and used within the application.

Zend_Translate translates text into a different language using the _() method. It supports multiple input formats, including the widespread gettext() format, so that you can choose the most appropriate format for your project. Small projects will use array and CSV formats, whereas larger projects are more likely to use gettext, TBX, Qt, XLIFF, and XmlTm. Through the flexibility of Zend_Translate’s adapter system, migrating from a simpler system to a more robust one doesn’t affect the main code base at all.

That covers the use of the internationalization and localization of applications, so we’ll move on to a different sort of translation: output formats. Although all websites can be printed, it’s sometimes easier and better to provide a PDF version of a page. Zend_Pdf enables the creation and editing of PDF documents with a minimum of fuss, so we’ll look at it in the next chapter.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset