Multilingual websites and webapplications using PHP and Smarty, part 2: dictionary based templates

Joor Loohuis, July 26, 2009, 30271 views.

One of the components of a multilingual website or webapplication is providing support for different languages. Typically, the translations are not provided by webdevelopers, but by less technically oriented people, who prefer a low-tech approach to translating. In this article a dictionary based translation model for Smarty templates is demonstrated, which is easy to implement, maintain, and extend.

Tags: , , , , , ,

As explained in the first article in this series, multilingual websites and webapplications are becoming the standard. We've seen how the language and locale preferences of the visitors are passed in each request, and now it is time to put this information to work. In this episode we'll demonstrate a strategy for setting up a multilingual website. We'll also use Smarty functionality to set up a way of using easily maintainable dictionaries to automatically translate interfaces. The objective is to allow maintenance of these dictionaries by people who are not necessarily technically oriented.

Directory based language selection

Before delving into the main topic, a little attention should be given to translating static website content. The method for separating a site into directories per language demonstrated in the first article allows you to write separate static pages for each language. There is a very good reason for choosing this directory based approach: it allows the visitor to select a different language if the initial selection is not correct (imagine a Dutch person visiting China, who wants to update her/his blog in an internet cafe). It also allows for deep linking to content in a specific language, regardless of the browser preferences. In short, you should only detect the language for the initial request to the homepage, and always provide some way of selecting one of the other languages.

Application interfaces

It gets interesting when we need to display an interface that is defined in a Smarty template. As a simple example, suppose we have a contact form in our multilingual site, for which we want to translate the field labels. The form could look something like this:

<form>
  <label for="name">Your name: <input id="name" type="text" name="name"></label>
  <label for="email">Your email: <input id="email" type="text" name="email"></label>
  <textarea name="question">Your question</textarea>
  <input type="submit" name="submit" value="Send">
</form>

What we want is to replace all text labels and other text that the user sees with the correct translation for all languages we intend to support. The way we do that here is by using an undervalued feature of Smarty, config files. Config files are an easy way of defining constants for reference in templates, that can be managed in a centralized way. Of course, this is exactly what we want to achieve. Config files are similar to Windows .ini files, in the sense that they use symbols to which values are assigned. Like .ini files, config files can be divided into sections that can be loaded explicitly in a template. You probably understand where this is going: we use distinct sections for each language we need to support. An example dictionary for the form above is

  1. dictionary.cfg
  2. default language is English
contact_name = Your name contact_email = Your email contact_question = Your question contact_submit = Send
  1. Dutch translation
nl contact_name = Je naam contact_email = Je email contact_question = Je vraag contact_submit = Verzenden

This dictionary consists of a global section at the top with all the symbols and values in the default language, and a separate section for the Dutch translations. Each section is identified by a label which is placed in brackets at the start of the section, and we use the language to identify each section. Extending this dictionary with a new language is as simple as copying the global section, adding the section name with the correct language, and translate all values assigned to the symbols. The way this dictionary is applied to a template is as follows:

{assign var=$lang value=$smarty.server.SCRIPT_NAME|regex_replace:'#(^/|\W.*)#':''}
{config_load file='dictionary.cfg' section=$lang}
<form>
  <label for="name">{#contact_name#}: <input id="name" type="text" name="name"></label>
  <label for="email">{#contact_email#}: <input id="email" type="text" name="email"></label>
  <textarea name="question">{#contact_question#}</textarea>
  <input type="submit" name="submit" value="{#contact_submit#}">
</form>

The detection of the language based on the directory should be familiar by now. The language is used to load the appropriate section of the dictionary. For the rest, all occurrences of literal strings are replaced by symbolic references. A useful side effect of this way of working is that if a symbol is missing from a language section, or if an entire language section is missing, the system will fall back to the default language defined in the global section.

How you organize your dictionaries depends on the complexity of your application and your personal preferences. You could simply place all symbols in a single file, but if the application becomes more complex, it is probably wize to use a separate config file for each template. You could even adapt the code above to load separate config files for all languages, but then you wouldn't have the big advantage of the system falling back to a default language if a symbol is omitted or misspelled.

Caveats

As always, there are some limitations that apply:

  • The principal downside with the approach described above, is that it is simplistic in the sense that it doesn't support a fundamental property of language, the discrimination between singular and plural. Depending on the complexity of your interfaces you might be able to work around this with a few simple tests and separate symbols for singular and plural. If the interfaces become too complex for this, you should look into Gettext support, but this makes maintenance and extension a lot more complicated.
  • Another downside is that this approach works well for left-to-right languages, and probably for right-to-left languages, but probably not when these both need to be supported. I'm not even mentioning adding support for vertical text orientation. Of course, in case you need support for more than one text orientation, you have much bigger design challenges than just adding language support.
  • Also note that the method described here translates the interface itself, but not the data. So if you have, say, a talkback feature on your blog that switches to the correct language for all interface elements, the actual posts will of course still be in the language the original posters used.

Finalizing

As you will probably have guessed by now, this blog uses translation dictionaries for the site and application templates, while the content on the pages other than the blog content pages is static. But this doesn't cover everything related to supporting native languages. One other major aspect is that we also need to support local variations in notation of numbers, dates, etc. Support for locales, as these are called, is the topic of the next episode in this series.

Social networking: Tweet this article on Twitter Pass on this article on LinkedIn Bookmark this article on Google Bookmark this article on Yahoo! Bookmark this article on Technorati Bookmark this article on Delicious Share this article on Facebook Digg this article on Digg Submit this article to Reddit Thumb this article up at StumbleUpon Submit this article to Furl

Talkback

respond to this article