- Multilingual websites and webapplications using PHP and Smarty, part 1: detecting languages and locales
- Multilingual websites and webapplications using PHP and Smarty, part 2: dictionary based templates
- Multilingual websites and webapplications using PHP and Smarty, part 3: locales
- Unintentional side effects of using microformats
- A Google Map with company logo and travel directions
Finding the right UTF-8 symbol
April 29, 2010,
Occasionally, you might have the need for a special symbol in HTML. Whether it's a simple arrow, a foreign currency symbol, or something more exotic, it's probably represented in the UTF-8 font encoding. Here's how to find the correct entity.
If you go back a little while in using computers in general, and building websites or other document manipulation in particular, you will remember the endless mess that you got into when you wanted to incorporate one or two 'special' characters. One of the problem spots was that fonts typically supported 255 different characters, and part of these were reserved for special purposes. This probably wasn't much of a problem if your only language was English, but other languages use many more characters. The way to use such a language was to switch to the right encoding, but that proved to be a source of trouble, for example when documents had to be shared with people that might not have fonts in the right encoding.
Unicode was supposed to be the answer to all these problems, because it provides room for many more characters in the same encoding, 65535 to be precise. In UTF-8, blocks have been assigned to specific purposes, such as various alphabets in use in the world, mathematical symbols, and so on. For designers this provides a lot of posibilities, since now many symbols can be represented as entities that correlate with the font, rather than fixed size images that don't scale or change color. One problem is how to find these symbols among the thousands that are available in UTF-8. The good people at fileformat.info have gathered a very extensive reference of what symbols reside where in the UTF-8 encoding. The best starting point is the list of supported symbol blocks, from where you can browse the available symbols.
There still is one fly in this alphabet soup. There are thousands of fonts available with a UTF-8 encoding, but many do not provide all characters that the encoding supports. This is to be expected, since creating all the glyphs would require an enormous amount of work. The consequence is that if you use special symbols in your designs, you should verify that it is supported by all font families you use in your design. The way you can tell that a symbol is not supported, is that it is represented by a little square with the hexadecimal character code in it (for example 0AE5). If your font size is small enough, these will look like little domino stones.