Making your servers a bit more "green" with smart system adminstration, part 3: the right tool for the right job
September 26, 2009,
"Green IT" can only work if you have a combined approach of optimization on all levels. System administration can play an important role. Even more important is choosing the right programs to solve your problem.
The line between administration and development is sometimes a bit fuzzy. Developers should at least be aware that their software needs to be deployed and might impact a system's performance. Administrators should know that the systems they administer are not there and other people might have a need to use certain software to get their work done. Ideally administrators and developers should work together troubleshooting programs that misbehave. Here at Loco we like to work right in the crossfire of administration and development, so this article will focus a bit more on development, using an example from the Loco hall of shame.
Choose the right software
Once we had to write a system to mangle real estate data. The feed we got was in XML format which needed to be parsed and put into a database so it could be displayed on a website. The file was a bit large, but not huge (70 megabytes). The data contained a lot of junk, which did not conform to the specifications we were given, so a lot of cleanup was needed as well. The first script I wrote used the normal Python DOM implementation, where I used some XSLT operations. The script ran for about 20 minutes and used quite a bit of memory and CPU, which was absolutely unacceptable. Switching to the Python minidom implementation (we did nothing spectacular DOM-wise) without rewriting any of the code cut executation time by about 75%, making it OK for everyday use, although leaving that nagging feeling that there was a suboptimal script running on the server.
So by just switching libraries we saved a quite big amount of resources. In retrospect we would probably have written the script using a SAX parser.
There are probably millions of programs on servers that do something, but which are written in a rather suboptimal way, for example by using a data structure that is suboptimal, or an algorithm that works really well with a small input but will be disastrous when processing large amounts of data. Using more or faster hardware in such cases will only help you a very little bit.
Don't do what you don't have to do
A program that we did not write, just merely sent in a patch for is the generate_tiles.py script that Mapnik uses. The program generate_tiles.py is a Python script that uses the Mapnik library to query geographical databases and generate tiles for use in a digital map, like OpenStreetMap, or a tile layer for Google Maps. For every zoom level (typically 18 in total with Google Maps and OpenStreetMap) a set of tiles is generated. Within every zoom level there is a hierachy as well. The directory path for a tile contains the zoom level and an x and y component, with the structure of the directory being zoomlevel/x/y/tile.png. Generating tiles is done using a few nested loops (one for all zoom levels, one for the x component and a final loop for the y component). Before it writes a tile to the file system, it checks if the directory already exists and if not, it would create the directory.
Originally these checks were done in the innermost loop. If you only generate for the lowest zoom levels you will not notice, but at zoom level 18 you have several million tiles. This means several million checks if the directory for zoom level 18 exists instead of just one. By moving the check towards the outermost loop (for zoom levels) the amount of stat operations went from possibly millions to 18 at maximum. The patch was made within a minute and very trivial (although in the overall performance of generate_tiles.py it won't be noticed).
Choose the right hardware
If you really want to optimize for getting the best out of your programs you need to take hardware into account. A lot of people don't do this and just buy something, but it is actually quite essential. Hardware for programs is like a pan for cooking: choose the wrong pan and your food might burn, or it might become as tender as an 80 year old sole. Choose the wrong hardware and your application might be underperforming because memory, CPU, disk and network are not properly balanced: the program will not run, but wait for a resource to become available.
One time we were asked to make specifications for a server architecture which had to server a lot of web pages in a short amount of time. The program was quite an advanced PHP program, with dynamic content from databases and a lot of static content. We clearly stated that we thought it would be wise to focus on more memory to increase performance. The manager from the other company had a sweet spot for CPU performance, so once we got the machine for installation it had a really fast CPU, but just a measly 512 MB of RAM (this was in the autumn of 2007).
It helps to know if your programs are CPU bound, memory bound, or (network) I/O bound. Profiling the application in various environments helps a lot to determine where you should focus first, because inevitably once you have fixed a certain bottleneck and your program scales to another level the next bottleneck will pop up.
More performance tweaks
In a LAMP environment, like Loco, a big performance bottleneck is the database, which will be discussed on a future blog article.