"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil." | ||
--Donald Knuth |
This chapter is about optimization and provides a set of general principles and profiling techniques. It gives the three rules of optimization every developer should be aware of and provides guidelines on optimization. Last, it focuses on how to find bottlenecks.
Optimization has a price, no matter what the results are. When a piece of code works, it might be better (sometimes) to leave it alone than to try making it faster at all costs. There are a few rules to keep in mind when doing any kind of optimization:
A very common mistake is to try to optimize the code while you are writing it. This is mostly pointless because the real bottlenecks are often located where you would have never thought they would be.
An application is usually composed of very complex interactions, and it is impossible to get a full picture of what is going on before it is really used.
Of course, this is not a reason to write a function or a method without trying to make it as fast as possible. You should be careful to lower its complexity as much as possible and avoid useless repetition. But the first goal is to make it work. This goal should not be hindered by optimization efforts.
For line-level code, the Python philosophy is that there's one, and preferably only one, way to do it. So, as long as you stick with a Pythonic syntax, described in Chapter 2, Syntax Best Practices – below the Class Level, and Chapter 3, Syntax Best Practices – above the Class Level, your code should be fine. Often, writing less code is better and faster than writing more code.
Don't do any of these things until you have gotten your code working and you are ready to profile:
For very specialized areas, such as scientific calculation or games, the usage of specialized libraries and externalization might be unavoidable from the beginning. On the other hand, using libraries like NumPy might ease the development of specific features and produce simpler and faster code at the end. Furthermore, you should not rewrite a function if there is a good library that does it for you.
For instance, Soya 3D, which is a game engine on top of OpenGL (see http://home.gna.org/oomadness/en/soya3d/index.html), uses C and Pyrex for fast matrix operations when rendering real-time 3D.
I have seen teams working on optimizing the startup time of an application server that worked really fine when it was already up and running. Once they finished speeding it, they promoted that work to their customers. They were a bit frustrated to notice that the customers didn't really care about it. This was because the speed-up work was not motivated by the user feedback but by the developer's point of view. The people who built the system were launching the server multiple times every day. So the startup time meant a lot to them but not to their customers.
While making a program start faster is a good thing from an absolute point of view, teams should be careful to prioritize the optimization work and ask themselves the following questions:
Remember that optimization has a cost and that the developer's point of view is meaningless to customers, unless you are writing a framework or a library and the customer is a developer too.
Even if Python tries to make the common code patterns the fastest, optimization work might obfuscate your code and make it really hard to read. There's a balance to keep between producing readable, and therefore maintainable, code and defacing it in order to make it faster.
When you have reached 90% of your optimization objectives and the 10% left to be done makes your code completely unreadable, it might be a good idea to stop the work there or to look for other solutions.