Saving Time and Memory

"It's not the daily increase but daily decrease. Hack away at the unessential."
– Bruce Lee

I love this quote from Bruce Lee. He was such a wise man! Especially, the second part, "hack away at the unessential", is to me what makes a computer program elegant. After all, if there is a better way of doing things so that we don't waste time or memory, why not?

Sometimes, there are valid reasons for not pushing our code up to the maximum limit: for example, sometimes to achieve a negligible improvement, we have to sacrifice on readability or maintainability. Does it make any sense to have a web page served in 1 second with unreadable, complicated code, when we can serve it in 1.05 seconds with readable, clean code? No, it makes no sense.

On the other hand, sometimes it's perfectly reasonable to try to shave off a millisecond from a function, especially when the function is meant to be called thousands of times. Every millisecond you save there means one second saved per thousands of calls, and this could be meaningful for your application.

In light of these considerations, the focus of this chapter will not be to give you the tools to push your code to the absolute limits of performance and optimization "no matter what," but rather, to enable you to write efficient, elegant code that reads well, runs fast, and doesn't waste resources in an obvious way.

In this chapter, we are going to cover the following:

  • The map, zip, and filter functions
  • Comprehensions
  • Generators

I will perform several measurements and comparisons, and cautiously draw some conclusions. Please do keep in mind that on a different box with a different setup or a different operating system, results may vary. Take a look at this code:

# squares.py
def square1(n):
return n ** 2 # squaring through the power operator

def square2(n):
return n * n # squaring through multiplication

Both functions return the square of n, but which is faster? From a simple benchmark I ran on them, it looks like the second is slightly faster. If you think about it, it makes sense: calculating the power of a number involves multiplication and therefore, whatever algorithm you may use to perform the power operation, it's not likely to beat a simple multiplication such as the one in square2.

Do we care about this result? In most cases, no. If you're coding an e-commerce website, chances are you won't ever even need to raise a number to the second power, and if you do, it's likely to be a sporadic operation. You don't need to concern yourself with saving a fraction of a microsecond on a function you call a few times.

So, when does optimization become important? One very common case is when you have to deal with huge collections of data. If you're applying the same function on a million customer objects, then you want your function to be tuned up to its best. Gaining 1/10 of a second on a function called one million times saves you 100,000 seconds, which is about 27.7 hours. That's not the same, right? So, let's focus on collections, and let's see which tools Python gives you to handle them with efficiency and grace.

Many of the concepts we will see in this chapter are based on those of the iterator and iterable. Simply put, the ability for an object to return its next element when asked, and to raise a StopIteration exception when exhausted. We'll see how to code a custom iterator and iterable objects in Chapter 6, OOP, Decorators, and Iterators.

Due to the nature of the objects we're going to explore in this chapter, I was often forced to wrap the code in a list constructor. This is because passing an iterator/generator to list(...) exhausts it and puts all the generated items in a newly created list, which I can easily print to show you its content. This technique hinders readability, so let me introduce an alias for list:

# alias.py
>>> range(7)
range(0, 7)
>>> list(range(7)) # put all elements in a list to view them
[0, 1, 2, 3, 4, 5, 6]
>>> _ = list # create an "alias" to list
>>> _(range(7)) # same as list(range(7))
[0, 1, 2, 3, 4, 5, 6]

Of the three sections I have highlighted, the first one is the call we need to do in order to show what would be generated by range(7), the second one is the moment when I create the alias to list (I chose the hopefully unobtrusive underscore), and the third one is the equivalent call, when I use the alias instead of list.

Hopefully readability will benefit from this, and please keep in mind that I will assume this alias to have been defined for all the code in this chapter.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset