Chapter 3. Functions

At the core of any programming language is the notion of functions, but we tend to take them for granted. Sure, there's the obvious fact that functions allow code to be encapsulated into individual units, which can be reused rather than being duplicated all over the place. But Python takes this beyond just the notion of code, with functions being full-fledged objects that can be passed around in data structures, wrapped up in other functions or replaced entirely by new implementations.

In fact, Python provides enough flexibility with functions that there are actually several different types of functions, reflecting the various forms of declaration and purposes. Understanding each of these types of functions will help you decide which is appropriate for each situation you encounter while working with your own code. This chapter will explain each of them in turn, as well as a variety of features you can take advantage of to extend the value of each function you create, regardless of its type.

At their core, all functions are essentially equal, regardless of which of the following sections they fit into. The built-in function type forms their basis, containing all the attributes necessary for Python to understand how to use them.

>>> def example():
...     pass
...
>>> type(example)
<class 'function'>
>>> example
<function example at 0x...>

Of course, there are still a number of different types of functions and as many different ways of declaring them. First off, let's examine one of the most universal aspects of functions.

Arguments

Most functions need to take some number of arguments in order to do anything useful. Normally, that means defining them in order in the function; then supplying them in the same order when calling that function later. Python supports that model, but also supports passing keyword arguments and even arguments that won't be known until the function is called.

One of the most common advantages of Python's keyword arguments is that you can pass arguments in a different order than the way they were defined in the function. You can even skip arguments entirely, as long as they have a default value defined. This flexibility helps encourage the use of functions that support lots of arguments with default values.

Planning for Flexibility

Argument planning is particularly important for functions intended to be called by someone who didn't write them, such as those in distributed applications. If you don't know the exact needs of the users who will eventually be using your code, it's best to move any assumptions you may have into arguments that can be overridden later.

As an extremely simple example, consider a function that assigns a prefix to a string:

def add_prefix(string):
    """Adds a 'pro_' prefix before the string provided."""
    return 'pro_' + string

The 'pro_' prefix here may make sense for the application it was written for, but what happens when anything else wants to use it? Right now, the prefix is hard-coded into the body of the function itself, so there's no available alternative. Moving that assumption into an argument makes for an easy way to customize the function later.

def add_prefix(string, prefix='pro_'):
    """Adds a 'pro_' prefix before the string provided."""
    return prefix + string

The default function call—without the prefix argument—doesn't need to change, so existing code works just fine. The section on preloading arguments later in this chapter shows how even the prefix can be changed and still be used by code that doesn't know about it.

Of course, this example is far too simple to provide much real-world value, but the functions illustrated throughout the rest of this book will take advantage of plenty of optional arguments, showing their value in each situation.

Variable Positional Arguments

Most functions are designed to work on a specific set of arguments, but some can handle any number of objects, acting on each in turn. These may be passed into a single argument as a tuple, list or other iterable, but that makes things a little odd if the function call knows in advance how many items will be passed in.

Take a typical shopping cart, for example. Adding items to the cart could be done one at a time or in batches. Here's how it could be done, using a standard argument.

class ShoppingCart:
    def add_to_cart(items):
        self.items.extend(items)

That would certainly do the trick, but now consider what that means for all the code that has to call it. The common case would be to add just a single item, but since the function always accepts a full list, it would end up looking something like this.

cart.add_to_cart([item])

So we've basically sabotaged the majority case in order to support the minority. Worse yet, if add_to_cart() originally supported just one item and was changed to support multiples, this syntax would break any existing calls, requiring you to rewrite them just to avoid a TypeError.

Ideally, the method should support the standard syntax for single arguments, while still supporting multiple arguments. By adding an asterisk before an argument name, you can specify that it should accept all positional arguments that didn't get assigned to anything before it. In this case, there are no other arguments, so variable positional arguments can make up the entire argument list.

def add_to_cart(*items):
    self.items.extend(items)

Now, the method can be called with any number of positional arguments, rather than having to group those arguments first into a tuple or list. The extra arguments are bundled up into a tuple automatically before the function starts executing. This cleans up the common case, while still enabling more arguments as needs require. Here are a few examples of how the method could be called.

cart.add_to_cart(item)
cart.add_to_cart(item1, item2)
cart.add_to_cart(item1, item2, item3, item4, item5)

There is still one more way to call this function that allows the calling code to support any number of items as well, but it's not specific to functions that are designed to accept variable arguments. See the section on invoking functions with variable arguments for all the details.

Variable Keyword Arguments

Other times, functions may need to take extra configuration options, particularly if passing those options to some other library further down the line. The obvious approach would be to accept a dictionary, which can map configuration names to their values.

class ShoppingCart:
    def __init__(self, options):
        self.options = options

Unfortunately, that ends up with a problem similar to the one we encountered with positional arguments described in the previous section. The simple case where you only override one or two values gets fairly complicated. Here are two ways the function call could look, depending on preference.

options = {'currency': 'USD'}
cart = ShoppingCart(options)

cart = ShoppingCart({'currency': 'USD'})

Of course, this approach doesn't scale any prettier than the list provided in the positional argument problem from the previous section. Also like the previous problem, this can be problematic. If the function you're working with was previously set up to accept some explicit keyword arguments, the new dictionary argument would break compatibility.

Instead, Python offers the ability to use variable keyword arguments by adding two asterisks before the name of the argument that will accept them. This allows for the much friendlier keyword argument syntax, while still allowing for a fully dynamic function call.

def __init__(self, **options):
    self.options = options

Now consider what that same function from earlier would look like, given that the function now takes arbitrary keyword arguments.

cart = ShoppingCart(currency='USD')

Warning

When working with variable arguments, there's one difference between positional and keyword arguments that can cause problems. Positional arguments are grouped into a tuple, which is immutable, while keyword arguments are placed into a dictionary, which is mutable. That property of dictionaries can be useful, but if you're not careful, you can accidentally lose data.

Combining Different Kinds of Arguments

These options for variable arguments combine with the standard options, such as required and optional arguments. In order to make sure everything meshes nicely, Python has some fairly specific rules about how arguments are laid out in the function definition. There are only four types of arguments, listed here in the order they generally appear in functions.

  • Required arguments

  • Optional arguments

  • Variable positional arguments

  • Variable keyword arguments

Putting the required arguments first in the list ensures that positional arguments satisfy the required arguments prior to getting into the optional arguments. Variable arguments can only pick up values that didn't fit into anything else, so they naturally get defined at the end. Here's how this would look in a typical function definition.

def create_element(name, editable=True, *children, **attributes):

This same ordering can be used when calling functions as well, but it has one shortcoming. In this example, you'd have to supply a value for editable as a positional argument in order to pass in any children at all. It'd be better to be able to supply them right after the name, avoiding the optional editable argument entirely most of the time.

To support this, Python also allows variable positional arguments to be placed in among standard arguments. Both required and optional arguments can be positioned after the variable argument, but now they must be passed by keyword. All the arguments are still available, but the less common ones become more optional when not required and more explicit when they do make sense.

An added feature of this behavior is hat explicit arguments placed after variable positional arguments can still be required. The only real difference between the two types of placement is the requirement of using keyword arguments; whether the argument requires a value still depends on whether you define a default argument.

>>> def join_with_prefix(prefix, *segments, delimiter):
...     return delimiter.join(prefix + segment for segment in segments)
...
>>> join_with_prefix('P', 'ro', 'ython')
Traceback (most recent call last):
  ...
TypeError: join_with_prefix() needs keyword-only argument delimiter
>>> join_with_prefix('P', 'ro', 'ython', ' ')
Traceback (most recent call last):
  ...
TypeError: join_with_prefix() needs keyword-only argument delimiter
>>> join_with_prefix('P', 'ro', 'ython', delimiter=' ')
'Pro Python'

Note

If you want to accept keyword-only arguments, but you don't have a good use for variable positional arguments, simply specify a single asterisk without an argument name. This will tell Python that everything after the asterisk is keyword-only, without also accepting potentially long sets of positional arguments. One caveat is that if you also accept variable keyword arguments, you must supply at least one explicit keyword argument. Otherwise, there's really no point in using the bare asterisk notation, and Python will raise a SyntaxError.

In fact, remember that the ordering requirements of required and optional arguments is solely intended for the case of positional arguments. With the ability to define arguments as being keyword-only, you're now free to define them as required and optional in any order, without any complaints from Python. Ordering isn't important when calling the function, so it's also not important when defining the function. Consider rewriting the previous example to require the prefix as a keyword argument, while also making the delimiter optional.

>>> def join_with_prefix(*segments, delimiter=' ', prefix):
...     return delimiter.join(prefix + segment for segment in segments)

>>> join_with_prefix('ro', 'ython', prefix='P')
'Pro Python'

Warning

Be careful taking advantage of this level of flexibility because it's not very straightforward compared to how Python code is typically written. It's certainly possible, but it runs contrary to what most Python programmers will expect, which can make it difficult to maintain in the long run.

In all cases, though, variable keyword arguments must be positioned at the end of the list, after all other types of arguments.

Invoking Functions with Variable Arguments

In addition to being able to define arguments that can accept any number of values, the same syntax can be used to pass values into a function call. The big advantage to this is that it's not restricted to arguments that were defined to be variable in nature. Instead, you can pass variable arguments into any function, regardless of how it was defined.

The same asterisk notation is used to specify variable arguments, which are then expanded into a function call as if all the arguments were specified directly. A single asterisk specifies positional arguments, while two asterisks specify keyword arguments. This is especially useful when passing in the return value of a function call directly as an argument, without assigning it to individual variables first.

>>> value = 'ro ython'
>>> join_with_prefix(*value.split(' '), prefix='P')

This example seems obvious on its own, because it's a variable argument being passed in to a variable argument, but the same process works just fine on other types of functions as well. Since the arguments get expanded before getting passed to the function, it can be used with any function, regardless of how its arguments were specified. It can even be used with built-in functions and those defined by extensions written in C.

Note

You can only pass in one set of variable positional arguments and one set of variable keyword arguments in a function call. If you have two lists of positional arguments, for example, you'll need to join them together yourself and pass that combined list into the function instead of trying to use the two separately.

Preloading Arguments

When you start adding a number of arguments to functions, many of which are optional, it becomes fairly common to know some of the argument values that will need to be passed, even if it's still long before the function will actually be called. Rather than having to pass in all the arguments at the time the call is made, it can be quite useful to apply some of those arguments in advance, so fewer can be applied later.

This concept is officially called partial application of the function, but the function doesn't get called at all yet, so it's really more a matter of preloading some of the arguments in advance. When the preloaded function is called later, any arguments passed along are added to those that were provided earlier.

This behavior is provided as part of the built-in functools module, by way of its partial() function. By passing in a callable and any number of positional and keyword arguments, it will return a new callable that can be used later to apply those arguments.

>>> import os
>>> def load_file(file, base_path='/', mode='rb'):
...     return open(os.path.join(base_path, file), mode)
...
>>> f = load_file('example.txt')
>>> f.mode
'rb'
>>> f.close()

>>> import functools
>>> load_writable = functools.partial(load_file, mode='w')
>>> f = load_writable('example.txt')
>>> f.mode
'w'
>>> f.close()

Note

The technique of preloading arguments is true for the partial() function, but the technique of passing one function into another to get a new function back is generally known as a decorator. Decorators, as you'll see later in this chapter, can perform any number of tasks when called; preloading arguments is just one example.

This is commonly used to customize a more flexible function into something simpler, so it can be passed into an API that doesn't know how to access that flexibility. By preloading the custom arguments beforehand, the code behind the API can call your function with the arguments it knows how to use, but all the arguments will still come into play.

Warning

When using functools.partial(), you won't be able to provide any new values for those arguments that were previously loaded. This is, of course, standard behavior any time you try to supply multiple values for a single argument, but the situation comes up much more often when you're not supplying them all in the same function call. For an alternative approach that addresses this issue, see the Decorators section of this chapter.

Introspection

Python is very transparent, allowing code to inspect many aspects of objects at run-time. Since functions are objects like any others, there are several things that your code can glean from them, including the argument specification. Obtaining a function's arguments directly requires going through a fairly complicated set of attributes that describe Python's bytecode structures, but thankfully Python also provides some functions to make it easier.

Many of Python's introspection features are available as part of the standard inspect module, with its getfullargspec() function being of use for function arguments. It accepts the function to be inspected and returns a named tuple of information about that function's arguments. The returned tuple contains values for every aspect of an argument specification.

  • args—A list of explicit argument names

  • varargs—The name of the variable positional argument

  • varkw—The name of the variable keyword argument

  • defaults—A tuple of default values for explicit arguments

  • kwonlyargs—A list of keyword-only argument names

  • kwonlydefaults—A dictionary of default values for keyword-only arguments

  • annotations—A dictionary of argument annotations, which will be explained later in this chapter

To better illustrate what values are present in each part of the tuple, here's how it maps out to a basic function declaration.

>>> def example(a:i nt, b=1, *c, d, e=2, **f) -> str:
...     pass
...
>>> import inspect
>>> inspect.getfullargspec(example)
FullArgSpec(args=['a', 'b'], varargs='c', varkw='f', defaults=(1,), kwonlyargs=[
'd', 'e'], kwonlydefaults={'e': 2}, annotations={'a': <class 'int'>, 'return': < class 'str'>})

Example: Identifying Argument Values

Sometimes it can be useful to log what arguments a function will receive, regardless of which function it is or what its arguments look like. This behavior often comes into play in systems that generate argument lists based on something other than a Python function call. Some examples include instructions from a template language and regular expressions that parse text input.

Unfortunately, positional arguments present a bit of a problem because their values don't include the name of the argument they'll be sent to. Default values also pose a problem because the function call doesn't need to include any values at all. Since the log should include all the values that will be given to the function, both of these problems will need to be addressed.

First, the easy part. Any argument values passed by keyword don't need to be matched up with anything manually, since the argument names are provided right with the values. Rather than concerning ourselves with logging at the outset, let's start with a function to get all the arguments in a dictionary that can be logged. The function accepts a function, a tuple of positional arguments and a dictionary of keyword arguments.

def get_arguments(func, args, kwargs):
    """
    Given a function and a set of arguments, return a dictionary
    of argument values that will be sent to the function.
    """

    arguments = kwargs.copy()
    return arguments

>>> get_arguments(example, (1,), {'f': 4})
{'f': 4}

That really was easy. The function makes a copy of the keyword arguments instead of just returning it directly because we'll be adding entries to that dictionary soon enough. Next, we have to deal with positional arguments. The trick is to identify which argument names map to the positional argument values, so those values can be added to the dictionary with the appropriate names. This is where inspect.getfullargspec() comes into play, using zip() to do the heavy lifting.

import inspect
def get_arguments(func, args, kwargs):
    """
    Given a function and a set of arguments, return a dictionary
    of argument values that will be sent to the function.
    """

    arguments = kwargs.copy()
    spec = inspect.getfullargspec(func)
    arguments.update(zip(spec.args, args))

    return arguments

>>> get_arguments(example, (1,), {'f': 4})
{'a': 1, 'f': 4}

Now that the positional arguments have been dealt with, let's move on to figuring out default values. If there are any default values that weren't overridden by the arguments provided, the defaults should be added to the argument dictionary, since they will be sent to the function.

import inspect

def get_arguments(func, args, kwargs):
    """
    Given a function and a set of arguments, return a dictionary
    of argument values that will be sent to the function.
    """

    arguments = kwargs.copy()
    spec = inspect.getfullargspec(func)
    arguments.update(zip(spec.args, args))

    if spec.defaults:
        for i, name in enumerate(spec.args[-len(spec.defaults):]):
            if name not in arguments:
                arguments[name] = spec.defaults[i]

    return arguments

>>> get_arguments(example, (1,), {'f': 4})
{'a': 1, 'b': 1, 'f': 4}

Since optional arguments must come after required arguments, this addition uses the size of the defaults tuple to determine the names of the optional argument. Looping over them, it then assigns only those values that weren't already provided. Unfortunately, this is only half of the default value situation. Because keyword-only arguments can take default values as well, getfullargspec() returns a separate dictionary for those values.

import inspect

def get_arguments(func, args, kwargs):
    """
    Given a function and a set of arguments, return a dictionary
    of argument values that will be sent to the function.
    """
    arguments = kwargs.copy()
    spec = inspect.getfullargspec(func)
    arguments.update(zip(spec.args, args))

    for i, name in enumerate(spec.args[-len(spec.defaults)]):
        if name not in arguments:
            arguments[name] = spec.defaults[i]

    if spec.kwonlydefaults:
        for name, value in spec.kwonlydefaults.items():
            if name not in arguments:
                arguments[name] = value

    return arguments

>>> get_arguments(example, (1,), {'f': 4})
{'a': 1, 'b': 1, 'e': 2, 'f': 4}

Since default values for keyword-only arguments also come in dictionary form, it's much easier to apply those because the argument names are known in advance. With that in place, get_arguments() can produce a more complete dictionary of arguments that will be passed to the function. Unfortunately, because this returns a dictionary and variable positional arguments have no names, there's no way to add them to the dictionary. This limits its usefulness a bit, but it's still valid for a great many function definitions.

Example: A More Concise Version

The previous example is certainly functional, but it's a bit more code than is really necessary. In particular, it takes a fair amount of work supplying default values when explicit values aren't provided. That's not very intuitive, though, because we usually think about default values the other way around: they're provided first, then overridden by explicit arguments.

The get_arguments() function can be rewritten with that in mind by bringing the default values out of the function declaration first, before replacing them with any values passed in as actual arguments. This avoids a lot of the checks that have to be made to make sure nothing gets overwritten accidentally.

The first step is to get the default values out. Because the defaults and kwonlydefaults attributes of the argument specification are set to None if no default values were specified, we actually have to start by setting up an empty dictionary to update. Then, the default values for positional arguments can be added in.

Because this only needs to update a dictionary this time, without regard for what might be in it already, it's a bit easier to use a different technique to get the positional defaults. Rather than using a complex slice that's fairly difficult to read, we can use a similar zip() to what was used to get the explicit argument values. By first reversing the argument list and the default values, they still match up starting at the end.

def get_arguments(func, args, kwargs):
    """
    Given a function and a set of arguments, return a dictionary
    of argument values that will be sent to the function.
    """

    arguments = {}
    spec = inspect.getfullargspec(func)

    if spec.defaults:
        arguments.update(zip(reversed(spec.args), reversed(spec.defaults)))

    return arguments

>>> get_arguments(example, (1,), {'f': 4})
{'b': 1}

Adding default values for keyword arguments is much easier because the argument specification already supplies them as a dictionary. We can just pass that straight into an update() of the argument dictionary and move on.

def get_arguments(func, args, kwargs):
    """
    Given a function and a set of arguments, return a dictionary
    of argument values that will be sent to the function.
    """

    arguments = {}
    spec = inspect.getfullargspec(func)

    if spec.defaults:
        arguments.update(zip(reversed(spec.args), reversed(spec.defaults)))
    if spec.kwonlydefaults:
        arguments.update(spec.kwonlydefaults)

    return arguments

>>> get_arguments(example, (1,), {'f': 4})
{'b': 1, 'e': 2}

Now all that's left is to add in the explicit argument values that were passed in. The same techniques used in the earlier version of this function will work here, with the only exception being that keyword arguments are passed in an update() instead of being copied to form the argument dictionary in the first place.

def get_arguments(func, args, kwargs):
    """
    Given a function and a set of arguments, return a dictionary
of argument values that will be sent to the function.
    """

    arguments = {}
    spec = inspect.getfullargspec(func)

    if spec.defaults:
        arguments.update(zip(reversed(spec.args), reversed(spec.defaults)))
    if spec.kwonlydefaults:
        arguments.update(spec.kwonlydefaults)
    arguments.update(zip(spec.args, args))
    arguments.update(kwargs)

    return arguments

>>> get_arguments(example, (1,), {'f': 4})
{'a': 1, 'b': 1, 'e': 2, 'f': 4}

With that, we have a much more concise function that works the way we normally think of default argument values. This type of refactoring is fairly common after you get more familiar with the advanced techniques available to you. It's always useful to look over old code to see if there's an easier, more straightforward way to go about the task at hand. This will often make your code faster as well as more readable and maintainable going forward.

Example: Validating Arguments

Unfortunately, that doesn't mean that the arguments returned by get_arguments() are capable of being passed into the function without errors. As it stands, get_arguments() assumes that any keyword arguments supplied are in fact valid arguments for the function, but that isn't always the case. In addition, any required arguments that didn't get a value will cause an error when the function is called. Ideally, we should be able to validate the arguments as well.

We can start with get_arguments(), so we have a dictionary of all the values that will be passed to the function, then we have two validation tasks: make sure all arguments have values and make sure no arguments were provided that the function doesn't know about. The function itself may impose additional requirements on the argument values, but as a generic utility, we can't make any assumptions about the content of any of the provided values.

Let's start off with making sure all the necessary values were provided. We don't have to worry as much about required or optional arguments this time around, since get_arguments() already makes sure optional arguments have their default values. Any argument left without a value is therefore required.

import itertools

def validate_arguments(func, args, kwargs):
    """
    Given a function and its arguments, return a dictionary
    with any errors that are posed by the given arguments.
    """

    arguments = get_arguments(func, args, kwargs)
    spec = inspect.getfullargspec(func)
    declared_args = spec.args[:]
declared_args.extend(spec.kwonlyargs)
    errors = {}

    for name in declared_args:
        if name not in arguments:
            errors[name] = "Required argument not provided."

    return errors

With the basics in place to validate that all required arguments have values, the next step is to make sure the function knows how to deal with all the arguments that were provided. Any arguments passed in that aren't defined in the function should be considered an error.

import itertools

def validate_arguments(func, args, kwargs):
    """
    Given a function and its arguments, return a dictionary
    with any errors that are posed by the given arguments.
    """

    arguments = get_arguments(func, args, kwargs)
    spec = inspect.getfullargspec(func)
    declared_args = spec.args[:]
    declared_args.extend(spec.kwonlyargs)
    errors = {}

    for name in declared_args:
        if name not in arguments:
            errors[name] = "Required argument not provided."

    for name in arguments:
        if name not in declared_args:
            errors[name] = "Unknown argument provided."

    return errors

Of course, because this relies on get_arguments(), it inherits the same limitation of variable positional arguments. This means validate_arguments() may sometimes return an incomplete dictionary of errors. Variable positional arguments present an additional challenge that can't be addressed with this function. A more comprehensive solution is provided in the section on function annotations.

Decorators

When dealing with a large codebase, it's very common to have a set of tasks that need to be performed by many different functions, usually before or after doing something more specific to the function at hand. The nature of these tasks is as varied as the projects that use them, but here are some of the more common examples of where decorators are used.

  • Access control

  • Cleanup of temporary objects

  • Error handling

  • Caching

  • Logging

In all of these cases, there's some boilerplate code that needs to be executed before or after what the function's really trying to do. Rather than copying that code out into each function, it'd be better if it could be written once and simply applied to each function that needs it. This is where decorators come in.

Technically, decorators are just simple functions designed with one purpose: accept a function and return a function. The function returned can be the same as the one passed in or it could be completely replaced by something else along the way. The most common way to apply a decorator is using a special syntax designed just for this purpose. Here's how you could apply a decorator designed to suppress any errors during the execution of a function.

import datetime
from myapp import suppress_errors

@suppress_errors
def log_error(message, log_file='errors.log'):
    """Log an error message to a file."""

    log = open(log_file, 'w')
    log.write('%s	%s
' % (datetime.datetime.now(), message))

This syntax tells Python to pass the log_error() function as an argument to the suppress_errors() function, which then returns a replacement to use instead. It's easier to understand what happens behind the scenes by examining the processed used in older versions of Python, before the @ syntax was introduced in Python 2.4.

import datetime
from myapp import suppress_errors

def log_error(message, log_file='errors.log'):
    """Log an error message to a file."""

    log = open(log_file, 'w')
    log.write('%s	%s
' % (datetime.datetime.now(), message))
log_error = suppress_errors(log_error)

The older option is still available and behaves identically to the @ syntax. The only real difference is that the @ syntax is only available when defining the function in the source file. If you want to decorate a function that was imported from elsewhere, you'll have to pass it into the decorator manually, so it's important to remember both ways it can work.

from myapp import log_error, suppress_errors

log_error = suppress_errors(log_error)

To understand what commonly goes on inside decorators like log_error(), it's first necessary to examine one of the most misunderstood and underutilized features of Python—and many other languages as well—closures.

Closures

Despite their usefulness, closures can seem to be an intimidating topic. Most explanations assume prior knowledge of things like lexical scope, free variables, upvalues and variable extent. Also, because so much can be done without ever learning about closures, the topic often seems mysterious and magical, as if it's the domain of experts, unsuitable for the rest of us. Thankfully, closures really aren't as difficult to understand as the terminology may suggest.

In a nutshell, a closure is a function that's defined inside another function, but is then passed outside that function where it can be used by other code. There are some other details to learn as well, but it's still fairly abstract at this point, so here's a simple example of a closure.

def multiply_by(factor):
    """Return a function that multiplies values by the given factor"""

    def multiply(value):
        """Multiply the given value by the factor already provided"""

        return value * factor

    return multiply

As you can see, when you call multiply_by() with a value to use as a multiplication factor, the inner multiply() gets returned to be used later on. Here's how it would actually be used, which may help explain why this is useful.

>>> times2 = multiply_by(2)
>>> times2(5)
10
>>> times2(10)
20
>>> times3 = multiply_by(3)
>>> times3(5)
15
>>> times2(times3(5))
30

This behavior looks a bit like the argument preloading feature of functools.partial(), but you don't need to have a function that takes both arguments at once. The interesting part of about how this works, though, is that the inner function doesn't need to accept a factor argument of its own; it essentially inherits that argument from the outer function.

The fact that an inner function can reference the values of an outer function often seems perfectly normal when looking at the code, but there are a couple of rules about how it works that might not be as obvious. First, the inner function must be defined within the outer function; simply passing in a function as an argument won't work.

def multiply(value):
    return value * factor

def custom_operator(func, factor):
    return func

multiply_by = functools.partial(custom_operator, multiply)

On the surface, this looks mostly equivalent to the working example shown previously, but with the added benefit of being able to provide a callable at run-time. After all, the inner function gets placed inside the outer function and gets returned for use by other code. The problem is that closures only work if the inner function is actually defined inside the outer function, not just anything that gets passed in.

>>> times2 = multiply_by(2)
>>> times2(5)
Traceback (most recent call last):
  ...
NameError: global name 'factor' is not defined

This almost contradicts the functionality of functools.partial(), which works much like the custom_operator() function described here, but remember that partial() accepts all the arguments at the same time as it accepts the callable to be bundled with them. It doesn't try to pull in any arguments from anywhere else.

Wrappers

Closures come into play heavily in the construction of wrappers, the most common use of decorators. Wrappers are functions designed to contain another function, adding some extra behavior before or after the wrapped function executes. In the context of the closure discussion, a wrapper is the inner function, while the wrapped function is passed in as an argument to the outer function. Here's the code behind the suppress_errors() decorator shown in the previous section.

def suppress_errors(func):
    """Automatically silence any errors that occur within a function"""

    def wrapper(*args, **kwargs):
        try:
            return func(*args, **kwargs)
        except Exception:
            pass

    return wrapper

There are a few things going on here, but most of them have already been covered. The decorator takes a function as its only argument, which isn't executed until the inner wrapper function executes. By returning the wrapper instead of the original function, we form a closure, which allows the same function handle to be used even after suppress_errors() is done.

Since the wrapper has to be called as if it were the original function, regardless of how that function was defined, it must accept all possible argument combinations. This is achieved by using variable positional and keyword arguments together and passing them straight into the original function internally. This is a very common practice with wrappers because it allows maximum flexibility, without caring what type of function it's applied to.

The actual work in the wrapper is quite simple: just execute the original function inside a try/except block to catch any errors that are raised. In the event of any errors, it just continues merrily along, implicitly returning None instead of doing anything interesting. It also makes sure to return any value returned by the original function, so that everything meaningful about the wrapped function is maintained.

In this case, the wrapper function is fairly simple, but the basic idea works for many more complex situations as well. There could be several lines of code both before and after the original function gets called, perhaps with some decisions about whether it gets called at all. Authorization wrappers, for instance, will typically return or raise an exception without ever calling the wrapped function, if the authorization failed for any reason.

Unfortunately, wrapping a function means some potentially useful information is lost. Chapter 5 shows how Python has access to certain attributes of a function, such as its name, docstring and argument list. By replacing the original function with a wrapper, we've actually replaced all of that other information as well. In order to bring some of it back, we turn to a decorator in the functools module called wraps.

It may seem odd to use a decorator inside a decorator, but it really just solves the same problem as anything else: there's a common need that shouldn't require duplicate code everywhere it takes place. The functools.wraps() decorator copies the name, docstring and some other information over to the wrapped function, so at least some of it gets retained. It can't copy over the argument list, but it's better than nothing.

import functools

def suppress_errors(func):
    """Automatically silence any errors that occur within a function"""

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        try:
            return func(*args, **kwargs)
        except Exception:
            pass

    return wrapper

What may seem most odd about this construction is that functools.wraps() takes an argument besides the function it's applied to. In this case, that argument is the function to copy attributes from, which is specified on the line with the decorator itself. This is often useful for customizing decorators for specific tasks, so next we'll examine how to take advantage of custom arguments in your own decorators.

Decorators with Arguments

Ordinarily, decorators only take a single argument, the function to be decorated. Behind the scenes, though, Python evaluates the @ line as an expression before applying it as a decorator. The result of the expression is what's actually used as a decorator. In the simple case, the decorator expression is just a single function, so it evaluates easily. Adding arguments in the form used by functools.wraps() makes the whole statement evaluate like this.

wrapper = functools.wraps(func)(wrapper)

Looking at it this way, the solution becomes clear: one function returns another. The first function accepts the extra arguments and returns another function, which is used as the decorator. This makes implementing arguments on a decorator more complex because it adds another layer to the whole process, but it's easy to deal with once you see it in context. Here's how everything works together in the longest chain you're likely to see.

  • A function to accept and validate arguments

  • A decorator to accept a user-defined function

  • A wrapper to add extra behavior

  • The original function that was decorated

Not all of that will happen for every decorator, but that's the general approach of the most complex scenarios. Anything more complicated is simply an expansion of one of those four steps. As you'll notice, three of the four have already been covered, so the extra layer imposed by decorator arguments is really the only thing left to discuss.

This new outermost function accepts all the arguments for the decorator, optionally validates them and returns a new function as a closure over the argument variables. That new function must take a single argument, functioning as the decorator. Here's how the suppress_errors() decorator might look if it instead accepted a logger function to report the errors to, rather than completely silencing them.

import functools

def suppress_errors(log_func=None):
    """Automatically silence any errors that occur within a function"""

    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            try:
                return func(*args, **kwargs)
            except Exception as e:
                if log_func is not None:
                    log_func(str(e))

        return wrapper

return decorator

This layering allows suppress_errors() to accept arguments prior to being used as a decorator, but it removes the ability to call it without any arguments. Since that was the previous behavior, we've now introduced a backward incompatibility. The closest we can get to the original syntax is to actually call suppress_errors() first, but without any arguments.

Here's an example function that processes updates files in a given directory. This is a task that's often performed on an automated schedule, so that if something goes wrong, it can just stop running and try again at the next appointed time.

import datetime
[import os
import time
from myapp import suppress_errors

@suppress_errors()
def process_updated_files(directory, process, since=None):
    """
    Processes any new files in a `directory` using the `process` function.
    If provided, `since` is a date after which files are considered updated.

    The process function passed in must accept a single argument: the absolute
    path to the file that needs to be processed.
    """

    if since is not None:
        # Get a threshold that we can compare to the modification time later
        threshold = time.mktime(since.timetuple()) + since.microsecond / 1000000
    else:
        threshold = 0

    for filename in os.listdir(directory):
        path = os.path.abspath(os.path.join(directory, filename))
        if os.stat(path).st_mtime > threshold:
            process(path)

Unfortunately, this is still a strange situation to end up with, and it really doesn't look like anything Python programmers are used to. Clearly we need a better solution.

Decorators with—or without—Arguments

Ideally, a decorator with optional arguments would be able to be called without parentheses if no arguments are provided, while still being able to provide the arguments when necessary. This means supporting two different flows in a single decorator, which can get tricky if you're not careful. The main problem is that the outermost function must be able to accept arbitrary arguments or a single function, and it must be able to tell the difference between the two and behave accordingly.

That brings us to the first task: determining which flow to use when the outer function is called. One option would be to inspect the first positional argument and check to see if it's a function, since decorators always receive the function as a positional argument. But since things like functools.wraps() accept a function as a non-decorator argument, that method falls apart pretty quickly.

Interestingly, a pretty good distinction can be made based on something mentioned briefly in the previous paragraph. Decorators always receive the decorated function as a positional argument, so we can use that as its distinguishing factor. For all other arguments, we can instead rely on keyword arguments, which are generally more explicit anyway, thus making it more readable as well.

We could do this by way of using *args and **kwargs, but since we know the positional argument list is just a fixed single argument, it's easier to just make that the first argument and make it optional. Then, any additional keyword arguments can be placed after it. They'll all need default values, of course, but the whole point here is that all arguments are optional, so that's not a problem.

With the argument distinction squared away, all that's left is to branch into a different code block if arguments are provided, rather than a function to be decorated. By having an optional first positional argument, we can simply test for its presence to determine which branch to go through.

import functools

def suppress_errors(func=None, log_func=None):
    """Automatically silence any errors that occur within a function"""

    def decorator(func):
        @functools.wraps(func)
        def wrapper(*args, **kwargs):
            try:
                return func(*args, **kwargs)
            except Exception as e:
                if log_func is not None:
                    log_func(str(e))

        return wrapper

    if func is None:
        return decorator
    else:
        return decorator(func)

This now allows suppress_errors() to be called with or without arguments, but it's still important to remember that arguments must be passed with keywords. This is an example where an argument looks identical to the function being decorated. There's no way to tell the difference by examining them, even if we tried.

If a logger function is provided as a positional argument, the decorator will assume it's the function to be decorated, so it'll actually execute the logger immediately, with the function to be decorated as its argument. In essence, you'll end up logging the function you wanted to decorate. Worse yet, the value you're left with after decorating the function is actually the return value from the logger, not the decorator. Since most loggers don't return anything, it'll probably be None—that's right, your function has vanished.

>>> def print_logger(message):
...     print(message)
...
>>> @suppress_errors(print_logger)
... def example():
...     return variable_which_does_not_exist
...
<function example at 0x...>
>>> example
>>>

This is a side-effect of the way the decorator works, and there's little to be done besides documenting it and making sure you always specify keywords when applying arguments.

Example: Memoization

To demonstrate how decorators can copy out common behavior into any function you like, consider what could be done to improve the efficiency of deterministic functions. Deterministic functions always return the same result given the same set of arguments, no matter how many times they're called. Given such a function, it should be possible to cache the results of a given function call, so if it's called with the same arguments again, the result can be looked up without having to call the function again.

Using a cache, a decorator can store the result of a function using the argument list as its key. Dictionaries can't be used as keys in a dictionary, so only positional arguments can be taken into account when populating the cache. Thankfully, most functions that would take advantage of memoization are simple mathematical operations, which are typically called with positional arguments anyway.

def memoize(func):
    """
    Cache the results of the function so it doesn't need to be called
    again, if the same arguments are provided a second time.
    """
    cache = {}

    @functools.wraps(func)
    def wrapper(*args):
        if args in cache:
            return cache[args]

        # This line is for demonstration only.
        # Remove it before using it for real.
        print('Calling %s()' % func.__name__)

        result = func(*args)
cache[args] = result
        return result

    return wrapper

Now, whenever you define a deterministic function, you can use the memoize() decorator to automatically cache its result for future use. Here's how it would work for some simple mathematical operations.

>>> @memoize
... def multiply(x, y):
...     return x * y
...
>>> multiply(6, 7)
Calling multiply()
42
>>> multiply(6, 7)
42
>>> multiply(4, 3)
Calling multiply()
12
>>> @memoize
... def factorial(x):
...    result = 1
...    for i in range(x):
...        result *= i + 1
...    return result
...
>>> factorial(5)
Calling factorial()
120
>>> factorial(5)
120
>>> factorial(7)
Calling factorial()
5040

Warning

Memoization is best suited for functions with a few arguments, which are called with relatively few variations in the argument values. Functions that are called with a large number of arguments or have a lot of variety in the argument values that are used will quickly fill up a lot of memory with the cache. This can slow down the entire system, with the only benefit being the minority of cases where arguments are reused. Also, functions that aren't truly deterministic will actually cause problems because the function won't be called every time.

Example: A Decorator to Create Decorators

Astute readers will have noticed something of a contradiction in the descriptions of the more complex decorator constructs. The purpose of decorators is to avoid a lot of boilerplate code and simplify functions, but the decorators themselves end up getting quite complicated just to support features like optional arguments. Ideally, we could put that boilerplate into a decorator as well, simplifying the process for new decorators.

Since decorators are Python functions, just like those they decorate, this is quite possible. As with the other situations, though, there's something that needs to be taken into account. In this case, the function you define as a decorator will need to distinguish between the arguments meant for the decorator and those meant for the function it decorates.

def decorator(declared_decorator):
    """Create a decorator out of a function, which will be used as a wrapper."""

    @functools.wraps(declared_decorator)
    def final_decorator(func=None, **kwargs):
        # This will be exposed to the rest
        # of your application as a decorator

        def decorated(func):
            # This will be exposed to the rest
            # of your application as a decorated
            # function, regardless how it was called
            @functools.wraps(func)
            def wrapper(*a, **kw):
                # This is used when actually executing
                # the function that was decorated
                return declared_decorator(func, a, kw, **kwargs)

            return wrapper

        if func is None:
            # The decorator was called with arguments,
            # rather than a function to decorate
            return decorated
        else:
            # The decorator was called without arguments,
            # so the function should be decorated immediately
            return decorated(func)

    return final_decorator

With this in place, you can define your decorators in terms of the wrapper function directly; then just apply this decorator to manage all the overhead behind the scenes. Your declared functions must always accept three arguments now, with any additional arguments added on beyond that. The three required arguments are shown in the following list.

  • The function that will be decorated, which should be called if appropriate

  • A tuple of positional arguments that were supplied to the decorated function

  • A dictionary of keyword arguments that were supplied to the decorated function

With these arguments in mind, here's how you might define the suppress_errors() decorator described previously in this chapter.

>>> @decorator
... def suppress_errors(func, args, kwargs, log_func=None):
...     try:
...        return func(*args, **kwargs)
...    except Exception as e:
...        if log_func is not None:
...           log_func(str(e))
...
>>> @suppress_errors
... def example():
...     return variable_which_does_not_exist
...
>>> example() # Doesn't raise any errors
>>> def print_logger(message):
...     print(message)
...
>>> @suppress_errors(log_func=print_logger)
... def example():
...     return variable_which_does_not_exist
...
>>> example()
global name 'variable_which_does_not_exist' is not defined

Function Annotations

There are typically three aspects of a function that don't deal with the code within it: a name, a set of arguments and an optional docstring. Sometimes, though, that's not quite enough to fully describe how the function works or how it should be used. Static-typed languages—like Java, for example—also include details about what type of values are allowed for each of the arguments, as well as what type can be expected for the return value.

Python's response to this need is the concept of function annotations. Each argument, as well as the return value, can have an expression attached to it, which describes a detail that can't be conveyed otherwise. This could be as simple as a type, such as int or str, which is analogous to static-typed languages, as shown in the following example.

def prepend_rows(rows:list, prefix:str) -> list:
    return [prefix + row for row in rows]

The biggest difference between this example and traditional static-typed languages isn't a matter of syntax; it's that in Python, annotations can be any expression, not just a type or a class. You could annotate your arguments with descriptive strings, calculated values or even inline functions—see this chapter's section on lambdas for details. Here's what the previous example might look like if annotated with strings as additional documentation.

def prepend_rows(rows:"a list of strings to add to the prefix",
                 prefix:"a string to prepend to each row provided",
                 ) -> "a new list of strings prepended with the prefix":
    return [prefix + row for row in rows]

Of course, this flexibility might make you wonder about the "intended" use for function annotations, but there isn't one, and that's deliberate. Officially, the intent behind annotations is to encourage experimentation in frameworks and other third-party libraries. The two examples shown here could be valid for use with type checking and documentation libraries, respectively.

Example: Type Safety

To illustrate how annotations can be used by a library, consider a basic implementation of a type safety library that can understand and utilize the function described previously. It would expect argument annotations to specify a valid type for any incoming arguments, while the return annotation would be able to validate the value returned by the function.

Since type safety involves verifying values before and after the function is executed, a decorator is the most suitable option for the implementation. Also, since all the type hinting information is provided in the function declaration, we don't need to worry about any additional arguments, so a simple decorator will suffice. The first task, though, is to validate the annotations themselves, since they must be valid Python types in order for the rest of the decorator to work properly.

import inspect

def typesafe(func):
    """
    Verify that the function is called with the right argument types and
    that it returns a value of the right type, according to its annotations
    """

    spec = inspect.getfullargspec(func)

    for name, annotation in spec.annotations.items():
        if not isinstance(annotation, type):
            raise TypeError("The annotation for '%s' is not a type." % name)

    return func

So far, this doesn't do anything to the function, but it does check to see that each annotation provided is a valid type, which can then be used to verify the type of the arguments referenced by the annotations. This uses isinstance(), which compares an object to the type it's expected to be. More information on isinstance() and on types and classes in general can be found in Chapter 4.

Now that we can be sure all the annotations are valid, it's time to start validating some arguments. Given how many types of arguments there are, let's take them one at a time. Keyword arguments are the easiest to start out with, since they already come with their name and value tied together, so that's one less thing to worry about. With a name, we can get the associated annotation and validate the value against that. This would also be a good time to start factoring some things out, since we'll end up having to use some of the same things over and over again. Here's how the wrapper would look to begin with.

import functools
import inspect

def typesafe(func):
    """
    Verify that the function is called with the right argument types and
    that it returns a value of the right type, according to its annotations
    """

    spec = inspect.getfullargspec(func)
    annotations = spec.annotations

    for name, annotation in annotations.items():
        if not isinstance(annotation, type):
            raise TypeError("The annotation for '%s' is not a type." % name)

    error = "Wrong type for %s: expected %s, got %s."

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        # Deal with keyword arguments
        for name, arg in kwargs.items():
            if name in annotations and not isinstance(arg, annotations[name]):
                raise TypeError(error % (name,
                                         annotations[name].__name__,
                                         type(arg).__name__))

        return func(*args, **kwargs)
    return wrapper

By now, this should be fairly self-explanatory. Any keyword arguments provided will be checked to see if there's an associated annotation. If there is, the provided value is checked to make sure it's an instance of the type found in the annotation. The error message is factored out because it'll get reused a few more times before we're done.

Next up is dealing with positional arguments. Once again, we can rely on zip() to line up the positional argument names with the values that were provided. Since the result of zip() is compatible with the items() method of dictionaries, we can actually use chain() from the itertools module to link them together into the same loop.

import functools
import inspect
from itertools import chain

def typesafe(func):
    """
    Verify that the function is called with the right argument types and
    that it returns a value of the right type, according to its annotations
    """
    spec = inspect.getfullargspec(func)
    annotations = spec.annotations

    for name, annotation in annotations.items():
        if not isinstance(annotation, type):
            raise TypeError("The annotation for '%s' is not a type." % name)

    error = "Wrong type for %s: expected %s, got %s."

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        # Deal with keyword arguments
        for name, arg in chain(zip(spec.args, args), kwargs.items()):
            if name in annotations and not isinstance(arg, annotations[name]):
                raise TypeError(error % (name,
                                         annotations[name].__name__,
                                         type(arg).__name__))

        return func(*args, **kwargs)
    return wrapper

Even though that takes care of both positional and keyword arguments, it's not everything. Since variable arguments can also accept annotations, we have to account for argument values that don't line up as nicely with defined argument names. Unfortunately, there's something else that must be dealt with before we can do much of anything on that front.

If you're paying really close attention, you might notice a very subtle bug in the code as it stands. In order to make the code a bit easier to follow and to account for any arguments that are passed by keywords, the wrapper iterates over the kwargs dictionary in its entirely, checking for associated annotations. Unfortunately, that leaves us with the possibility of an unintentional name conflict.

To illustrate how the bug could be triggered, first consider what would be expected when dealing with variable arguments. Since we can only apply a single annotation to the variable argument name itself, that annotation must be assumed to apply to all arguments that fall under that variable argument, whether passed positionally or by keyword. Without explicit support for that behavior yet, variable arguments should just be ignored, but here's what happens with the code as it stands.

>>> @typesafe
... def example(*args:int, **kwargs:str):
...    pass
...
>>> example(spam='eggs')
>>> example(kwargs='spam')
>>> example(args='spam')
Traceback (most recent call last):
  ...
TypeError: Wrong type for args: expected int, got str.

Interestingly, everything works fine unless the function call includes a keyword argument with the same name as the variable positional argument. Though it may not seem obvious at first, the problem is actually in the set of values to iterate over in the wrapper's only loop. It assumes that the names of all the keyword arguments line up nicely with annotations.

Basically, the problem is that keyword arguments that are meant for the variable argument end up getting matched up with annotations from other arguments. For the most part, this is acceptable because two of the three types of arguments won't ever cause problems. Matching it with an explicit argument name simply duplicates what Python already does, so using the associated annotation is fine, and matching the variable keyword argument name ends up using the same annotation that we were planning on using anyway.

So the problem only crops up when a keyword argument matches the variable positional argument name because that association never makes sense. Sometimes if the annotation is the same as that of the variable keyword argument, the problem might never show up, but it's still there, regardless. Since the code for the wrapper function is still fairly minimal, it's not too difficult to see where the problem is occurring.

In the main loop, the second part of the iteration chain is the list of items in the kwargs dictionary. That means everything passed in by keyword is checked against named annotations, which clearly isn't always what we want. Instead, we only want to loop over the explicit arguments at this point, while still supporting both positions and keywords. That means we'll have to construct a new dictionary based on the function definition, rather than taking the easy way out and relying on kwargs, as we are now. The outer typesafe() function has been removed from the listing here to make the code easier to digest in print.

def wrapper(*args, **kwargs):
    # Populate a dictionary of explicit arguments passed positionally
    explicit_args = dict(zip(spec.args, args))

    # Add all explicit arguments passed by keyword
    for name in chain(spec.args, spec.kwonlyargs):
        if name in kwargs:
           explicit_args[name] = kwargs[name]

        # Deal with explicit arguments
        for name, arg in explicit_args.items():
            if name in annotations and not isinstance(arg, annotations[name]):
                raise TypeError(error % (name,
                                         annotations[name].__name__,
                                         type(arg).__name__))

        return func(*args, **kwargs)

With that bug out of the way, we can focus on properly supporting variable arguments. Since keyword arguments have names but positional arguments don't, we can't manage both types in one pass like we could with the explicit arguments. The processes are fairly similar to the explicit arguments, but the values to iterate over are different in each case. The biggest difference, though, is that the annotations aren't referenced by the name of the arguments.

In order to loop over just the truly variable positional arguments, we can simply use the number of explicit arguments as the beginning of a slice on the positional arguments tuple. This gets us all positional arguments provided after the explicit arguments or an empty list if only explicit arguments were provided.

For keyword arguments, we have to be a bit more creative. Since the function already loops over all the explicitly declared arguments at the beginning, we can use that same loop to exclude any matching items from a copy of the kwargs dictionary. Then we can iterate over what's left over to account for all the variable keyword arguments.

def wrapper(*args, **kwargs):
    # Populate a dictionary of explicit arguments passed positionally
    explicit_args = dict(zip(spec.args, args))
    keyword_args = kwargs.copy()

    # Add all explicit arguments passed by keyword
    for name in chain(spec.args, spec.kwonlyargs):
        if name in kwargs:
           explicit_args[name] = keyword_args.pop(name)

    # Deal with explicit arguments
    for name, arg in explicit_args.items():
        if name in annotations and not isinstance(arg, annotations[name]):
           raise TypeError(error % (name,
                                    annotations[name].__name__,
                                    type(arg).__name__))

    # Deal with variable positional arguments
    if spec.varargs and spec.varargs in annotations:
       annotation = annotations[spec.varargs]
       for i, arg in enumerate(args[len(spec.args):]):
           if not isinstance(arg, annotation):
              raise TypeError(error % ('variable argument %s' % (i + 1),
                                       annotation.__name__,
                                       type(arg).__name__))

    # Deal with variable keyword arguments
    if spec.varkw and spec.varkw in annotations:
       annotation = annotations[spec.varkw]
       for name, arg in keyword_args.items():
           if not isinstance(arg, annotation):
              raise TypeError(error % (name,
                                       annotation.__name__,
                                       type(arg).__name__))

        return func(*args, **kwargs)

Now we've covered all explicit arguments as well as variable arguments passed in by position and keyword. The only thing left is to validate the value returned by the target function. Thus far, the wrapper just calls the original function directly, without regard for what it returns, but by now, it should be easy to see what needs to be done.

def wrapper(*args, **kwargs):
    # Populate a dictionary of explicit arguments passed positionally
    explicit_args = dict(zip(spec.args, args))
    keyword_args = kwargs.copy()

    # Add all explicit arguments passed by keyword
    for name in chain(spec.args, spec.kwonlyargs):
        if name in kwargs:
           explicit_args[name] = keyword_args(name)

    # Deal with explicit arguments
    for name, arg in explicit_args.items():
        if name in annotations and not isinstance(arg, annotations[name]):
           raise TypeError(error % (name,
                                    annotations[name].__name__,
                                    type(arg).__name__))

    # Deal with variable positional arguments
    if spec.varargs and spec.varargs in annotations:
            annotation = annotations[spec.varargs]
            for i, arg in enumerate(args[len(spec.args):]):
                if not isinstance(arg, annotation):
                    raise TypeError(error % ('variable argument %s' % (i + 1),
                                             annotation.__name__,
                                             type(arg).__name__))

    # Deal with variable keyword arguments
    if spec.varkw and spec.varkw in annotations:
       annotation = annotations[spec.varkw]
       for name, arg in keyword_args.items():
           if not isinstance(arg, annotation):
              raise TypeError(error % (name,
                                       annotation.__name__,
                                       type(arg).__name__))

    r = func(*args, **kwargs)
    if 'return' in annotations and not isinstance(r, annotations['return']):
       raise TypeError(error % ('the return value',
                                annotations['return'].__name__,
                                type(r).__name__))
    return r

With that, we have a fully functional type safety decorator, which can validate all arguments to a function as well as its return value. There's one additional safeguard we can include to find errors even more quickly, though. Similarly to how the outer typesafe() function already validates that the annotations are types, that part of the function is also capable of validating the default values for all provided arguments. Since variable arguments can't have default values, this is much simpler than dealing with the function call itself.

import functools
import inspect
from itertools import chain

def typesafe(func):
    """
    Verify that the function is called with the right argument types and
    that it returns a value of the right type, according to its annotations
    """
    spec = inspect.getfullargspec(func)
    annotations = spec.annotations

    for name, annotation in annotations.items():
        if not isinstance(annotation, type):
            raise TypeError("The annotation for '%s' is not a type." % name)

    error = "Wrong type for %s: expected %s, got %s."
    defaults = spec.defaults or ()
    defaults_zip = zip(spec.args[-len(defaults):], defaults)
    kwonlydefaults = spec.kwonlydefaults or {}

    for name, value in chain(defaults_zip, kwonlydefaults.items()):
        if name in annotations and not isinstance(value, annotations[name]):
            raise TypeError(error % ('default value of %s' % name,
                                     annotations[name].__name__,
                                     type(value).__name__))

    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        # Populate a dictionary of explicit arguments passed positionally
        explicit_args = dict(zip(spec.args, args))
        keyword_args = kwargs.copy()

        # Add all explicit arguments passed by keyword
        for name in chain(spec.args, spec.kwonlyargs):
            if name in kwargs:
                explicit_args[name] = keyword_args.pop(name)

        # Deal with explicit arguments
        for name, arg in explicit_args.items():
            if name in annotations and not isinstance(arg, annotations[name]):
                raise TypeError(error % (name,
                                         annotations[name].__name__,
                                         type(arg).__name__))

        # Deal with variable positional arguments
        if spec.varargs and spec.varargs in annotations:
            annotation = annotations[spec.varargs]
            for i, arg in enumerate(args[len(spec.args):]):
                if not isinstance(arg, annotation):
                    raise TypeError(error % ('variable argument %s' % (i + 1),
                                             annotation.__name__,
                                             type(arg).__name__))
# Deal with variable keyword arguments
        if spec.varkw and spec.varkw in annotations:
            annotation = annotations[spec.varkw]
            for name, arg in keyword_args.items():
                if not isinstance(arg, annotation):
                    raise TypeError(error % (name,
                                             annotation.__name__,
                                             type(arg).__name__))

        r = func(*args, **kwargs)
        if 'return' in annotations and not isinstance(r, annotations['return']):
            raise TypeError(error % ('the return value',
                                     annotations['return'].__name__,
                                     type(r).__name__))
        return r
    return wrapper

Factoring Out the Boilerplate

Looking over the code as it stands, you'll notice a lot of repetition. Each form of annotation ends up doing the same things: checking if the value is appropriate and raising an exception if it's not. Ideally, we'd be able to factor that out into a separate function that can focus solely on the actual task of validation. The rest of the code is really just boilerplate, managing the details of finding the different types of annotations.

Since the common code will be going into a new function, the obvious way to tie it into the rest of the code is to create a new decorator. This new decorator will be placed on a function that will process the annotation for each value, so we'll call it annotation_processor. The function passed into annotation_processor will then be used for each of the annotation types throughout the existing code.

import functools
import inspect
from itertools import chain

def annotation_decorator(process):
    """
    Creates a decorator that processes annotations for each argument passed
    into its target function, raising an exception if there's a problem.
    """

    @functools.wraps(process)
    def decorator(func):
        spec = inspect.getfullargspec(func)
        annotations = spec.annotations

        defaults = spec.defaults or ()
        defaults_zip = zip(spec.args[-len(defaults):], defaults)
        kwonlydefaults = spec.kwonlydefaults or {}

        for name, value in chain(defaults_zip, kwonlydefaults.items()):
            if name in annotations:
                process(value, annotations[name])
@functools.wraps(func)
        def wrapper(*args, **kwargs):
            # Populate a dictionary of explicit arguments passed positionally
            explicit_args = dict(zip(spec.args, args))
            keyword_args = kwargs.copy()

            # Add all explicit arguments passed by keyword
            for name in chain(spec.args, spec.kwonlyargs):
                if name in kwargs:
                    explicit_args[name] = keyword_args.pop(name)

            # Deal with explicit arguments
            for name, arg in explicit_args.items():
                if name in annotations:
                    process(arg, annotations[name])

            # Deal with variable positional arguments
            if spec.varargs and spec.varargs in annotations:
                annotation = annotations[spec.varargs]
                for arg in args[len(spec.args):]:
                    process(arg, annotation)

            # Deal with variable keyword arguments
            if spec.varkw and spec.varkw in annotations:
                annotation = annotations[spec.varkw]
                for name, arg in keyword_args.items():
                    process(arg, annotation)

            r = func(*args, **kwargs)
            if 'return' in annotations:
                process(r, annotations['return'])
            return r

        return wrapper

    return decorator

Note

Because we're making it a bit more generic, you'll notice that the initial portion of the decorator no longer checks that the annotations are valid types. The decorator itself no longer cares what logic you apply to the argument values, since that's all done in the decorated function.

Now we can apply this new decorator to a much simpler function to provide a new typesafe() decorator, which functions just like the one in the previous section.

@annotation_decorator
def typesafe(value, annotation):
    """
Verify that the function is called with the right argument types and
    that it returns a value of the right type, according to its annotations
    """
    if not isinstance(value, annotation):
        raise TypeError("Expected %s, got %s." % (annotation.__name__,
                                                  type(value).__name__))

The benefit of doing this is that it's much easier to modify the behavior of the decorator in the future. In addition, you can now use annotation_processor() to create new types of decorators that use annotation for different purposes, such as type coercion.

Example: Type Coercion

Rather than strictly requiring that the arguments all be the types specified when they're passed into the function, another approach is to coerce them to the required types inside the function itself. Many of the same types that are used to validate values can also be used to coerce them directly into the types themselves. In addition, if a value can't be coerced, the type it's passed into raises an exception, usually a TypeError, just like our validation function.

The decorator presented in the previous section provides a good starting point for adding this behavior to a new decorator, and we can use it to modify the incoming value according to the annotation that was provided along with it. Since we're relying on the type constructor to do all the necessary type checking and raise exceptions appropriately, this new decorator can be much simpler. In fact, it can be expressed in just one actual instruction.

@annotation_decorator
def coerce_arguments(value, annotation):
    return annotation(value)

In fact, this is so simple that it doesn't even require the annotation be a type at all. Any function or class that returns an object will work just fine, and the value returned will be passed into the function decorated by coerce_arguments(). Or will it? If you look back at the annotation_decorator() function as it stands, there's a minor problem that prevents it from working the way this new decorator would need it to.

The problem is that, in the lines that call the process() function that was passed into the outer decorator, the return value is thrown away. If you try to use coerce_arguments() with the existing decorator, all you'll get is the exception-raising aspect of the code, not the value coercion aspect. So in order to work properly, we'll need to go back and add that feature to annotation_processor().

There are a few things that need to be done overall, though. Because the annotation processor will be modifying the arguments that will be eventually sent to the decorated function, we'll need to set up a new list for positional arguments and a new dictionary for keyword arguments. Then we have to split up the explicit argument handling, so that we can distinguish between positional and keyword arguments. Without that, the function wouldn't be able to apply variable positional arguments correctly.

def wrapper(*args, **kwargs):
            new_args = []
            new_kwargs = {}
            keyword_args = kwargs.copy()

            # Deal with explicit arguments passed positionally
            for name, arg in zip(spec.args, args):
                if name in annotations:
                    new_args.append(process(arg, annotations[name]))

            # Deal with explicit arguments passed by keyword
            for name in chain(spec.args, spec.kwonlyargs):
                if name in kwargs and name in annotations:
                    new_kwargs[name] = process(keyword_args.pop(name),
                                               annotations[name])

            # Deal with variable positional arguments
            if spec.varargs and spec.varargs in annotations:
                annotation = annotations[spec.varargs]
                for arg in args[len(spec.args):]:
                    new_args.append(process(arg, annotation))

            # Deal with variable keyword arguments
            if spec.varkw and spec.varkw in annotations:
                annotation = annotations[spec.varkw]
                for name, arg in keyword_args.items():
                    new_kwargs[name] = process(arg, annotation)

            r = func(*new_args, **new_kwargs)
            if 'return' in annotations:
                r = process(r, annotations['return'])
            return r

With those changes in place, the new coerce_arguments() decorator will be able to replace the arguments on the fly, passing the replacements into the original function. Unfortunately, if you're still using typesafe() from before, this new behavior causes problems because typesafe() doesn't return a value. Fixing that is a simple matter of returning the original value, unchanged, if the type check was satisfactory.

@annotation_decorator
def typesafe(value, annotation):
    """
    Verify that the function is called with the right argument types and
    that it returns a value of the right type, according to its annotations
    """
if not isinstance(value, annotation):
        raise TypeError("Expected %s, got %s." % (annotation.__name__,
                                                  type(value).__name__))
    return value

Annotating with Decorators

The natural question to ask is: what happens if you want to use two libraries together? One might expect you to supply valid types, while the other expects a string to use for documentation. They're completely incompatible with each other, which forces you to use one or the other, rather than both. Furthermore, any attempt to merge the two, using a dictionary or some other combined data type, would have to be agreed on by both libraries, since each would need to know how to get at the information it cares about.

Once you consider how many other frameworks and libraries might take advantage of these annotations, you can see how quickly the official function annotations fall apart. It's still too early to see which applications will actually use it or how they'll work together, but it's certainly worth considering other options which can bypass the problems completely.

Since decorators can take arguments of their own, it's possible to use them to provide annotations for the arguments of the functions they decorate. This way, the annotations are separate from the function itself and provided directly to the code that makes sense of them. And since multiple decorators can be stacked together on a single function, it's already got a built-in way of managing multiple frameworks.

Example: Type Safety as a Decorator

To illustrate the decorator-based approach to function annotations, let's consider the type safety example from earlier. It already relied on a decorator, so we can extend that to take arguments, using the same types that the annotations provided previously. Essentially, it'll look something like this.

>>> @typesafe(str, str)
... def combine(a, b):
...     return a + b
...
>>> combine('spam', 'alot')
'spamalot'
>>> combine('fail', 1)
Traceback (most recent call last):
  ...
TypeError: Wrong type for b: expected str, got int.

It works almost exactly like the true annotated version, except that the annotations are supplied to the decorator directly. In order to accept arguments, we're going to just change the first portion of the code a bit, so we can get the annotations from the arguments instead of inspecting the function itself.

Since annotations come in through arguments to the decorator, we have a new outer wrapper for receiving them. When the next layer receives the function to be decorated, it can match up the annotations with the function's signature, providing names for any annotations passed positionally. Once all the available annotations have been given the right names, they can be used by the rest of the inner decorator, without any further modifications.

import functools
import inspect
from itertools import chain
def annotation_decorator(process):
    """
    Creates a decorator that processes annotations for each argument passed
    into its target function, raising an exception if there's a problem.
    """

    def annotator(*args, **kwargs):
        annotations = kwargs.copy()

        @functools.wraps(process)
        def decorator(func):
            spec = inspect.getfullargspec(func)
            annotations.update(zip(spec.args, args))

            defaults = spec.defaults or ()
            defaults_zip = zip(spec.args[-len(defaults):], defaults)
            kwonlydefaults = spec.kwonlydefaults or {}

            for name, value in chain(defaults_zip, kwonlydefaults.items()):
                if name in annotations:
                    process(value, annotations[name])

            @functools.wraps(func)
            def wrapper(*args, **kwargs):
                new_args = []
                new_kwargs = {}
                keyword_args = kwargs.copy()

                # Deal with explicit arguments passed positionally
                for name, arg in zip(spec.args, args):
                    if name in annotations:
                        new_args.append(process(arg, annotations[name]))

                # Deal with explicit arguments passed by keyword
                for name in chain(spec.args, spec.kwonlyargs):
                    if name in kwargs and name in annotations:
                        new_kwargs[name] = process(keyword_args.pop(name),
                                                   annotations[name])

                # Deal with variable positional arguments
                if spec.varargs and spec.varargs in annotations:
                    annotation = annotations[spec.varargs]
                    for arg in args[len(spec.args):]:
                        new_args.append(process(arg, annotation))

                # Deal with variable keyword arguments
                if spec.varkw and spec.varkw in annotations:
                    annotation = annotations[spec.varkw]
                    for name, arg in keyword_args.items():
                        new_kwargs[name] = process(arg, annotation)
r = func(*new_args, **new_kwargs)
                if 'return' in annotations:
                    r = process(r, annotations['return'])
                return r

            return wrapper

        return decorator

    return annotator

That handles most of the situation, but it doesn't handle return values yet. If you try to supply a return value using the right name, return, you'll get a syntax error because it's a reserved Python keyword. Trying to provide it alongside the other annotations would require each call to pass annotations using an actual dictionary, where you can provide the return annotation without upsetting Python's syntax.

Instead, we'll need to provide the return value annotation in a separate function call, where it can be the sole argument, without any reserved name issues. When working with most types of decorators, this would be easy to do: just create a new decorator that checks the return value and be done with it. Unfortunately, since the eventual decorator we're working with is created outside the control of our code, it's not so easy.

If we completely detached the return value processing from the argument processing, the programmer who's actually writing something like the typesafe() decorator would have to write it twice; once to create the argument-processing decorator and again to create the return-value-processing decorator. Since that's a clear violation of DRY, let's see if we can reuse as much of their work as possible.

Here's where some design comes into play. We're looking at going beyond just a simple decorator, so we need to figure out how to best approach it, so it makes sense to those who have to use it. Thinking about the available options, one solution springs to mind fairly quickly. If we can add the extra annotation function as an attribute of the final decorator, you'd be able to write the return value annotator on the same line as the other decorator, but right afterward, in its own function call. Here's what it might look like, if we went that route.

@typesafe(int, int).returns(int)
def add(a, b):
    return a + b

Unfortunately, this isn't actually an option, for reasons that can be demonstrated without even adding the necessary code to support it. The trouble is, this formation isn't allowed as Python syntax. If typesafe() hadn't taken any arguments, it would work, but there's no support for calling two separate functions as part of a single decorator. Instead of supplying the return value annotation in the decorator itself, let's look somewhere else.

Another option is to use the generated typesafe() decorator to add a function as an attribute to the wrapper around the add() function. This places the return value annotation at the end of the function definition, closer to where the return value is specified. In addition, it helps clarify the fact that you can use typesafe() to supply argument decorators without bothering to check the return value, if you want to. Here's how it would look.

@typesafe(int, int)
def add(a, b):
    return a + b
add.returns(int)

It's still very clear and perhaps even more explicit than the syntax that doesn't work anyway. As an added bonus, the code to support it is very simple, requiring just a few lines be added to the end of the inner decorator() function.

def decorator(func):
            from itertools import chain

            spec = inspect.getfullargspec(func)
            annotations.update(zip(spec.args, args))

            defaults = spec.defaults or ()
            defaults_zip = zip(spec.args[-len(defaults):], defaults)
            kwonlydefaults = spec.kwonlydefaults or {}

            for name, value in chain(defaults_zip, kwonlydefaults.items()):
                if name in annotations:
                    process(value, annotations[name])

            @functools.wraps(func)
            def wrapper(*args, **kwargs):
                new_args = []
                new_kwargs = {}
                keyword_args = kwargs.copy()

                # Deal with explicit arguments passed positionally
                for name, arg in zip(spec.args, args):
                    if name in annotations:
                        new_args.append(process(arg, annotations[name]))

                # Deal with explicit arguments passed by keyword
                for name in chain(spec.args, spec.kwonlyargs):
                    if name in kwargs and name in annotations:
                        new_kwargs[name] = process(keyword_args.pop(name),
                                                   annotations[name])

                # Deal with variable positional arguments
                if spec.varargs and spec.varargs in annotations:
                    annotation = annotations[spec.varargs]
                    for arg in args[len(spec.args):]:
                        new_args.append(process(arg, annotation))

                # Deal with variable keyword arguments
                if spec.varkw and spec.varkw in annotations:
                    annotation = annotations[spec.varkw]
                    for name, arg in keyword_args.items():
                        new_kwargs[name] = process(arg, annotation)

                r = func(*new_args, **new_kwargs)
                if 'return' in annotations:
                    r = process(r, annotations['return'])
                return r
def return_annotator(annotation):
                annotations['return'] = annotation
            wrapper.returns = return_annotator

            return wrapper

Since this new returns() function will be called before the final typesafe() function ever will, it can simply add a new annotation to the existing dictionary. Then, when typesafe() does get called later, the internal wrapper can just continue working like it always did. This just changes the way the return value annotation is supplied, which is all that was necessary.

Because all of this behavior was factored out into a separate decorator, you can apply this decorator to coerce_arguments() or any other similarly purposed function. The resulting function will work the same way as typesafe(), only swapping out the argument handling with whatever the new decorator needs to do.

Generators

Chapter 2 introduced the concept of generator expressions and stressed the importance of iteration. While generator expressions are useful for simple situations, you'll often need more sophisticated logic to determine how the iteration should work. You may need finer-grained control over the duration of the loop, the items getting returned, possible side-effects that get triggered along the way or any number of other concerns you may have.

Essentially, you need a real function, but with the benefits of a proper iterator and without the cognitive overhead of creating the iterator yourself. This is where generators come in. By allowing you to define a function that can produce individual values one at a time, rather than just a single return value, you have the added flexibility of a function and the performance of an iterator.

Generators are set aside from other functions by their use of the yield statement. This is somewhat of an analog to the typical return statement, except that yield doesn't cause the function to stop executing completely. It pushes one value out of the function, which gets consumed by the loop that called the generator, then when that loop starts over, the generator starts back up again. It picks up right where it left off, running until it finds another yield statement or the function simply finishes executing.

The basics are best illustrated by an example, so consider a simple generator that returns the values in the Fibonacci sequence. The sequence begins with 0 and 1; each following number is produced by adding up the two numbers before it in the sequence. Therefore, the function only ever needs to keep two numbers in memory at a time, no matter how high the sequence goes. In order to keep it from continuing on forever, though, it's best to require a maximum number of values it should return, making a total of three values to keep track of.

It's tempting to set up the first two values as special cases, yielding them one at a time before even starting into the main loop that would return the rest of the sequence. That adds some extra complexity, though, which can make it pretty easy to accidentally introduce an infinite loop. Instead, we'll use a couple other seed values, −1 and 1, which can be fed right into the main loop directly. They'll generate 0 and 1 correctly when the loop's logic is applied.

Next, we can add a loop for all the remaining values in the sequence, up until the count is reached. Of course, by the time the loop starts, two values have already been yielded, so we have to decrease count by 2 before entering the loop. Otherwise, we'd end up yielding two more values than were requested.

def fibonacci(count):
    # These seed values generate 0 and 1 when fed into the loop
a, b = −1, 1

    while count > 0:
        # Yield the value for this iteration
        c = a + b
        yield c

        # Update values for the next iteration
        a, b = b, c
        count -= 1

With the generator in place, you can iterate over the values it produces, simply by treating it like you would any other sequence. Generators are iterable automatically, so a standard for loop already knows how to activate it and retrieve its values.

>>> for x in fibonacci(3):
...     print(x)
...
0
1
1
>>> for x in fibonacci(7):
...     print(x)
...
0
1
1
2
3
5
8

Unfortunately, the main benefit of generators can also, at times, be somewhat of a burden. Because there's no complete sequence in memory at any given time, generators always have to pick up where they left off. Most of the time, though, you'll completely exhaust the generator when you iterate over it the first time, so when you try to put it into another loop, you won't get anything back at all.

>>> fib = fibonacci(7)
>>> list(fib)
[0, 1, 1, 2, 3, 5, 8]
>>> list(fib)
[]

This behavior can seem a bit misleading at first, but most of the time, it's the only behavior that makes sense. Generators are often used in places where the entire sequence isn't even known in advance or it may change after you iterate over it. For example, you might use a generator to iterate over the users currently accessing a system. Once you've identified all the users, the generator automatically becomes stale and you need to create a new one, which refreshes the list of users.

Note

If you've used the built-in range() function (or xrange() prior to Python 3.0) often enough, you may have noticed that it does restart itself if accessed multiple times. That behavior is provided by moving one level lower in the iteration process, by implementing the iterator protocol explicitly. It can't be achieved with simple generators, but Chapter 5 shows you can have greater control over iteration of the objects you create.

Lambdas

In addition to providing features on their own, functions are often called upon to provide some extra minor bit of functionality to some other feature. For example, when sorting a list, you can configure Python's behavior by supplying a function that accepts a list item and returns a value that should be used for comparison. This way, given a list of House objects, for instance, you can sort by price.

>>> def get_price(house):
...     return house.price
...
>>> houses.sort(key=get_price)

Unfortunately, this seems like a bit of a waste of the function's abilities, plus it requires a couple of extra lines of code and a name that never gets used outside of the sort() method call. A better approach would be if you could specify the key function directly inline with the method call. This not only makes it more concise, it also places the body of the function right where it will be used, so it's a lot more readable for these types of simple behaviors.

In these situations, Python's lambda form is extremely valuable. Python provides a separate syntax, identified by the keyword lambda. This allows you to define a function without a name as a single expression, with a much simpler feature set. Before diving into the details of the syntax, here's what it looks like in the house sorting example.

>>> houses.sort(key=lambda h: h.price)

As you can see, this is a considerably compressed form of a function definition. Following the lambda keyword is a list of arguments, separated by commas. In the sort example, only one argument is needed, and it can be named anything you like, like any other function. They can even have default values if necessary, using the same syntax as regular functions. Arguments are followed by a colon, which notes the beginning of the lambda's body. If no arguments are involved, the colon can be placed immediately after the lambda keyword.

>>> a = lambda: 'example'
>>> a
<function <lambda> at 0x. .>
>>> a()
'example'
>>> b = lambda x, y=3: x + y
>>> b()
Traceback (most recent call last):

TypeError: <lambda>() takes at least 1 positional argument (0 given)
>>> b(5)
8
>>> b(5, 1)
6

As you'll have likely discovered by now, the body of the lambda is really just its return value. There's no explicit return statement, so the entire body of the function is really just a single expression used to return a value. That's a big part of what makes the lambda form so concise, yet easily readable, but it comes at a price: only a single expression is allowed. You can't use any control structures, such as try, with or while blocks, you can't assign variables inside the function body and you can't perform multiple operations without them also being tied to the same overall expression.

This may seem extremely limiting, but in order to still be readable, the function body must be kept as simple as possible. In situations where you need the additional control flow features, you'll find it much more readable to specify it in a standard function, anyway. Then, you can pass that function in where you might otherwise use the lambda. Alternatively, if you have a portion of the behavior that's provided by some other function, but not all of it, you're free to call out to other functions as part of the expression.

Introspection

One of the primary advantages of Python is that nearly everything can be examined at run-time, from object attributes and module contents to documentation and even generated bytecode. Peeking at this information is called introspection, and it permeates nearly every aspect of Python. The following sections define some of the more general introspection features that are available, while more specific details are given in the remaining chapters.

The most obvious introspective aspect of any function is its name. It's also one of the simplest, made available simply at the __name__ attribute. The return is the string used to define the function. In the case of lambdas, which have no names, the __name__ attribute is populated with the standard string, '<lambda>'.

>>> def example():
...     pass
...
>>> example.__name__
'example'
>>> (lambda: None).__name__
'<lambda>'

Identifying Object Types

Python's dynamic nature can sometimes make it seem difficult to ensure you're getting the right type of value or to even know what type of value it is. Python does provide some options for accessing that information, but it's necessary to realize those are two separate tasks, so Python uses two different approaches.

The most obvious requirement is to identify what type of object your code was given. For this, Python supplies its built-in type() function, which accepts an object to identify. The return value is the Python class that was used to create the given object, even if that creation was done implicitly, by way of a literal value.

>>> type('example')
<class 'str'>
>>> class Test:
...     pass
...
>>> type(Test)
<class 'type'>
>>> type(Test())
<class '__main__.Test'>

Chapter 4 explains in detail what you can do with that class object once you have it, but the more common case is to compare an object against a particular type you expect to receive. This is a different situation because it doesn't really matter exactly what type the object is. As long as the value is an instance of the right type, you can make correct assumptions about how it behaves.

There are a number of different utility functions available for this purpose, most of which will be covered in Chapter 4. This section and the next chapter will make use of one of them fairly frequently, so it merits some explanation here. The isinstance() function accepts two arguments: the object to check and the type you're expecting it to be. The result is a simple True or False, making it suitable for if blocks.

>>> def test(value):
...     if isinstance(value, int):
...         print('Found an integer!')
...
>>> test('0')
>>> test(0)
Found an integer!

Modules and Packages

Functions and classes that are defined in Python are placed inside of modules, which in turn are often part of a package structure. Accessing this structure when importing code is easy enough, using documentation or even just peeking at the source files on disk. Given a piece of code, however, it's often useful to identify where it was defined in the source code.

For this reason, all functions and classes have a __module__ attribute, which contains the import location of the module where the code was defined. Rather than just supplying the name of the module, the __module__ string also includes the full path to where the module resides. Essentially, it's enough information for you to pass it straight into any of the dynamic importing features shown in Chapter 2.

Working with the interactive interpreter is something of a special case because there's no named source file to work with. Any functions or classes defined there will have the special name '__main__' returned from the __module__ attribute.

>>> def example():
...     pass
...
>>> example
<function example at 0x...>
>>> example.__module__
'__main__'

Docstrings

Since you can document your functions with docstrings included right alongside the code itself, Python also stores those strings as part of the function object. By accessing the __doc__ attribute of a function, you can read a docstring into code, which can be useful for generating a library's documentation on the fly. Consider the following example, showing simple docstring access on a simple function.

>>> def example():
...     """This is just an example to illustrate docstring access."""
...     pass
...
>>> example.__doc__
'This is just an example to illustrate docstring access.'
>>> def divide(x, y):
...     """
...     divide(integer, integer) -> floating point
...
...     This is a more complex example, with more comprehensive documentation.
...     """
...     return float(x) / y # Use float()for compatibility prior to 3.0
...
>>> divide.__doc__
'
    divide(integer, integer) -> floating point

    This is a more complex ex
ample, with more comprehensive documentation.
    '
>>> print(divide.__doc__)

    divide(integer, integer) -> floating point

    This is a more complex example, with more comprehensive documentation.

>>>

As you can see, simple docstrings are easy to handle just by reading in __doc__ and using it however you need to. Unfortunately, more complex docstrings will retain all whitespace, including newlines, making them more challenging to work with. Worse yet, your code can't know which type of docstring you're looking at without scanning it for certain characters. Even if you're just printing it out to the interactive prompt, you still have an extra line before and after the real documentation, as well as the same indentation as was present in the file.

To more gracefully handle complex docstrings like the one shown in the example, the inspect module mentioned previously also has a getdoc() function, designed to retrieve and format docstrings. It strips out whitespace both before and after the documentation, as well as any indentation that was used to line up the docstring with the code around it. Here's that same docstring again, but formatted with inspect.getdoc().

>>> import inspect
>>> print(inspect.getdoc(divide))

divide(integer, integer) -> floating point
This is a more complex example, with more comprehensive documentation.
>>>

We still have to use print() at the interactive prompt because the newline character is still retained in the result string. All inspect.getdoc() strips out is the whitespace that was used to make the docstring look right alongside the code for the function. In addition to trimming the space at the beginning and end of the docstring, getdoc() uses a simple technique to identify and remove whitespace used for indentation.

Essentially, getdoc() counts the number of spaces at the beginning of each line of code, even if the answer is 0. Then, it determines the lowest value of those counts and removes that many characters from each line that remains after the leading and trailing whitespace has been removed. This allows you to keep other indentation in the docstring intact, as long as it's greater than what you need to align the text with the surrounding code. Here's an example of an even more complex docstring, so you can see how inspect.getdoc() handles it.

>>> def clone(obj, count=1):
...     """
...    clone(obj, count=1) -> list of cloned objects
...
...    Clone an object a specified number of times, returning the cloned
...    objects as a list. This is just a shallow copy only.
...
...    obj
...        Any Python object
...    count
...        Number of times the object will be cloned
...
...      >>> clone(object(), 2)
...      [<object object at 0x12345678>, <object object at 0x87654321>]
...    """
...    import copy
...    return [copy.copy(obj) for x in count]
...
>>> print(inspect.getdoc(clone))
clone(obj, count=1) -> list of cloned objects

Clone an object a specified number of times, returning the cloned
objects as a list. This is just a shallow copy only.

obj
    Any Python object
count
    Number of times the object will be cloned

  >>> clone(object(), 2)
  [<object object at 0x12345678>, <object object at 0x87654321>]
>>>

Notice how the descriptions of each argument are still indented four spaces, just like they appeared to be in the function definition. The shortest lines had just four total spaces at the beginning, while those had eight, so Python stripped out the first four, leaving the rest intact. Likewise, the example interpreter session was indented by two extra spaces, so the resulting string maintains a two-space indentation.

Oh, and don't worry too much about the copy utility just yet. Chapter 6 will describe in detail how to make and manage copies of objects when necessary.

Taking It With You

Although Python functions may seem to be quite simple on the surface, you now know how to define and manage them in ways that really fit your needs. Of course, you're probably looking to incorporate functions into a more comprehensive object-oriented program, and for that, we'll need to look at how Python's classes work.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset