CHAPTER 2

image

Django Is Python

Django, like other frameworks, is built on an underlying programming language—in this case, Python—to do its work. Many people who are new to Django are also new to Python, and Python’s natural-feeling syntax combined with Django’s energy-saving features can make Django seem like it uses some kind of metalanguage, which isn’t the case.

A proper understanding of what can be done in Django must begin with the knowledge that Django is simply Python, as are all of your applications. Anything that can be done in Python can be done in Django, which makes the possibilities nearly limitless.

This also means that Django applications have access not only to the entire Python standard library, but also to an immense collection of third-party libraries and utilities. Interfaces to some of these are provided along with Django itself, so for many cases, the existing code and documentation will be sufficient to quickly get an application up and running.

Later in this book, some additional utilities are covered, along with some tips on how to integrate them into a Django application. The possibilities aren’t limited to the options outlined in this book, so feel free to look around for Python utilities that will help support your business plan, and use the techniques listed in this book to integrate them into your application.

Though learning Python is beyond the scope of this book, Django uses some of its advanced features. In this chapter, I’ll discuss many of those features to help you understand how Python can contribute to the goal of making things easier for everyone.

How Python Builds Classes

Some of the most advanced Python techniques that Django relies on are related to how Python constructs its classes. This process is often taken for granted by most developers—as well it should be—but since it’s at the heart of Django, it forms the basis of this exploration.

When the Python interpreter encounters a class definition, it reads its contents just as it would any other code. Python then creates a new namespace for the class and executes all the code within it, writing any variable assignments to that new namespace. Class definitions generally contain variables, methods and other classes, all of which are basically assignments to the namespace for the class. However, nearly any valid code is allowed here, including printing to console output, writing to log files or even triggering GUI interaction.

Once the contents have finished executing, Python will have a class object that is ordinarily placed in the namespace where it was defined (usually the global namespace for the module), where it is then passed around or called to create instances of that class.

>>> class NormalClass:
...     print('Loading NormalClass')
...     spam = 'eggs'
...     print('Done loading')
...
Loading NormalClass
Done loading
>>> NormalClass
<class '__main__.NormalClass'>
>>> NormalClass.spam
'eggs'

As you can see, code executes within the class definition, with any assigned variables showing up as class attributes once the class is ready.

Building a Class Programmatically

The process described in the previous section is used for any source-declared class, but the way Python goes about it offers the possibility of something far more interesting. Behind the scenes, details about the class declaration are sent off to the built-in type object, which takes care of creating an appropriate Python object for the class. This happens automatically, for every class, immediately when it finishes parsing the contents of the class declaration.

The constructor for type accepts three arguments, which represent the entire class declaration.

  • name—The name provided for the class, as a string
  • bases—A tuple of classes in the inheritance chain of the class; may be empty
  • attrs—A dictionary of the class namespace

COMPATIBILITY: NEW-STYLE CLASSES IN PYTHON 2

The process described in this section is true for new-style Python classes, a distinction introduced in Python 2.21 Old-style classes have been completely removed from Python 3, but if you're working with Python 2, you’ll need to make sure to force new-style classes. To do so, simply make sure that the class inherits from the built-in object type somewhere in its inheritance chain.

All the classes Django provides to be subclassed will already derive from object, so any further derivatives will automatically be new-style classes, without any extra effort on your part. Still, it’s important to keep the difference in mind, so that any custom classes your application may need will exhibit the behaviors outlined in this chapter.

Like any Python object, a new type can be instantiated at any time, from any block of code. This means that your code can construct a new class based on data collected at runtime. The following code demonstrates a way to declare a class at runtime, which is functionally equivalent to the example provided in the previous section.

>>> DynamicClass = type('DynamicClass', (), {'spam': 'eggs'})
>>> DynamicClass
<class '__main__.DynamicClass'>
>>> DynamicClass.spam
'eggs'

A WARNING ABOUT TYPE()

Using type() manually makes it easy to create classes with duplicate names, and even the module location can be customized by providing a __module__ key in the dictionary in the attrs argument. Although these features can be useful, as will be demonstrated later in this book, they can lead to problems with introspection.

You could reasonably have two different classes with the same name and module, but your code won’t be able to tell the difference between them. This may not be a problem in some situations, but it’s something to be aware of.

Metaclasses Change It Up

type is actually a metaclass—a class that creates other classes—and what we’ve been engaging in is called metaprogramming.2 In essence, metaprogramming creates or modifies code at runtime rather than at programming time. Python allows you to customize this process by allowing a class to define a different metaclass to perform its work.

If a class definition includes a separate class for its metaclass option, that metaclass will be called to create the class, rather than the built-in type object. This allows your code to read, modify or even completely replace the declared class to further customize its functionality. The metaclass option could technically be given any valid Python callable, but most metaclasses are subclasses of type. The metaclass receives the new class as its first argument and provides access to the class object along with the details regarding its declaration.

To help illustrate how the metaclass arguments are derived from a class definition, take the following code as an example.

>>> class MetaClass(type):
...     def __init__(cls, name, bases, attrs):
...         print('Defining %s' % cls)
...         print('Name: %s' % name)
...         print('Bases: %s' % (bases,))
...         print('Attributes:')
...         for (name, value) in attrs.items():
...             print('    %s: %r' % (name, value))
...
>>> class RealClass(object, metaclass=MetaClass):
...     spam = 'eggs'
...
Defining <class '__main__.RealClass'>
Name: RealClass
Bases: (<class 'object'>,)
Attributes:
    spam: 'eggs'
    __module__: '__main__'
    __qualname__: 'RealClass'
>>> RealClass
<class '__main__.RealClass'>

Notice that the class wasn’t instantiated at any time; the simple act of creating the class triggered execution of the metaclass. Notice __module__ in the list of attributes: this attribute is a standard part of all Python classes.

While this example uses the __init__ method to perform special processing on the newly created class, there is another, somewhat more powerful method called __new__, with the potential for a different set of possibilities. As described in later chapters, Django uses __new__ when configuring many of its classes.

COMPATIBILITY: METACLASSES IN PYTHON 2

Python 3 introduced the ability to pass arguments into a class definition, as shown here with the metaclass option. In Python 2, metaclasses were assigned to a class variable named __metaclass__. The effect is identical in both versions; it’s only a syntax change.

Using a Base Class with a Metaclass

Metaclasses can be quite useful, but the metaclass option is an implementation detail, which shouldn’t need to be part of the process when defining classes. Another problem is that while each class gets processed by the metaclass, they don’t inherit from any concrete class. This means that any additional functionality, such as common methods or attributes, would have to be provided during metaclass processing in order to be of any use.

With a bit of care, a concrete Python class can use a metaclass to solve both of these problems. Since subclasses inherit attributes from their parents, the metaclass option is automatically provided for all subclasses of a class that defines it. This is a simple, effective way to provide metaclass processing for arbitrary classes, without requiring that each class define the metaclass option. Following the example from the previous section, look what happens when we subclass RealClass.

>>> class SubClass(RealClass):  # Notice there's no metaclass here.
...     pass
...
Defining <class '__main__.SubClass'>
Name: SubClass
Bases: (<class '__main__.RealClass'>,)
Attributes:
    __module__: '__main__'

Notice how the subclass here doesn’t have to worry about the fact that there’s a metaclass in use behind the scenes. By just specifying a base class, it inherits all the benefits. Django uses this behavior to implement one of its most prominent features, described in the next section.

Declarative Syntax

Some of Django’s more prominent tools feature a “declarative syntax” that is simple to read, write and understand. This syntax is designed to make minimize “boilerplate” repetitive syntax and provide elegant, readable code. For example, here’s what a typical Django model and more might look like:

class Contact(models.Model):
    """
    Contact information provided when sending messages to the owner of the site.
    """
    name = models.CharField(max_length=255)
    email = models.EmailField()

This declarative syntax has become an identifying feature of Django code, so many third-party applications that supply additional frameworks are written to use a syntax similar to that of Django itself. This helps developers easily understand and utilize new code by making it all feel more cohesive. Once you understand how to create a class using declarative syntax, you’ll easily be able to create classes using many Django features, both official and community-provided.

Looking at declarative syntax on its own will demonstrate how easy it is to create an entirely new framework for Django that fits with this pattern. Using declarative syntax in your own code will help you and your colleagues more easily adapt to the code, ensuring greater productivity. After all, developer efficiency is a primary goal of Django and of Python itself.

While the next few sections describe declarative syntax in general, the examples shown are for Django’s object-relational mapper (ORM), detailed in Chapter 3.

Centralized Access

Typically, a package will supply a single module from which applications can access all the necessary utilities. This module may pull the individual classes and functions from elsewhere in its tree, so they can still use maintainable namespaces, but they will all be collected into one central location.

from django.db import models

Once imported, this module provides at least one class intended as the base class for subclasses based on the framework. Additional classes are provided to be used as attributes of the new subclass. Together, these objects will combine to control how the new class will work.

The Base Class

Each feature starts with at least one base class. There may be more, depending on the needs of the framework, but at least one will always be required in order to make this syntax possible. Without it, every class you ask your users to define will have to include a metaclass explicitly, which is an implementation detail most users shouldn’t need to know about.

class Contact(models.Model):

In addition to inspecting the defined attributes, this base class will provide a set of methods and attributes that the subclass will automatically inherit. Like any other class, it can be as simple or complex as necessary to provide whatever features the framework requires.

Attribute Classes

The module supplying the base class will also provide a set of classes to be instantiated, often with optional arguments to customize their behavior and assigned as attributes of a new class.

class Contact(models.Model):
    name = models.CharField(max_length=255)
    email = models.EmailField()

The features these objects provide will vary greatly across frameworks, and some may behave quite differently from a standard attribute. Often they will combine with the metaclass to provide some additional, behind-the-scenes functionality beyond simply assigning an attribute. Options to these attribute classes are usually read by the metaclass when creating this extra functionality.

For example, Django’s Model uses the names and options of field attributes to describe an underlying database table, which can then be created automatically in the database itself. Field names are used to access individual columns in that table, while the attribute class and options convert native Python data types to the appropriate database values automatically. More information on how Django handles model classes and fields is available in the next chapter.

Ordering Class Attributes

One potential point of confusion when using declarative syntax is that Python dictionaries are unordered, rather than respecting the order in which their values were assigned. Ordinarily this wouldn’t be a problem, but when inspecting a namespace dictionary it’s impossible to determine the order in which the keys were declared. If a framework needs to iterate through its special attributes, or display them to a user or programmer, it’s often useful to access these attributes in the same order they were defined. This gives the programmer final control over the order of the attributes, rather than some arbitrary ordering decided by the programming language.

A simple solution to this is to have the attributes themselves keep track of the instantiation sequence; the metaclass can then order them accordingly. This process works by having all attribute classes inherit from a particular base class, which can count how many times the class is instantiated and assign a number to each instance.

class BaseAttribute(object):
    creation_counter = 1
    def __init__(self):
        self.creation_counter = BaseAttribute.creation_counter
        BaseAttribute.creation_counter += 1

Object instances have a different namespace than classes, so all instances of this class will have a creation_counter, which can be used to sort the objects according to the order in which they were instantiated. This isn’t the only solution to this problem, but it’s how Django sorts fields for both models and forms.

Class Declaration

With all of these classes in a module, creating an application class is as simple as defining a subclass and some attributes. Different frameworks will have different names for the attribute classes, and will have different requirements as to which classes are required or the combinations in which they may be applied. They may even have reserved names that will cause conflicts if you define an attribute with that name, but such problems are rare, and reserving names should generally be discouraged when developing new frameworks for use with this syntax. The general rule is to allow developers to be as flexible as they’d need to be, without the framework getting in the way.

from django.db import models
 
class Contact(models.Model):
    """
    Contact information provided when sending messages to the owner of the site.
    """
    name = models.CharField(max_length=255)
    email = models.EmailField()

This simple code alone is enough to allow the framework to imbue the new class with a wealth of additional functionality, without requiring the programmer to deal with that process manually. Also note how all the attribute classes are provided from that same base module and are instantiated when assigned to the model.

A class declaration is never limited to only those features provided by the framework. Since any valid Python code is allowed, your classes may contain a variety of methods and other attributes, intermingled with a framework’s provided features.

Common Duck Typing Protocols

You’ve probably heard the old adage, “If it walks like a duck and talks like a duck, it’s a duck.” Shakespeare played on this idea a bit more romantically when he wrote in Romeo and Juliet, “That which we call a rose by any other name would smell as sweet.” The recurring theme here is that the name given to an object has no bearing on its true nature. The idea is that, regardless of labels, you can be reasonably sure what something is just by looking at its behavior.

In Python, and in some other languages, this concept is extended to refer to object types. Rather than relying on some base class or interface to define what an object can do, it simply implements the attributes and methods necessary to behave as expected. A common example of this in Python is a file-like object, which is any object that implements at least some of the same methods as a Python file object. In this way, many libraries may return their own objects that can be passed to other functions that expect a file object but while retaining special abilities, such as being read-only, compressed, encrypted, pulled from an Internet-connected source or any number of other possibilities.

Also, like interfaces in other languages, Python objects can be more than one type of duck at a time. It’s not uncommon, for instance, to have an object that can behave as a dictionary in some respects, while behaving like a list in others. Django’s HttpResponse object exhibits both of these behaviors, as well as mimicking an open file object.

In Django, many features utilize duck typing by not providing a particular base class. Instead, each feature defines a protocol of sorts, a set of methods and attributes that an object must provide in order to function properly. Many of these protocols are presented in the official Django documentation, and this book will cover many more. You will also see some of the special abilities that can be provided by using this technique.

The following sections describe a few common Python protocols that you’ll see throughout Django, and indeed throughout any large Python library.

Callables

Python allows code to be executed from a number of sources, and anything that can be executed in the same manner as a typical function is designated as callable. All functions, classes and methods are automatically callable, as would be expected, but instances of arbitrary object classes can be designated as callable as well, by providing a single method.

__call__(self[, …])

This method will be executed when the instantiated object is called as a function. It works just like any other member function, differing only in the manner in which it’s called.

>>> class Multiplier(object):
...     def __init__(self, factor):
...         self.factor = factor
...     def __call__(self, value):
...         return value * self.factor
...
>>> times2 = Multiplier(2)
>>> times2(5)
10
>>> times2(10)
20
>>> times3 = Multiplier(3)
>>> times3(10)
30

Python also provides a built-in function to assist in the identification of callable objects. The callable() function takes a single argument, returning True or False, indicating whether the object can be called as a function.

>>> class Basic(object):
...     pass
...
>>> class Callable(object):
...     def __call__(self):
...         return "Executed!"
...
>>> b = Basic()
>>> callable(b)
False
>>> c = Callable()
>>> callable(c)
True

Dictionaries

A dictionary is a mapping between keys and values within a single object. Most programming languages have dictionaries in some form; other languages call them “hashes,” “maps” or “associative arrays.” In addition to simple access to values by specifying a key, dictionaries in Python provide a number of methods for more fine-grained manipulation of the underlying mapping. To behave even more like a true dictionary, an object may provide other methods, documented in the Python Library Reference.3

__contains__(self, key)

Used by the in operator, this returns True if the specified key is present in the underlying mapping, and returns False otherwise. This should never raise an exception.

__getitem__(self, key)

This returns the value referenced by the specified key, if it exists. If the key is not present in the underlying mapping, it should raise a KeyError.

__setitem__(self, key, value)

This stores the specified value to be referenced later by the specified key. This should overwrite any existing value referenced by the same key, if such a mapping is already present.

>>> class CaseInsensitiveDict(dict):
...     def __init__(self, **kwargs):
...         for key, value in kwargs.items():
...             self[key.lower()] = value
...     def __contains__(self, key):
...         return super(CaseInsensitiveDict, self).__contains__(key.lower())
...     def __getitem__(self, key):
...         return super(CaseInsensitiveDict, self).__getitem__(key.lower())
...     def __setitem__(self, key, value):
...         super(CaseInsensitiveDict, self).__setitem__(key.lower(), value)
...
>>> d = CaseInsensitiveDict(SpAm='eggs')
>>> 'spam' in d
True
>>> d['SPAM']
'eggs'
>>> d['sPaM'] = 'burger'
>>> d['SpaM']
'burger'

Dictionaries are also expected to be iterable, with the list of keys used when code loops over a dictionary’s contents. Refer to the upcoming “Iterables” section for more information.

Files

As mentioned previously, files are a common way to access information, and many Python libraries provide file-like objects for use with other file-related functions. A file-like object doesn’t need to supply all of the following methods, just those that are necessary to function properly. In the case of the file protocol, objects are free to implement read access, write access or both. Not all methods are listed here, only the most common. A full list of file methods is available in the Python standard library documentation, so be sure to check there for more details.4

read(self, [size])

This retrieves data from the object or its source of information. The optional size argument contains the number of bytes to be retrieved. Without this argument, the method should return as many bytes as possible (often the entire file, if available, or perhaps all the bytes available on a network interface).

write(self, str)

This writes the specified str to the object or its source of information.

close(self)

This closes the file so it can no longer be accessed. This can be used to free any memory resources that have been allocated, to commit the object’s contents to disk or simply to satisfy the protocol. Even if this method provides no special functionality, it should be provided to avoid unnecessary errors.

A VERY LOOSE PROTOCOL

File-like objects come in many varieties, because this protocol is one of the loosest defined in all of Python. There are quite a few features, from buffering output to allowing random access to data, that are inappropriate in some situations, so objects designed for those situations will typically just not implement the corresponding methods. For example, Django’s HttpResponse object, described in Chapter 7, only allows writes in sequence, so it doesn’t implement read(), seek() or tell(), causing errors when used with certain file-manipulation libraries.

The common approach in situations like this is to simply leave any inappropriate methods unimplemented so that trying to access them raises an AttributeError. In other cases, a programmer may decide it’s more useful to implement them but simply raise a NotImplementedError to display a more descriptive message. Just make sure to always document how much of the protocol your object obeys, so users aren’t surprised if these errors occur while trying to use them as standard files, especially in third-party libraries.

Iterables

An object is considered iterable if passing it to the built-in iter() returns an iterator. iter() is often called implicitly, as in a for loop. All lists, tuples and dictionaries are iterable, and any new-style class can be made iterable by defining the following method.

__iter__(self)

This method is called implicitly by iter() and is responsible for returning an iterator that Python can use to retrieve items from the object. The iterator returned is often implied by defining this method as a generator function, described in the upcoming “Generators” section.

>>> class Fibonacci(object):
...     def __init__(self, count):
...         self.count = count
...     def __iter__(self):
...         a, b = 0, 1
...         for x in range(self.count):
...             if x < 2:
...                 yield x
...             else:
...                 c = a + b
...                 yield c
...                 a, b = b, c
...
>>> for x in Fibonacci(5):
...     print(x)
...
0
1
1
2
3
>>> for x in Fibonacci(10):
...     print(x)
...
0
1
1
2
3
5
8
13
21
34

Iterators

When iter() is called with an object, it’s expected to return an iterator, which can then be used to retrieve items for that object in sequence. Iterators are a simple method of one-way travel through the available items, returning just one at a time until there are no more to use. For large collections, accessing items one by one is much more efficient than first gathering them all into a list.

next(self)

The only method required for an iterator, this returns a single item. How that item is retrieved will depend on what the iterator is designed for, but it must return just one item. After that item has been processed by whatever code called the iterator, next() will be called again to retrieve the next item.

Once there are no more items to be returned, next() is also responsible for telling Python to stop using the iterator and to move on after the loop. This is done by raising the StopIteration exception. Python will continue calling next() until an exception is raised, causing an infinite loop. Either StopIteration should be used to stop the loop gracefully or another exception should be used to indicate a more serious problem.

class FibonacciIterator(object):
    def __init__(self, count):
        self.a = 0
        self.b = 1
        self.count = count
        self.current = 0
 
    def __next__(self):
        self.current += 1
        if self.current > self.count:
            raise StopIteration
        if self.current < 3:
            return self.current - 1
        c = self.a + self.b
        self.a = self.b
        self.b = c
        return c
    next = __next__
 
    def __iter__(self):
        # Since it's already an iterator, this can return itself.
        return self
 
class Fibonacci(object):
    def __init__(self, count):
        self.count = count
 
    def __iter__(self):
        return FibonacciIterator(self.count)

Note that iterators don’t explicitly need to define __iter__() in order to be used properly, but including that method allows the iterator to be used directly in loops.

COMPATIBILITY: ITERATORS IN PYTHON 2

There’s only one very minor change to iterators in Python 3. The __next__() method shown here used to be called next(). Note the missing underscores. This was changed to respect Python’s convention of identifying magic methods like this with double underscores before and after the name of the method.

If you need to support Python 2 and 3 together, the solution is fairly simple. After you define __next__() as shown in our Fibonacci example, you can just assign the __next__() method to next on the method directly: next = __next__. This can be done anywhere inside the class definition, but it’s usually best right after the end of the __next__() method, to keep things tidy.

Generators

As illustrated in the Fibonacci examples, generators are a convenient shortcut to create simple iterators without having to define a separate class. Python uses the presence of the yield statement to identify a function as a generator, which makes it behave a bit differently from other functions.

When calling a generator function, Python doesn’t execute any of its code immediately. Instead, it returns an iterator whose next() method will then call the body of the function, up to the point where the first yield statement occurs. The expression given to the yield statement is used as the next() method’s return value, allowing whatever code called the generator to get a value to work with.

The next time next() is called on the iterator, Python continues executing the generator function right where it left off, with all of its variables intact. This repeats as long as Python encounters yield statements, typically with the function using a loop to keep yielding values. Whenever the function finishes without yielding a value, the iterator automatically raises StopIteration to indicate that the loop should be ended and the rest of the code can continue.

Sequences

While iterables simply describe an object that retrieves one value at a time, these values are often all known in advance and collected on a single object. This is a sequence. The most common types are lists and tuples. As iterables, sequences also use the __iter__() method to return their values one by one, but since these values are also known in advance, some extra features are available.

__len__(self)

With all the values available, sequences have a specific length, which can be determined using the built-in len() function. Behind the scenes, len() checks to see if the object it’s given has a __len__() method and uses that to get the length of the sequence. To accomplish this, __len__() should return an integer containing the number of items in the sequence.

Technically, __len__() doesn’t require that all the values be known in advance, just how many there are. And since there can’t be partial items—an item either exists or it doesn’t—__len__() should always return an integer. If it doesn’t, len() will coerce it to an integer anyway.

>>> class FibonacciLength(Fibonacci):
...     def __len__(self):
...         return self.count
...
>>> len(FibonacciLength(10))
10
>>> len(FibonacciLength(2048))
2048

__getitem__(self) and __setitem__(self, value)

All the values in a sequence are already ordered as well, so it’s possible to access individual values by their index within the sequence. Since the syntax used for this type of access is identical to that of dictionary keys, Python reuses the same two methods that were previously described for dictionaries. This allows a sequence to customize how individual values are accessed or perhaps restrict setting new values to the sequence, making it read-only.

Augmenting Functions

In addition to standard declarations and calls, Python provides options that allow you to invoke functions in interesting ways. Django uses these techniques to help with efficient code reuse. You can use these same techniques in your applications as well; they are standard parts of Python.

Excess Arguments

It’s not always possible to know what arguments will be provided to a function at runtime. This is often the case in Django, where class methods are defined in source even before a subclass itself is customized appropriately. Another common situation is a function that can act on any number of objects. In still other cases, the function call itself can be made into a sort of API for other applications to utilize.

For these situations, Python provides two special ways to define function arguments, which allow the function to accept excess arguments not handled by the explicitly declared arguments. These “extra” arguments are explained next.

Note that the names args and kwargs are merely Python conventions. As with any function argument, you may name them whatever you like, but consistency with standard Python idioms makes your code more accessible to other programmers.

Positional Arguments

Using a single asterisk before an argument name allows the function to accept any number of positional arguments.

>>> def multiply(*args):
...     total = 1
...     for arg in args:
...         total *= arg
...     return total
...
>>> multiply(2, 3)
6
>>> multiply(2, 3, 4, 5, 6)
720

Python collects the arguments into a tuple, which is then accessible as the variable args. If no positional arguments are provided beyond those explicitly declared, this argument will be populated with an empty tuple.

Keyword Arguments

Python uses two asterisks before the argument name to support arbitrary keyword arguments.

>>> def accept(**kwargs):
...     for keyword, value in kwargs.items():
...         print("%s -> %r" % (keyword, value))
...
>>> accept(foo='bar', spam='eggs')
foo -> 'bar'
spam -> 'eggs'

Notice that kwargs is a normal Python dictionary containing the argument names and values. If no extra keyword arguments are provided, kwargs will be an empty dictionary.

Mixing Argument Types

Arbitrary positional and keyword arguments may be used with other standard argument declarations. Mixing them requires some care, as their order is important to Python. Arguments can be classified into four categories, and while not all categories are required, they must be defined in the following order, skipping any that are unused.

  • Required arguments
  • Optional arguments
  • Excess positional arguments
  • Excess keyword arguments
def complex_function(a, b=None, *c, **d):

This order is required because *args and **kwargs only receive those values that couldn’t be placed in any other arguments. Without this order, when you call a function with positional arguments, Python would be unable to determine which values are intended for the declared arguments and which should be treated as an excess positional argument.

Also note that, while functions can accept any number of required and optional arguments, they may only define one of each of the excess argument types.

Passing Argument Collections

In addition to functions being able to receive arbitrary collections of arguments, Python code may call functions with any number of arguments, using the asterisk notation previously described. Arguments passed in this way are expanded by Python into a normal list of arguments, so that the function being called doesn’t need to plan for excess arguments in order to be called like this. Any Python callable may be called using this notation, and it may be combined with standard arguments using the same ordering rules.

>>> def add(a, b, c):
...     return a + b + c
...
>>> add(1, 2, 3)
6
>>> add(a=4, b=5, c=6)
15
>>> args = (2, 3)
>>> add(1, *args)
6
>>> kwargs = {'b': 8, 'c': 9}
>>> add(a=7, **kwargs)
24
>>> add(a=7, *args)
Traceback (most recent call last):
  ...
TypeError: add() got multiple values for keyword argument 'a'
>>> add(1, 2, a=7)
Traceback (most recent call last):
  ...
TypeError: add() got multiple values for keyword argument 'a'

As illustrated in the final lines of this example, take special care if explicitly passing any keyword arguments while also passing a tuple as excess positional arguments. Since Python will expand the excess arguments using the ordering rules, the positional arguments would come first. In the example, the last two calls are identical, and Python can’t determine which value to use for a.

Decorators

Another common way to alter the way a function behaves is to “decorate” it with another function. This is also often called “wrapping” a function, as decorators are designed to execute additional code before or after the original function gets called.

The key principle behind decorators is that they accept callables and return new callables. The function returned by the decorator is the one that will be executed when the decorated function is called later. Care must be taken to make sure that the original function isn’t lost in the process, as there wouldn’t be any way to get it back without reloading the module.

Decorators can be applied in a number of ways, either to a function you’re defining directly or to a function that was defined elsewhere. As of Python 2.4, decorators on newly defined functions can use a special syntax. In previous versions of Python, a slightly different syntax is necessary, but the same code can be used in both cases; the only difference is the syntax used to apply the decorator to the intended function.

>>> def decorate(func):
...     print('Decorating %s...' % func.__name__)
...     def wrapped(*args, **kwargs):
...         print("Called wrapped function with args:", args)
...         return func(*args, **kwargs)
...     print('done!')
...     return wrapped
...
 
# Syntax for Python 2.4 and higher
 
>>> @decorate
... def test(a, b):
...     return a + b
...
Decorating test...
done!
>>> test(13, 72)
Called wrapped function with args: (13, 72)
85
 
# Syntax for Python 2.3
 
>>> def test(a, b):
...     return a + b
...
>>> test = decorate(test)
Decorating test...
done!
>>> test(13, 72)
Called wrapped function with args: (13, 72)
85

The older syntax in this example is another technique for decorating functions, which can be used in situations where the @ syntax isn’t available. Consider a function that’s been declared elsewhere but would benefit from being decorated. Such a function can be passed to a decorator, which then returns a new function with everything all wrapped up. Using this technique, any callable, regardless of where it comes from or what it does, can be wrapped in any decorator.

Decorating with Extra Arguments

Sometimes, a decorator needs additional information to determine what it should do with the function it receives. Using the older decorator syntax, or when decorating arbitrary functions, this task is fairly easy to perform. Simply declare the decorator to accept additional arguments for the required information so they can be supplied along with the function to be wrapped.

>>> def test(a, b):
...     return a + b
...
>>> def decorate(func, prefix='Decorated'):
...     def wrapped(*args, **kwargs):
...         return '%s: %s' % (prefix, func(*args, **kwargs))
...     return wrapped
...
>>> simple = decorate(test)
>>> customized = decorate(test, prefix='Custom')
>>> simple(30, 5)
'Decorated: 35'
>>> customized(27, 15)
'Custom: 42'

However, the Python 2.4 decorator syntax complicates things. When using this new syntax, the decorator always receives just one argument: the function to be wrapped. There is a way to get extra arguments into decorators, but first we’ll need to digress a bit and talk about “partials.”

Partial Application of Functions

Typically, functions are called with all the necessary arguments at the time the function should be executed. Sometimes, however, arguments may be known in advance, long before the function will be called. In these cases, a function can have one or more of its arguments applied beforehand so that the function can be called with fewer arguments.

For this purpose, Python 2.5 includes the partial object as part of its functools module. It accepts a callable along with any number of additional arguments and returns a new callable, which will behave just like the original, only without having to specify those preloaded arguments at a later point.

>>> import functools
>>> def add(a, b):
...     return a + b
...
>>> add(4, 2)
6
>>> plus3 = functools.partial(add, 3)
>>> plus5 = functools.partial(add, 5)
>>> plus3(4)
7
>>> plus3(7)
10
>>> plus5(10)
15

For versions of Python older than 2.5, Django provides its own implementation of partial in the curry function, which lives in django.utils.functional. This function works on Python 2.3 and greater.

Back to the Decorator Problem

As mentioned previously, decorators using the Python 2.4 syntax present a problem if they accept additional arguments, since that syntax only provides a single argument on its own. Using the partial application technique, it’s possible to preload arguments even on a decorator. Given the decorator described earlier, the following example uses curry (described in Chapter 9) to provide arguments for decorators using the newer Python 2.4 syntax.

>>> from django.utils.functional import curry
>>> @curry(decorate, prefix='Curried')
... def test(a, b):
...     return a + b
...
>>> test(30, 5)
'Curried: 35'
>>> test(27, 15)
'Curried: 42'

This is still rather inconvenient, since the function needs to be run through curry every time it’s used to decorate another function. A better way would be to supply this functionality directly in the decorator itself. This requires some extra code on the part of the decorator, but including that code makes it easier to use.

The trick is to define the decorator inside another function, which will accept the arguments. This new outer function then returns the decorator, which is then used by Python’s standard decorator handling. The decorator, in turn, returns a function that will be used by the rest of the program after the decoration process is complete.

As this is all fairly abstract, consider the following, which provides the same functionality as in previous examples but without relying on curry, making it easier to deal with.

>>> def decorate(prefix='Decorated'):
...     # The prefix passed in here will be
...     # available to all the inner functions
...     def decorator(func):
...         # This is called with func being the
...         # actual function being decorated
...         def wrapper(*args, **kwargs):
...             # This will be called each time
...             # the real function is executed
...             return '%s: %s' % (prefix, func(*args, **kwargs))
...         # Send the wrapped function
...         return wrapper
...     # Provide the decorator for Python to use
...     return decorator
...
>>> @decorate('Easy')
... def test(a, b):
...     return a + b
...
>>> test(13, 17)
'Easy: 30'
>>> test(89, 121)
'Easy: 210'

This technique makes the most sense in situations where arguments are expected. If the decorator is applied without any arguments, parentheses are still required in order for it to work at all properly.

>>> @decorate()
... def test(a, b):
...     return a + b
...
>>> test(13, 17)
'Decorated: 30'
>>> test(89, 121)
'Decorated: 210'
>>> @decorate
... def test(a, b):
...     return a + b
...
>>> test(13, 17)
Traceback (most recent call last):
  ...
TypeError: decorator() takes exactly 1 argument (2 given)

The second example fails because we didn’t first call decorate. Thus, all subsequent calls to test send their arguments to decorator instead of test. Since this is a mismatch, Python throws an error. This situation can be a bit difficult to debug because the exact exception that will be raised will depend on the function being wrapped.

A Decorator with or without Arguments

One other option for decorators is to provide a single decorator that can function in both of the previous situations: with arguments and without. This is more complex but worth exploring.

The goal is to allow the decorator to be called with or without arguments so it’s safe to assume that all arguments are optional; any decorator with required arguments can’t use this technique. With that in mind, the basic idea is to add an extra optional argument at the beginning of the list, which will receive the function to be decorated. Then, the decorator structure includes the necessary logic to determine whether it’s being called to add arguments or to decorate the target function.

>>> def decorate(func=None, prefix='Decorated'):
...     def decorated(func):
...         # This returns the final, decorated
...         # function, regardless of how it was called
...         def wrapper(*args, **kwargs):
...             return '%s: %s' % (prefix, func(*args, **kwargs))
...         return wrapper
...     if func is None:
...         # The decorator was called with arguments
...         def decorator(func):
...             return decorated(func)
...         return decorator
...     # The decorator was called without arguments
...     return decorated(func)
...
>>> @decorate
... def test(a, b):
...     return a + b
...
>>> test(13, 17)
'Decorated: 30'
>>> @decorate(prefix='Arguments')
... def test(a, b):
...     return a + b
...
>>> test(13, 17)
'Arguments: 30'

This requires that all arguments passed to the decorator be passed as keyword arguments, which generally makes for more readable code. One downside is how much boilerplate would have to be repeated for each decorator that uses this approach.

Thankfully, like most boilerplate in Python, it’s possible to factor it out into a reusable form, so new decorators can be defined more easily, using yet another decorator. The following function can be used to decorate other functions, providing all the functionality necessary to accept arguments, or it can be used without them.

>>> def optional_arguments_decorator(real_decorator):
...     def decorator(func=None, **kwargs):
...         # This is the decorator that will be
...         # exposed to the rest of your program
...         def decorated(func):
...             # This returns the final, decorated
...             # function, regardless of how it was called
...             def wrapper(*a, **kw):
...                 return real_decorator(func, a, kw, **kwargs)
...             return wrapper
...         if func is None:
...             # The decorator was called with arguments
...             def decorator(func):
...                 return decorated(func)
...             return decorator
...         # The decorator was called without arguments
...         return decorated(func)
...     return decorator
...
>>> @optional_arguments_decorator
... def decorate(func, args, kwargs, prefix='Decorated'):
...     return '%s: %s' % (prefix, func(*args, **kwargs))
...
>>> @decorate
... def test(a, b):
...     return a + b
...
>>> test(13, 17)
'Decorated: 30'
>>> test = decorate(test, prefix='Decorated again')
>>> test(13, 17)
'Decorated again: Decorated: 30'

This makes the definition of individual decorators much simpler and more straightforward. The resulting decorator behaves exactly like the one in the previous example, but it can be used with or without arguments. The most notable change that this new technique requires is that the real decorator being defined will receive the following three values:

  • func—The function that was decorated using the newly generated decorator
  • args—A tuple containing positional arguments that were passed to the function
  • kwargs—A dictionary containing keyword arguments that were passed to the function

An important thing to realize, however, is that the args and kwargs that the decorator receives are passed as positional arguments, without the usual asterisk notation. Then, when passing them on to the wrapped function, the asterisk notation must be used to make sure the function receives them without having to know about how the decorator works.

g

Descriptors

Ordinarily, referencing an attribute on an object accesses the attribute’s value directly, without any complications. Getting and setting attributes directly affects the value in the object’s instance namespace. Sometimes, additional work has to be done when accessing these values.

  • Retrieving data from a complicated source, such as a database or configuration file
  • Transforming a simple value to a complicated object or data structure
  • Customizing a value for the object it’s attached to
  • Converting a value to a storage-ready format before saving to a database

In some programming languages, this type of behavior is made possible by creating extra instance methods for accessing those attributes that need it. While functional, this approach leads to a few problems. For starters, these behaviors are typically more associated with the type of data stored in the attribute than some aspect of the instance it’s attached to. By requiring that the object supply additional methods for accessing this data, every object that contains this behavior will have to provide the necessary code in its instance methods.

One other significant issue is what happens when an attribute that used to be simple suddenly needs this more advanced behavior. When changing from a simple attribute to a method, all references to that attribute also need to be changed. To avoid this, programmers in these languages have adopted a standard practice of always creating methods for attribute access so that any changes to the underlying implementation won’t affect any existing code.

It’s never fun to touch that much of your code for a change to how one attribute is accessed, so Python provides a different approach to the problem. Rather than requiring the object to be responsible for special access to its attributes, the attributes themselves can provide this behavior. Descriptors are a special type of object that, when attached to a class, can intervene when the attribute is accessed, providing any necessary additional behavior.

>>> import datetime
>>> class CurrentDate(object):
...     def __get__(self, instance, owner):
...         return datetime.date.today()
...     def __set__(self, instance, value):
...         raise NotImplementedError("Can't change the current date.")
...
>>> class Example(object):
...     date = CurrentDate()
...
>>> e = Example()
>>> e.date
datetime.date(2008, 11, 24)
>>> e.date = datetime.date.today()
Traceback (most recent call last):
  ...
NotImplementedError: Can't change the current date.

Creating a descriptor is as simple as creating a standard new-style class (by inheriting from object under Python 2.x), and specifying at least one of the following methods. The descriptor class can include any other attributes or methods as necessary to perform the tasks it’s responsible for, while the following methods constitute a kind of protocol that enables this special behavior.

__get__(self, instance, owner)

When retrieving the value of an attribute (value = obj.attr), this method will be called instead, allowing the descriptor to do some extra work before returning the value. In addition to the usual self representing the descriptor object, this getter method receives two arguments.

  • instance—The instance object containing the attribute that was referenced. If the attribute was referenced as an attribute of a class rather than an instance, this will be None.
  • owner—The class where the descriptor was assigned. This will always be a class object.

The instance argument can be used to determine whether the descriptor was accessed from an object or its class. If instance is None, the attribute was accessed from the class rather than an instance. This can be used to raise an exception if the descriptor is being accessed in a way that it shouldn’t.

Also, by defining this method, you make the descriptor responsible for retrieving and returning a value to the code that requested it. Failing to do so will force Python to return its default return value of None.

Note that, by default, descriptors don’t know what name they were given when declared as attributes. Django models provide a way to get around this, which is described in Chapter 3, but apart from that, descriptors only know about their data, not their names.

__set__(self, instance, value)

When setting a value to a descriptor (obj.attr = value), this method is called so that a more specialized process can take place. Like __get__, this method receives two arguments in addition to the standard self.

  • instance—The instance object containing the attribute that was referenced. This will never be None.
  • value—The value being assigned.

Also note that the __set__ method of descriptors will only be called when the attribute is assigned on an object and will never be called when assigning the attribute on the class where the descriptor was first assigned. This behavior is by design, and prohibits the descriptor from taking complete control over its access. External code can still replace the descriptor by assigning a value to the class where it was first assigned.

Also note that the return value from __set__ is irrelevant. The method itself is solely responsible for storing the supplied value appropriately.

Keeping Track of Instance Data

Since descriptors short-circuit attribute access, you need to take care when setting values on the attached object. You can’t simply set the value on the object using setattr; attempting to do so will call the descriptor again, resulting in infinite recursion.

Python provides another way to access an object’s namespace: the __dict__ attribute. Available on all Python objects, __dict__ is a dictionary representing all values in the object’s namespace. Accessing this dictionary directly bypasses all of Python’s standard handling with regard to attributes, including descriptors. Using this, a descriptor can set a value on an object without triggering itself. Consider the following example.

>>> class Descriptor(object):
...     def __init__(self, name):
...         self.name = name
...     def __get__(self, instance, owner):
...         return instance.__dict__[self.name]
...     def __set__(self, instance, value):
...         instance.__dict__[self.name] = value
...
>>> class TestObject(object):
...     attr = Descriptor('attr')
...
>>> test = TestObject()
>>> test.attr = 6
>>> test.attr
6

Unfortunately, this technique requires giving the attribute’s name to the descriptor explicitly. You can work around this with some metaclass tricks; Django’s model system (discussed in Chapter 3) shows one possible workaround.

Introspection

Many Python objects carry metadata beyond the code they execute. This information can be quite useful when working with a framework or writing your own.

Python’s introspection tools can help greatly when trying to develop reusable applications, as they allow Python code to retrieve information about what a programmer wrote without requiring the programmer to write it all over again.

Some of the features described in this section rely on a powerful standard library module, inspect. The inspect module provides convenient functions to perform advanced introspection.

Only some of inspect’s many uses will be detailed here, as they hold the most value to applications written using Django. For full details of the many other options available in this module, consult the Python Standard Library documentation.5

MORE ON OLD-STYLE CLASSES

The examples shown in this section are all for new-style classes, which, as described earlier in this chapter, will behave differently from old-style classes, especially with regards to introspection. The exact differences are beyond the scope of this book, since the usual recommendation is to simply use new-style classes.

If any of your code seems to behave differently than what’s described here, make sure that all your classes inherit from object, which will make them proper new-style classes.

Common Class and Function Attributes

All classes and functions provide a few common attributes that can be used to identify them.

  • __name__—The name that was used to declare the class or function
  • __doc__—The docstring that was declared for the function
  • __module__—The import path of the module where the class or function was declared

In addition, all objects contain a special attribute, __class__, which is the actual class object used to create the object. This attribute can be used for a variety of purposes, such as testing to see whether the class provided a particular attribute or if it was set on the object itself.

>>> class ValueClass(object):
...     source = 'The class'
...
>>> value_instance = ValueClass()
>>> value_instance.source = 'The instance'
>>> value_instance.__class__
<class '__main__.ValueClass'>
>>> value_instance.source
'The instance'
>>> value_instance.__class__.source
'The class'

Identifying Object Types

Since Python uses dynamic typing, any variable could be an object of any available type. While the common principle of duck typing recommends that objects simply be tested for support of a particular protocol, it’s often useful to identify what type of object you’re dealing with. There are a few ways to handle this.

Getting Arbitrary Object Types

It’s easy to determine the type of any Python object using the built-in type described earlier. Calling type with a single argument will return a type object, often a class, which was instantiated to produce the object.

>>> type('this is a string')
<type 'str'>
>>> type(42)
<type 'int'>
>>> class TestClass(object):
...     pass
...
>>> type(TestClass)
<type 'type'>
>>> obj = TestClass()
>>> type(obj)
<class '__main__.TestClass'>

This approach usually isn’t the best way to determine the type of an object, particularly if you’re trying to decide what branch of execution to follow based on an object’s type. It only tells you the one specific class that’s being used, even though subclasses should likely be considered for the same branch of execution. Instead, this approach should be used in situations where the object’s type isn’t necessary for a decision but rather is being output somewhere, perhaps to the user to a log file.

For example, when reporting exceptions, it’s quite useful to include the exception’s type along with its value. In these situations, type can be used to return the class object, and its __name__ attribute can then be included in the log, easily identifying the exception’s type.

Checking for Specific Types

More often, you’ll need to check for the influence of a particular type, whether a class descends from it or whether an object is an instance of it. This is a much more robust solution than using type, as it takes class inheritance into account when determining success or failure.

Python provides two built-in functions for this purpose.

  • issubclass(cls, base)—Returns True if cls and base are the same, or if cls inherits from base somewhere in its ancestry
  • isinstance(obj, base)—Tests if the object is an instance of base or any of its ancestors
>>> class CustomDict(dict):
...     pass    # Pretend there's something more useful here
...
>>> issubclass(CustomDict, dict)
True
>>> issubclass(CustomDict, CustomDict)
True
>>> my_dict = CustomDict()
>>> isinstance(my_dict, dict)
True
>>> isinstance(my_dict, CustomDict)
True

There’s a clear relationship between issubclass and isinstance: isinstance(obj, SomeClass) is equivalent to issubclass(obj.__class__, SomeClass).

Function Signatures

As described earlier in this chapter, Python functions can be declared in a number of ways, and it can be quite useful to have access to information about their declarations directly inside your code.

Of particular importance when inspecting functions is inspect.getargspec(), a function that returns information about what arguments a function accepts. It accepts a single argument, the function object to be inspected, and returns a tuple of the following values:

  • args—A list of all argument names specified for the function. If the function doesn’t accept any arguments, this will be an empty list.
  • varargs—The name of the variable used for excess positional arguments, as described previously. If the function doesn’t accept excess positional arguments, this will be None.
  • varkwargs—The name of the variable used for excess keyword arguments, as described previously. If the function doesn’t accept excess keyword arguments, this will be None.
  • defaults—A tuple of all default values specified for the function’s arguments. If none of the arguments specify a default value, this will be None rather than an empty tuple.

Together, these values represent everything necessary to know how to call the function in any way possible. This can be useful when receiving a function and calling it with just the arguments that are appropriate for it.

>>> def test(a, b, c=True, d=False, *e, **f):
...     pass
...
>>> import inspect
>>> inspect.getargspec(test)
ArgSpec(args=['a', 'b', 'c', 'd'], varargs='e', keywords='f', defaults=(True, False))

Handling Default Values

As the previous example illustrates, default values are returned in a separate list from argument names, so it may not seem obvious how to tell which arguments specify which defaults. However, there’s a relatively simple way to handle this situation, based on a minor detail from the earlier discussion of excess arguments: required arguments must always be declared before any optional arguments.

This is key because it means the arguments and their defaults are specified in the order they were declared in the function. So in the previous example, the fact that there are two default values means that the last two arguments are optional, and the defaults line up with them in order. The following code could be used to create a dictionary mapping the optional argument names to the default values declared for them.

>>> def get_defaults(func):
...     args, varargs, varkwargs, defaults = inspect.getargspec(func)
...     index = len(args) - len(defaults) # Index of the first optional argument
...     return dict(zip(args[index:], defaults))
...
>>> get_defaults(test)
{'c': True, 'd': False}

Docstrings

As mentioned previously, classes and functions all have a special __doc__ attribute, which contains the actual string specified as the code’s docstring. Unfortunately, this is formatted exactly as it was in the original source file, including extra line breaks and unnecessary indentation.

To format docstrings in a more readable manner, Python’s inspect module provides another useful function, getdoc(). It removes unnecessary line breaks, as well as any extra indentation that was a side effect of where the docstring was written.

The removal of indentation merits a bit of explanation. Essentially, getdoc() finds the leftmost non-whitespace character in the string, counts up all the whitespace between that character and the start of the line it’s in, and removes that amount of whitespace from all the other lines in the docstring. This way, the resulting string is left-justified but retains any additional indents that exist for the sake of formatting the documentation.

>>> def func(arg):
...     """
...     Performs a function on an argument and returns the result.
...
...     arg
...         The argument to be processed
...     """
...     pass
...
>>> print(func.__doc__)
 
    Performs a function on an argument and returns the result.
 
    arg
        The argument to be processed
 
>>> print(inspect.getdoc(func))
Performs a function on an argument and returns the result.
 
arg
    The argument to be processed

In situations where docstrings should be displayed to users, such as automated documentation or help systems, getdoc() provides a useful alternative to the raw docstring.

Applied Techniques

There are innumerable combinations of Python features that can be used to accomplish a vast multitude of tasks, so the few shown here should by no means be considered an exhaustive list of what can be done by combining the many features of Python. However, these are useful tactics in terms of Django, and serve as a solid basis for the other techniques listed throughout this book.

Tracking Subclasses

Consider an application that must, at any given time, have access to a list of all subclasses of a particular class. Metaclasses are a terrific way to go about this, but they have one problem. Remember, each class with a metaclass option will be processed, including this new base class, which doesn’t need to be registered (only its subclasses should be registered). This requires some extra handling, but it’s fairly straightforward:

>>> class SubclassTracker(type):
...     def __init__(cls, name, bases, attrs):
...         try:
...             if TrackedClass not in bases:
...                 return
...         except NameError:
...             return
...         TrackedClass._registry.append(cls)
...
>>> class TrackedClass(metaclass=SubclassTracker)
...     _registry = []
...
>>> class ClassOne(TrackedClass):
...     pass
...
>>> TrackedClass._registry
[<class '__main__.ClassOne'>]
>>> class ClassTwo(TrackedClass):
...     pass
...
>>> TrackedClass._registry
[<class '__main__.ClassOne'>, <class '__main__.ClassTwo'>]

The metaclass performs two functions. First, the try block makes sure that the parent class, TrackedClass, has already been defined. If it hasn’t been, a NameError is raised, indicating that the metaclass is currently processing TrackedClass itself. Here, more processing could be done for TrackedClass, but the example simply ignores it, allowing it to bypass the registration.

In addition, the if clause makes sure that another class hasn’t specified SubclassTracker explicitly as its metaclass option. The application only wants to register subclasses of TrackedClass, not other classes that might not fit the proper requirements for the application.

Any application author who wants to use a declarative syntax similar to Django’s could use this technique to provide a common base class, from which specific classes can be created. Django uses this process for both its models and its forms so that its declarative syntax can be fairly consistent throughout the framework.

If Python makes it through those tests without bailing out early, the class is added to the registry, where all subclasses of TrackedClass can be retrieved at any time. Any subclasses of TrackedClass will show up in this registry, regardless of where the subclass is defined. Executing the class definition will be sufficient to register it; that way, the application can import any modules that might have the necessary classes and the metaclass does the rest.

Though its registry provides many more features than a simple list, Django uses an extension of this technique to register models, since they must each extend a common base class.

A Simple Plugin Architecture

In reusable applications, it’s usually desirable to have a well-defined core set of features, combined with the ability to extend those features through the use of plugins. While this may seem like a tall order that might require extensive plugin architecture libraries, it can be done quite simply and entirely in your own code. After all, a successful, loosely-coupled plugin architecture comes down to providing just three things:

  • A clear, readable way to declare a plugin and make it available to code that needs to use it
  • A simple way to access all the plugins that have been declared
  • A way to define a neutral point between plugins and the code that uses them, where the plugins should be registered and accessed

Armed with this simple list of requirements and a healthy understanding of what Python has to offer, a few simple lines of code can combine to fulfill these requirements.

class PluginMount(type):
    def __init__(cls, name, bases, attrs):
        if not hasattr(cls, 'plugins'):
            # This branch only executes when processing the mount point itself.
            # So, since this is a new plugin type, not an implementation, this
            # class shouldn't be registered as a plugin. Instead, it sets up a
            # list where plugins can be registered later.
            cls.plugins = []
        else:
            # This must be a plugin implementation, which should be registered.
            # Simply appending it to the list is all that's needed to keep
            # track of it later.
            cls.plugins.append(cls)

That’s all it takes to get the whole thing working, keeping track of registered plugins and storing them in a list on the plugins attribute. All that’s left is to work out how to achieve each of the points listed earlier. For the following examples, we’ll create an application for validating the strength of a user’s password.

The first step will be the neutral access point, which I’ll call a mount point, from which each side of the equation can access the other. As mentioned before, this relies on metaclasses, so that’s a good place to start.

class PasswordValidator(metaclass=PluginMount):
    """
    Plugins extending this class will be used to validate passwords.
    Valid plugins must provide the following method.
 
    validate(self, password)
        Receives a password to test, and either finishes silently or raises a
        ValueError if the password was invalid. The exception may be displayed
        to the user, so make sure it adequately describes what's wrong.
    """

You could add more to this if you want, but what’s here is the only part that’s essential to get the process working properly. When looking to add more to it, just know that individual plugins will subclass it and will thus inherit anything else you define on this class. It’s a handy way of providing additional attributes or helper methods that would be useful for all the plugins to have available. Individual plugins can override them anyway, so nothing would be set in stone.

Also note that the plugin mount point should contain documentation relating to how plugins will be expected to behave. While this isn’t expressly required, it’s a good practice to get into, as doing so will make it easier for others to implement plugins. The system only works if all the registered plugins conform to a specified protocol; make sure it’s specified.

Next, set up your code to access any plugins that were registered, using them in whatever way makes sense for the application. Since the mount point already maintains its own list of known plugins, all it takes is to cycle through the plugins and use whatever attributes or methods are appropriate for the task at hand.

def is_valid_password(password):
    """
    Returns True if the password was fine, False if there was a problem.
    """
    for plugin in PasswordValidator.plugins:
        try:
            plugin().validate(password)
        except ValueError:
            return False
    return True
 
def get_password_errors(password):
    """
    Returns a list of messages indicating any problems that were found
    with the password. If it was fine, this returns an empty list.
    """
    errors = []
    for plugin in PasswordValidator.plugins:
        try:
            plugin().validate(password)
        except ValueError as e:
            errors.append(str(e))
    return errors

These examples are a bit more complicated than most, since they require error handling, but it’s still a very simple process. Simply iterating over the list will provide each of the plugins for use. All that’s left is to build some plugins to provide this validation behavior.

class MinimumLength(PasswordValidator):
    def validate(self, password):
        "Raises ValueError if the password is too short."
        if len(password) < 6:
            raise ValueError('Passwords must be at least 6 characters.')
 
class SpecialCharacters(PasswordValidator):
    def validate(self, password):
        "Raises ValueError if the password doesn't contain any special characters."
        if password.isalnum():
            raise ValueError('Passwords must contain at least one special character.')

Yes, it really is that easy! Here’s how these plugins would look in practice.

>>> for password in ('pass', 'password', 'p@ssword!'):
...     print(('Checking %r...' % password), end=' ')
...     if is_valid_password(password):
...         print('valid!')
...     else:
...         print()  # Force a new line
...         for error in get_password_errors(password):
...             print('  %s' % error)
...
Checking 'pass'...
  Passwords must be at least 6 characters.
  Passwords must contain at least one special character.
Checking 'password'...
  Passwords must contain at least one special character.
Checking 'p@ssword!'... valid!

Now What?

With a solid understanding of what Python has to offer, you’re ready to dive into some of the ways Django uses these tools for many of its features and how you can apply the same techniques in your own code. Forming the foundation of most Django applications, models make use of many of these advanced Python features.

1 http://prodjango.com/new-style-classes/

2 http://prodjango.com/metaprogramming/

3 http://prodjango.com/dict-methods/

4 http://prodjango.com/file-methods/

5 http://prodjango.com/inspect-module/

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset