CHAPTER 11

image

Enhancing Applications

Once a site has a set of basic applications in working order, the next step is to add more advanced functionality to complement the existing behavior. This can sometimes be a matter of simply adding more applications, each providing new features for users and employees alike. Other times, there are ways of enhancing your existing applications so they grow new features directly, without a separate application that can stand on its own.

These “meta-applications” or “sub-frameworks” are built with the goal of easily integrating into an existing application, using hooks that are already provided. This book has illustrated many such hooks, and they can be used in combination to great effect. It’s often possible to write a tool that performs a lot of tasks but only requires adding a single line of code to an existing application.

Adding an API

These days, most sites have an API to allow programmers to interact with the site's content and functionality without requiring a user or even a web browser. The goal is to provide data in a simple way for code to work with your data, using structured, reliable techniques. Django's class-based views provide a number of ways to customize a view's behavior, which can be quite useful for generating an API without having to write a lot of new code yourself.

Building an API requires a few decisions, but not all of them need to be made right away. Here’s a small sample, showing some of the common questions that need to be answered when designing an API.

  • What format should be used to transfer the data?
  • What types of data should be exposed?
  • How should that data be structured?
  • Do users need to be authenticated to access data?
  • Can users customize which data they retrieve?
  • Can users modify data through the API?
  • Are there separate permissions for different API endpoints?

This chapter will answer some of those questions, within the context of the real estate site outlined in Chapter 10. Better yet, you’ll see an example of a simple framework that can add the necessary API features without requiring much to add directly to your apps. Reusability is key to the long-term success of features like these, so it’s ideal to produce a configurable tool for such tasks.

image Caution  Not all of these questions will be addressed in this chapter. In particular, the examples shown throughout the chapter don’t use any authentication or authorization whatsoever. Django’s standard session-based authenticationisn’t well suited for use with an API, but you have several choices. You could simply let your web server handle authentication1, implement a full OAuth provider2 or use some other approach that makes sense for your site. Those decisions and associated instructions are outside the scope of this book.

Serializing Data

A good first place to start is establishing a format for your data to use. These days, the de facto standard is JSON, JavaScript Object Notation. It originated as an easy way to use data inside of a browser, because browsers natively understand JavaScript objects, but it has since come into its own as a simple, readable and reliable cross-platform data format. Python has its own tools for reading and writing it directly, as do many other programming languages.

In fact, Django even has its own tools for writing model instances using JSON and reading them back in again. Because it takes an in-memory object and converts it to a sequence of characters that can be sent over a network, this process is called serialization. Django’s serialization tools are located in django.core.serializers. Getting a JSON serializer is fairly simple, using the get_serializer() function.

To get a serializer, just pass in the name of the serialization method you’d like to use and you’ll get a class that can be used to serialize objects. Django supports three serialization formats.

  • json—JavaScript Object Notation
  • xml—Extensible Markup Language
  • yaml—YAML Ain’t a Markup Language, which is available if you have PyYAML3 installed
>>> from django.core import serializers
>>> JSONSerializer = serializers.get_serializer('json')

The serializer class you get back from this works the same way, regardless of which format you choose. The remainder of this chapter will use JSON, but you should be able to use XML or YAML with relatively minor modifications.

Usage of the serializer is a bit different than the simpler json module provided directly with Python. Rather than using dumps() and loads() methods, Django’s serializers provide serialize() and deserialize() methods to transform data to and from JSON, respectively. Also, these methods work with proper Django model instances, rather than merely with native data structures like lists and dictionaries. For now, we’ll just look at the serialize() method, to get data out of your application so others can use it.

>>> serializer = JSONSerializer()
>>> from contacts.models import Contact
>>> serializer.serialize(Contact.objects.all())
'[{...}, {...}, {...}]'

If you look at the actual output for each of the serialized contacts, you’ll notice some additional information you might not have expected. Django’s serialization tools are intended to produce output that can be deserialized without knowing in advance what models and instances were serialized originally. To do that, the output includes some information about the model itself, as well as the primary key of each instance. Each object will look something like this.

{
    "pk": 1,
    "model": "contacts.contact",
    "fields": {
        "user": 1,
        "address": "123 Main Street",
        "city": "Los Angeles",
        "state": "CA",
        "zip_code": "90210",
        "phone_number": "123-456-7890"
    }
}

For an API, you’ll already know what model and ID you’re working with, because they’ll be mapped as part of the URL. Ideally, we can get the fields dictionary out of this and send just that back and forth instead. Though not documented, Django’s serializers do offer a way to override their behavior to do just that. It starts by realizing that the result of get_serializer() is actually a class, not an instance object. This allows you to create a subclass before instantiating it, yielding all the benefits of overriding individual methods on the subclass.We’ll put this code in a file called serializers.py, and because we’ll also be adding some more files later in this chapter, we’ll create this in a package called api. In the end, we’ll be able to import this code as api.serializers.

from django.core import serializers
 
class QuerySetSerializer(serializers.get_serializer('json')):
    pass

Understanding how to override the serializer requires knowing a bit about how the serializers work. The serialize() method accepts a QuerySet or any iterable that yields model instances. It iterates over that and for each object it finds, it iterates over its fields, outputting values at each step. The entire process can be easily seen by looking at the methods that are called along the way.

  • start_serialization()—This sets up the list to hold the objects that will be output in the stream.
  • start_object(obj)—This sets up a dictionary to collect information about an individual object.
  • handle_field(obj, field)—Each field gets added to the object’s dictionary individually.
  • handle_fk_field(obj, field)—Foreign key relationships are handled using a separate method.
  • handle_m2m_field(obj, field)—Like foreign keys, many-to-many relationships are handled using their own method.
  • end_object(obj)—Once all the fields have been handled, the object gets a chance to finalize the dictionary for its data. This is where the model information and primary key value are added to the fields, yielding the output shown previously.
  • get_dump_object(obj)—Called internally within end_object(), this is responsible for defining the actual structure of each object that gets serialized.
  • end_serialization()—Once all the objects have been processed, this method finalizes the stream.

The first customization we’ll apply is to simplify the structure of the output. Because we’ll use the URL to indicate what type of object we’re working with and what ID it has, all we need in the serialized output is the collection of fields. As hinted at in the process we’ve just looked at, this is handled by the get_dump_object() method. In addition to the object it’s provided, get_dump_object() also has access to the current data that’s already been assembled by the field handling methods. That data is stored in the _current attribute.

The default implementation of get_dump_object() wraps the field data in a dictionary, along with the object’s primary key and the path and name of its model. All we need to do in our overridden method is to return just the current field data.

class QuerySetSerializer(serializers.get_serializer('json')):
    def get_dump_object(self, obj):
        return self._current

With just this simple method in place, you can already see an improvement in the output.

{
    "user"
    "address": "123 Main Street",
    "city": "Los Angeles",
    "state": "CA",
    "zip_code": "90210",
    "phone_number": "123-456-7890"
}

Outputting a Single Object

The example shown at the end of the previous section would be just one entry in a list, because serialize() operates exclusively on iterables. For an API, you’re perhaps even more likely to be outputting a single object from a detail view, which shouldn't be wrapped in a list. Instead of a QuerySetSerializer, we need a SingleObjectSerializer. It'll still be based on QuerySetSerializer, so we can reuse all the functionality we’re adding there, but with just enough modifications to handle individual objects.

The first thing to override is the serialize() method, so it can accept individual objects, but in order to reuse all the serialization behavior, it will need to call its parent method with a list instead of the single object. This is a fairly simple override.

class SingleObjectSerializer(QuerySetSerializer):
    def serialize(self, obj, **options):
        # Wrap the object in a list in order to use the standard serializer
        return super(SingleObjectSerializer, self).serialize([obj], **options)

Unfortunately, because that wraps the object in a list and returns the output without any other changes, this will in fact still output a list in the JSON string. In order to output just the values from the one object in that list, it’s necessary to strip out the list characters around it. The result is a string, so these characters can be removed using the string’s strip() method.

We could place this code directly in the serialize() method, after calling the parent method but before returning the string at the end, but Django’s serializers have one other point of customization that we haven’t looked into yet. Once all the objects have been assembled into a format that the serializer can work with, the getvalue() method is asked to return the fully serialized output. This is a much better place to put our customizations, as it matches the intent of the original methods. Aligning your overrides with the intent of the original implementation is a good way to ensure that future changes won’t break your code in unexpected ways.

class SingleObjectSerializer(QuerySetSerializer):
    def serialize(self, obj, **options):
        # Wrap the object in a list in order to use the standard serializer
        return super(SingleObjectSerializer, self).serialize([obj], **options)
 
    def getvalue(self):
        # Strip off the outer list for just a single item
        value = super(SingleObjectSerializer, self).getvalue()
        return value.strip('[] ')

And that’s all we need to get a new serializer that’s fully capable of handling individual objects. Now, you can serialize an object on its own and get just that object’s output in return.

>>> serializer = SingleObjectSerializer()
>>> from contacts.models import Contact
>>> serializer.serialize(Contact.objects.get(pk=1))
'{...}'

Handling Relationships

Looking at the output as it currently stands, you’ll notice that the user associated with the contact is represented only by its primary key. Because Django’s serializers are intended for reconstituting models one at a time, they contain only the data necessary for each individual object. For an API, it’s more useful to include some details of the related object as well, preferably in a nested dictionary.

This part of the output is managed by the handle_fk_field() method, with the default implementation simply outputting the numeric ID value. We can override this to provide details of the object instead, but it requires a bit of an interesting approach because of a problem you might not expect.Django’s serializers wrap more generic serializers and add on the behavior necessary to work with Django models, but that added behavior only applies to the first level of data. Any Django model that you try to serialize outside the first-level iterable will raise a TypeError indicating that it’s not a serializable object.

At first blush, it may seem like the answer would be to serialize the related objects separately, then attach them to the rest of the structure. The problem on that end is that the output of the serializer is a string. If you attach that output to the self._current dictionary, it’ll get serialized as a single string that just happens to contain another serialized object inside of it.

So we can’t leave the object unserialized, and we also can’t fully serialize it. Thankfully, Django offers a path between the two by way of yet another serializer that’s not normally documented. The 'python' serializer can take Django objects and produce native Python lists and dictionaries, rather than strings. These lists and dictionaries can be used anywhere in a serializable structure and will produce what you would expect.

We need two serializers now: one for outputting the overall structure, including related objects, and another for outputting the string as JSON or whichever other format you prefer. The Python serializer will do the bulk of the work, and we can build a more useful serializer by combining that with the basic JSON serializer. Here’s how our existing implementation would look.

class DataSerializer(serializers.get_serializer('python')):
    def get_dump_object(self, obj):
        return self._current
 
class QuerySetSerializer(DataSerializer, serializers.get_serializer('json')):
    pass  # Behavior is now inherited from DataSerializer

Notice that get_dump_object() moves into the new DataSerializer, because it doesn't actually have anything to do with JSON output. Its sole purpose is to define the structure of the output, which is applicable to any of the output formats. That’s also where the overridden handle_fk_field() belongs. It has three tasks to perform.

  • Retrieve the related object
  • Transform it into a native Python structure
  • Add it to the main object’s data dictionary

The first and third points are straightforward, but it’s that middle one that looks a little tricky. We can’t just call self.serialize() because each serializer maintains state throughout the process, by way of the _current attribute. We’ll need to instantiate a new serializer instead, but we also need to make sure to always use DataSerializer instead of accidentally getting an instance of the JSON serializer. That’s the only way to be sure that it outputs native Python objects, rather than strings.

class DataSerializer(serializers.get_serializer('python')):
    def get_dump_object(self, obj):
        return self._current
 
    def handle_fk_field(self, obj, field):
        # Include content from the related object
        related_obj = getattr(obj, field.name)
        value = DataSerializer().serialize([related_obj])
        self._current[field.name] = value[0]

The only other interesting thing to note about this new method is that it wraps the related object in a list before serializing it. Django’s serializers only operate on iterables, so when processing a single object, you’ll always need to wrap it in an iterable, such as a list. In the case of the Python serializer, the output is also a list, so when assigning it back to self._current, we need to only get the first item out of that list.

With that in place, the serialized output of a typical contact looks something like the following.

{
    "user": {
        "username": "admin",
        "first_name": "Admin",
        "last_name": "User",
        "is_active": true,
        "is_superuser": true,
        "is_staff": true,
        "last_login": "2013-07-17T12:00:00.000Z",
        "groups": [],
        "user_permissions": [],
        "password": "pbkdf2_sha256$10000$...",
        "email": "[email protected]",
        "date_joined": "2012-12-04T17:46:00.000Z"
    },
    "address": "123 Main Street",
    "city": "Los Angeles",
    "state": "CA",
    "zip_code": "90210",
    "phone_number": "123-456-7890"
}

With just those few extra lines of code in one method, we now have the ability to nest objects within others, and because it utilizes DataSerializer, they can be nested as many levels deep as you’d like. But there’s a lot of information in a User object, most of which doesn’t really need to be included in the API, and some of it—such as the password hash—should never be revealed.

Controlling Output Fields

Django once again accommodates this situation, this time by offering a fields argument to the serialize() method. Simply pass in a list of field names, and only those fields will be processed by the handle_*_field() methods. For example, we could simplify the output of our Contact model by excluding the user from it entirely.

SingleObjectSerializer().serialize(Contact.objects.get(pk=1), fields=[
    'phone_number',
    'address',
    'city',
    'state',
    'zip_code',
])

With this in place, the output certainly gets simpler.

{
    "address": "123 Main Street",
    "city": "Los Angeles",
    "state": "CA",
    "zip_code": "90210",
    "phone_number": "123-456-7890"
}

Of course, removing the user from the output doesn’t really help at all. What we really need to do is limit the fields on the user object, not the contact. Unfortunately, this is another situation where the intent of Django’s serializers hampers us a little bit. Just like we had to intercept the serialization of the user object in handle_fk_field(), that’s also where we’d have to supply the fields argument to its call to the serialize() method. But specifying the fields there would require overriding the method each time we want to do it, and special-casing each model we want to handle.

A more general solution would be to create a registry of models and their associated fields. The handle_fk_field() method could then check this registry for each object it receives, using the field list it finds or falling back to a standard serialization if the model wasn’t registered. Setting up the registry is pretty simple, as is the function to register the model and field list combinations.

field_registry = {}
def serialize_fields(model, fields):
    field_registry[model] = set(fields)

image Note  Fields can be passed in as any iterable, but are explicitly placed into a set internally. The order of the fields doesn’t matter for the serialization process, and a set can be a bit smaller and faster because it doesn’t worry about ordering. Also, later in this chapter we’ll be able to take advantage of a specific behavior of sets to make parts of the implementation a bit easier to work with.

With this in place, we can import it wherever we need to specify the field list for a model, and simply call it once with the appropriate mapping that will need to be used later. The code to use this registry won’t actually go in handle_fk_field() though, because that would only apply it for related objects, not the outer-most objects themselves. In order to make a more consistent usage pattern, it would be ideal if you could specify the fields in the registry and use those registered fields for every object you serialize, whether it was a relationship or not.

To support this more general use case, the code for reading the field registry can go in the serialize() method instead. It’s the primary entry point for the main objects and related objects alike, so it’s a great place to provide this extra behavior.

The first task it needs to do is determine the model being used. Because you can pass in either a QuerySet or a standard iterable, there are two ways to get the model of the objects that were passed in. The most straightforward approach takes advantage of the fact that QuerySets are iterable as well, so you can always just get the first item.

class DataSerializer(serializers.get_serializer('python')):
    def serialize(self, queryset, **options):
        model = queryset[0].__class__
 
        return super(DataSerializer, self).serialize(queryset, **options)
 
    # Other methods previously described

This certainly works for both cases, but it will make an extra query for each QuerySet passed in, because fetching the first record is actually a different operation from iterating over all of the results. Sure, we need to do this anyway for non-QuerySet inputs, but QuerySets have a bit of extra information that we can use instead. Each QuerySet also has a model attribute on it, which already contains the model that was used to query the records, so if that attribute is present, we can just use that instead.

class DataSerializer(serializers.get_serializer('python')):
    def serialize(self, queryset, **options):
        if hasattr(queryset, 'model'):
            model = queryset.model
        else:
            model = queryset[0].__class__
 
        return super(DataSerializer, self).serialize(queryset, **options)
 
    # Other methods previously described

And because this isn’t checking specifically for a QuerySet object, but rather just checks to see if the model attribute exists, it will also work correctly if you happen to have some other iterable that yields models, as long as the iterable also has a model attribute on it.

With the model in hand, it’s easy to perform a lookup in the field list registry, but it’s important that we do so only if the fields argument itself wasn’t provided. Global registries like this should always be easy to override when necessary for specific cases. If provided, the fields argument will be present in the options dictionary, and that’s also where we’d put the field list we found from the registry. So adding this part of the process gets pretty simple as well.

class DataSerializer(serializers.get_serializer('python')):
    def serialize(self, queryset, **options):
        if hasattr(queryset, 'model'):
            model = queryset.model
        else:
            model = queryset[0].__class__
 
        if options.get('fields') is None and model in field_registry:
            options['fields'] = field_registry[model]
 
        return super(DataSerializer, self).serialize(queryset, **options)
 
    # Other methods previously described

image Note  The second if block can look very strange at a glance, but it can’t simply check to see if 'fields' exists in the options dictionary. In some cases, a None could be passed in explicitly, which should behave the same as if the argument was omitted entirely. To account for this, we use get(), which falls back to None if not found, then we check for None manually in order to make sure we’re catching all the right cases. In particular, supplying an empty list should still override any registered fields, so we can’t just use a Boolean not.

Now serialize() will automatically inject a field list for any model it already knows about, unless overridden by a custom fields argument. This means that you’ll have to make sure to register your field lists before trying to serialize anything, but as you’ll see later in this chapter, that’s easily done in your URL configuration. Also note that if the model hasn’t been assigned a field list and you didn’t specify one yourself, this update will simply leave the fields argument unspecified, falling back to the default behavior we saw previously.

With all this in place, we can easily customize the field lists for our Contact and User models. We don’t need to customize Contact specifically, because we want to include all of its fields, but it’s included here as well, for demonstration purposes. Besides, explicit is better than implicit, and specifying everything here helps document the output of your API.

from api import serialize_fields
from contacts.models import Contact
from django.contrib.auth.models import User
 
serialize_fields(Contact, [
    'phone_number',
    'address',
    'city',
    'state',
    'zip_code',
    'user',
])
serialize_fields(User, [
    'username',
    'first_name',
    'last_name',
    'email',
])

Interestingly, these field lists mostly match the fields that were already provided to the forms we created in Chapter 10. Forms also keep a list of their own fields, so we can actually rewrite these registrations using the form field names instead, which helps us avoid repeating ourselves. This lets the forms be primarily responsible for what fields are useful to end-users, with the API simply following suit. The only change we’ll need to make is to add the Contact model’s user attribute back in, because that was handled differently in the form scenarios.

from api import serialize_fields
from contacts.models import Contact
from django.contrib.auth.models import User
from contacts.forms import ContactEditorForm, UserEditorForm
 
serialize_fields(Contact, ContactEditorForm.base_fields.keys() + ['user'])
serialize_fields(User, UserEditorForm.base_fields.keys())

And now, when we go to serialize a Contact object using the SingleObjectSerializer with these new changes in place, it finally looks like you would expect.

{
    "user": {
        "username": "admin",
        "first_name": "Admin",
        "last_name": "User",
        "email": "[email protected]"
    },
    "address": "123 Main Street",
    "city": "Los Angeles",
    "state": "CA",
    "zip_code": "90210",
    "phone_number": "123-456-7890"
}

Many-to-Many Relationships

So far, the API will output just about everything you might need, with the major missing feature being many-to-many relationships. The handle_fk_field() method will only handle simple foreign keys that point to a single object per record, while many-to-many relationships would result in a list of related objects that all need to be serialized and inserted into the JSON string.

As outlined earlier in this chapter, serializers also have a handle_m2m_field() method, which we can use to customize how they behave with regard to these more complex relationships. Technically, these relationships are already handled slightly, but only in the same way as foreign keys were originally. Each related object will simply yield its primary key value and nothing else. We’ll need to apply some of the same steps that were done for foreign keys in order to get more information from these relationships.

The first change from our foreign key handling is that the attribute that references related objects isn’t an object or QuerySet itself; it’s a QuerySet manager. That means it’s not iterable on its own, and thus can’t be serialized directly, so we’ll have to call its all() method to get a QuerySet to work with. Then, rather than wrapping it in a list, we can just pass it through the standard serialize() method on its own.

class DataSerializer(serializers.get_serializer('python')):
    # Other methods previously described
 
    def handle_m2m_field(self, obj, field):
        # Include content from all related objects
        related_objs = getattr(obj, field.name).all()
        values = DataSerializer().serialize(related_objs)
        self._current[field.name] = values

With this in place, here’s what a contact would look like if we add the 'groups' field to registry for the User object.

{
    "user": {
        "username": "admin",
        "first_name": "Admin",
        "last_name": "User",
        "email": "[email protected]",
        "groups": [
            {
                "name": "Agents",
                "permission_set": [...]
            },
            {
                "name": "Buyers",
                "permission_set": [...]
            },
            {
                "name": "Sellers",
                "permission_set": [...]
            }
        ]
    },
    "address": "123 Main Street",
    "city": "Los Angeles",
    "state": "CA",
    "zip_code": "90210",
    "phone_number": "123-456-7890"
}

Of course, the permissions don’t make much sense in this context, so you’d want to remove those from the User object’s field list, but other than that, it looks like another pretty simple solution that will get us on our way. Unfortunately, many-to-many relationships in Django have one other feature that makes things considerably more complicated for us.

When you specify a many-to-many relationship, you can optionally specify a “through” model, which can contain some additional information about the relationship. This information isn’t attached to either model directly, but is instead truly part of the relationship between the two. The simple approach we just applied for handle_m2m_field() completely ignores this feature, so none of that extra information will ever be included in our output.

Remember from Chapter 10 that our Property model is related to Feature and Contact by way of many-to-many relationships, and each of them used the through argument to include some extra information fields. Here’s what you’d see if you tried to serialize a Property object with the code we have in place at the moment.

{
    "status": 2,
    "address": "123 Main St.",
    "city": "Anywhere",
    "state": "CA",
    "zip": "90909",
    "features": [
        {
            "slug": "shed",
            "title": "Shed",
            "definition": "Small outdoor storage building"
        },
        {
            "slug": "porch",
            "title": "Porch",
            "definition": "Outdoor entryway"
        }
    ],
    "price": 130000,
    "acreage": 0.25,
    "square_feet": 1248
}

As you can see, the features listed only include information about the generate types of features being mentioned. The definition simply explains what a shed and porch are in general terms, but there’s nothing specific about the features on this particular property. Those details are only present in the PropertyFeature relationship table, which is currently being ignored.Let’s take a look at what we’re hoping to do with this in order to better understand how to get there. The fields we’re looking for are stored in an intermediate PropertyFeature model, but we want to have them included as if they were directly on the Feature model. That requires both the Feature instance and the PropertyFeature instance, with their attributes merged into a single dictionary.

Getting the Appropriate Fields

The first problem we run into is that the PropertyFeature model has more fields on it than we really want to include. It includes two ForeignKey fields that relate to Property and Feature, which are really only there to support the relationship and don’t add any useful information that we couldn’t have gotten with the simple approach shown earlier. We don’t want to include those, or its automatic primary key either. Everything else is useful information, but we’ll need a way to identify which fields are useful and which ones aren’t.

To get this information, we’ll start with a helper method that can look through the fields in the PropertyFeature model and organize them according to their purpose. There are four types of fields, each of which can be identified by different things that we can introspect in code.

  • The auto-incrementing primary key will be an instance of AutoField. This field doesn’t do anything useful for us, so once it’s found, it can be safely ignored.
  • One foreign key points to the main model we’re working with. In this case, it’s the one that points to the Property model. This can be identified as an instance of ForeignKey, with the value of its rel.to attribute matching the class of the object that was passed in. We’ll call this the source field.There’s also a foreign key that points to the related model, which is Feature in this example. This can be identified as an instance of ForeignKey, with the value of its rel.to attribute matching the rel.to attribute on the ManyToMany field that was passed into handle_m2m_field(). Let’s call this one the target field.
  • Lastly, any other fields that don’t fall into the other three categories contain information about the relationship itself, and these are the ones we’re working to gather. We’ll call these the extra fields.

With these rules in place, a new get_through_fields() method is rather straightforward. It just has to look through all the fields on the relationship model and identify each one according to those rules, returning the ones we need to work with in handle_m2m_field().

from django.db.models import AutoField, ForeignKey
 
class DataSerializer(serializers.get_serializer('python')):
    # Other methods previously described
 
    def get_through_fields(self, obj, field):
        extra = set()
 
        for f in field.rel.through._meta.fields:
            if isinstance(f, AutoField):
                # Nothing to do with AutoFields, so just ignore it
                continue
 
            if isinstance(f, ForeignKey):
                # The source will refer to the model of our primary object
                if f.rel.to == obj.__class__:
                    source = f.name
                    continue
 
                # The target will be the same as on the ManyToManyField
                if f.rel.to == field.rel.to:
                    target = f.name
                    continue
 
            # Otherwise this is a standard field
            extra.add(f.name)
 
        return source, target, extra

Getting Information About the Relationship

Now that we have the fields we need, the meat of the process is about finding each relationship individually and pulling the appropriate information out of it. We’ll build up this new version of handle_m2m_field() a few lines at a time, so it’s a bit easier to see all the pieces come together along the way.

First, we need to retrieve all the field information applicable to this task. The previous section set the stage for getting information from the relationship model, but we’ll also need the list of fields to include in the serialized output. We’re not serializing PropertyFeature using the standard process, so it can’t use the field registry the same way other models do. Besides, we’ll be returning all the data from Feature and PropertyFeature together in the structure referenced by Feature, so it’s better if we allow the configuration to specify all the fields for both models when specifying Feature. For example, to pull in just the title of the feature and its description for the current property, we could register them both in a single line.

api.serialize_fields(Feature, ['title', 'description'])

The title field will come from the Feature model, while description comes from PropertyFeature instead, but this allows that implementation detail to be better hidden from view. Calling get_through_fields() is easy enough, and retrieving the registered field list is the same as it was in handle_fk_field(), with one minor exception. If there’s no field list already registered, we can just use the extra fields returned from our call to get_through_fields(). We have to make sure to specify something by default, because otherwise the automatic primary key and those two extra foreign keys would be serialized as well, even though they’re not useful here.

class DataSerializer(serializers.get_serializer('python')):
    # Other methods previously described
 
    def handle_m2m_field(self, obj, field):
        source, target, extra_fields = self.get_through_fields(obj, field)
        fields = field_registry.get(field.rel.to, extra_fields)

Next, we prepare to iterate over all the relationships for the current object. This works a bit differently than it might seem, because the simple way to access many-to-many relationships doesn’t return any of the extra relationship information. To get that, we need to query the relationship model directly, filtering for only the results where the source refers to the object that was passed into handle_m2m_field(). While we’re at it, we can also set up a list to store the data retrieved from these relationships.

class DataSerializer(serializers.get_serializer('python')):
    # Other methods previously described
 
    def handle_m2m_field(self, obj, field):
        source, target, extra_fields = self.get_through_fields(obj, field)
        fields = field_registry.get(field.rel.to, extra_fields)
 
        # Find all the relationships for the object passed into this method
        relationships = field.rel.through._default_manager.filter(**{source: obj})
 
        objects = []

Now we’re in position to start iterating through the relationships and pull the necessary information out of each one. The first step is to actually serialize the related model according to the field list we found earlier. This will add the title data from the Feature model, for example.

class DataSerializer(serializers.get_serializer('python')):
    # Other methods previously described
 
    def handle_m2m_field(self, obj, field):
        source, target, extra_fields = self.get_through_fields(obj, field)
        fields = field_registry.get(field.rel.to, extra_fields)
 
        # Find all the relationships for the object passed into this method
        relationships = field.rel.through._default_manager.filter(**{source: obj})
 
        objects = []
        for relation in relationships.select_related():
            # Serialize the related object first
            related_obj = getattr(relation, target)
            data = DataSerializer().serialize([related_obj])[0]

Notice here that we need to wrap the object in a list to serialize it, then grab the first item out of the resulting list. We created a SingleObjectSerializer earlier, but that’s only designed to work with JSON output as a more public interface. We’re only doing this in one method, so it’s not really worth creating another single-object variation that’s designed to work with native Python data structures.

The source and target fields have already proven useful, and we now have a data dictionary that contains some of what we need. In order to get the rest of the information from the relationship model, we look at the extra fields. We don’t necessarily want all of them, though. We need to get only the ones that were also included in the field list registry. This is where it becomes really useful that we have both of them stored as sets. We can perform a simple intersection operation using to the & operator to get only the fields that are in both places. For each one we find, we just add it to the data dictionary alongside the other values.

class DataSerializer(serializers.get_serializer('python')):
    # Other methods previously described
 
    def handle_m2m_field(self, obj, field):
        source, target, extra_fields = self.get_through_fields(obj, field)
        fields = field_registry.get(field.rel.to, extra_fields)
 
        # Find all the relationships for the object passed into this method
        relationships = field.rel.through._default_manager.filter(**{source: obj})
 
        objects = []
        for relation in relationships.select_related():
            # Serialize the related object first
            related_obj = getattr(relation, target)
            data = DataSerializer().serialize([related_obj])[0]
 
            # Then add in the relationship data, but only
            # those that were specified in the field list
            for f in fields & extra_fields:
                data[f] = getattr(relation, f)

Now all that’s left is to add all this data to the list of objects and add that whole collection to the dictionary the serializer is using to keep track of the current object.

class DataSerializer(serializers.get_serializer('python')):
    # Other methods previously described
 
    def handle_m2m_field(self, obj, field):
        source, target, extra_fields = self.get_through_fields(obj, field)
        fields = field_registry.get(field.rel.to, extra_fields)
 
        # Find all the relationships for the object passed into this method
        relationships = field.rel.through._default_manager.filter(**{source: obj})
 
        objects = []
        for relation in relationships.select_related():
            # Serialize the related object first
            related_obj = getattr(relation, target)
            data = DataSerializer().serialize([related_obj])[0]
 
            # Then add in the relationship data, but only
            # those that were specified in the field list
            for f in fields & extra_fields:
                data[f] = getattr(relation, f)
 
            objects.append(data)
        self._current[field.name] = objects

Now, when we serialize that same Property object we looked at earlier, you can see that it has information from both Feature and PropertyFeature included in a single dictionary, making it a much more useful representation of the data in our system.

{
    "status": 2,
    "address": "123 Main St.",
    "city": "Anywhere",
    "state": "CA",
    "zip": "90909",
    "features": [
        {
            "title": "Shed",
            "description": "Small woodshed near the back fence"
        },
        {
            "title": "Porch",
            "description": "Beautiful wrap-around porch facing north and east"
        }
    ],
    "price": 130000,
    "acreage": 0.25,
    "square_feet": 1248
}

Retrieving Data

With our data structure in place, the next logical step is to retrieve data from the database and present it appropriately. This is a job for views, and we can get a lot of mileage out of class-based views in particular. Internally, class-based views are made up of several mix-ins, which are combined as necessary to build up more useful classes. We’ll start by creating a mix-in of our own, called ResourceView. This code will go inside the api package in a new file called views.py, alongside the serialization code from the previous sections.

ResourceView

As with most class-based views, most of the work here is about allowing its behavior to be customized for each individual use case. Because the purpose of a ResourceView will be to serialize one or more objects, we can give it the ability to accept a serializer that can be used to perform that step. For good measure, we’ll also make it a little easier to use by adding in its own serialize() method, so you don’t have to worry about accessing the serializer directly.

from django.views.generic import View
 
class ResourceView(View):
    serializer = None
 
    def serialize(self, value):
        return self.serializer.serialize(value)

Notice that serializer is set to the standard JSON serializer by default. We have different serializers for QuerySets and individual objects, and at this point, there’s no way to know which one to use. Rather than flip a coin to decide which use will be more common, it’s better to just leave it undefined for now and require it to be specified in subclasses or individual URL configurations.

One thing that’s missing from the serialize() call right now is the ability to specify the output fields. A fair amount of our serialization code is designed to support that feature, so ResourceView should expose that behavior to individual URLs for customization. Using None as a default value here will automatically serialize all available fields on whatever model is provided.

from django.views.generic import View
 
class ResourceView(View):
    serializer = None
    fields = None
 
    def get_fields(self):
        return self.fields
 
    def serialize(self, value):
        return self.serializer.serialize(value, fields=self.get_fields())

We use a get_fields() method here instead of just raw attribute access because mix-ins are intended to be subclassed in ways we might not expect. In a later section of this chapter, you’ll see a subclass that needs to change how the fields are retrieved by adding a fallback to use if fields wasn’t specified. We could consider using a property in the subclass instead of a method, but that causes its own set of problems if a future subclass of that needs to override that behavior yet again, particularly if it wants to build on the behavior of its parent class. In general, methods are a much more straightforward way to handle subclassing behavior, so they work very well for mix-in classes like this.

ResourceListView

Now we can start working on a real view that’s actually useful on its own. Because ResourceView is just a mix-in that provides a couple new options and methods, we can combine it with virtually any other Django view we’d like to use. For the most basic case, we can use Django’s own ListView to provide a collection of objects and simply serialize them on the way out instead of rendering a template.

Because ListView is based on TemplateView, it already contains a method intended to render a given context dictionary into an HttpResponse by way of rendering a template. We’re not rendering a template, but we do need to return an HttpResponse, and the context already gives us everything we need to do so. This allows us to use a custom render_to_response() to use a JSON serializer in place of a template renderer to get the right results.

To start, we’ll need to specify the serializer we want to use, as the default ResourceView doesn’t have one assigned to it.

from django.views.generic import View, ListView
 
from api import serializers
 
# ResourceView is defined here
 
class ResourceListView(ResourceView, ListView):
    serializer = serializers.QuerySetSerializer()

Next, we can override the render_to_response() method. This will need to perform three steps:

  • Get the lists of objects out of the provided context
  • Serialize those objects into a JSON string
  • Return an appropriate HttpResponse

The first two steps are easy to do, given the features we already have in place. The last step can’t be just a standard HttpResponse though. We’ll need to customize its Content-Type to indicate to the HTTP client that the content consists of a JSON value. The value we need is application/json, and it can be set using the content_type argument of HttpResponse. All these steps combine into a pretty short function.

from django.http import HttpResponse
from django.views.generic import View, ListView
 
from api import serializers
 
# ResourceView is defined here
 
class ResourceListView(ResourceView, ListView):
    serializer = serializers.QuerySetSerializer()
 
    def render_to_response(self, context):
        return HttpResponse(self.serialize(context['object_list']),
            content_type='application/json')

Believe it or not, that’s all it takes to offer up a list of JSON objects through your new API. All the options available to ListView are available here as well, with the only difference being that the output will be JSON instead of HTML. Here’s what an associated URL configuration might look like, so you can see how the various serialization features combine to make this happen.

from django.conf.urls import *
from django.contrib.auth.models import User, Group
 
from api.serializers import serialize_fields
from api.views import ResourceListView
from contacts.models import Contact
from contacts import forms
 
serialize_fields(Contact, forms.ContactEditorForm.base_fields.keys() + ['user'])
serialize_fields(User, forms.UserEditorForm.base_fields.keys())
serialize_fields(Group, ['name'])
 
urlpatterns = patterns('',
    url(r'^$',
        ResourceListView.as_view(
            queryset=Contact.objects.all(),
        ), name='contact_list_api'),
)

ResourceDetailView

Next up, we need to provide a detail view for our models, and it works the same as the previous section described. In fact, there are only three differences:

  • We need to subclass DetailView instead of ListView.
  • We use the SingleObjectSerializer instead of QuerySetSerializer.
  • The context variable we need is named 'object' instead of 'object_list'.

With those three changes in place, here’s ResourceListView and ResourceDetailView together in the api package’s views.py file.

from django.http import HttpResponse
from django.views.generic import View, ListView, DetailView
 
from api import serializers
 
# ResourceView is defined here
 
class ResourceListView(ResourceView, ListView):
    serializer = serializers.QuerySetSerializer()
 
    def render_to_response(self, context):
        return HttpResponse(self.serialize(context['object_list']),
            content_type='application/json')
 
class ResourceDetailView(ResourceView, DetailView):
    serializer = serializers.SingleObjectSerializer()
 
    def render_to_response(self, context):
        return HttpResponse(self.serialize(context['object']),
            content_type='application/json')

And for good measure, here’s the continuation of the URL configuration from the previous section, extended to include a reference to ResourceDetailView as well.

from django.conf.urls import *
from django.contrib.auth.models import User, Group
 
from api.serializers import serialize_fields
from api.views import ResourceListView, ResourceDetailView
from contacts.models import Contact
from contacts import forms
 
serialize_fields(Contact, forms.ContactEditorForm.base_fields.keys() + ['user'])
serialize_fields(User, forms.UserEditorForm.base_fields.keys())
serialize_fields(Group, ['name'])
 
urlpatterns = patterns('',
    url(r'^$',
        ResourceListView.as_view(
            queryset=Contact.objects.all(),
        ), name='contact_list_api'),
    url(r'^(?P<slug>[w-]+)/$',
        ResourceDetailView.as_view(
            queryset=Contact.objects.all(),
            slug_field='user__username',
        ), name='contact_detail_api'),
)

Now What?

The API shown in this chapter is just the beginning, providing anonymous read-only access to a few models. You could extend this in many different directions, adding things like authentication, authorization of external applications, and even updating data through the API using Django’s forms.

The tools and techniques discussed in this book go well beyond the official Django documentation, but there’s still a lot left unexplored. There are plenty of other innovative ways to use Django and Python.

As you work your way through your own applications, be sure to consider giving back to the Django community. The framework is available because others decided to distribute it freely; by doing the same, you can help even more people uncover more possibilities.

1 http://prodjango.com/remote-user

2 http://prodjango.com/oauth

3 http://prodjango.com/pyyaml

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset