Structural patterns

This section contains several design patterns that can help you design and structure your models.

Patterns – normalized models

Problem: By design, model instances have duplicated data that cause data inconsistencies.

Solution: Break down your models into smaller models through normalization. Connect these models with logical relationships between them.

Problem details

Imagine if someone designed our Post table (omitting certain columns) in the following way:

Superhero Name

Message

Posted on

Captain Temper

Has this posted yet?

2012/07/07 07:15

Professor English

It should be 'Is' not 'Has'.

2012/07/07 07:17

Captain Temper

Has this posted yet?

2012/07/07 07:18

Capt. Temper

Has this posted yet?

2012/07/07 07:19

I hope you noticed the inconsistent superhero naming in the last row (and captain's consistent lack of patience).

If we were to look at the first column, we are not sure which spelling is correct—Captain Temper or Capt. Temper. This is the kind of data redundancy we would like to eliminate through normalization.

Solution details

Before we take a look at the fully normalized solution, let's have a brief primer on database normalization in the context of Django models.

Three steps of normalization

Normalization helps you efficiently store data. Once your models are fully normalized, they will not have redundant data, and each model should contain data that is only logically related to it.

To give a quick example, if we were to normalize the Post table so that we can unambiguously refer to the superhero who posted that message, then we need to isolate the user details in a separate table. Django already creates the user table by default. So, you only need to refer to the ID of the user who posted the message in the first column, as shown in the following table:

User ID

Message

Posted on

12

Has this posted yet?

2012/07/07 07:15

8

It should be 'Is' not 'Has'.

2012/07/07 07:17

12

Has this posted yet?

2012/07/07 07:18

12

Has this posted yet?

2012/07/07 07:19

Now, it is not only clear that there were three messages posted by the same user (with an arbitrary user ID), but we can also find that user's correct name by looking up the user table.

Generally, you will design your models to be in their fully normalized form and then selectively denormalize them for performance reasons. In databases, Normal Forms are a set of guidelines that can be applied to a table to ensure that it is normalized. Commonly found normal forms are first, second, and third normal forms, although they could go up to the fifth normal form.

In the next example, we will normalize a table and create the corresponding Django models. Imagine a spreadsheet called 'Sightings' that lists the first time someone spots a superhero using a power or superhuman ability. Each entry mentions the known origins, super powers, and location of first sighting, including latitude and longitude.

Name

Origin

Power

First Used At (Lat, Lon, Country, Time)

Blitz

Alien

Freeze

Flight

+40.75, -73.99; USA; 2014/07/03 23:12

+34.05, -118.24; USA; 2013/03/12 11:30

Hexa

Scientist

Telekinesis

Flight

+35.68, +139.73; Japan; 2010/02/17 20:15

+31.23, +121.45; China; 2010/02/19 20:30

Traveller

Billionaire

Time travel

+43.62, +1.45, France; 2010/11/10 08:20

The preceding geographic data has been extracted from http://www.golombek.com/locations.html.

First normal form (1NF)

To confirm to the first normal form, a table must have:

  • No attribute (cell) with multiple values
  • A primary key defined as a single column or a set of columns (composite key)

Let's try to convert our spreadsheet into a database table. Evidently, our 'Power' column breaks the first rule.

The updated table here satisfies the first normal form. The primary key (marked with a *) is a combination of 'Name' and 'Power', which should be unique for each row.

Name*

Origin

Power*

Latitude

Longitude

Country

Time

Blitz

Alien

Freeze

+40.75170

-73.99420

USA

2014/07/03 23:12

Blitz

Alien

Flight

+40.75170

-73.99420

USA

2013/03/12 11:30

Hexa

Scientist

Telekinesis

+35.68330

+139.73330

Japan

2010/02/17 20:15

Hexa

Scientist

Flight

+35.68330

+139.73330

Japan

2010/02/19 20:30

Traveller

Billionaire

Time travel

+43.61670

+1.45000

France

2010/11/10 08:20

Second normal form or 2NF

The second normal form must satisfy all the conditions of the first normal form. In addition, it must satisfy the condition that all non-primary key columns must be dependent on the entire primary key.

In the previous table, notice that 'Origin' depends only on the superhero, that is, 'Name'. It doesn't matter which Power we are talking about. So, Origin is not entirely dependent on the composite primary key—Name and Power.

Let's extract just the origin information into a separate table called 'Origins' as shown here:

Name*

Origin

Blitz

Alien

Hexa

Scientist

Traveller

Billionaire

Now our Sightings table updated to be compliant to the second normal form looks like this:

Name*

Power*

Latitude

Longitude

Country

Time

Blitz

Freeze

+40.75170

-73.99420

USA

2014/07/03 23:12

Blitz

Flight

+40.75170

-73.99420

USA

2013/03/12 11:30

Hexa

Telekinesis

+35.68330

+139.73330

Japan

2010/02/17 20:15

Hexa

Flight

+35.68330

+139.73330

Japan

2010/02/19 20:30

Traveller

Time travel

+43.61670

+1.45000

France

2010/11/10 08:20

Third normal form or 3NF

In third normal form, the tables must satisfy the second normal form and should additionally satisfy the condition that all non-primary key columns must be directly dependent on the entire primary key and must be independent of each other.

Think about the Country column for a moment. Given the Latitude and Longitude, you can easily derive the Country column. Even though the country where a superpowers was sighted is dependent on the Name-Power composite primary key it is only indirectly dependent on them.

So, let's separate the location details into a separate Countries table as follows:

Location ID

Latitude*

Longitude*

Country

1

+40.75170

-73.99420

USA

2

+35.68330

+139.73330

Japan

3

+43.61670

+1.45000

France

Now our Sightings table in its third normal form looks like this:

User ID*

Power*

Location ID

Time

2

Freeze

1

2014/07/03 23:12

2

Flight

1

2013/03/12 11:30

4

Telekinesis

2

2010/02/17 20:15

4

Flight

2

2010/02/19 20:30

7

Time travel

3

2010/11/10 08:20

As before, we have replaced the superhero's name with the corresponding User ID that can be used to reference the user table.

Django models

We can now take a look at how these normalized tables can be represented as Django models. Composite keys are not directly supported in Django. The solution used here is to apply the surrogate keys and specify the unique_together property in the Meta class:

class Origin(models.Model):
    superhero = models.ForeignKey(settings.AUTH_USER_MODEL)
    origin = models.CharField(max_length=100)
class Location(models.Model):
    latitude = models.FloatField()
    longitude = models.FloatField()
    country = models.CharField(max_length=100)
    class Meta:
        unique_together = ("latitude", "longitude")
class Sighting(models.Model):
    superhero = models.ForeignKey(settings.AUTH_USER_MODEL)
    power = models.CharField(max_length=100)
    location = models.ForeignKey(Location)
    sighted_on = models.DateTimeField()
    class Meta:
        unique_together = ("superhero", "power")

Performance and denormalization

Normalization can adversely affect performance. As the number of models increase, the number of joins needed to answer a query also increase. For instance, to find the number of superheroes with the Freeze capability in USA, you will need to join four tables. Prior to normalization, any information can be found by querying a single table.

You should design your models to keep the data normalized. This will maintain data integrity. However, if your site faces scalability issues, then you can selectively derive data from those models to create denormalized data.

Tip

Best Practice

Normalize while designing but denormalize while optimizing.

For instance, if counting the sightings in a certain country is very common, then add it as an additional field to the Location model. Now, you can include the other queries using Django (object-relational mapping) ORM, unlike a cached value.

However, you need to update this count each time you add or remove a sighting. You need to add this computation to the save method of Sighting, add a signal handler, or even compute using an asynchronous job.

If you have a complex query spanning several tables, such as a count of superpowers by country, then you need to create a separate denormalized table. As before, we need to update this denormalized table every time the data in your normalized models changes.

Denormalization is surprisingly common in large websites because it is tradeoff between speed and space. Today, space is cheap but speed is crucial to user experience. So, if your queries are taking too long to respond, then you might want to consider it.

Should we always normalize?

Too much normalization is not necessarily a good thing. Sometimes, it can introduce an unnecessary table that can complicate updates and lookups.

For example, your User model might have several fields for their home address. Strictly speaking, you can normalize these fields into an Address model. However, in many cases, it would be unnecessary to introduce an additional table to the database.

Rather than aiming for the most normalized design, carefully weigh each opportunity to normalize and consider the tradeoffs before refactoring.

Pattern – model mixins

Problem: Distinct models have the same fields and/or methods duplicated violating the DRY principle.

Solution: Extract common fields and methods into various reusable model mixins.

Problem details

While designing models, you might find certain common attributes or behaviors shared across model classes. For example, a Post and Comment model needs to keep track of its created date and modified date. Manually copy-pasting the fields and their associated method is not a very DRY approach.

Since Django models are classes, object-oriented approaches such as composition and inheritance are possible solutions. However, compositions (by having a property that contains an instance of the shared class) will need an additional level of indirection to access fields.

Inheritance can get tricky. We can use a common base class for Post and Comments. However, there are three kinds of inheritance in Django: concrete, abstract, and proxy.

Concrete inheritance works by deriving from the base class just like you normally would in Python classes. However, in Django, this base class will be mapped into a separate table. Each time you access base fields, an implicit join is needed. This leads to horrible performance.

Proxy inheritance can only add new behavior to the parent class. You cannot add new fields. Hence, it is not very useful for this situation.

Finally, we are left with abstract inheritance.

Solution details

Abstract base classes are elegant solutions used to share data and behavior among models. When you define an abstract class, it does not create any corresponding table in the database. Instead, these fields are created in the derived non-abstract classes.

Accessing abstract base class fields doesn't need a JOIN statement. The resulting tables are also self-contained with managed fields. Due to these advantages, most Django projects use abstract base classes to implement common fields or methods.

Limitations of abstract models are as follows:

  • They cannot have a Foreign Key or many-to-many field from another model
  • They cannot be instantiated or saved
  • They cannot be directly used in a query since it doesn't have a manager

Here is how the post and comment classes can be initially designed with an abstract base class:

class Postable(models.Model):
    created = models.DateTimeField(auto_now_add=True)
    modified = models.DateTimeField(auto_now=True)
    message = models.TextField(max_length=500)

    class Meta:
        abstract = True


class Post(Postable):
    ...


class Comment(Postable):
    ...

To turn a model into an abstract base class, you will need to mention abstract = True in its inner Meta class. Here, Postable is an abstract base class. However, it is not very reusable.

In fact, if there was a class that had just the created and modified field, then we can reuse that timestamp functionality in nearly any model needing a timestamp. In such cases, we usually define a model mixin.

Model mixins

Model mixins are abstract classes that can be added as a parent class of a model. Python supports multiple inheritances, unlike other languages such as Java. Hence, you can list any number of parent classes for a model.

Mixins ought to be orthogonal and easily composable. Drop in a mixin to the list of base classes and they should work. In this regard, they are more similar in behavior to composition rather than inheritance.

Smaller mixins are better. Whenever a mixin becomes large and violates the Single Responsibility Principle, consider refactoring it into smaller classes. Let a mixin do one thing and do it well.

In our previous example, the model mixin used to update the created and modified time can be easily factored out, as shown in the following code:

class TimeStampedModel(models.Model):
    created = models.DateTimeField(auto_now_add=True)
    modified = models.DateTimeField(auto_now =True)

    class Meta:
        abstract = True

class Postable(TimeStampedModel):
    message = models.TextField(max_length=500)
    ... 

    class Meta:
        abstract = True

class Post(Postable):
    ...

class Comment(Postable):
    ...

We have two base classes now. However, the functionality is clearly separated. The mixin can be separated into its own module and reused in other contexts.

Pattern – user profiles

Problem: Every website stores a different set of user profile details. However, Django's built-in User model is meant for authentication details.

Solution: Create a user profile class with a one-to-one relation with the user model.

Problem details

Out of the box, Django provides a pretty decent User model. You can use it when you create a super user or log in to the admin interface. It has a few basic fields, such as full name, username, and e-mail.

However, most real-world projects keep a lot more information about users, such as their address, favorite movies, or their superpower abilities. From Django 1.5 onwards, the default User model can be extended or replaced. However, official docs strongly recommend storing only authentication data even in a custom user model (it belongs to the auth app, after all).

Certain projects need multiple types of users. For example, SuperBook can be used by superheroes and non-superheroes. There might be common fields and some distinctive fields based on the type of user.

Solution details

The officially recommended solution is to create a user profile model. It should have a one-to-one relation with your user model. All the additional user information is stored in this model:

class Profile(models.Model):
    user = models.OneToOneField(settings.AUTH_USER_MODEL,
                                primary_key=True)

It is recommended that you set the primary_key explicitly to True to prevent concurrency issues in some database backends such as PostgreSQL. The rest of the model can contain any other user details, such as birthdate, favorite color, and so on.

While designing the profile model, it is recommended that all the profile detail fields must be nullable or contain default values. Intuitively, we can understand that a user cannot fill out all his profile details while signing up. Additionally, we will ensure that the signal handler also doesn't pass any initial parameters while creating the profile instance.

Signals

Ideally, every time a user model instance is created, a corresponding user profile instance must be created as well. This is usually done using signals.

For example, we can listen for the post_save signal from the user model using the following signal handler:

# signals.py
from django.db.models.signals import post_save
from django.dispatch import receiver
from django.conf import settings 
from . import models

@receiver(post_save, sender=settings.AUTH_USER_MODEL)
def create_profile_handler(sender, instance, created, **kwargs):
    if not created:
        return
    # Create the profile object, only if it is newly created
    profile = models.Profile(user=instance)
    profile.save()

Note that the profile model has passed no additional initial parameters except for the user instance.

Previously, there was no specific place for initializing the signal code. Typically, they were imported or implemented in models.py (which was unreliable). However, with app-loading refactor in Django 1.7, the application initialization code location is well defined.

First, create a __init__.py package for your application to mention your app's ProfileConfig:

default_app_config = "profiles.apps.ProfileConfig"

Next, subclass the ProfileConfig method in app.py and set up the signal in the ready method:

# app.py
from django.apps import AppConfig

class ProfileConfig(AppConfig):
    name = "profiles"
    verbose_name = 'User Profiles'

    def ready(self):
        from . import signals

With your signals set up, accessing user.profile should return a Profile object to all users, even the newly created ones.

Admin

Now, a user's details will be in two different places within the admin: the authentication details in the usual user admin page and the same user's additional profile details in a separate profile admin page. This gets very cumbersome.

For convenience, the profile admin can be made inline to the default user admin by defining a custom UserAdmin as follows:

# admin.py
from django.contrib import admin
from .models import Profile
from django.contrib.auth.models import User

class UserProfileInline(admin.StackedInline):
    model = Profile

class UserAdmin(admin.UserAdmin):
    inlines = [UserProfileInline]

admin.site.unregister(User)
admin.site.register(User, UserAdmin)

Multiple profile types

Assume that you need several kinds of user profiles in your application. There needs to be a field to track which type of profile the user has. The profile data itself needs to be stored in separate models or a unified model.

An aggregate profile approach is recommended since it gives the flexibility to change the profile types without loss of profile details and minimizes complexity. In this approach, the profile model contains a superset of all profile fields from all profile types.

For example, SuperBook will need a SuperHero type profile and an Ordinary (non-superhero) profile. It can be implemented using a single unified profile model as follows:

class BaseProfile(models.Model):
    USER_TYPES = (
        (0, 'Ordinary'),
        (1, 'SuperHero'),
    )
    user = models.OneToOneField(settings.AUTH_USER_MODEL,
                                primary_key=True)
    user_type = models.IntegerField(max_length=1, null=True,
                                    choices=USER_TYPES)
    bio = models.CharField(max_length=200, blank=True, null=True)

    def __str__(self):
        return "{}: {:.20}". format(self.user, self.bio or "")

    class Meta:
        abstract = True

class SuperHeroProfile(models.Model):
    origin = models.CharField(max_length=100, blank=True, null=True)

    class Meta:
        abstract = True

class OrdinaryProfile(models.Model):
    address = models.CharField(max_length=200, blank=True, null=True)

    class Meta:
        abstract = True

class Profile(SuperHeroProfile, OrdinaryProfile, BaseProfile):
    pass

We grouped the profile details into several abstract base classes to separate concerns. The BaseProfile class contains all the common profile details irrespective of the user type. It also has a user_type field that keeps track of the user's active profile.

The SuperHeroProfile class and OrdinaryProfile class contain the profile details specific to superhero and non-hero users respectively. Finally, the profile class derives from all these base classes to create a superset of profile details.

Some details to take care of while using this approach are as follows:

  • All profile fields that belong to the class or its abstract bases classes must be nullable or with defaults.
  • This approach might consume more database space per user but gives immense flexibility.
  • The active and inactive fields for a profile type need to be managed outside the model. Say, a form to edit the profile must show the appropriate fields based on the currently active user type.

Pattern – service objects

Problem: Models can get large and unmanageable. Testing and maintenance get harder as a model does more than one thing.

Solution: Refactor out a set of related methods into a specialized Service object.

Problem details

Fat models, thin views is an adage commonly told to Django beginners. Ideally, your views should not contain anything other than presentation logic.

However, over time pieces of code that cannot be placed anywhere else tend to go into models. Soon, models become a dump yard for the code.

Some of the tell-tale signs that your model can use a Service object are as follows:

  1. Interactions with external services, for example, checking whether the user is eligible to get a SuperHero profile with a web service.
  2. Helper tasks that do not deal with the database, for example, generating a short URL or random captcha for a user.
  3. Involves a short-lived object without a database state, for example, creating a JSON response for an AJAX call.
  4. Long-running tasks involving multiple instances such as Celery tasks.

Models in Django follow the Active Record pattern. Ideally, they encapsulate both application logic and database access. However, keep the application logic minimal.

While testing, if we find ourselves unnecessarily mocking the database even while not using it, then we need to consider breaking up the model class. A Service object is recommended in such situations.

Solution details

Service objects are plain old Python objects (POPOs) that encapsulate a 'service' or interactions with a system. They are usually kept in a separate file named services.py or utils.py.

For example, checking a web service is sometimes dumped into a model method as follows:

class Profile(models.Model):
    ...

    def is_superhero(self):
        url = "http://api.herocheck.com/?q={0}".format(
              self.user.username)
        return webclient.get(url)

This method can be refactored to use a service object as follows:

from .services import SuperHeroWebAPI

    def is_superhero(self):
        return SuperHeroWebAPI.is_hero(self.user.username)

The service object can be now defined in services.py as follows:

API_URL = "http://api.herocheck.com/?q={0}"

class SuperHeroWebAPI:
    ...
    @staticmethod
    def is_hero(username):
        url =API_URL.format(username)
        return webclient.get(url)

In most cases, methods of a Service object are stateless, that is, they perform the action solely based on the function arguments without using any class properties. Hence, it is better to explicitly mark them as static methods (as we have done for is_hero).

Consider refactoring your business logic or domain logic out of models into service objects. This way, you can use them outside your Django application as well.

Imagine there is a business reason to blacklist certain users from becoming superhero types based on their username. Our service object can be easily modified to support this:

class SuperHeroWebAPI:
    ...
    @staticmethod
    def is_hero(username):
        blacklist = set(["syndrome", "kcka$$", "superfake"])
        url =API_URL.format(username)
        return username not in blacklist and webclient.get(url)

Ideally, service objects are self-contained. This makes them easy to test without mocking, say, the database. They can be also easily reused.

In Django, time-consuming services are executed asynchronously using task queues such as Celery. Typically, the Service Object actions are run as Celery tasks. Such tasks can be run periodically or after a delay.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset