This section contains design patterns that deal with accessing model properties or performing queries on them.
Problem: Models have attributes that are implemented as methods. However, these attributes should not be persisted to the database.
Solution: Use the property decorator on such methods.
Model fields store per-instance attributes, such as first name, last name, birthday, and so on. They are also stored in the database. However, we also need to access some derived attributes, such as full name or age.
They can be easily calculated from the database fields, hence need not be stored separately. In some cases, they can just be a conditional check such as eligibility for offers based on age, membership points, and active status.
A straightforward way to implement this is to define functions, such as get_age
similar to the following:
class BaseProfile(models.Model): birthdate = models.DateField() #... def get_age(self): today = datetime.date.today() return (today.year - self.birthdate.year) - int( (today.month, today.day) < (self.birthdate.month, self.birthdate.day))
Calling profile.get_age()
would return the user's age by calculating the difference in the years adjusted by one based on the month and date.
However, it is much more readable (and Pythonic) to call it profile.age
.
Python classes can treat a function as an attribute using the property
decorator. Django models can use it as well. In the previous example, replace the function definition line with:
@property def age(self):
Now, we can access the user's age with profile.age
. Notice that the function's name is shortened as well.
An important shortcoming of a property is that it is invisible to the ORM, just like model methods are. You cannot use it in a QuerySet
object. For example, this will not work, Profile.objects.exclude(age__lt=18)
.
It might also be a good idea to define a property to hide the details of internal classes. This is formally known as the Law of Demeter. Simply put, the law states that you should only access your own direct members or "use only one dot".
For example, rather than accessing profile.birthdate.year
, it is better to define a profile.birthyear
property. It helps you hide the underlying structure of the birthdate
field this way.
An undesirable side effect of this law is that it leads to the creation of several wrapper properties in the model. This could bloat up models and make them hard to maintain. Use the law to improve your model's API and reduce coupling wherever it makes sense.
Each time we call a property, we are recalculating a function. If it is an expensive calculation, we might want to cache the result. This way, the next time the property is accessed, the cached value is returned.
from django.utils.functional import cached_property #... @cached_property def full_name(self): # Expensive operation e.g. external service call return "{0} {1}".format(self.firstname, self.lastname)
The cached value will be saved as a part of the Python instance. As long as the instance exists, the same value will be returned.
As a failsafe mechanism, you might want to force the execution of the expensive operation to ensure that stale values are not returned. In such cases, set a keyword argument such as cached=False
to prevent returning the cached value.
Problem: Certain queries on models are defined and accessed repeatedly throughout the code violating the DRY principle.
Solution: Define custom managers to give meaningful names to common queries.
Every Django model has a default manager called objects
. Invoking objects.all()
, will return all the entries for that model in the database. Usually, we are interested in only a subset of all entries.
We apply various filters to find out the set of entries we need. The criterion to select them is often our core business logic. For example, we can find the posts accessible to the public by the following code:
public = Posts.objects.filter(privacy="public")
This criterion might change in the future. Say, we might want to also check whether the post was marked for editing. This change might look like this:
public = Posts.objects.filter(privacy=POST_PRIVACY.Public, draft=False)
However, this change needs to be made everywhere a public post is needed. This can get very frustrating. There needs to be only one place to define such commonly used queries without 'repeating oneself'.
QuerySets
are an extremely powerful abstraction. They are lazily evaluated only when needed. Hence, building longer QuerySets
by method-chaining (a form of fluent interface) does not affect the performance.
In fact, as more filtering is applied, the result dataset shrinks. This usually reduces the memory consumption of the result.
A model manager is a convenient interface for a model to get its QuerySet
object. In other words, they help you use Django's ORM to access the underlying database. In fact, managers are implemented as very thin wrappers around a QuerySet
object. Notice the identical interface:
>>> Post.objects.filter(posted_by__username="a") [<Post: a: Hello World>, <Post: a: This is Private!>] >>> Post.objects.get_queryset().filter(posted_by__username="a") [<Post: a: Hello World>, <Post: a: This is Private!>]
The default manager created by Django, objects
, has several methods, such as all
, filter
, or exclude
that return QuerySets
. However, they only form a low-level API to your database.
Custom managers are used to create a domain-specific, higher-level API. This is not only more readable but less affected by implementation details. Thus, you are able to work at a higher level of abstraction closely modeled to your domain.
Our previous example for public posts can be easily converted into a custom manager as follows:
# managers.py from django.db.models.query import QuerySet class PostQuerySet(QuerySet): def public_posts(self): return self.filter(privacy="public") PostManager = PostQuerySet.as_manager
This convenient shortcut for creating a custom manager from a QuerySet
object appeared in Django 1.7. Unlike other previous approaches, this PostManager
object is chainable like the default objects
manager.
It sometimes makes sense to replace the default objects
manager with our custom manager, as shown in the following code:
from .managers import PostManager class Post(Postable): ... objects = PostManager()
By doing this, to access public_posts
our code gets considerably simplified to the following:
public = Post.objects.public_posts()
Since the returned value is a QuerySet
, they can be further filtered:
public_apology = Post.objects.public_posts().filter( message_startswith="Sorry")
QuerySets
have several interesting properties. In the next few sections, we can take a look at some common patterns that involve combining QuerySets
.
True to their name (or the latter half of their name), QuerySets
support a lot of (mathematical) set operations. For the sake of illustration, consider two QuerySets
that contain the user objects:
>>> q1 = User.objects.filter(username__in=["a", "b", "c"]) [<User: a>, <User: b>, <User: c>] >>> q2 = User.objects.filter(username__in=["c", "d"]) [<User: c>, <User: d>]
Some set operations that you can perform on them are as follows:
q1
| q2
to get [<User: a>
, <User: b>
, <User: c>
, <User: d>
]q1
and q2
to get [<User: c>
]q1.exclude(pk__in=q2)
to get [<User: a>
, <User: b>
]The same operations can be done using the Q
objects:
from django.db.models import Q # Union >>> User.objects.filter(Q(username__in=["a", "b", "c"]) | Q(username__in=["c", "d"])) [<User: a>, <User: b>, <User: c>, <User: d>] # Intersection >>> User.objects.filter(Q(username__in=["a", "b", "c"]) & Q(username__in=["c", "d"])) [<User: c>] # Difference >>> User.objects.filter(Q(username__in=["a", "b", "c"]) & ~Q(username__in=["c", "d"])) [<User: a>, <User: b>]
Note that the difference is implemented using &
(AND) and ~
(Negation). The Q
objects are very powerful and can be used to build very complex queries.
However, the Set
analogy is not perfect. QuerySets
, unlike mathematical sets, are ordered. So, they are closer to Python's list data structure in that respect.
So far, we have been combining QuerySets
of the same type belonging to the same base class. However, we might need to combine QuerySets
from different models and perform operations on them.
For example, a user's activity timeline contains all their posts and comments in reverse chronological order. The previous methods of combining QuerySets
won't work. A naïve solution would be to convert them to lists, concatenate, and sort them, like this:
>>>recent = list(posts)+list(comments) >>>sorted(recent, key=lambda e: e.modified, reverse=True)[:3] [<Post: user: Post1>, <Comment: user: Comment1>, <Post: user: Post0>]
Unfortunately, this operation has evaluated the lazy QuerySets
object. The combined memory usage of the two lists can be overwhelming. Besides, it can be quite slow to convert large QuerySets
into lists.
A much better solution uses iterators to reduce the memory consumption. Use the itertools.chain
method to combine multiple QuerySets
as follows:
>>> from itertools import chain >>> recent = chain(posts, comments) >>> sorted(recent, key=lambda e: e.modified, reverse=True)[:3]
Once you evaluate a QuerySet
, the cost of hitting the database can be quite high. So, it is important to delay it as long as possible by performing only operations that will return QuerySets
unevaluated.