Chapter 10. Deployment and Maintenance Strategies

One of the major challenges in modern web development is managing deployment and maintenance tasks. Python and Django, as well as other frameworks, can simplify application development, but leave us somewhat on our own when it comes to deployment. Fortunately an array of tools that are improving all the time can assist us in this task. In this chapter we will examine some of these tools, including:

  • The mod_wsgi extension for Apache to load our Python applications
  • Automated remote deployment using Fabric
  • Python environment isolation and project building with zc.buildout
  • Virtual environments to easily develop and deploy multiple projects
  • Python module distribution using distutils

Knowledge of these tools, at the very least, is important for any Django developer. If you're interested in boosting your productivity right away, as well as avoiding potential future headaches, you can do so easily by switching to virtual environments. In the author's opinion, learning virtualenv is one of the best investments in process you can make and is one of the most productive development tools in the Django and Python community.

Note also that many of these tools are changing fast. Fabric, for example, is expected to release a new 1.0 version sometime soon. We have done our best to cover the very latest versions of the tools in this chapter and throughout the book, but minor variations to syntax in new versions are likely unavoidable. In addition, these are the top tools available today, but developers are constantly creating, improving, and releasing new tools.

Apache and mod_wsgi

There are numerous methods of integrating Django applications with web servers. Each has specific advantages and disadvantages, and your options are sometimes limited by your server, hosting provider, and operating system. The new, official recommendation, at least for Apache servers, is to use mod_wsgi.

WSGI (pronounced wizgy) stands for Web Server Gateway Interface, and it is designed to be a standard means of connecting Python web applications with the servers that run them. This means that any WSGI-compliant Python web framework or application can run on any web server that implements the WSGI standard. Apache is a very popular open source web server that supports the WSGI standard through its mod_wsgi extension.

Web server configurations, especially those involving Apache, are a complex business. Web servers are like snowflakes and no two are ever the same. It's easy to argue about performance of module X over module Y using framework Z, but in reality there is a trade-off happening in every configuration. Sometimes you give up a little memory to gain some speed, and sometimes vice versa.

If you have a choice in the matter, mod_wsgi is generally an all-around solid and highly recommended deployment option. It may be possible to get improved performance out of another option, but if your situation warrants something more sophisticated, you likely don't need us to tell you.

How does WSGI compare to other Django server options, such as mod_python? In general, mod_wsgi is faster and uses less memory. In the case of it being faster we simply mean that mod_wsgi's measured request throughput is often greater than mod_python under comparable load in similar servers. Memory usage is more obvious, in part because mod_wsgi is written strictly for support of WSGI applications. The mod_python option is designed not just for running Python-based web applications, but also to provide Python modules for interacting with the Apache server. As a result, additional memory is required to accommodate the additional modules. This memory usage claim holds up even under the simplest load analysis (try setting up two identical servers, automating a load test and running top).

There are two modes of configuration for mod_wsgi, embedded and daemon. Daemon mode is recommended in almost all cases. The difference is that embedded mode, like mod_python, runs the Python interpreter and WSGI application within an Apache process. Daemon mode allows us to run our WSGI applications in a separate process from Apache requests—a special WSGI-only process. This process can be threaded and has other special properties, such as running under an arbitrary user account. Though performance in both configurations can be made comparable, Apache's default setup makes embedded mode difficult to optimize without a deep understanding of Apache configuration. As a result, daemon mode is recommended just because it usually works.

These few paragraphs about performance are at best crude sketches of a topic that is very complex. For the vast majority of applications, server-level configuration, beyond the basics we've outlined, is not going to be the source of problems. Usually, bottlenecks and other performance issues will result from the Django application itself.

A great deal of additional information on mod_wsgi is available at the project's homepage on Google Code: http://code.google.com/p/modwsgi/.

A Django WSGI script

Django includes built-in support for WSGI, but in order to deploy our Django projects under mod_wsgi, we need to write a simple WSGI script. This is just some Python code that sets up our environment and kicks off Django's WSGI handler. Here is an example script file, called django.wsgi:

import os
import sys

os.environ['DJANGO_SETTINGS_MODULE'] = 'coleman.settings' sys.path.append('/home/jesse/src/my-site-packages')
import django.core.handlers.wsgi
application = django.core.handlers.wsgi.WSGIHandler()

Specifying our settings module manually is required in all cases (it's a Django requirement). We can also modify the value of sys.path, if needed, to include our Django or other Python packages. The WSGI script will run under mod_wsgi and will therefore not receive any additions to the $PYTHONPATH environment variable we might use in our local, command-line development environment.

An example httpd.conf

Once we've put together our WSGI script, we need to write an Apache configuration file. The primary Apache directive to know is WSGIScriptAlias. This is similar to the more common ScriptAlias directive. An example VirtualHost configuration appears below:

<VirtualHost *:80>
ServerName coleman.localhost
WSGIScriptAlias / /Users/jesse/src/coleman/apache/django.wsgi

<Directory /Users/jesse/src/coleman/apache>
Order deny,allow
Allow from all
</Directory>
Alias /media/ /Users/jesse/src/coleman/media/
<Directory /Users/jesse/src/coleman/media>
Order deny,allow
Allow from all
</Directory>
</VirtualHost>

Configuring daemon mode

By default, this configuration uses mod_wsgi in embedded mode. For development or testing purposes this may not make much of a difference, but as we mentioned earlier, it is generally recommended to use daemon mode for any WSGI application, unless you're willing and able to tune Apache yourself. To enable daemon mode, we use the WSGIDaemonProcess and WSGIProcessGroup directives:

<VirtualHost *:80>
WSGIDaemonProcess book-site user=jesse threads=25
WSGIProcessGroup book-site
ServerName coleman.localhost
WSGIScriptAlias / /Users/jesse/src/coleman/apache/django.wsgi

<Directory /Users/jesse/src/coleman/apache>
  Order deny,allow
  Allow from all
</Directory>

Alias /media/ /Users/jesse/src/coleman/media/

<Directory /Users/jesse/src/coleman/media>
  Order deny,allow
  Allow from all
</Directory>
</VirtualHost>

The addition of these two directives changes things quite significantly. Let's examine WSGIDaemonProcess first.

The WSGIDaemonProcess directive activates mod_wsgi's daemon mode. The first parameter, which is required, is a unique name that will be assigned to the WSGI daemon process. The remaining options are not required, but fine tune mod_wsgi's mechanisms. The above example uses two options. The first, user, specifies the user account that our WSGI process will run under. The second, threads, specifies how many threads to initiate in this process.

WSGIDaemonProcess supports establishing multiple processes as well, using the processes option. Use of this option in combination with threads will control how many total request handlers will be available for your WSGI application. The default number of threads is 15, so using processes=2 would spawn 30 total request handlers. These options are tunable and depend a lot on your specific application, server, hardware, and so on.

The WSGIDaemonProcess directive simply creates a WSGI process according to our desired options. To assign this process to a specific WSGI application, we need to use WSGIProcessGroup. This directive can occur in our top-level Apache configuration or within a VirtualHost context. It our example case, the WSGI application defined for the virtual host will be assigned to the WSGI process named book-site. A WSGI application can be bound to any WSGI process defined in the server or virtual host. In addition, multiple applications can be assigned to the same process in this same manner.

Whole books are dedicated to Apache configuration and the mod_wsgi site alone includes dozens of wiki pages with details and discussion about the module's use. In general, most applications will not need much more than a simple, straightforward configuration. It is important to understand the basics, however, because as your application grows, it could become important.

Thread-safety

An issue that arises in our configuration above is thread-safety. Considerable debate has followed the development of Django in regards to this topic, but as of most recent releases, the framework is generally considered to be thread-safe. This means it is reasonable to run Django applications using multi-threaded WSGI processes or under a multi-threaded web server (for example, Apache's 'worker' MPM).

Undiscovered threading bugs are always possible, albeit unlikely. More important, however, is to consider the implications of your own application code and how it works in a multi-threaded environment. Typically this sort of problem can be introduced when trying to use shared, global variables between threads in your application code. Particularly important is the fact that Django QuerySet objects are not thread-safe. They attempt to share a global QuerySet instance in application code between threads, which will have unpredictable results.

We should also mention that versions of Django prior to 1.0 had known thread-safety problems. If you're running an earlier Django version, for code compatibility reasons perhaps, you could potentially encounter threading bugs when running under Apache's worker MPM, multi-threaded WSGI processes, any Apache server running on the Windows operating system, or any other multi-threaded environment.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset