If you are coming from the I don't test section and are now convinced to do test-driven development, then congratulations! You know the basics of test-driven development, but there are some more things you should learn before you will be able to efficiently use this methodology.
This section describes a few problems developers bump into when they write tests and some ways to solve them. It also provides a quick review of popular test runners and tools available in the Python community.
The unittest
module was introduced in Python 2.1 and has been massively used by developers since then. But some alternative test frameworks were created in the community by people who were frustrated with the weaknesses and limitations of unittest
.
These are the common criticisms that are often made:
TestCase
test
TestCase
instead of plain assert
statements and existing methods may not cover every use casesetUp
and tearDown
facilities are tied to the TestCase
level, though they run once per test. In other words, if a test fixture concerns many test modules, it is not simple to organize its creation and cleanup.python -m unittest
) indeed provides some test discovery but does not provide enough filtering capabilities. In practice, extra scripts have to be written to collect the tests, aggregate them, and then run them in a convenient way.A lighter approach is needed to write tests without suffering from the rigidity of a framework that looks too much like its big Java brother, JUnit. Since Python does not require working with a 100% class-based environment, it is preferable to provide a more Pythonic test framework that is not based on subclassing.
A common approach would be:
Some third-party tools try to solve the problems just mentioned by providing extra features in the shape of unittest
extensions.
Python wiki provides a very long list of various testing utilities and frameworks (refer to https://wiki.python.org/moin/PythonTestingToolsTaxonomy), but there are just two projects that are especially popular:
nose
: http://nose.readthedocs.orgpy.test
: http://pytest.orgnose
is mainly a test runner with powerful discovery features. It has extensive options that allow running all kind of test campaigns in a Python application.
It is not a part of standard library but is available on PyPI and can be easily installed with pip:
pip install nose
After installing nose, a new command called nosetests
is available at the prompt. Running the tests presented in the first section of the chapter can be done directly with it:
nosetests -v test_true (test_primes.OtherTests) ... ok test_is_prime (test_primes.PrimesTests) ... ok builds the test suite. ... ok ---------------------------------------------------------------------- Ran 3 tests in 0.009s OK
nose
takes care of discovering the tests by recursively browsing the current directory and building a test suite on its own. The preceding example at first glance does not look like any improvement over the simple python -m unittest
. The difference will be noticeable if you run this command with the --help
switch. You will notice that nose provides tens of parameters that allow you to control test discovery and execution.
nose
goes a step further by running all classes and functions whose name matches the regular expression ((?:^|[b_.-])[Tt]est)
located in modules that match it too. Roughly, all callables that start with test
and are located in a module that match the pattern will also be executed as a test.
For instance, this test_ok.py
module will be recognized and run by nose
:
$ more test_ok.py def test_ok(): print('my test') $ nosetests -v test_ok.test_ok ... ok ----------------------------------------------------------------- Ran 1 test in 0.071s OK
Regular TestCase
classes and doctests
are executed as well.
Last, nose
provides assertion functions that are similar to TestCase
methods. But these are provided as functions that follow the PEP 8 naming conventions, rather than using the Java convention
unittest
uses (refer to http://nose.readthedocs.org/).
nose
supports three levels of fixtures:
setup
and teardown
functions can be added in the __init__.py
module of a test's package containing all test modulessetup
and teardown
functionswith_setup
decorator providedFor instance, to set a test fixture at the module and test level, use this code:
def setup(): # setup code, launched for the whole module ... def teardown(): # teardown code, launched for the whole module ... def set_ok(): # setup code launched only for test_ok ... @with_setup(set_ok) def test_ok(): print('my test')
Last, nose
integrates smoothly with setuptools
and so the test
command can be used with it (python setup.py test
). This integration is done by adding the test_suite
metadata in the setup.py
script:
setup( #... test_suite='nose.collector', )
nose
also uses setuptool's
entry point machinery for developers to write nose
plugins. This allows you to override or modify every aspect of the tool from test discovery to output formatting.
A list of nose
plugins is maintained at https://nose-plugins.jottit.com.
nose
is a complete testing tool that fixes many of the issues unittest
has. It is still designed to use implicit prefix names for tests, which remains a constraint for some developers. While this prefix can be customized, it still requires one to follow a convention.
This convention over configuration statement is not bad and is a lot better than the boiler-plate code required in unittest
. But using explicit decorators, for example, could be a nice way to get rid of the test
prefix.
Also, the ability to extend nose
with plugins makes it very flexible and allows a developer to customize the tool to meet his/her needs.
If your testing workflow requires overriding a lot of nose parameters, you can easily add a .noserc
or a nose.cfg
file in your home directory or project root. It will specify the default set of options for the nosetests
command. For instance, a good practice is to automatically look for doctests during test run. An example of the nose
configuration file that enables running doctests is as follows:
[nosetests] with-doctest=1 doctest-extension=.txt
py.test
is very similar to nose
. In fact, the latter was inspired by py.test
, so we will focus mainly on details that make these tools different from each other. The tool was born as part of a larger package called py
but now these are developed separately.
Like every third-party package mentioned in this book, py.test
is available on PyPI and can be installed with pip
as pytest
:
$ pip install pytest
From there, a new py.test
command is available at the prompt that can be used exactly like nosetests
. The tool uses similar pattern-matching and a test discovery algorithm to catch tests to be run. The pattern is stricter than that which nose
uses and will only catch:
Test
, in a file that starts with test
test
, in a file that starts with test
The advantages of py.test
are:
py.test
supports two mechanisms to deal with fixtures. The first one, modeled after xUnit framework, is similar to nose
. Of course semantics differ a bit. py.test
will look for three levels of fixtures in each test module as shown in following example:
def setup_module(module): """ Setup up any state specific to the execution of the given module. """ def teardown_module(module): """ Teardown any state that was previously setup with a setup_module method. """ def setup_class(cls): """ Setup up any state specific to the execution of the given class (which usually contains tests). """ def teardown_class(cls): """ Teardown any state that was previously setup with a call to setup_class. """ def setup_method(self, method): """ Setup up any state tied to the execution of the given method in a class. setup_method is invoked for every test method of a class. """ def teardown_method(self, method): """ Teardown any state that was previously setup with a setup_method call. """
Each function will get the current module, class, or method as an argument. The test fixture will, therefore, be able to work on the context without having to look for it, as with nose
.
The alternative mechanism for writing fixtures with py.test
is to build on the concept of dependency injection, allowing to maintain the test state in a more modular and scalable way. The non xUnit-style fixtures (setup/teardown procedures) always have unique names and need to be explicitly activated by declaring their use in test functions, methods, and modules in classes.
The simplest implementation of fixtures takes the form of a named function declared with the pytest.fixture()
decorator. To mark a fixture as used in the test, it needs to be declared as a function or method argument. To make it more clear, consider the previous example of the test module for the is_prime
function rewritten with the use of py.test
fixtures:
import pytest from primes import is_prime @pytest.fixture() def prime_numbers(): return [3, 5, 7] @pytest.fixture() def non_prime_numbers(): return [8, 0, 1] @pytest.fixture() def negative_numbers(): return [-1, -3, -6] def test_is_prime_true(prime_numbers): for number in prime_numbers: assert is_prime(number) def test_is_prime_false(non_prime_numbers, negative_numbers): for number in non_prime_numbers: assert not is_prime(number) for number in non_prime_numbers: assert not is_prime(number)
py.test
provides a simple mechanism to disable some tests upon certain conditions. This is called skipping, and the pytest
package provides the .skipif
decorator for that purpose. If a single test function or a whole test class decorator needs to be skipped upon certain conditions, it needs to be defined with this decorator and with some value provided that verifies if the expected condition was met. Here is an example from the official documentation that skips running the whole test case class on Windows:
import pytest @pytest.mark.skipif( sys.platform == 'win32', reason="does not run on windows" ) class TestPosixCalls: def test_function(self): """will not be setup or run under 'win32' platform"""
You can, of course, predefine the skipping conditions in order to share them across your testing modules:
import pytest skipwindows = pytest.mark.skipif( sys.platform == 'win32', reason="does not run on windows" ) @skip_windows class TestPosixCalls: def test_function(self): """will not be setup or run under 'win32' platform"""
If a test is marked in such a way, it will not be executed at all. However, in some cases, you want to run such a test and want to execute it, but you know, it is expected to fail under known conditions. For this purpose, a different decorator is provided. It is @mark.xfail
and ensures that the test is always run, but it should fail at some point if the predefined condition occurs:
import pytest @pytest.mark.xfail( sys.platform == 'win32', reason="does not run on windows" ) class TestPosixCalls: def test_function(self): """it must fail under windows"""
Using xfail
is much stricter than skipif
. Test is always executed and if it does not fail when it is expected to, then the whole py.test
run will result in a failure.
An interesting feature of py.test
is its ability to distribute the tests across several computers. As long as the computers are reachable through SSH, py.test
will be able to drive each computer by sending tests to be performed.
However, this feature relies on the network; if the connection is broken, the slave will not be able to continue working since it is fully driven by the master.
Buildbot, or other continuous integration tools, is preferable when a project has long test campaigns. But the py.test
distributed model can be used for the ad hoc distribution of tests when you are working on an application that consumes a lot of resources to run the tests.
py.test
is very similar to nose
since no boilerplate code is needed to aggregate the tests in it. It also has a good plugin system and there are a great number of extensions available on PyPI.
Lastly, py.test
focuses on making the tests run fast and is truly superior compared to the other tools in this area. The other notable feature is the original approach to fixtures that really helps in managing a reusable library of fixtures. Some people may argue that there is too much magic involved but it really streamlines the development of a test suite. This single advantage of py.test
makes it my tool of choice, so I really recommend it.
Code coverage is a very useful metric that provides objective information on how well project code is tested. It is simply a measurement of how many and which lines of code are executed during all test executions. It is often expressed as a percentage and 100% coverage means that every line of code was executed during tests.
The most popular code coverage tool is called simply coverage and is freely available on PyPI. The usage is very simple and consists only of two steps. The first step is to run the coverage run command in your shell with the path to your script/program that runs all the tests as an argument:
$ coverage run --source . `which py.test` -v ===================== test session starts ====================== platformdarwin -- Python 3.5.1, pytest-2.8.7, py-1.4.31, pluggy-0.3.1 -- /Users/swistakm/.envs/book/bin/python3 cachedir: .cache rootdir: /Users/swistakm/dev/book/chapter10/pytest, inifile: plugins: capturelog-0.7, codecheckers-0.2, cov-2.2.1, timeout-1.0.0 collected 6 items primes.py::pyflakes PASSED primes.py::pep8 PASSED test_primes.py::pyflakes PASSED test_primes.py::pep8 PASSED test_primes.py::test_is_prime_true PASSED test_primes.py::test_is_prime_false PASSED ========= 6 passed, 1 pytest-warnings in 0.10 seconds ==========
The coverage run also accepts -m
parameter that specifies a runnable module name instead of a program path that may be better for some testing frameworks:
$ coverage run -m unittest $ coverage run -m nose $ coverage run -m pytest
The next step is to generate a human-readable report of your code coverage from results cashed in the .coverage
file. The coverage
package supports a few output formats and the simplest one just prints an ASCII table in your terminal:
$ coverage report Name StmtsMiss Cover ------------------------------------ primes.py 7 0 100% test_primes.py 16 0 100% ------------------------------------ TOTAL 23 0 100%
The other useful coverage report format is HTML that can be browsed in your web browser:
$ coverage html
The default output folder of this HTML report is htmlcov/
in your working directory. The real advantage of the coverage html
output is that you can browse annotated sources of your project with highlighted parts that have missing test coverage (as shown in Figure 1):
You should remember that while you should always strive to ensure 100% test coverage, it is never a guarantee that code is tested perfectly and there is no place where code can break. It means only that every line of code was reached during execution, but not necessarily every possible condition was tested. In practice, it may be relatively easy to ensure full code coverage, but it is really hard to make sure that every branch of code was reached. This is especially true for the testing of functions that may have multiple combinations of if
statements and specific language constructs like list
/dict
/set
comprehensions. You should always care for good test coverage, but you should never treat its measurement as the final answer of how good your testing suite is.
Writing unit tests presupposes that you isolate the unit of code that is being tested. Tests usually feed the function or method with some data and verify its return value and/or the side effects of its execution. This is mainly to make sure the tests:
Sometimes, the proper isolation of the program component is not obvious. For instance, if the code sends e-mails, it will probably call Python's smtplib
module, which will work with the SMTP server through a network connection. If we want our tests to be reproducible and are just testing if e-mails have the desired content, then probably this should not happen. Ideally, unit tests should run on any computer with no external dependencies and side effects.
Thanks to Python's dynamic nature, it is possible to use monkey patching to modify the runtime code from the test fixture (that is, modify software dynamically at runtime without touching the source code) to fake the behavior of a third-party code or library.
A fake behavior in the tests can be created by discovering the minimal set of interactions needed for the tested code to work with the external parts. Then, the output is manually returned or uses a real pool of data that has been previously recorded.
This is done by starting an empty class or function and using it as a replacement. The test is then launched, and the fake is iteratively updated until it behaves correctly. This is possible thanks to nature of a Python type system. The object is considered compatible with the given type as long as it behaves as the expected type and does not need to be its ancestor via subclassing. This approach to typing in Python is called duck typing—if something behaves like a duck, it can be treated like a duck.
Let's take an example with a function called send
in a module called mailer
that sends e-mails:
import smtplib import email.message def send( sender, to, subject='None', body='None', server='localhost' ): """sends a message.""" message = email.message.Message() message['To'] = to message['From'] = sender message['Subject'] = subject message.set_payload(body) server = smtplib.SMTP(server) try: return server.sendmail(sender, to, message.as_string()) finally: server.quit()
The corresponding test can be:
from mailer import send def test_send(): res = send( '[email protected]', '[email protected]', 'topic', 'body' ) assert res == {}
This test will pass and work as long as there is an SMTP server on the local host. If not, it will fail like this:
$ py.test --tb=short ========================= test session starts ========================= platform darwin -- Python 3.5.1, pytest-2.8.7, py-1.4.31, pluggy-0.3.1 rootdir: /Users/swistakm/dev/book/chapter10/mailer, inifile: plugins: capturelog-0.7, codecheckers-0.2, cov-2.2.1, timeout-1.0.0 collected 5 items mailer.py .. test_mailer.py ..F ============================== FAILURES =============================== ______________________________ test_send ______________________________ test_mailer.py:10: in test_send 'body' mailer.py:19: in send server = smtplib.SMTP(server) .../smtplib.py:251: in __init__ (code, msg) = self.connect(host, port) .../smtplib.py:335: in connect self.sock = self._get_socket(host, port, self.timeout) .../smtplib.py:306: in _get_socket self.source_address) .../socket.py:711: in create_connection raise err .../socket.py:702: in create_connection sock.connect(sa) E ConnectionRefusedError: [Errno 61] Connection refused ======== 1 failed, 4 passed, 1 pytest-warnings in 0.17 seconds ========
A patch can be added to fake the SMTP class:
import smtplib import pytest from mailer import send class FakeSMTP(object): pass @pytest.yield_fixture() def patch_smtplib(): # setup step: monkey patch smtplib old_smtp = smtplib.SMTP smtplib.SMTP = FakeSMTP yield # teardown step: bring back smtplib to # its former state smtplib.SMTP = old_smtp def test_send(patch_smtplib): res = send( '[email protected]', '[email protected]', 'topic', 'body' ) assert res == {}
In the preceding code, we have used a new pytest.yield_fixture()
decorator. It allows us to use a generator syntax to provide both setup and teardown procedures in a single fixture function. Now our test suite can be run again with the patched version of smtplib
:
$ py.test --tb=short -v ======================== test session starts ======================== platform darwin -- Python 3.5.1, pytest-2.8.7, py-1.4.31, pluggy-0.3.1 -- /Users/swistakm/.envs/book/bin/python3 cachedir: .cache rootdir: /Users/swistakm/dev/book/chapter10/mailer, inifile: plugins: capturelog-0.7, codecheckers-0.2, cov-2.2.1, timeout-1.0.0 collected 5 items mailer.py::pyflakes PASSED mailer.py::pep8 PASSED test_mailer.py::pyflakes PASSED test_mailer.py::pep8 PASSED test_mailer.py::test_send FAILED ============================= FAILURES ============================== _____________________________ test_send _____________________________ test_mailer.py:29: in test_send 'body' mailer.py:19: in send server = smtplib.SMTP(server) E TypeError: object() takes no parameters ======= 1 failed, 4 passed, 1 pytest-warnings in 0.09 seconds =======
As we can see from the preceding transcript, our FakeSMTP
class implementation is not complete. We need to update its interface to match the original SMTP class. According to the duck typing principle, we need only to provide interfaces that are required by the tested send()
function:
class FakeSMTP(object): def __init__(self, *args, **kw): # arguments are not important in our example pass def quit(self): pass def sendmail(self, *args, **kw): return {}
Of course, the fake class can evolve with new tests to provide more complex behaviors. But it should be as short and simple as possible. The same principle can be used with more complex outputs, by recording them to serve them back through the fake API. This is often done for third-party servers such as LDAP or SQL.
It is important to know that special care should be taken when monkey patching any built-in or third-party module. If not done properly, such an approach might leave unwanted side effects that will propagate between tests. Fortunately, many testing frameworks and tools provide proper utilities that make the patching of any code units safe and easy. In our example, we did everything manually and provided a custom patch_smtplib()
fixture function with separated setup and teardown steps. A typical solution in py.test
is much easier. This framework comes with a built-in monkey patch fixture that should satisfy most of our patching needs:
import smtplib from mailer import send class FakeSMTP(object): def __init__(self, *args, **kw): # arguments are not important in our example pass def quit(self): pass def sendmail(self, *args, **kw): return {} def test_send(monkeypatch): monkeypatch.setattr(smtplib, 'SMTP', FakeSMTP) res = send( '[email protected]', '[email protected]', 'topic', 'body' ) assert res == {}
You should know that fakes have real limitations. If you decide to fake an external dependency, you might introduce bugs or unwanted behaviors the real server wouldn't have or vice versa.
Mock objects are generic fake objects that can be used to isolate the tested code. They automate the building process of the object's input and output. There is a greater use of mock objects in statically typed languages, where monkey patching is harder, but they are still useful in Python to shorten the code to mimic external APIs.
There are a lot of mock libraries available in Python, but the most recognized one is unittest.mock
and is provided in the standard library. It was created as a third-party package and not as a part of the Python distribution but was shortly included into the standard library as a provisional package (refer to https://docs.python.org/dev/glossary.html#term-provisional-api). For Python versions older than 3.3, you will need to install it from PyPI:
pip install Mock
In our example, using unittest.mock
to patch SMTP is way simpler than creating a fake from scratch:
import smtplib from unittest.mock import MagicMock from mailer import send def test_send(monkeypatch): smtp_mock = MagicMock() smtp_mock.sendmail.return_value = {} monkeypatch.setattr( smtplib, 'SMTP', MagicMock(return_value=smtp_mock) ) res = send( '[email protected]', '[email protected]', 'topic', 'body' ) assert res == {}
The return_value
argument of a mock object or method allows you to define what value is returned by the call. When the mock object is used, every time an attribute is called by the code, it creates a new mock object for the attribute on the fly. Thus, no exception is raised. This is the case (for instance) for the quit
method we wrote earlier that does not need to be defined anymore.
In the preceding example, we have in fact created two mocks:
__init__()
method. Mocks by default return new Mock()
objects if treated as callable. This is why we needed to provide another mock as its return_value
keyword argument to have control on the instance interface.smtplib.SMTP()
call. In this mock, we control the behavior of the sendmail()
method.In our example, we have used the monkey-patching utility available from the py.test
framework, but unittest.mock
provides its own patching utilities. In some situations (like patching class objects), it may be simpler and faster to use them instead of your framework-specific tools. Here is example of monkey patching with the patch()
context manager provided by unittest.mock
module:
from unittest.mock import patch from mailer import send def test_send(): with patch('smtplib.SMTP') as mock: instance = mock.return_value instance.sendmail.return_value = {} res = send( '[email protected]', '[email protected]', 'topic', 'body' ) assert res == {}
The importance of environment isolation has already been mentioned in this book many times. By isolating your execution environment on both application level (virtual environments) and system level (system virtualization), you are able to ensure that your tests run under repeatable conditions. This way, you protect yourself from rare and obscure problems caused by broken dependencies.
The best way to allow the proper isolation of a test environment is to use good continuous integration systems that support system virtualization. There are good free solutions for open source projects such as Travis CI (Linux and OS X) or AppVeyor (Windows), but if you need such a thing for testing proprietary software, it is very likely that you will need to spend some time on building such a solution by yourself on top of some existing open source CI tools (GitLab CI, Jenkins, and Buildbot).
Testing matrixes for open source Python projects in most cases focus only on different Python versions and rarely on different operating systems. Not doing your tests and builds on different systems is completely OK for projects that are purely Python and where there are no expected system interoperability issues. But some projects, especially when distributed as compiled Python extensions, should be definitely tested on various target operating systems. For open source projects, you may even be forced to use a few independent CI systems to provide builds for just the three most popular ones (Windows, Linux, and Mac OS X). If you are looking for a good example, you can take a look at the small pyrilla project (refer to https://github.com/swistakm/pyrilla) that is a simple C audio extension for Python. It uses both Travis CI and AppVeyor in order to provide compiled builds for Windows and Mac OS X and a large range of CPython versions.
But dimensions of test matrixes do not end on systems and Python versions. Packages that provide integration with other software such as caches, databases, or system services very often should be tested on various versions of integrated applications. A good tool that makes such testing easy is tox (refer to http://tox.readthedocs.org). It provides a simple way to configure multiple testing environments and run all tests with a single tox
command. It is a very powerful and flexible tool but is also very easy to use. The best way to present its usage is to show you an example of a configuration file that is in fact the core of tox. Here is the tox.ini
file from the django-userena project (refer to https://github.com/bread-and-pepper/django-userena):
[tox] downloadcache = {toxworkdir}/cache/ envlist = ; py26 support was dropped in django1.7 py26-django{15,16}, ; py27 still has the widest django support py27-django{15,16,17,18,19}, ; py32, py33 support was officially introduced in django1.5 ; py32, py33 support was dropped in django1.9 py32-django{15,16,17,18}, py33-django{15,16,17,18}, ; py34 support was officially introduced in django1.7 py34-django{17,18,19} ; py35 support was officially introduced in django1.8 py35-django{18,19} [testenv] usedevelop = True deps = django{15,16}: south django{15,16}: django-guardian<1.4.0 django15: django==1.5.12 django16: django==1.6.11 django17: django==1.7.11 django18: django==1.8.7 django19: django==1.9 coverage: django==1.9 coverage: coverage==4.0.3 coverage: coveralls==1.1 basepython = py35: python3.5 py34: python3.4 py33: python3.3 py32: python3.2 py27: python2.7 py26: python2.6 commands={envpython} userena/runtests/runtests.py userenaumessages {posargs} [testenv:coverage] basepython = python2.7 passenv = TRAVIS TRAVIS_JOB_ID TRAVIS_BRANCH commands= coverage run --source=userena userena/runtests/runtests.py userenaumessages {posargs} coveralls
This configuration allows you to test django-userena
on five different versions of Django and six versions of Python. Not every Django version will work on every Python version and the tox.ini
file makes it relatively easy to define such dependency constraints. In practice, the whole build matrix consists of 21 unique environments (including a special environment for code coverage collection). It would require tremendous effort to create each testing environment manually or even using shell scripts.
Tox is great but its usage gets more complicated if we would like to change other elements of the testing environment that are not plain Python dependencies. This is a situation when we need to test under different versions of system packages and backing services. The best way to solve this problem is again to use good continuous integration systems that allow you to easily define matrixes of environment variables and install system software on virtual machines. A good example of doing that using Travis CI is provided by the ianitor
project (refer to https://github.com/ClearcodeHQ/ianitor/) that was already mentioned in Chapter 9, Documenting Your Project. It is a simple utility for the Consul discovery service. The Consul project has a very active community and many new versions of its code are released every year. This makes it very reasonable to test against various versions of that service. This makes sure that the ianitor
project is still up to date with the latest version of that software but also does not break compatibility with previous Consul versions. Here is the content of the .travis.yml
configuration file for Travis CI that allows you to test against three different Consul versions and four Python interpreter versions:
language: python install: pip install tox --use-mirrors env: matrix: # consul 0.4.1 - TOX_ENV=py27 CONSUL_VERSION=0.4.1 - TOX_ENV=py33 CONSUL_VERSION=0.4.1 - TOX_ENV=py34 CONSUL_VERSION=0.4.1 - TOX_ENV=py35 CONSUL_VERSION=0.4.1 # consul 0.5.2 - TOX_ENV=py27 CONSUL_VERSION=0.5.2 - TOX_ENV=py33 CONSUL_VERSION=0.5.2 - TOX_ENV=py34 CONSUL_VERSION=0.5.2 - TOX_ENV=py35 CONSUL_VERSION=0.5.2 # consul 0.6.4 - TOX_ENV=py27 CONSUL_VERSION=0.6.4 - TOX_ENV=py33 CONSUL_VERSION=0.6.4 - TOX_ENV=py34 CONSUL_VERSION=0.6.4 - TOX_ENV=py35 CONSUL_VERSION=0.6.4 # coverage and style checks - TOX_ENV=pep8 CONSUL_VERSION=0.4.1 - TOX_ENV=coverage CONSUL_VERSION=0.4.1 before_script: - wget https://releases.hashicorp.com/consul/${CONSUL_VERSION}/consul_${CONSUL_VERSION}_linux_amd64.zip - unzip consul_${CONSUL_VERSION}_linux_amd64.zip - start-stop-daemon --start --background --exec `pwd`/consul -- agent -server -data-dir /tmp/consul -bootstrap-expect=1 script: - tox -e $TOX_ENV
The preceding example provides 14 unique test environments (including pep8
and coverage
builds) for the ianitor
code. This configuration also uses tox to create actual testing virtual environments on Travis VMs. This is actually a very popular approach to integrate tox with different CI systems. By moving as much of a test environment configuration as possible to tox, you are reducing the risk of locking yourself to a single vendor. Things like the installation of new services or defining system environment variables are supported by most of the Travis CI competitors, so it should be relatively easy to switch to a different service provider if there is a better product available on the market or Travis will change their pricing model for open source projects.
doctests are a real advantage in Python compared to other languages. The fact that documentation can use code examples that are also runnable as tests changes the way TDD can be done. For instance, a part of the documentation can be done through doctests
during the development cycle. This approach also ensures that the provided examples are always up to date and really working.
Building software through doctests rather than regular unit tests is called Document-Driven Development (DDD). Developers explain what the code is doing in plain English while they are implementing it.
Writing doctests in DDD is done by building a story about how a piece of code works and should be used. The principles are described in plain English and then a few code usage examples are distributed throughout the text. A good practice is to start to write text on how the code works and then add some code examples.
To see an example of doctests in practice, let's look at the atomisator
package (refer to https://bitbucket.org/tarek/atomisator). The documentation text for its atomisator.parser
subpackage (under packages/atomisator.parser/atomisator/parser/docs/README.txt
) is as follows:
================= atomisator.parser ================= The parser knows how to return a feed content, with the `parse` function, available as a top-level function:: >>> from atomisator.parser import Parser This function takes the feed url and returns an iterator over its content. A second parameter can specify a maximum number of entries to return. If not given, it is fixed to 10:: >>> import os >>> res = Parser()(os.path.join(test_dir, 'sample.xml')) >>> res <itertools.imap ...> Each item is a dictionary that contain the entry:: >>> entry = res.next() >>> entry['title'] u'CSSEdit 2.0 Released' The keys available are: >>> keys = sorted(entry.keys()) >>> list(keys) ['id', 'link', 'links', 'summary', 'summary_detail', 'tags', 'title', 'title_detail'] Dates are changed into datetime:: >>> type(entry['date']) >>>
Later, the doctest will evolve to take into account new elements or the required changes. This doctest is also a good documentation for developers who want to use the package and should be changed with this usage in mind.
A common pitfall in writing tests in a document is to transform it into an unreadable piece of text. If this happens, it should not be considered as part of the documentation anymore.
That said, some developers that are working exclusively through doctests often group their doctests into two categories: the ones that are readable and usable so that they can be a part of the package documentation, and the ones that are unreadable and are just used to build and test the software.
Many developers think that doctests should be dropped for the latter in favor of regular unit tests. Others even use dedicated doctests for bug fixes.
So, the balance between doctests and regular tests is a matter of taste and is up to the team, as long as the published part of the doctests is readable.