Chapter 10. Test-Driven Development

Test-Driven Development (TDD) is a simple technique to produce high quality software. It is widely used in the Python community, but it is also very popular in other communities.

Testing is especially important in Python due to its dynamic nature. It lacks static typing so many, even minute, errors won't be noticed until the code is run and each of its line is executed. But the problem is not only how types in Python work. Remember that most bugs are not related to bad syntax usage, but rather to logical errors and subtle misunderstandings that can lead to major failures.

This chapter is split into two parts:

  • I don't test, which advocates TDD and quickly describes how to do it with the standard library
  • I do test, which is intended for developers who practice tests and wish to get more out of them

I don't test

If you have already been convinced to TDD, you should move to the next section. It will focus on advanced techniques and tools for making your life easier when working with tests. This part is mainly intended for those who are not using this approach and tries to advocate its usage.

Test-driven development principles

The test-driven development process, in its simplest form, consists of three steps:

  1. Writing automated tests for a new functionality or improvement that has not been implemented yet.
  2. Providing minimal code that just passes all the defined tests.
  3. Refactoring code to meet the desired quality standards.

The most important fact to remember about this development cycle is that tests should be written before implementation. It is not an easy task for unexperienced developers, but it is the only approach which guarantees that the code you are going to write will be testable.

For example, a developer who is asked to write a function that checks whether the given number is a prime number, writes a few examples on how to use it and what the expected results are:

assert is_prime(5)
assert is_prime(7)
assert not is_prime(8)

The developer that implements the feature does not need to be the only one responsible for providing tests. The examples can be provided by another person as well. For instance, very often the official specifications of network protocols or cryptography algorithms provide test vectors that are intended to verify correctness of implementation. These are a perfect basis for test cases.

From there, the function can be implemented until the preceding examples work:

def is_prime(number):
    for element in range(2, number):
        if number % element == 0:
            return False
    return True

A bug or an unexpected result is a new example of usage the function should be able to deal with:

>>> assert not is_prime(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

The code can be changed accordingly, until the new test passes:

def is_prime(number):
    if number in (0, 1):
        return False

    for element in range(2, number):
        if number % element == 0:
            return False

    return True

And more cases show that the implementation is still incomplete:

>>> assert not is_prime(-3) 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AssertionError

The updated code is as follows:

def is_prime(number):
    if number < 0 or number in (0, 1):
        return False

    for element in range(2, number):
        if number % element == 0:
            return False

    return True

From there, all tests can be gathered in a test function, which is run every time the code evolves:

def test_is_prime():
    assert is_prime(5)
    assert is_prime(7)

    assert not is_prime(8)
    assert not is_prime(0)
    assert not is_prime(1)

    assert not is_prime(-1)
    assert not is_prime(-3)
    assert not is_prime(-6)

Every time we come up with a new requirement, the test_is_prime() function should be updated first to define the expected behavior of the is_prime() function. Then, the test is run to check if the implementation delivers the desired results. Only if the tests are known to be failing, there is a need to update code for the tested function.

Test-driven development provides a lot of benefits:

  • It helps to prevent software regression
  • It improves software quality
  • It provides a kind of low-level documentation of code behavior
  • It allows you to produce robust code faster in short development cycles

The best convention to deal in with test is to gather all of them in a single module or package (usually named tests) and have an easy way to run the whole suite using a single shell command. Fortunately, there is no need to build whole test tool chains all by yourself. Both Python standard library and Python Package Index come with plenty of test frameworks and utilities that allow you to build, discover, and run tests in a convenient way. We will discuss the most notable examples of such packages and modules later in this chapter.

Preventing software regression

We all face software regression issues in our developer lives. Software regression is a new bug introduced by a change. It manifests when features or functionalities that were known to be working in the previous versions of the software get broken and stop working at some point during project development.

The main reason for regressions is high complexity of software. At some point, it is impossible to guess what a single change in the codebase might lead to. Changing some code might break some other features and sometimes lead to vicious side effects, such as silently corrupting data. And high complexity is not only the problem of huge codebases. There is, of course, obvious correlation between the amount of code and its complexity, but even small projects (few hundredths/thousands lines of code) may have such convoluted architecture that it is hard to predict all consequences of relatively small changes.

To avoid regression, the whole set of features the software provides should be tested every time a change occurs. Without this, you are not able to reliably tell difference between bugs that have always existed in your software from the new ones introduced to parts that were working correctly just some time ago.

Opening up a codebase to several developers amplifies the problem, since each person will not be fully aware of all the development activities. While having a version control system prevents conflicts, it does not prevent all unwanted interactions.

TDD helps reduce software regression. The whole software can be automatically tested after each change. This will work as long as each feature has the proper set of tests. When TDD is properly done, the testbase grows together with the codebase.

Since a full test campaign can last for quite a long time, it is a good practice to delegate it to some continuous integration system which can do the work in the background. We discussed such solutions already in Chapter 8, Managing Code. Nevertheless, the local re-launching of the tests should be performed manually by the developer too, at least for the concerned modules. Relying only on continuous integration will have a negative effect on the developers' productivity. Programmers should be able to run selections of tests easily in their environments. This is the reason why you should carefully choose testing tools for the project.

Improving code quality

When a new module, class, or a function is written, a developer focuses on how to write it and how to produce the best piece of code he or she can. But while he or she is concentrating on algorithms, he or she might lose the user's point of view: How and when will his or her function be used? Are the arguments easy and logical to use? Is the name of the API right?

This is done by applying the tips described in the previous chapters, such as Chapter 4, Choosing Good Names. But the only way to do it efficiently is to write usage examples. This is the moment when the developer realizes if the code he or she wrote is logical and easy to use. Often, the first refactoring occurs right after the module, class, or function is finished.

Writing tests, which are use cases for the code, helps in having a user point of view. Developers will, therefore, often produce a better code when they use TDD. It is difficult to test gigantic functions and huge monolithic classes. Code that is written with testing in mind tends to be architected more cleanly and modularly.

Providing the best developer documentation

Tests are the best place for a developer to learn how software works. They are the use cases the code was primarily created for. Reading them provides a quick and deep insight into how the code works. Sometimes an example is worth a thousand words.

The fact that these tests are always up to date with the codebase makes them the best developer documentation that a piece of software can have. Tests don't go stale in the same way documentation does, otherwise they would fail.

Producing robust code faster

Writing without testing leads to long debugging sessions. A consequence of a bug in one module might manifest itself in a completely different part of the software. Since you don't know who to blame, you spend an inordinate amount of time debugging. It's better to fight small bugs one at a time when a test fails, because you'll have a better clue as to where the real problem is. And testing is often more fun than debugging because it is coding.

If you measure the time taken to fix the code together with the time taken to write it, it will usually be longer than the time a TDD approach would take. This is not obvious when you start a new piece of code. This is because the time taken to set up a test environment and write the first few tests is extremely long compared to the time taken just to write the first pieces of code.

But there are some test environments that are really hard to set up. For instance, when your code interacts with an LDAP or an SQL server, writing tests is not obvious at all. This is covered in the Fakes and mocks section in this chapter.

What kind of tests?

There are several kinds of tests that can be made on any software. The main ones are acceptance tests (or functional tests) and unit tests, and these are the ones that most people think of when discussing the topic of software testing. But there are a few other kinds of tests that you can use in your project. We will discuss some of them shortly in this section.

Acceptance tests

An acceptance test focuses on a feature and deals with the software like a black box. It just makes sure that the software really does what it is supposed to do, using the same media as that of the users and controlling the output. These tests are usually written out of the development cycle to validate that the application meets the requirements. They are usually run as a checklist over the software. Often, these tests are not done through TDD and are built by managers, QA staff, or even customers. In that case, they are often called user acceptance tests.

Still, they can and they should be done with TDD principles. Tests can be provided before the features are written. Developers get a pile of acceptance tests, usually made out of the functional specifications, and their job is to make sure the code will pass all of them.

The tools used to write those tests depend on the user interface the software provides. Some popular tools used by Python developers are:

Application type

Tool

Web application

Selenium (for Web UI with JavaScript)

Web application

zope.testbrowser (doesn't test JS)

WSGI application

paste.test.fixture (doesn't test JS)

Gnome Desktop application

dogtail

Win32 Desktop application

pywinauto

Note

For an extensive list of functional testing tools, Grig Gheorghiu maintains a wiki page at https://wiki.python.org/moin/PythonTestingToolsTaxonomy.

Unit tests

Unit tests are low-level tests that perfectly fit test-driven development. As the name suggests, they focus on testing software units. A software unit can be understood as the smallest testable piece of the application code. Depending on the application, the size may vary from whole modules to a single method or function, but usually unit tests are written for the smallest fragments of code possible. Unit tests usually isolate the tested unit (module, class, function, and so on) from the rest of the application and other units. When external dependencies are required, such as web APIs or databases, they are often replaced by fake objects or mocks.

Functional tests

Functional tests focus on whole features and functionalities instead of small code units. They are similar in their purpose to acceptance tests. The main difference is that functional tests do not necessarily need to use the same interface that a user does. For instance, when testing web applications, some of the user interactions (or its consequences) can be simulated by synthetic HTTP requests or direct database access, instead of simulating real page loading and mouse clicks.

This approach is often easier and faster than testing with tools used in user acceptance tests. The downside of limited functional tests is that they tend not to cover enough parts of the application where different abstraction layers and components meet. Tests that focus on such meeting points are often called integration tests.

Integration tests

Integration tests represent a higher level of testing than unit tests. They test bigger parts of code and focus on situations where many application layers or components meet and interact with each other. The form and scope of integration tests varies depending on the project's architecture and complexity. For example, in small and monolithic projects, this may be as simple as running more complex functional tests and allowing them to interact with real backing services (databases, caches, and so on) instead of mocking or faking them. For complex scenarios or products that are built from multiple services, the real integration tests may be very extensive and even require running the whole project in a big distributed environment that mirrors the production.

Integration tests are often very similar to functional tests and the border between them is very blurry. It is very common that integration tests are also logically testing separate functionalities and features.

Load and performance testing

Load tests and performance tests provide objective information about code efficiency rather than its correctness. The terms of load testing and performance testing are used by some interchangeably but the first one in fact refers to a limited aspect of performance. Load testing focuses on measuring how code behaves under some artificial demand (load). This is a very popular way of testing web applications where load is understood as web traffic from real users or programmatic clients. It is important to note that load tests tend to cover whole requests to the application so are very similar to integration and functional tests. This makes it important to be sure that tested application components are fully verified to be working correctly. Performance tests are generally all the tests that aim to measure code performance and can target even small units of code. So, load tests are only a specific subtype of performance tests.

They are special kind of tests because they do not provide binary results (failure/success) but only some performance quality measurement. This means that single results need to be interpreted and/or compared with results of different test runs. In some cases, the project requirements may set some hard time or resource constraints on the code but this does not change the fact that there is always some arbitrary interpretation involved in these kinds of testing approaches.

Load performance tests are a great tool during the development of any software that needs to fulfill some Service Level Agreements because it helps to reduce the risk of compromising the performance of critical code paths. Anyway, it should not be overused.

Code quality testing

Code quality does not have the arbitrary scale that would say for definite if it is bad or good. Unfortunately, the abstract concept of code quality cannot be measured and expressed in the form of numbers. But instead, we can measure various metrics of the software that are known to be highly correlated with the quality of code. To name a few:

  • The number of code style violations
  • The amount of documentation
  • Complexity metrics, such as McCabe's cyclomatic complexity
  • The number of static code analysis warnings

Many projects use code quality testing in their continuous integration workflows. The good and popular approach is to test at least basic metrics (static code analysis and code style violations) and not allow the merging of any code to the mainstream that makes these metrics lower.

Python standard test tools

Python provides two main modules in the standard library to write tests:

unittest

unittest basically provides what JUnit does for Java. It offers a base class called TestCase, which has an extensive set of methods to verify the output of function calls and statements.

This module was created to write unit tests, but acceptance tests can also be written with it as long as the test uses the user interface. For instance, some testing frameworks provide helpers to drive tools such as Selenium on top of unittest.

Writing a simple unit test for a module using unittest is done by subclassing TestCase and writing methods with the test prefix. The final example from the Test-driven development principles section will look like this:

import unittest

from primes import is_prime


class MyTests(unittest.TestCase):
    def test_is_prime(self):
        self.assertTrue(is_prime(5))
        self.assertTrue(is_prime(7))

        self.assertFalse(is_prime(8))
        self.assertFalse(is_prime(0))
        self.assertFalse(is_prime(1))

        self.assertFalse(is_prime(-1))
        self.assertFalse(is_prime(-3))
        self.assertFalse(is_prime(-6))


if __name__ == "__main__":
    unittest.main()

The unittest.main() function is the utility that allows to make the whole module to be executable as a test suite:

$ python test_is_prime.py -v
test_is_prime (__main__.MyTests) ... ok

----------------------------------------------------------------------
Ran 1 test in 0.000s

OK

The unittest.main() function scans the context of the current module and looks for classes that subclass TestCase. It instantiates them, then runs all methods that start with the test prefix.

A good test suite follows the common and consistent naming conventions. For instance, if the is_prime function is included in the primes.py module, the test class could be called PrimesTests and put into the test_primes.py file:

import unittest

from primes import is_prime


class PrimesTests(unittest.TestCase):
    def test_is_prime(self):
        self.assertTrue(is_prime(5))
        self.assertTrue(is_prime(7))

        self.assertFalse(is_prime(8))
        self.assertFalse(is_prime(0))
        self.assertFalse(is_prime(1))

        self.assertFalse(is_prime(-1))
        self.assertFalse(is_prime(-3))
        self.assertFalse(is_prime(-6))


if __name__ == '__main__':
    unittest.main()

From there, every time the utils module evolves, the test_utils module gets more tests.

In order to work, the test_primes module needs to have the primes module available in the context. This can be achieved either by having both modules in the same package by adding a tested module explicitly to the Python path. In practice, the develop command of setuptools is very helpful here.

Running tests over the whole application presupposes that you have a script that builds a test campaign out of all test modules. unittest provides a TestSuite class that can aggregate tests and run them as a test campaign, as long as they are all instances of TestCase or TestSuite.

In Python's past, there was convention that test module provides a test_suite function that returns a TestSuite instance either used in the __main__ section, when the module is called by Command Prompt, or used by a test runner:

import unittest

from primes import is_prime


class PrimesTests(unittest.TestCase):
    def test_is_prime(self):
        self.assertTrue(is_prime(5))

        self.assertTrue(is_prime(7))

        self.assertFalse(is_prime(8))
        self.assertFalse(is_prime(0))
        self.assertFalse(is_prime(1))

        self.assertFalse(is_prime(-1))
        self.assertFalse(is_prime(-3))
        self.assertFalse(is_prime(-6))


class OtherTests(unittest.TestCase):
    def test_true(self):
        self.assertTrue(True)


def test_suite():
    """builds the test suite."""
    suite = unittest.TestSuite()
    suite.addTests(unittest.makeSuite(PrimesTests))
    suite.addTests(unittest.makeSuite(OtherTests))

    return suite


if __name__ == '__main__':
    unittest.main(defaultTest='test_suite')

Running this module from the shell will print the test campaign output:

$ python test_primes.py -v
test_is_prime (__main__.PrimesTests) ... ok
test_true (__main__.OtherTests) ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.001s

OK

The preceding approach was required in the older versions of Python when the unittest module did not have proper test discovery utilities. Usually, running of all tests was done by a global script that browses the code tree looking for tests and runs them. This is called test discovery and will be covered more extensively later in this chapter. For now, you should only know that unittest provides a simple command that can discover all tests from modules and packages with a test prefix:

$ python -m unittest -v
test_is_prime (test_primes.PrimesTests) ... ok
test_true (test_primes.OtherTests) ... ok

----------------------------------------------------------------------
Ran 2 tests in 0.001s

OK

If you use the preceding command, then there is no requirement to manually define the __main__ sections and invoke the unittest.main() function.

doctest

doctest is a module that extracts snippets in the form of interactive prompt sessions from docstrings or text files and replays them to check whether the example output is the same as the real one.

For instance, the text file with the following content could be run as a test:

Check addition of integers works as expected::

>>> 1 + 1
2

Let's assume this documentation file is stored in the filesystem under test.rst name. The doctest module provides some functions to extract and run the tests from such documentation:

>>> import doctest
>>> doctest.testfile('test.rst', verbose=True)
Trying:
    1 + 1
Expecting:
    2
ok
1 items passed all tests:
   1 tests in test.rst
1 tests in 1 items.
1 passed and 0 failed.
Test passed.
TestResults(failed=0, attempted=1)

Using doctest has many advantages:

  • Packages can be documented and tested through examples
  • Documentation examples are always up to date
  • Using examples in doctests to write a package helps to maintain the user's point of view

However, doctests do not make unit tests obsolete; they should be used only to provide human-readable examples in documents. In other words, when the tests are concerning low-level matters or need complex test fixtures that would obfuscate the document, they should not be used.

Some Python frameworks such as Zope use doctests extensively, and they are at times criticized by people who are new to the code. Some doctests are really hard to read and understand, since the examples break one of the rules of technical writing—they cannot be taken and run in a simple prompt, and they need extensive knowledge. So, documents that are supposed to help newcomers are really hard to read because the code examples, which are doctests built through TDD, are based on complex test fixtures or even specific test APIs.

Note

As explained in Chapter 9, Documenting Your Project, when you use doctests that are part of the documentation of your packages, be careful to follow the seven rules of technical writing.

At this stage, you should have a good overview of what TDD brings. If you are still not convinced, you should give it a try over a few modules. Write a package using TDD and measure the time spent on building, debugging, and then refactoring. You should find out quickly that it is truly superior.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset