Chapter 9. Testing

Writing an application is only part of the process; it's also important to check that all the code works as it should. You can visually inspect the code, but it's better to execute it in a variety of situations that may arise in the real world to make sure it behaves properly in all situations. This process is called unit testing because the goal is to test the smallest available units of execution.

Typically, the smallest unit is a function or method, many of which combine to form a full application. By breaking it down into individual units, you can minimize how much each test is responsible for. This way, a failure of any particular unit doesn't involve hundreds of lines of code, so it's easier to track down exactly what's going wrong.

Testing each individual unit can be a lengthy process for large applications, though, given how many scenarios you may need to take into account. Rather than try to get through all of it manually, you can automate the process by letting your code do the heavy lifting. Writing a test suite allows you to easily try all the different paths your code might take, verifying that each behaves as it should.

Test-Driven Development (TDD)

One of the more extreme examples of automated testing is the practice of test-driven development, often referred to simply as TDD. As the name implies, this practice uses automated testing to drive the development process. Whenever a new feature is written, tests for that feature are written first—tests that will fail right away. Once the tests are in place, you would write code to make sure those tests pass.

One value of this approach is that it encourages you to understand the desired behavior more thoroughly before setting out to write the code. For example, a function that processes text might have a number of common input strings, each with a desired output. Writing the test first encourages you to think about the output string for each available input string, without regard to how the string is processed internally. By shifting the focus away from code at the outset, it's easier to see the big picture.

The more obvious advantage, though, is that it ensures that every piece of code in an application has a set of tests associated with it. When code comes first, it's all too easy to run a few basic scenarios manually, then move on to coding the next feature. Tests can get lost in the shuffle, even though they're essential to the long-term health of the project. Getting in the habit of writing tests first is a good way to make sure they do get written.

Unfortunately, many developers find test-driven development far too strict for practical work. As long as the tests get written as comprehensively as possible, though, your code will reap the benefits. One of the easiest ways to do this is to write doctests.

Doctests

The topic of documentation was well covered in the previous chapter, but one particular aspect of it can be useful for testing. Since Python supports docstrings that can be processed by code, rather than just by people, the content within those strings can be used to perform basic tests as well.

In order to play double-duty alongside regular documentation, doctests must look like documentation, while still being something that can be parsed, executed and verified for correctness. One format fits that bill very conveniently, and it's been in use throughout this book. Doctests are formatted as interactive interpreter sessions, which already contain both input and output in an easily identifiable format.

Formatting Code

Even though the overall format of a doctest is identical to the interpreter sessions shown throughout this book, there are some specific details that are important to identify. Each line of code to execute begins with three right angle brackets (>>>) and a single space, followed by the code itself.

>>> a = 2

Just like the interactive interpreter, any code that extends beyond one line is indicated by new lines beginning with three periods (...) rather than brackets. You can include as many of these as necessary to complete multi-line structures, such as lists and dictionaries, as well as function and class definitions.

>>> b = ('example',
... 'value')
>>> def test():
...     return b * a

All the lines that start with periods like this are combined with the last line that started with angle brackets, and they're all evaluated together. That means you can leave extra lines if necessary, anywhere in the structure or even after it. This is useful for mimicking the output of an actual interpreter session, which requires a blank line to indicate when indented structures, such as functions or classes, are completed.

>>> b = ('example',
...
... 'value')
>>> def test():
...     return b * a
...

Representing Output

With the code in place, we just need to verify that its output matches what's expected. In keeping with the interpreter format, output is presented beneath one or more lines of input code. The exact formatting of the output will depend on the code being executed, but it's the same as you'd see when typing the code into the interpreter directly.

>>> a
2
>>> b
('example', 'value')
>>> test()
('example', 'value', 'example', 'value')

In these examples, the output string is equivalent to passing the return value from the expression into the built-in repr() function. Therefore, strings will always be quoted, and many specific types will have a different format than if you print them directly. Testing the output of str() can be achieved simply by calling str() in the line of code. Alternatively, the print() function is also supported and works just as you'd expect.

>>> for value in test():
...     print(value)
example
value
example
value

In examples like this, all lines of the output are checked against what was actually returned or printed by the code provided. This provides a very readable way to deal with sequences, as shown here. For longer sequences, as well as situations where output is allowed to change from one run to another, output may also include three periods as ellipses, indicating a place where additional content should be ignored.

>>> for value in test():
...     print(value)
example
...
value

This form is particularly useful when testing exceptions, because the interpreter output includes file paths, which will nearly always change from one system to another, and aren't relevant to most tests. In these cases, what's important to test is that the exception is raised, that it's the correct type and that its value, if any, is correct.

>>> for value in test:
...     print(value)
Traceback (most recent call last):
  ...
TypeError: 'function' object is not iterable

As the output format here suggests, the doctest will verify the first and last lines of the exception output, while ignoring the entire traceback in between. Since the traceback details are typically irrelevant to the documentation as well, this format is also much more readable.

Integrating With Documentation

Since the tests are meant to be built into documentation, there needs to be a way to make sure that only the tests get executed. In order to distinguish between the two without interrupting the flow of documentation, tests are set aside by nothing more than an extra newline. You'd always have to use one newline to avoid them all running together on a single line, so adding an extra simply leaves one blank line between the two.

"""
This is an example of placing documentation alongside tests in a single string.

>>> print 'Hello, world!'
'Hello, world!'

Additional documentation can be placed between snippets of code, and it won't
disturb the behavior or validity of the tests.
"""

Running Tests

The actual execution of doctests is provided by the doctest module. In the simplest form, you can run a single function to test an entire module. This is useful when writing a set of tests for a file that was already written because you can easily test the file individually after writing new tests. Simply import doctest and run its testmod() function to test the module. Here's an example module that contains a couple types of doctests.

def times2(value):
    """
    Multiplies the provided value by two. Because input objects can override
    the behavior of multiplication, the result can be different depending on
    the type of object passed in.

    >>> times2(5)
    10
    >>> times2('test')
    'testtest'
    >>> times2(('a', 1))
    ('a', 1, 'a', 1)
    """
    return value * 2

if __name__ == '__main__':
    import doctest
    doctest.testmod()

The docstring in times2() function includes tests and, because it's available as a module-level function, the testmod() can see it and execute the tests. This simple construct allows you to call the module directly from the command line and see the results of all doctests in the module. For example, if this module was called times2.py, you could invoke it from the command line as follows.

$ python times2.py
$

By default, the output only contains errors and failures, so if all the tests pass, there won't be any output at all. Failures are reported on individual tests, with each input/output combination being considered a unique test. This provides fine-grained details about the nature of the tests that were attempted and how they failed. If the final line in the example doctest were to read just ('a', 1) instead, here's what would happen.

$ python times2.py
**********************************************************************
File "...", line 11, in __main__.times2
Failed example:
    times2((a, '1'))
Expected:
    (a, '1')
Got:
    (a, '1', a, '1')
**********************************************************************
1 items had failures:
   1 of   3 in __main__.times2
***Test Failed*** 1 failures.
$

When working with more complicated applications and frameworks, though, the simple input/output paradigm of doctests breaks down fairly quickly. In those situations, Python provides a more powerful alternative, the unittest module.

The unittest module

Unlike doctests, which require your tests be formatted in a very specific way, unittest offers much more flexibility by allowing you to write your tests in real Python code. As is often the case, this extra power requires more control over how your tests are defined. In the case of unit tests, this control is provided by way of an object-oriented API for defining individual tests, test suites and data fixtures for use with tests.

After importing the unittest module, the first place to start is the TestCase class, which forms the base of most of the module's features. It doesn't do much on its own, but when subclassed, it offers a rich set of tools to help define and control your tests. These tools are a combination of existing methods that you can use to perform individual tests and new methods you can define to control how your tests work. It all starts by creating a subclass of the TestCase class.

import unittest

class MultiplicationTestCase(unittest.TestCase):
    pass

Setting Up

The starting point for most test cases is the setUp() method, which you can define to perform some tasks at the start of all the tests that will be defined on the class. Common setup tasks include defining static values that will be compared later, opening connections to databases, opening files and loading data to analyze.

This method takes no arguments and doesn't return anything. If you need to control its behavior with any parameters, you'll need to define those in a way that setUp() can access without them being passed in as arguments. A common technique is to check os.environ for specific values that affect the behavior of the tests. Another option is to have a customizable settings modules that can be imported in setUp(), which can then modify the test behavior.

Likewise, any values that setUp() defines for later use can't be returned using the standard value. Instead, they can be stored on the TestCase object itself, which will be instantiated prior to running setUp(). The next section will show that individual tests are defined as methods on that same object, so any attributes stored during setup will be available for use by the tests when they execute.

import unittest

class MultiplicationTestCase(unittest.TestCase):
    def setUp(self):
        self.factor = 2

Note

If you look at PEP-8, you'll notice that the name setUp doesn't follow standard Python naming conventions. The capitalization style here is based on the Java testing framework, JUnit. Python's unit testing system was ported from Java, and some of its style carried over as well.

Writing Tests

With the setup in place, you can write some tests to verify whatever behavior you're working with. Like setUp(), these are implemented as custom methods on your test case class. Unlike setUp(), though, there's no single specific method that must implement all the tests. Instead, the test framework will look at your test case class for any methods whose names begin with the word test.

For each method that it finds, the test framework executes setUp() before executing the test method. This helps ensure that each method can rely on a consistent environment, regardless of how many methods there are, what they each do or in what order they're executed. Completely ensuring consistency requires one other step, but that will be covered in the next section.

When writing the body of a test method, the TestCase class offers some utility methods to describe how your code is supposed to work. These are designed in such a way that each represents a condition that must be true in order to continue. There are several of these methods, with each covering a specific type of assertion. If the given assertion passes, the test will continue to the next line of code; otherwise, the test halts immediately and a failure message will be generated. Each method provides a default message to use in case of a failure but also accepts an argument to customize that message.

  • assertTrue(expr, msg=None)—This method tests that the given expression evaluates to True. This is the simplest assertion available, mirroring the built-in assert keyword. Using this method ties failures into the test framework, though, so it should be used instead. If you prefer the assert keyword, this method is also available as assert_().

  • assertFalse(expr, msg=None)—The inverse of assertTrue(), this test will only pass if the provided expression evaluates to False.

  • fail(msg=None)—This method generates a failure message explicitly. This is useful if the conditions of the failure are more complex than the built-in methods provide for on their own. Generating a failure is preferable to raising an exception because it indicates that the code failed in a way that the test understands, rather than being unknown.

These functions alone provide a basic palette for the rest of your tests. To start converting the earlier doctest to a unit test, we can start by providing a testNumber() method to simulate the first test that was performed previously. Like doctests, the unittest module also provides a simple function to run all the tests found in the given module; this time, it's called main().

import unittest
import times2

class MultiplicationTestCase(unittest.TestCase):
    def setUp(self):
        self.factor = 2

    def testNumber(self):
        self.assertTrue(times2.times2(5) == 10)

if __name__ == '__main__':
    unittest.main()

Tests are typically stored in a module called tests.py. After saving this file, we can execute it just like the doctest example shown previously.

$ python tests.py
.
----------------------------------------------------------------------
Ran 1 test in 0.001s

Unlike doctests, unit testing does show some statistics by default. Each period represents a single test that was run, so complex applications with dozens, hundreds or even thousands of tests can easily fill several screens with results. Failures and errors are also represented here, using E for errors and F for failures. In addition, each failure will produce a block of text to describe what went wrong. Look what happens when we change the test expression.

import unittest
import times2

class MultiplicationTestCase(unittest.TestCase):
    def setUp(self):
        self.factor = 2

    def testNumber(self):
        self.assertTrue(times2.times2(5) == 42

if __name__ == '__main__':
    unittest.main()
$ python tests.py
F
======================================================================
FAIL: testNumber (__main__.MultiplicationTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "tests.py", line 9, in testNumber
    self.assertTrue(times2(5) == 42)
AssertionError: False is not True
----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)

As you can see, it shows exactly which test method generated the failure, with a traceback to help track down the code flow that led to the failure. In addition, the failure itself is shown as an AssertionError, with the assertion shown plainly.

In this case, though, the failure message isn't as useful as it could be. All it reports is that False is not True. That's a correct report, of course, but it doesn't really tell the whole story. In order to better track down what went wrong, it'd be useful to know what the function actually returned.

To provide more information about the values involved, we'll need to use a test method that can identify the different values individually. If they're not equal, the test fails just like the standard assertion, but the failure message can now include the two distinct values so you can see how they're different. That can be a valuable tool in determining how and where the code went wrong—which is, after all, the whole point of testing.

  • assertEqual(obj1, obj2, msg=None)—This checks that both objects that were passed in evaluate as equal, utilizing the comparison features shown in Chapter 5 if applicable.

  • assertNotEqual(obj1, obj2, msg=None)—This is similar to assertEqual() except that this method will fail if the two objects are equal.

  • assertAlmostEqual(obj1, obj2, *, places=7, msg=None)—Specifically for numeric values, this method rounds the value to the given number of decimal places before checking for equality. This helps account for rounding errors and other problems due to floating point arithmetic.

  • assertNotAlmostEqual(obj1, obj2, *, places=7, msg=None)—The inverse of the previous method, this test fails if the two numbers are equal when rounded to the specified number of digits.

With assertEqual() available, we can change testNumber() to produce a more useful message in the event that the assertion fails.

import unittest
import times2

class MultiplicationTestCase(unittest.TestCase):
    def setUp(self):
        self.factor = 2

    def testNumber(self):
        self.assertEqual(times2.times2(5), 42)

if __name__ == '__main__':
    unittest.main()
$ python tests.py
F
======================================================================
FAIL: testNumber (__main__.MultiplicationTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests.py", line 9, in testNumber
    self.assertEqual(times2(5), 42)
AssertionError: 10 != 42
----------------------------------------------------------------------
Ran 1 test in 0.001s

FAILED (failures=1)

Behind the scenes, assertEqual() does a couple interesting things to be as flexible and powerful as possible. First, by using the == operator, it can compare the two objects using whatever more efficient method the objects themselves may define. Second, the formatting of the output can be configured by supplying a custom comparison method. Several of these customized methods are provided in the unittest module.

  • assertSetEqual(set1, set2, msg=None)—Because unordered sequences are typically implemented as sets, this method is designed specifically for sets, using the first set's difference() method to determine whether any items are different between the two.

  • assertDictEqual(dict1, dict2, msg=None)—This method is designed specifically for dictionaries, in order to take their values into account as well as their keys.

  • assertListEqual(list1, list2, msg=None)—Similar to assertEqual(), this method is targeted specifically at lists.

  • assertTupleEqual(tuple1, tuple2, msg=None)—Like assertListEqual(), this is a customized equality check, but this time tailored for use with tuples.

  • assertSequenceEqual(seq1, seq2, msg=None)—If you're not working with a list, tuple or a subclass of one of them, this method can be used to do the same job on any object that acts as a sequence.

In addition to these methods provided out of the box, you can add your own to the test framework, so that assertEqual() can more effectively work with your own types. By passing a type and a comparison function into the addTypeEqualityFunc() method, you can register it for use with assertEqual() later on.

Using addTypeEqualityFunc() effectively can be tricky, because it's valid for the entire test case class, no matter how many tests there may be inside it. It may be tempting to add the equality function in the setUp() method, but remember that setUp() gets called once for each test method that was found on the TestCase class. If the equality function will be registered for all tests on that class, there's no point registering it before each one.

A better solution would be to add the addTypeEqualityFunc() call to the __init__() method of the test case class. This also has the additional benefit that you can subclass your own test case class to provide a more suitable base for other tests to work with. That process is explained in more detail later in this chapter.

Other Comparisons

Beyond simple equality, unittest.TestCase includes a few other methods that can be used to compare two values. Aimed primarily at numbers, these address the question of whether a tested value is less than or greater than what was expected.

  • assertGreater(obj1, obj2, msg=None)—Similar to the tests for equality, this tests whether the first object is greater than the second. Like equality, this also delegates to methods on the two objects, if applicable.

  • assertGreaterEqual(obj1, obj2, msg=None)—This works just like assertGreater(), except that the test also passes if the two objects compare as equal.

  • assertLess(obj1, obj2, msg=None)—This test passes if the first object compares as less than the second object.

  • assertLessEqual(obj1, obj2, msg=None)—Like assertLess(), this tests whether the first object is less than the second but also passes if both are equal.

Testing Strings and Other Sequence Content

Sequences present an interesting challenge because they're made up of multiple individual values. Any value in a sequence could determine the success or failure of a given test, so it's necessary to have tools to work with them specifically. First, there are two methods designed for strings, where simple equality may not always be sufficient.

  • assertMultiLineEqual(obj1, obj2, msg=None)—This is a specialized form of assertEqual(), designed for multi-line strings. Equality works like any other string, but the default failure message is optimized to show the differences between the values.

  • assertRegexpMatches(text, regexp, msg=None)—This tests whether the given regular expression matches the text provided.

More generally, tests for sequences need to make sure that certain items are present in the sequence in order to pass. The equality methods shown previously will only work if the entire sequence must be equal. In the event that some items in the sequence are important but the rest can be different, we'll need to use some other methods to verify that.

  • assertIn(obj, seq, msg=None)—This tests whether the object is present in the given sequence.

  • assertNotIn(obj, seq, msg=None)—This works like assertIn() except that it fails if the object exists as part of the given sequence.

  • assertDictContainsSubset(dict1, dict2, msg=None)—This method takes the functionality of assertIn() and applies it specifically to dictionaries. Like the assertDictEqual() method, this specialization allows it to also take the values into account instead of just the keys.

  • assertSameElements(seq1, seq2, msg=None)—This tests all the items in two sequences and passes only if the items in both sequences are identical. This only tests for the presence of individual items, not their order within each sequence. This will also accept two dictionaries but will treat it as any other sequence, so it will only look at the keys in the dictionary, not their associated values.

Testing Exceptions

So far, all the test methods have taken a positive approach, where the test verifies that a successful outcome really is successful. It's just as important to verify unsuccessful outcomes, though, because they still need to be reliable. Many functions are expected to raise exceptions in certain situations, and unit testing is just as useful in verifying that behavior.

  • assertRaises(exception, callable, *args, **kwargs)—Rather than checking a specific value, this method tests a callable to see that it raises a particular exception. In addition to the exception type and the callable to test, it also accepts any number of positional and keyword arguments. These extra arguments will be passed to the callable that was supplied, so that multiple flows can be tested.

  • assertRaisesRegexp(exception, regex, callable, *args, **kwargs)—This method is slightly more specific than assertRaises() because it also accepts a regular expression that must match the exception's string value in order to pass. The expression can be passed in as a string or as a compiled regular expression object.

In our times2 example, there are many types of values that can't be multiplied by an integer. Those situations can be part of the explicit behavior of the function, as long as they're handled consistently. The typical response would be to raise a TypeError, as Python does by default. Using the assertRaises() method, we can test for this as well.

import unittest
import times2

class MultiplicationTestCase(unittest.TestCase):
    def setUp(self):
        self.factor = 2

    def testNumber(self):
        self.assertEqual(times2.times2(5), 42)

    def testInvalidType(self):
        self.assertRaises(TypeError, times2.times2, {})

Some situations are a bit more complicated, which can cause difficulties with testing. One common example is an object that overrides one of the standard operators. You could call the overridden method by name, but it would be more readable to simply use the operator itself. Unfortunately, the normal form of assertRaises() requires a callable, rather than just an expression.

To address this, both of these methods can act as context managers using a with block. In this form, you don't supply a callable or arguments, but rather just pass in the exception type and, if using assertRaisesRegexp(), a regular expression. Then, in the body of the with block, you can add the code that must raise the given exception. This can also be more readable than the standard version, even for situations that wouldn't otherwise require it.

import unittest
import times2

class MultiplicationTestCase(unittest.TestCase):
    def setUp(self):
        self.factor = 2

    def testNumber(self):
        self.assertEqual(times2.times2(5), 42)

    def testInvalidType(self):
        with self.assertRaises(TypeError):
            times2.times2({})

Testing Identity

The last group contains methods for testing the identity of objects. Rather than just checking to see if their values are equivalent, these methods check to see if two objects are in fact the same. One common scenario for this test is when your code caches values for use later. By testing for identity, you can verify that a value returned from cache is the same value that was placed in the cache to begin with, rather than simply an equivalent copy.

  • assertIs(ob1, obj2, msg=None)—This method checks to see if the two arguments both refer to the same object. The test is performed using the identity of the objects, so objects that might compare as equal will still fail if they're not actually the same object.

  • assertIsNot(obj1, obj2, msg=None)—This inversion of assertIs() will only pass if the two arguments refer to two different objects. Even if they would otherwise compare as equal, this test requires them to have different identities.

  • assertIsNone(obj, msg=None)—This is a simple shortcut for a common case of assertIs(), where an object is compared to the built-in None object.

  • assertIsNotNone(obj, msg=None)—The inversion of assertIsNone() will pass only if the object provided is not the built-in None object.

Tearing Down

Just as setUp() gets called before each individual test is carried out, the TestCase object also calls a tearDown() method to clean up any initialized values after testing is carried out. This is used quite often in tests that need to create and store information outside of Python during testing. Examples of such information are database rows and temporary files. Once the tests are complete, that information is no longer necessary, so it makes good sense to clean up after they've completed.

Typically, a set of tests that works with files will have to create temporary files along the way, to verify that they get accessed and modified properly. These files can be created in setUp() and deleted in tearDown(), ensuring that each test has a fresh copy when it runs. The same can be done with databases or other data structures.

Note

The key value of setUp() and tearDown() is that they can prepare a clean environment for each individual test. If you need to set up an environment for all the tests to share or revert some changes after all tests have completed, you'll need to do so before or after starting the testing process.

Providing a Custom Test Class

Because the unittest module is designed as a class to be overridden, you can write your own class on top of it for your tests to use instead. This is a different process than writing tests because you're providing more tools for your tests to use. You can override any of the existing methods that are available on TestCase itself or add any others that are useful to your code.

The most common way to extend the usefulness of TestCase is to add new methods to test different functionality than the original class was designed for. A file-handling framework might include extra methods for testing the size of a given file or perhaps some details about its contents. A framework for retrieving web content could include methods to check HTTP status codes or look for individual tags in HTML documents. The possibilities are endless.

Changing Test Behavior

Another powerful technique available when creating a testing class is the ability to change how the tests themselves are performed. The most obvious way to do this is to override the existing assertion methods, which can change how those tests are performed. There are a few other ways to alter the standard behavior, without overriding the assertion methods.

These additional overrides can be managed in the __init__() method of your custom class because, unlike setUp(), the __init__() method will only be called once per TestCase object. That makes it good for those customizations that need to affect all tests but won't be affected by any of the tests as they run. One such example, mentioned previously in this chapter, is the ability to add custom equality comparison methods, which are registered with the addTypeEqualityFunc() method.

Another modification you can make to the test class is to define what type of exception is used to identify failures. Normally, all test failures raise an AssertionError behind the scenes—the same exception used when an assert statement fails. If you need to change that for any reason, such as to better integrate with a higher-level testing framework, you can assign a new exception type to the failureException class attribute.

As a side-effect of using the failureException attribute to generate failures, you can raise it explicitly using self.failureException to generate a test failure. This is essentially the same as simply calling self.fail(), but it can be more readable in some cases to raise an exception rather than call a method.

Taking It With You

The tools described in this chapter are just the basis of a functional test suite. As you write an application, you'll need to fill in the gaps with the important facets of how your code should work. Always remember, though, that tests aren't just for you. By making sure that new code doesn't break existing code, you can provide a much better guarantee for your users once you distribute your code to the public. The next chapter will show how you can get your code to the masses.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset