Chapter 10. Testing and Logging – Preparing for Bugs

When programming, most developers plan a bit and immediately continue writing code. After all, we all expect to write bug-free code! Unfortunately, we don't. At some point, an incorrect assumption, a misinterpretation, or just a silly mistake is bound to happen. Debugging (covered in Chapter 11, Debugging – Solving the Bugs) will always be required at some point, but there are several methods that you can use to prevent bugs or, at the very least, make it much easier to solve them when they do occur.

To prevent bugs from occurring in the first place, test-driven development or, at the very least, functional/regression/unit tests are very useful. The standard Python installation alone offers several options such as the doctest, unittest, and test modules. The doctest module allows you to combine tests with example documentation. The unittest module allows you to easily write regression tests. The test module is meant for internal usage only, so unless you are planning to modify the Python core, you probably won't need this one.

The test modules we will discuss in this chapter are:

  • doctest
  • py.test (and why it's more convenient than unittest)
  • unittest.mock

The py.test module has roughly the same purpose as the unittest module, but it's much more convenient to use and has a few extra options.

After learning how to avoid the bugs, it's time to take a look at logging so that we can inspect what is happening in our program and why. The logging module in Python is highly configurable and can be adjusted for just about any use case. If you've ever written Java code, you should feel right at home with the logging module, as its design is largely based on the log4j module and is very similar in both implementation and naming. The latter makes it a bit of an odd module in Python as well, as it is one of the few modules that do not follow the pep8 naming standards.

This chapter will explain the following topics:

  • Combining documentation with tests using doctest
  • Regression and unit tests using py.test and unittest
  • Testing with fake objects using unittest.mock
  • Using the logging module effectively
  • Combining logging and py.test

Using examples as tests with doctest

The doctest module is one of the most useful modules within Python. It allows you to combine documenting your code with tests to make sure that it keeps working as it is supposed to.

A simple doctest example

Let's start with a quick example: a function that squares the input. The following example is a fully functional command-line application, containing not only code but also functioning tests. The first few tests cover how the function is supposed to behave when executing normally, followed by a few tests to demonstrate the expected errors:

def square(n):
    '''
    Returns the input number, squared

    >>> square(0)
    0
    >>> square(1)
    1
    >>> square(2)
    4
    >>> square(3)
    9
    >>> square()
    Traceback (most recent call last):
    ...
    TypeError: square() missing 1 required positional argument: 'n'
    >>> square('x')
    Traceback (most recent call last):
    ...
    TypeError: can't multiply sequence by non-int of type 'str'

    Args:
        n (int): The number to square

    Returns:
        int: The squared result
    '''
    return n * n

if __name__ == '__main__':
    import doctest
    doctest.testmod()

It can be executed as any Python script, but the regular command won't give any output as all tests are successful. The doctest.testmod function takes verbosity parameters, luckily:

# python square.py -v
Trying:
    square(0)
Expecting:
    0
ok
Trying:
    square(1)
Expecting:
    1
ok
Trying:
    square(2)
Expecting:
    4
ok
Trying:
    square(3)
Expecting:
    9
ok
Trying:
    square()
Expecting:
    Traceback (most recent call last):
    ...
    TypeError: square() missing 1 required positional argument: 'n'
ok
Trying:
    square('x')
Expecting:
    Traceback (most recent call last):
    ...
    TypeError: can't multiply sequence by non-int of type 'str'
ok
1 items had no tests:
    __main__
1 items passed all tests:
   6 tests in __main__.square
6 tests in 2 items.
6 passed and 0 failed.
Test passed.

Additionally, since it uses the Google syntax (as discussed in Chapter 9, Documentation – How to Use Sphinx and reStructuredText, the documentation chapter), we can generate pretty documentation using Sphinx:

A simple doctest example

However, the code is not always correct, of course. What would happen if we modify the code so that the tests do not pass anymore?

This time, instead of n * n, we use n ** 2. Both square a number right? So the results must be identical. Right? These are the types of assumptions that create bugs, and the types of assumptions that are trivial to catch using a few basic tests:

def square(n):
    '''
    Returns the input number, squared

    >>> square(0)
    0
    >>> square(1)
    1
    >>> square(2)
    4
    >>> square(3)
    9
    >>> square()
    Traceback (most recent call last):
    ...
    TypeError: square() missing 1 required positional argument: 'n'
    >>> square('x')
    Traceback (most recent call last):
    ...
    TypeError: can't multiply sequence by non-int of type 'str'

    Args:
        n (int): The number to square

    Returns:
        int: The squared result
    '''
    return n ** 2

if __name__ == '__main__':
    import doctest
    doctest.testmod()

So let's execute the test again and see what happens this time. For brevity, we will skip the verbosity flag this time:

# python square.py
**********************************************************************
File "square.py", line 17, in __main__.square
Failed example:
    square('x')
Expected:
    Traceback (most recent call last):
    ...
    TypeError: can't multiply sequence by non-int of type 'str'
Got:
    Traceback (most recent call last):
      File "doctest.py", line 1320, in __run
        compileflags, 1), test.globs)
      File "<doctest __main__.square[5]>", line 1, in <module>
        square('x')
      File "square.py", line 28, in square
        return n ** 2
    TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'
**********************************************************************
1 items had failures:
   1 of   6 in __main__.square
***Test Failed*** 1 failures.

The only modification we made to the code was replacing n * n with n ** 2, which translates to the power function. Since multiplication is not the same as taking the power of a number, the results are slightly different but similar enough in practice that most programmers wouldn't notice the difference.

The only difference caused by the code change was that we now have a different exception—an innocent mistake, only breaking the tests in this case. But it shows how useful these tests are. When rewriting code, an incorrect assumption is easily made, and that is where tests are most useful—knowing you are breaking code as soon as you break it instead of finding out months later.

Writing doctests

Perhaps, you have noticed from the preceding examples that the syntax is very similar to the regular Python console, and that is exactly the point. The doctest input is nothing more than the output of a regular Python shell session. This is what makes testing with this module so intuitive; simply write the code in the Python console and copy the output into a docstring to get tests. Here is an example:

# python
>>> from square import square
>>> square(5)
25
>>> square()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: square() missing 1 required positional argument: 'n'

That's why this is probably the easiest way to test code. With almost no effort, you can check whether your code is working as you would expect it, add tests, and add documentation at the same time. Simply copy the output from the interpreter to your function or class documentation and you have functioning doctests.

Testing with pure documentation

The docstrings in functions, classes, and modules are usually the most obvious way to add doctests to your code, but they are not the only way. The Sphinx documentation, as we discussed in the previous chapter, also supports the doctest module. You might remember that when creating the Sphinx project, we enabled the doctest module:

> doctest: automatically test code snippets in doctest blocks (y/n) [n]:y

This flag enables the sphinx.ext.doctest extension in Sphinx, which tells Sphinx to run those tests as well. Since not all the examples in the code are useful, let's see whether we can split them between the ones that are actually useful and the ones that are only relevant for documentation. Moreover, to see the results, we will add an error to the documentation:

square.py

def square(n):
    '''
    Returns the input number, squared

    >>> square(2)
    4

    Args:
        n (int): The number to square

    Returns:
        int: The squared result
    '''
    return n * n

if __name__ == '__main__':
    import doctest
    doctest.testmod()

square.rst

square module
=============

.. automodule:: square
    :members:
    :undoc-members:
    :show-inheritance:

Examples:

.. testsetup::

    from square import square

.. doctest::

    >>> square(100)

    >>> square(0)
    0
    >>> square(1)
    1
    >>> square(3)
    9
    >>> square()
    Traceback (most recent call last):
    ...
    TypeError: square() missing 1 required positional argument: 'n'
    >>> square('x')
    Traceback (most recent call last):
    ...
    TypeError: can't multiply sequence by non-int of type 'str'

Now, it's time to execute the tests. In the case of Sphinx, there is a specific command for this:

# make doctest
sphinx-build -b doctest -d _build/doctrees . _build/doctest
Running Sphinx v1.3.3
loading translations [en]... done
loading pickled environment... done
building [mo]: targets for 0 po files that are out of date
building [doctest]: targets for 3 source files that are out of date
updating environment: 0 added, 0 changed, 0 removed
looking for now-outdated files... none found
running tests...

Document: square
----------------
**********************************************************************
File "square.rst", line 16, in default
Failed example:
    square(100)
Expected nothing
Got:
    10000
**********************************************************************
1 items had failures:
   1 of   7 in default
7 tests in 1 items.
6 passed and 1 failed.
***Test Failed*** 1 failures.

Doctest summary
===============
    7 tests
    1 failure in tests
    0 failures in setup code
    0 failures in cleanup code
build finished with problems.
make: *** [doctest] Error 1

As expected, we are getting an error for the incomplete doctest, but beyond that, all tests executed correctly. To make sure that the tests know what square is, we had to add the testsetup directive, and this still generates a pretty output:

Testing with pure documentation

The doctest flags

The doctest module features several option flags. They affect how doctest processes the tests. These option flags can be passed globally using your test suite, through command-line parameters while running the tests, and through inline commands. For this book, I have globally enabled the following option flags through a pytest.ini file (we will cover more about py.test later in this chapter):

doctest_optionflags = ELLIPSIS NORMALIZE_WHITESPACE

Without these option flags, some of the examples in this book will not function properly. This is because they have to be reformatted to fit. The next few paragraphs will cover the following option flags:

  • DONT_ACCEPT_TRUE_FOR_1
  • NORMALIZE_WHITESPACE
  • ELLIPSIS

There are several other option flags available with varying degrees of usefulness, but these are better left to the Python documentation:

https://docs.python.org/3/library/doctest.html#option-flags

True and False versus 1 and 0

Having True evaluating to 1 and False evaluating to 0 is useful in most cases, but it can give unexpected results. To demonstrate the difference, we have these lines:

'''
>>> False
0
>>> True
1
>>> False  # doctest: +DONT_ACCEPT_TRUE_FOR_1
0
>>> True  # doctest: +DONT_ACCEPT_TRUE_FOR_1
1
'''

if __name__ == '__main__':
    import doctest
    doctest.testmod()

Here are the results of the DONT_ACCEPT_TRUE_FOR_1 flag:

# python test.py
**********************************************************************
File "test.py", line 6, in __main__
Failed example:
    False  # doctest: +DONT_ACCEPT_TRUE_FOR_1
Expected:
    0
Got:
    False
**********************************************************************
File "test.py", line 8, in __main__
Failed example:
    True  # doctest: +DONT_ACCEPT_TRUE_FOR_1
Expected:
    1
Got:
    True
**********************************************************************
1 items had failures:
   2 of   4 in __main__
***Test Failed*** 2 failures.

As you can see, the DONT_ACCEPT_TRUE_FOR_1 flag makes doctest reject 1 as a valid response for True as well as 0 for False.

Normalizing whitespace

Since doctests are used for both documentation and test purposes, it is pretty much a requirement to keep them readable. Without normalizing whitespace, this can be tricky, however. Consider the following example:

>>> [list(range(5)) for i in range(5)]
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]

While not all that bad, this output isn't the best for readability. With whitespace normalizing, here is what we can do instead:

>>> [list(range(5)) for i in range(5)]  # doctest: +NORMALIZE_WHITESPACE
[[0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4],
 [0, 1, 2, 3, 4]]

Formatting the output in this manner is both more readable and convenient for keeping your line length less.

Ellipsis

The ELLIPSIS flag is very useful but also a bit dangerous, as it can easily lead to incorrect matches. It makes ... match any substring, which is very useful for exceptions but dangerous in other cases:

>>> {10: 'a', 20: 'b'}  # doctest: +ELLIPSIS
{...}
>>> [True, 1, 'a']  # doctest: +ELLIPSIS
[...]
>>> True,  # doctest: +ELLIPSIS
(...)
>>> [1, 2, 3, 4]  # doctest: +ELLIPSIS
[1, ..., 4]
>>> [1, 0, 0, 0, 0, 0, 4]  # doctest: +ELLIPSIS
[1, ..., 4]

These cases are not too useful in real scenarios, but they demonstrate how the ELLIPSIS option flag functions. They also indicate the danger. Both [1, 2, 3, 4] and [1, 0, ... , 4] match the [1, ..., 4] test, which is probably unintentional, so be very careful while using ELLIPSIS.

A more useful case is when documenting class instances:

>>> class Spam(object):
...     pass
>>> Spam()  # doctest: +ELLIPSIS
<__main__.Spam object at 0x...>

Without the ELLIPSIS flag, the memory address (the 0x... part) would never be what you expect. Let's demonstrate an actual run in a normal CPython instance:

Failed example:
    Spam()
Expected:
    <__main__.Spam object at 0x...>
Got:
    <__main__.Spam object at 0x10d9ad160>

Doctest quirks

The three option flags discussed earlier take care of quite a few quirks found in doctests, but there are several more cases that require care. In these cases, you just need to be a bit careful and work around the limitations of the doctest module. The doctest module effectively uses the representation string, and those are not always consistent.

The most important cases are floating-point inaccuracies, dictionaries, and random values, such as timers. The following example will fail most of the time because certain types in Python have no consistent ordering and depend on external variables:

>>> dict.fromkeys('spam')
{'s': None, 'p': None, 'a': None, 'm': None}
>>> 1./7.
0.14285714285714285

>>> import time
>>> time.time() - time.time()
-9.5367431640625e-07

All the problems have several possible solutions, which differ mostly in style and your personal preference.

Testing dictionaries

The problem with dictionaries is that they are internally implemented as hash tables, resulting in an effectively random representation order. Since the doctest system requires a representation string that is identical in meaning (save for certain doctest flags, of course) to the docstring, this does not work. Naturally, there are several workaround options available and all have some advantages and disadvantages.

The first is using the pprint library to format it in a pretty way:

>>> import pprint
>>> data = dict.fromkeys('spam')
>>> pprint.pprint(data)
{'a': None, 'm': None, 'p': None, 's': None}

Since the pprint library always sorts the items before outputting, this solves the problem with random representation orders. However, it does require an extra import and function call, which some people prefer to avoid.

Another option is manual sorting of the items:

>>> data = dict.fromkeys('spam')
>>> sorted(data.items())
[('a', None), ('m', None), ('p', None), ('s', None)]

The downside here is that it is not visible from the output that data is a dictionary, which makes the output less readable.

Lastly, comparing the dict with a different dict comprised of the same elements works as well:

>>> data = dict.fromkeys('spam')
>>> data == {'a': None, 'm': None, 'p': None, 's': None}
True

A perfectly okay solution, of course! But True is not really the clearest output, especially if the comparison doesn't work:

Failed example:
    data == {'a': None, 'm': None, 'p': None}
Expected:
    True
Got:
    False

On the other hand, the other options presented previously show both the expected value and the returned value correctly:

Failed example:
    sorted(data.items())
Expected:
    [('a', None), ('m', None), ('p', None)]
Got:
    [('a', None), ('m', None), ('p', None), ('s', None)]

Failed example:
    pprint.pprint(data)
Expected:
    {'a': None, 'm': None, 'p': None}
Got:
    {'a': None, 'm': None, 'p': None, 's': None}

Personally, out of the solutions presented, I would recommend using pprint, as I find it the most readable solution, but all the solutions have some merits to them.

Testing floating-point numbers

For the same reason as a floating-point comparison can be problematic (that is, 1/3 == 0.333), a representation string comparison is also problematic. The easiest solution is to simply add some rounding/clipping to your code, but the ELLIPSIS flag is also an option here. Here is a list of several solutions:

>>> 1/3  # doctest: +ELLIPSIS
0.333...
>>> '%.3f' % (1/3)
'0.333'
>>> '{:.3f}'.format(1/3)
'0.333'
>>> round(1/3, 3)
0.333
>>> 0.333 < 1/3 < 0.334
True

When the ELLIPSIS option flag is enabled globally anyhow, that would be the most obvious solution. In other cases, I recommend one of the alternative solutions.

Times and durations

For timings, the problems that you will encounter are quite similar to the floating-point issues. When measuring the duration execution time of a code snippet, there will always be some variation present. That's why the most stable solution for tests, including time, is limiting the precision, although even that is no guarantee. Regardless, the simplest solution checks whether the delta between the two times is smaller than a certain number, as follows:

>>> import time
>>> a = time.time()
>>> b = time.time()
>>> (b - a) < 0.01
True

For the timedelta objects, however, it's slightly more complicated. Yet, this is where the ELLIPSIS flag definitely comes in handy again:

>>> import datetime
>>> a = datetime.datetime.now()
>>> b = datetime.datetime.now()
>>> str(b - a)  # doctest: +ELLIPSIS
'0:00:00.000...

The alternative to the ELLIPSIS option flag would be comparing the days, hours, minutes, and microseconds in timedelta separately.

In a later paragraph, we will see a completely stable solution for problems like these using mock objects. For doctests, however, that is generally overkill.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset