Writing extensions

As already said, writing extensions is not a simple task but in exchange for your hard work, it can give you a lot of advantages. The easiest and recommended approach to your own extensions is to use tools such as Cython or Pyrex or simply integrate the existing dynamic libraries with ctypes or cffi. These projects will increase your productivity and also make code easier to develop, read, and maintain.

Anyway, if you are new to this topic, it is good to know that you can start your adventure with extensions by writing one using nothing more than bare C code and Python/C API. This will improve your understanding of how extensions work and will also help you to appreciate the advantages of alternative solutions. For the sake of simplicity, we will take a simple algorithmic problem as an example and try to implement it using three different approaches:

  • Writing a pure C extension
  • Using Cython
  • Using Pyrex

Our problem will be finding the nth number of the Fibonacci sequence. It is very unlikely that you would like to create compiled extensions solely for this problem, but it is very simple so it will serve as a very good example of wiring any C function to Python/C APIs. Our only goals are clarity and simplicity, so we won't try to provide the most efficient solution. Once we know this, our reference implementation of the Fibonacci function implemented in Python looks as follows:

"""Python module that provides fibonacci sequence function"""


def fibonacci(n):
    """Return nth Fibonacci sequence number computed recursively.
    """
    if n < 2:
        return 1
    else:
        return fibonacci(n - 1) + fibonacci(n - 2)

Note that this is one of the most simple implementations of the fibonnaci() function and a lot of improvements could be applied to it. We refuse to improve our implementation (using a memoization pattern, for instance) though because this is not the purpose of our example. In the same manner, we won't optimize our code later when discussing implementations in C or Cython even though the compiled code gives many more possibilities to do so.

Pure C extensions

Before we fully dive into the code examples of Python extensions written in C, here is a huge warning. If you want to extend Python with C, you need to already know both of these languages well. This is especially true for C. Lack of proficiency with it can lead to real disasters because it can be easily mishandled.

If you have decided that you need to write C extension for Python, I assume that you already know the C language to a level that will allow you to fully understand the examples that are presented. Nothing other than Python/C API details will be explained here. This book is about Python and not any other language. If you don't know C at all, you should definitely not try to write your own Python extensions in C until you gain enough experience and skills. Leave it to others and stick with Cython or Pyrex because they are a lot safer from the beginner's perspective. This is mostly because Python/C API, despite having been crafted with great care, is definitely not a good introduction to C.

As proposed earlier, we will try to port the fibonacci() function to C and expose it to Python code as an extension. The bare implementation without the wiring to Python/C API that is analogous to the previous Python example could be roughly as follows:

long long fibonacci(unsigned int n) {
    if (n < 2) {
        return 1;
    } else {
        return fibonacci(n - 2) + fibonacci(n - 1);
    }
}

And here is the example of a complete, fully functional extension that exposes this single function in a compiled module:

#include <Python.h>


long long fibonacci(unsigned int n) {
    if (n < 2) {
        return 1;
    } else {
        return fibonacci(n-2) + fibonacci(n-1);
    }
}


static PyObject* fibonacci_py(PyObject* self, PyObject* args) {
    PyObject *result = NULL;
    long n;

    if (PyArg_ParseTuple(args, "l", &n)) {
        result = Py_BuildValue("L", fibonacci((unsigned int)n));
    }

    return result;
}


static char fibonacci_docs[] =
    "fibonacci(n): Return nth Fibonacci sequence number "
    "computed recursively
";


static PyMethodDef fibonacci_module_methods[] = {
    {"fibonacci", (PyCFunction)fibonacci_py,
     METH_VARARGS, fibonacci_docs},
    {NULL, NULL, 0, NULL}
};


static struct PyModuleDef fibonacci_module_definition = {
    PyModuleDef_HEAD_INIT,
    "fibonacci",
    "Extension module that provides fibonacci sequence function",
    -1,
    fibonacci_module_methods
};


PyMODINIT_FUNC PyInit_fibonacci(void) {
    Py_Initialize();

    return PyModule_Create(&fibonacci_module_definition);
}

The preceding example might be a bit overwhelming at first glance because we had to add four times more code just to make the fibonacci() C function accessible from Python. We will discuss every bit of that code later, so don't worry. But before we do that, let's see how it can be packaged and executed in Python. The minimal setuptools configuration for our module needs to use the setuptools.Extension class in order to instruct the interpreter how our extension is compiled:

from setuptools import setup, Extension


setup(
    name='fibonacci',
    ext_modules=[
        Extension('fibonacci', ['fibonacci.c']),
    ]
)

The build process for the extension can be initialized with Python's setup.py build command, but will also be automatically performed on package installation. The following transcript presents the result of installation in development mode and a simple interactive session where our compiled fibonacci() function is inspected and executed:

$ ls -1a
fibonacci.c
setup.py

$ pip install -e .
Obtaining file:///Users/swistakm/dev/book/chapter7
Installing collected packages: fibonacci
  Running setup.py develop for fibonacci
Successfully installed Fibonacci

$ ls -1ap
build/
fibonacci.c
fibonacci.cpython-35m-darwin.so
fibonacci.egg-info/
setup.py

$ python
Python 3.5.1 (v3.5.1:37a07cee5969, Dec  5 2015, 21:12:44) 
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import fibonacci
>>> help(fibonacci.fibonacci)

Help on built-in function fibonacci in fibonacci:

fibonacci.fibonacci = fibonacci(...)
    fibonacci(n): Return nth Fibonacci sequence number computed recursively

>>> [fibonacci.fibonacci(n) for n in range(10)]
[1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
>>> 

A closer look at Python/C API

Since we know how to properly package, compile, and install custom C extensions and we are sure that it works as expected, now it is the right time to discuss our code in detail.

The extensions module starts with a single C preprocessor directive that includes the Python.h header file:

#include <Python.h>

This pulls the whole Python/C API and is everything you need to include to be able to write your extensions. In more realistic cases, your code will require a lot more preprocessor directives to take benefit from C standard library functions or to integrate other source files. Our example was simple, so no more directives were required.

Next we have the core of our module:

long long fibonacci(unsigned int n) {
    if (n < 2) {
        return 1;
    } else {
        return fibonacci(n - 2) + fibonacci(n - 1);
    }
}

The preceding fibonacci() function is the only part of our code that does something useful. It is pure C implementation that Python by default can't understand. The rest of our example will create the interface layer that will expose it through Python/C API.

The first step of exposing this code to Python is the creation of the C function that is compatible with the CPython interpreter. In Python, everything is an object. This means that C functions called in Python also need to return real Python objects. Python/C APIs provide a PyObject type and every callable must return the pointer to it. The signature of our function is:

static PyObject* fibonacci_py(PyObject* self, PyObject* args)s

Note that the preceding signature does not specify the exact list of arguments but only PyObject* args that will hold the pointer to the structure that contains the tuple of the provided values. The actual validation of the argument list must be performed inside of the function body and this is exactly what fibonacci_py() does. It parses the args argument list assuming it is the single unsigned int type and uses that value as an argument to the fibonacci() function to retrieve the Fibonacci sequence element:

static PyObject* fibonacci_py(PyObject* self, PyObject* args) {
    PyObject *result = NULL;
    long n;

    if (PyArg_ParseTuple(args, "l", &n)) {
        result = Py_BuildValue("L", fibonacci((unsigned int)n));
    }

    return result;
}

Note

The preceding example function has some serious bugs, which the eyes of an experienced developer should spot very easily. Try to find it as an exercise in working with C extensions. For now, we leave it as it is for the sake of brevity. We will try to fix it later when discussing details of dealing with errors in the Exception handling section.

The "l" string in the PyArg_ParseTuple(args, "l", &n) call means that we expect args to contain only a single long value. In case of failure, it will return NULL and store information about the exception in the per-thread interpreter state. The details of exception handling will be described a bit later in the Exception handling section.

The actual signature of the parsing function is int PyArg_ParseTuple(PyObject *args, const char *format, ...) and what goes after the format string is a variable length list of arguments that represents parsed value output (as pointers). This is analogous to how the scanf() function from the C standard library works. If our assumption fails and the user provides an incompatible arguments list, then PyArg_ParseTuple() will raise the proper exception. This is a very convenient way to encode function signatures once you get used to it but has a huge downside when compared to plain Python code. Such Python call signatures implicitly defined by the PyArg_ParseTuple() calls cannot be easily inspected inside of the Python interpreter. You need to remember this fact when using code provided as extensions.

As already said, Python expects objects to be returned from callables. This means that we cannot return a raw long long value obtained from the fibonacci() function as a result of fibonacci_py(). Such an attempt would not even compile and there is no automatic casting of basic C types to Python objects. The Py_BuildValue(*format, ...) function must be used instead. It is the counterpart of PyArg_ParseTuple() and accepts a similar set of format strings. The main difference is that the list of arguments is not a function output but an input, so actual values must be provided instead of pointers.

After fibonacci_py() is defined, most of the heavy work is done. The last step is to perform module initialization and add metadata to our function that will make usage a bit simpler for users. This is the boilerplate part of our extension code that for some simple examples, such as this one, can take more place than actual functions that we want to expose. In most cases, it simply consists of some static structures and one initialization function that will be executed by the interpreter on module import.

At first, we create a static string that will be a content of Python docstring for the fibonacci_py() function:

static char fibonacci_docs[] =
    "fibonacci(n): Return nth Fibonacci sequence number "
    "computed recursively
";

Note that this could be inlined somewhere later in fibonacci_module_methods, but it is a good practice to have docstrings separated and stored in close proximity to the actual function definition that they refer to.

The next part of our definition is the array of the PyMethodDef structures that define methods (functions) that will be available in our module. This structure contains exactly four fields:

  • char* ml_name: This is the name of the method.
  • PyCFunction ml_meth: This is the pointer to the C implementation of the function.
  • int ml_flags: This includes the flags indicating either the calling convention or binding convention. The latter is applicable only for definition of class methods.
  • char* ml_doc: This is the pointer to the content of method/function docstring.

Such an array must always end with a sentinel value of {NULL, NULL, 0, NULL} that indicates its end. In our simple case, we created the static PyMethodDef fibonacci_module_methods[] array that contains only two elements (including the sentinel value):

static PyMethodDef fibonacci_module_methods[] = {
    {"fibonacci", (PyCFunction)fibonacci_py,
     METH_VARARGS, fibonacci_docs},
    {NULL, NULL, 0, NULL}
};

And this is how the first entry maps to the PyMethodDef structure:

  • ml_name = "fibonacci": Here, the fibonacci_py() C function will be exposed as a Python function under the fibonacci name
  • ml_meth = (PyCFunction)fibonacci_py: Here, the casting to PyCFunction is simply required by Python/C API and is dictated by the call convention defined later in ml_flags
  • ml_flags = METH_VARARGS: Here, the METH_VARARGS flag indicates that the calling convention of our function accepts a variable list of arguments and no keyword arguments
  • ml_doc = fibonacci_docs: Here, the Python function will be documented with the content of the fibonacci_docs string

When an array of function definitions is complete, we can create another structure that contains the definition of the whole module. It is described using the PyModuleDef type and contains multiple fields. Some of them are useful only for more complex scenarios, where fine-grained control over the module initialization process is required. Here we are interested only in the first five of them:

  • PyModuleDef_Base m_base: This should always be initialized with PyModuleDef_HEAD_INIT.
  • char* m_name: This is the name of the newly created module. In our case it is fibonacci.
  • char* m_doc: This is the pointer to the docstring content for the module. We usually have only a single module defined in one C source file, so it is OK to inline our documentation string in the whole structure.
  • Py_ssize_t m_size: This is the size of the memory allocated to keep the module state. This is only used when support for multiple subinterpreters or multiphase initialization is required. In most cases, you don't need that and it is gets the value -1.
  • PyMethodDef* m_methods: This is a pointer to the array containing module-level functions described by the PyMethodDef values. It could be NULL if the module does not expose any functions. In our case, it is fibonacci_module_methods.

The other fields are explained in detail in the official Python documentation (refer to https://docs.python.org/3/c-api/module.html) but are not needed in our example extension. They should be set to NULL if not required and they will be initialized with that value implicitly when not specified. This is why our module description contained in the fibonacci_module_definition variable can take this simple five-element form:

static struct PyModuleDef fibonacci_module_definition = {
    PyModuleDef_HEAD_INIT,
    "fibonacci",
    "Extension module that provides fibonacci sequence function",
    -1,
    fibonacci_module_methods
};

The last piece of code that crowns our work is the module initialization function. This must follow a very specific naming convention, so the Python interpreter can easily pick it when the dynamic/shared library is loaded. It should be named PyInit_name, where name is your module name. So it is exactly the same string that was used as the m_base field in the PyModuleDef definition and as the first argument of the setuptools.Extension() call. If you don't require a complex initialization process for the module, it takes a very simple form, exactly like in our example:

PyMODINIT_FUNC PyInit_fibonacci(void) {
    return PyModule_Create(&fibonacci_module_definition);
}

The PyMODINIT_FUNC macro is a preprocessor macro that will declare the return type of this initialization function as PyObject* and add any special linkage declarations if required by the platform.

Calling and binding conventions

As explained in the A closer look at Python/C API section, the ml_flags bitfield of the PyMethodDef structure contains flags for calling and binding conventions. Calling convention flags are:

  • METH_VARARGS: This is a typical convention for the Python function or method that only accepts arguments as its parameters. The type provided as the ml_meth field for such a function should be PyCFunction. The function will be provided with two arguments of the PyObject* type. The first is either the self object (for methods) or the module object (for module functions). A typical signature for the C function with that calling convention is PyObject* function(PyObject* self, PyObject* args).
  • METH_KEYWORDS: This is the convention for the Python function that accepts keyword arguments when called. Its associated C type is PyCFunctionWithKeywords. The C function must accept three arguments of the PyObject* type: self, args, and a dictionary of keyword arguments. If combined with METH_VARARGS, the first two arguments have the same meaning as for the previous calling convention, otherwise args will be NULL. The typical C function signature is: PyObject* function(PyObject* self, PyObject* args, PyObject* keywds).
  • METH_NOARGS: This is the convention for Python functions that do not accept any other argument. The C function should be of the PyCFunction type, so the signature is the same as that of the METH_VARARGS convention (two self and args arguments). The only difference is that args will always be NULL, so there is no need to call PyArg_ParseTuple(). This cannot be combined with any other calling convention flag.
  • METH_O: This is the shorthand for functions and methods accepting single object arguments. The type of C function is again PyCFunction, so it accepts two PyObject* arguments: self and args. Its difference from METH_VARARGS is that there is no need to call PyArg_ParseTuple() because PyObject* provided as args will already represent the single argument provided in the Python call to that function. This also cannot be combined with any other calling convention flag.

A function that accepts keywords is described either with METH_KEYWORDS or a bitwise combination of calling convention flags in the form of METH_VARARGS | METH_KEYWORDS. If so, it should parse its arguments with PyArg_ParseTupleAndKeywords() instead of PyArg_ParseTuple() or PyArg_UnpackTuple(). Here is an example module with a single function that returns None and accepts two named keyword arguments that are printed on standard output:

#include <Python.h>

static PyObject* print_args(PyObject *self, PyObject *args, PyObject *keywds)
{
    char *first;
    char *second;

    static char *kwlist[] = {"first", "second", NULL};

    if (!PyArg_ParseTupleAndKeywords(args, keywds, "ss", kwlist,
                                     &first, &second))
        return NULL;

    printf("%s %s
", first, second);

    Py_INCREF(Py_None);
    return Py_None;
}


static PyMethodDef module_methods[] = {
    {"print_args", (PyCFunction)print_args,
     METH_VARARGS | METH_KEYWORDS,
     "print provided arguments"},
    {NULL, NULL, 0, NULL}
};


static struct PyModuleDef module_definition = {
    PyModuleDef_HEAD_INIT,
    "kwargs",
    "Keyword argument processing example",
    -1,
    module_methods
};


PyMODINIT_FUNC PyInit_kwargs(void) {
    return PyModule_Create(&module_definition);
}

Argument parsing in Python/C API is very elastic and is extensively described in the official documentation at https://docs.python.org/3.5/c-api/arg.html. The format argument in PyArg_ParseTuple() and PyArg_ParseTupleAndKeywords() allows fine grained control over argument number and types. Every advanced calling convention known in Python can be coded in C with this API including:

  • Functions with default values for arguments
  • Functions with arguments specified as keyword-only
  • Functions with a variable number of arguments

The binding convention flags are METH_CLASS, METH_STATIC, and METH_COEXIST, are reserved for methods, and cannot be used to describe module functions. The first two are quite self-explanatory. They are the C counterparts of classmethod and staticmethod decorators and change the meaning of the self argument passed to the C function.

METH_COEXIST allows loading a method in place of the existing definition. It is useful very rarely. This is mostly when you would like to provide an implementation of the C method that would be generated automatically from the other features of the type that was defined. Python documentation gives an example of the __contains__() wrapper method that would be generated if the type has the sq_contains slot defined. Unfortunately, defining your own classes and types using Python/C API is beyond the scope of this introductory chapter. We will cover creating your own types in extensions later when discussing Cython because doing that in pure C requires way too much boilerplate code and leaves a lot of room for making mistakes.

Exception handling

C, unlike Python, or even C++ does not have syntax for raising and catching exceptions. All error handling is usually handled with function return values and optional global state for storing details that can explain the cause of the last failure.

Exception handling in Python/C API is built around that simple principle. There is a global per thread indicator of the last error that occurred and functioned in the C API. It is set to describe the cause of a problem. There is also a standardized way to inform the caller of a function if this state was changed during the call:

  • If the function is supposed to return a pointer, it returns NULL
  • If the function is supposed to return an int type, it returns -1

The only exceptions from the preceding rules in Python/C API are the PyArg_*() functions that return 1 to indicate success and 0 to indicate failure.

To see how this works in practice, let's recall our fibonacci_py() function from the example in the previous sections:

static PyObject* fibonacci_py(PyObject* self, PyObject* args) {
    PyObject *result = NULL;
    long n;

    if (PyArg_ParseTuple(args, "l", &n)) {
      result = Py_BuildValue("L", fibonacci((unsigned int) n));
    }

    return result;
}

Lines that somehow take part in our error handling are highlighted. It starts at the very beginning with the initialization of the result variable that is supposed to store the return value of our function. It is initialized with NULL that, as we already know, is an indicator of error. And this is how you will usually code your extensions, assuming that error is the default state of your code.

Later we have the PyArg_ParseTuple() call that will set error info in case of an exception and return 0. This is part of the if statement and in that case we don't do anything more and return NULL. Whoever calls our function will be notified about the error.

Py_BuildValue() can also raise an exception. It is supposed to return PyObject* (pointer), so in case of failure it gives NULL. We can simply store it as our result variable and pass further as a return value.

But our job does not end with caring for exceptions raised by Python/C API calls. It is very probable that you will need to inform the extension user that some other kind of error or failure occurred. Python/C API has multiple functions that help you to raise an exception, but the most common one is PyErr_SetString(). It sets an error indicator with the given exception type with an additional string provided as the error cause explanation. The full signature of this function is:

void PyErr_SetString(PyObject* type, const char* message)

I have already said that implementation of our fibonacci_py() function has serious bug. Now is the right time to fix it. Fortunately, we have proper tools to do that. The problem lies in insecure casting of the long type to unsigned int in the following lines:

    if (PyArg_ParseTuple(args, "l", &n)) {
      result = Py_BuildValue("L", fibonacci((unsigned int) n));
    }

Thanks to the PyArg_ParseTuple() call, the first and only argument will be interpreted as a long type (the "l" specifier) and stored in the local n variable. Then it is cast to unsigned int so the issue will occur if the user calls the fibonacci() function from Python with a negative value. For instance, -1, as a signed 32-bit integer, will be interpreted as 4294967295 when cast to an unsigned 32-bit integer. Such a value will cause deep recursion and will result in stack overflow and a segmentation fault. Note that the same may happen if the user gives an arbitrarily large positive argument. We cannot fix this without a complete redesign of the C fibonacci() function, but we can at least try to ensure that argument that is passed meets some preconditions. Here we check if the value of the n argument is greater than or equal to zero and we raise a ValueError exception if that's not true:

static PyObject* fibonacci_py(PyObject* self, PyObject* args) {
    PyObject *result = NULL;
    long n;
    long long fib;

    if (PyArg_ParseTuple(args, "l", &n)) {
        if (n<0) {
            PyErr_SetString(PyExc_ValueError,
                            "n must not be less than 0");
        } else {
            result = Py_BuildValue("L", fibonacci((unsigned int)n));
        }
    }

    return result;
}

The last note is that the global error state does not clear by itself. Some of the errors can be handled gracefully in your C functions (same as using the try ... except clause in Python) and you need to be able to clear the error indicator if it is no longer valid. The function for that is PyErr_Clear().

Releasing GIL

I have already mentioned that extensions can be a way to bypass Python GIL. There is a famous limitation of the CPython implementation stating that only one thread at a time can execute Python code. While multiprocessing is the suggested approach to circumvent this problem, it may not be a good solution for some highly parallelizable algorithms due to the resource overhead of running additional processes.

Because extensions are mostly used in cases where a bigger part of the work is performed in pure C without any calls to Python/C API, it is possible (even advisable) to release GIL in some application sections. Thanks to this, you can still benefit from having multiple CPU cores and multithreaded application design. The only thing you need to do is to wrap blocks of code that are known to not use any of Python/C API calls or Python structures with specific macros provided by Python/C API. These two preprocessor macros are provided to simplify the whole procedure of releasing and reacquiring the Global Interpreter Lock:

  • Py_BEGIN_ALLOW_THREADS: This declares the hidden local variable where the current thread state is saved and it releases GIL
  • Py_END_ALLOW_THREADS: This reacquires GIL and restores the thread state from the local variable declared with the previous macro

When we look carefully at our fibonacci extension example, we can clearly see that the fibonacci() function does not execute any Python code and does not touch any of the Python structures. This means that the fibonacci_py() function that simply wraps the fibonacci(n) execution could be updated to release GIL around that call:

static PyObject* fibonacci_py(PyObject* self, PyObject* args) {
    PyObject *result = NULL;
    long n;
    long long fib;

    if (PyArg_ParseTuple(args, "l", &n)) {
        if (n<0) {
            PyErr_SetString(PyExc_ValueError,
                            "n must not be less than 0");
        } else {
            Py_BEGIN_ALLOW_THREADS;
            fib = fibonacci(n);
            Py_END_ALLOW_THREADS;

            result = Py_BuildValue("L", fib);
        }}

    return result;
}

Reference counting

Finally, we come to the important topic of memory management in Python. Python has its own garbage collector, but it is designed only to solve the issue of cyclic references in the reference counting algorithm. Reference counting is the primary method of managing the deallocation of objects that are no longer needed.

Python/C API documentation introduces an ownership of references to explain how it deals with deallocation of objects. Objects in Python are never owned and they are always shared. The actual creation of objects is managed by Python's memory manager. It is the component of CPython interpreter that is responsible for allocating and deallocating memory for objects that are stored in a private heap. What can be owned instead is a reference to the object.

Every object in Python that is represented by a reference (PyObject* pointer) has an associated reference count. When it goes to zero, it means that no one holds any valid reference to the object and the deallocator associated with its type can be called. Python/C API provides two macros for increasing and decreasing reference counts: Py_INCREF(), and Py_DECREF(). But before we discuss their details, we need to understand a few more terms related to reference ownership:

  • Passing of ownership: Whenever we say that the function passes the ownership over a reference, it means that it has already increased the reference count and it is the responsibility of the caller to decrease the count when the reference to the object is no longer needed. Most of the functions that return the newly created objects, such as Py_BuildValue, do that. If that object is going to be returned from our function to another caller, then the ownership is passed again. We do not decrease the reference count in that case because it is no longer our responsibility. This is why the fibonacci_py() function does not call Py_DECREF() on the result variable.
  • Borrowed references: The borrowing of references happens when the function receives a reference to some Python object as an argument. The reference count for such a reference should never be decreased in that function unless it was explicitly increased in its scope. In our fibonacci_py() function the self and args arguments are such borrowed references and thus we do not call PyDECREF() on them. Some of the Python/C API functions may also return borrowed references. The notable examples are PyTuple_GetItem() and PyList_GetItem(). It is often said that such references are unprotected. There is no need to dispose of its ownership unless it will be returned as a function's return value. In most cases, extra care should be taken if we use such borrowed references as arguments of other Python/C API calls. It may be necessary in some circumstances to additionally protect such references with additional Py_INCREF() before using as argument to other function and then calling Py_DECREF() when it is no longer needed.
  • Stolen references: It is also possible for the Python/C API function to steal the reference instead of borrowing it when provided as a call argument. This is the case of exactly two functions: PyTuple_SetItem() and PyList_SetItem(). They fully take over the responsibility of the reference passed to them. They do not increase the reference count by themselves but will call Py_DECREF() when the reference is no longer needed.

Keeping an eye on the reference counts is one of the hardest things when writing complex extensions. Some of the not-so-obvious issues may not be noticed until the code is run in a multithreaded setup.

The other common problem is caused by the very nature of Python's object model and the fact that some functions return borrowed references. When the reference count goes to zero, the deallocation function is executed. For user-defined classes, it is possible to define a __del__() method that will be called at that moment. This can be any Python code and it is possible that it will affect other objects and their reference counts. The official Python documentation gives the following example of code that may be affected by this problem:

void bug(PyObject *list) {
    PyObject *item = PyList_GetItem(list, 0);

    PyList_SetItem(list, 1, PyLong_FromLong(0L));
    PyObject_Print(item, stdout, 0); /* BUG! */
}

It looks completely harmless, but the problem is in fact that we cannot know what elements the list object contains. When PyList_SetItem() sets a new value on the list[1] index, the ownership of the object that was previously stored at that index is disposed. If it was the only existing reference, the reference count will become 0 and the object will become deallocated. It is possible that it was some user-defined class with a custom implementation of the __del__() method. A serious issue will occur if in the result of such a __del__() execution item[0] will be removed from the list. Note that PyList_GetItem() returns a borrowed reference! It does not call Py_INCREF() before returning a reference. So in that code, it is possible that PyObject_Print() will be called with a reference to an object that no longer exists. This will cause a segmentation fault and crash the Python interpreter.

The proper approach is to protect borrowed references for the whole time we need them because there is a possibility that any call in-between may cause deallocation of any other object—even if they are seemingly unrelated:

void no_bug(PyObject *list) {
    PyObject *item = PyList_GetItem(list, 0);

    Py_INCREF(item);
    PyList_SetItem(list, 1, PyLong_FromLong(0L));
    PyObject_Print(item, stdout, 0);
    Py_DECREF(item);
}

Cython

Cython is both an optimizing static compiler and the name of a programming language that is a superset of Python. As a compiler, it can perform source to source compilation of native Python code and its Cython dialect to Python C extensions using Python/C API. It allows you to combine the power of Python and C without the need to manually deal with Python/C API.

Cython as a source to source compiler

For extensions created using Cython, the major advantage you will get is using the superset language that it provides. Anyway, it is possible to create extensions from plain Python code using source to source compilation. This is the simplest approach to Cython because it requires almost no changes to the code and can give some significant performance improvements at a very low development cost.

Cython provides a simple cythonize utility function that allows you to easily integrate the compilation process with distutils or setuptools. Let's assume that we would like to compile a pure Python implementation of our fibonacci() function to a C extension. If it is located in the fibonacci module, the minimal setup.py script could be as follows:

from setuptools import setup
from Cython.Build import cythonize

setup(
    name='fibonacci',
    ext_modules=cythonize(['fibonacci.py'])
)

Cython used as a source compilation tool for the Python language has another benefit. Source to source compilation to extensions can be a fully optional part of source distribution installation process. If the environment where the package needs to be installed does not have Cython or any other building prerequisites, it can be installed as a normal pure Python package. The user should not notice any functional difference in the behavior of code distributed that way.

A common approach for distributing extensions built with Cython is to include both Python/Cython sources and C code that would be generated from these source files. This way the package can be installed in three different ways depending on the existence of building prerequisites:

  • If the installation environment has Cython available, the extension C code is generated from the Python/Cython sources that are provided
  • If Cython is not available but there are available building prerequisites (C compiler, Python/C API headers), the extension is built from distributed pre-generated C files
  • If neither of the preceding prerequisites is available but the extension is created from pure Python sources, the modules are installed like ordinary Python code, and the compilation step is skipped

Note that Cython documentation says that including generated C files as well as Cython sources is the recommended way of distributing Cython extensions. The same documentation says that Cython compilation should be disabled by default because the user may not have the required version of Cython in his environment and this may result in unexpected compilation issues. Anyway, with the advent of environment isolation, this seems to be a less worrying problem today. Also, Cython is a valid Python package that is available on PyPI, so it can easily be defined as your project requirement in a specific version. Including such a prerequisite is, of course, a decision with serious implications and should be considered very carefully. The safer solution is to leverage the power of the extras_require feature in the setuptools package and allow the user to decide whether he wants to use Cython with a specific environment variable:

import os

from distutils.core import setup
from distutils.extension import Extension

try:
    # cython source to source compilation available
    # only when Cython is available
    import Cython
    # and specific environment variable says
    # explicitely that Cython should be used
    # to generate C sources
    USE_CYTHON = bool(os.environ.get("USE_CYTHON"))

except ImportError:
    USE_CYTHON = False

ext = '.pyx' if USE_CYTHON else '.c'

extensions = [Extension("fibonacci", ["fibonacci"+ext])]

if USE_CYTHON:
    from Cython.Build import cythonize
    extensions = cythonize(extensions)

setup(
    name='fibonacci',
    ext_modules=extensions,
    extras_require={
        # Cython will be set in that specific version
        # as a requirement if package will be intalled
        # with '[with-cython]' extra feature
        'cython': ['cython==0.23.4']
    }
)

The pip installation tool supports the installation of packages with the extras option by adding the [extra-name] suffix to the package name. For the preceding example, the optional Cython requirement and compilation during the installation from local sources can be enabled using the following command:

$ USE_CYTHON=1 pip install .[with-cython]

Cython as a language

Cython is not only a compiler but also a superset of the Python language. Superset means that any valid Python code is allowed and it can be further updated with additional features, such as support for calling C functions or declaring C types on variables and class attributes. So any code written in Python is also written in Cython. This explains why ordinary Python modules can be so easily compiled to C using the Cython compiler.

But we won't stop on that simple fact. Instead of saying that our reference fibonacci() function is also code for valid extensions in this superset of Python, we will try to improve it a bit. This won't be any real optimization to our function design but some minor updates that will allow it to benefit from being written in Cython.

Cython sources use a different file extension. It is .pyx instead of .py. Let's assume that we still want to implement our Fibbonacci sequence. The content of fibonacci.pyx might look like this:

"""Cython module that provides fibonacci sequence function."""


def fibonacci(unsigned int n):
    """Return nth Fibonacci sequence number computed recursively."""
    if n < 2:
        return n
    else:
        return fibonacci(n - 1) + fibonacci(n - 2)

As you can see, the only thing that has really changed is the signature of the fibonacci() function. Thanks to optional static typing in Cython, we can declare the n argument as unsigned int, and this should slightly improve the way our function works. Additionally, it does a lot more than we did previously when writing extensions by hand. If the argument of the Cython function is declared with a static type, then the extension will automatically handle conversion and overflow errors by raising proper exceptions:

>>> from fibonacci import fibonacci
>>> fibonacci(5)
5
>>> fibonacci(-1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "fibonacci.pyx", line 21, in fibonacci.fibonacci (fibonacci.c:704)
OverflowError: can't convert negative value to unsigned int
>>> fibonacci(10 ** 10)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "fibonacci.pyx", line 21, in fibonacci.fibonacci (fibonacci.c:704)
OverflowError: value too large to convert to unsigned int

We already know that Cython compiles only source to source and the generated code uses the same Python/C API that we would use when writing C code for extensions by hand. Note that fibonacci() is a recursive function, so it calls itself very often. This will mean that although we declared a static type for input argument, during the recursive call it will treat itself like any other Python function. So n-1 and n-2 will be packed back into the Python object and then passed to the hidden wrapper layer of the internal fibonacci() implementation that will again bring it back to the unsigned int type. This will happen again and again until we reach the final depth of recursion. This is not necessarily a problem but involves a lot more argument processing than really required.

We can cut off the overhead of Python function calls and argument processing by delegating more of the work to a pure C function that does not know anything about Python structures. We did this previously when creating C extensions with pure C and we can do that in Cython too. We can use the cdef keyword to declare C-style functions that accept and return only C types:

cdef long long fibonacci_cc(unsigned int n):
    if n < 2:
        return n
    else:
        return fibonacci_cc(n - 1) + fibonacci_cc(n - 2)


def fibonacci(unsigned int n):
    """ Return nth Fibonacci sequence number computed recursively
    """
    return fibonacci_cc(n)

We can go even further. With a plain C example, we finally showed how to release GIL during the call of our pure C function, so the extension was a bit nicer for multithreaded applications. In previous examples, we have used Py_BEGIN_ALLOW_THREADS and Py_END_ALLOW_THREADS preprocessor macros from Python/C API headers to mark section of code as free from Python calls. The Cython syntax is a lot shorter and easier to remember. GIL can be released around the section of code using a simple with nogil statement:

def fibonacci(unsigned int n):
    """ Return nth Fibonacci sequence number computed recursively
    """
    with nogil:
        result = fibonacci_cc(n)

    return fibonacci_cc(n)

You can also mark the whole C style function as safe to call without GIL:

cdef long long fibonacci_cc(unsigned int n) nogil:
    if n < 2:
        return n
    else:
        return fibonacci_cc(n - 1) + fibonacci_cc(n - 2)

It is important to know that such functions cannot have Python objects as arguments or return types. Whenever a function marked as nogil needs to perform any Python/C API call, it must acquire GIL using the with gil statement.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset