Thanks to ctypes
(a module in the standard library) or cffi
(an external package), you can integrate just about every compiled dynamic/shared library in Python no matter in what language it was written. And you can do that in pure Python without any compilation steps, so this is an interesting alternative to writing extensions in C.
This does not mean you don't need to know anything about C. Both solutions require from you a reasonable understanding of C and how dynamic libraries work in general. On the other hand, they remove the burden of dealing with Python reference counting and greatly reduce the risk of making painful mistakes. Also interfacing with C code through ctypes
or cffi
is more portable than writing and compiling the C extension module.
ctypes
is the most popular module to call functions from dynamic or shared libraries without the need of writing custom C extensions. The reason for that is obvious. It is part of the standard library, so it is always available and does not require any external dependencies. It is a foreign function interface (FFI) library and provides an API for creating C-compatible datatypes.
There are four types of dynamic library loaders available in ctypes
and two conventions to use them. The classes that represent dynamic and shared libraries are ctypes.CDLL
, ctypes.PyDLL
, ctypes.OleDLL
, and ctypes.WinDLL
. The last two are only available on Windows, so we won't discuss them here. The differences between CDLL
and PyDLL
are as follows:
ctypes.CDLL
: This class represents loaded shared libraries. The functions in these libraries use the standard calling convention, and are assumed to return int
. GIL is released during the call.ctypes.PyDLL
: This class works like CDLL
, but GIL is not released during the call. After execution, the Python error flag is checked and an exception is raised if it is set. It is only useful when directly calling functions from Python/C API.To load a library, you can either instantiate one of the preceding classes with proper arguments or call the LoadLibrary()
function from the submodule associated with a specific class:
ctypes.cdll.LoadLibrary()
for ctypes.CDLL
ctypes.pydll.LoadLibrary()
for ctypes.PyDLL
ctypes.windll.LoadLibrary()
for ctypes.WinDLL
ctypes.oledll.LoadLibrary()
for ctypes.OleDLL
The main challenge when loading shared libraries is how to find them in a portable way. Different systems use different suffixes for shared libraries (.dll
on Windows, .dylib
on OS X, .so
on Linux) and search for them in different places. The main offender in this area is Windows, that does not have a predefined naming scheme for libraries. Because of that, we won't discuss the details of loading libraries with ctypes
on this system and concentrate mainly on Linux and Mac OS X that deal with this problem in a consistent and similar way. If you are anyway interested in Windows platform, refer to the official ctypes
documentation that has plenty of information about supporting that system (refer to https://docs.python.org/3.5/library/ctypes.html).
Both library loading conventions (the LoadLibrary()
function and specific library-type classes) require you to use the full library name. This means all the predefined library prefixes and suffixes need to be included. For example, to load the C standard library on Linux, you need to write the following:
>>> import ctypes >>> ctypes.cdll.LoadLibrary('libc.so.6') <CDLL 'libc.so.6', handle 7f0603e5f000 at 7f0603d4cbd0>
Here, for Mac OS X, this would be:
>>> import ctypes >>> ctypes.cdll.LoadLibrary('libc.dylib')
Fortunately, the ctypes.util
submodule provides a find_library()
function that allows to load a library using its name without any prefixes or suffixes and will work on any system that has a predefined scheme for naming shared libraries:
>>> import ctypes >>> from ctypes.util import find_library >>> ctypes.cdll.LoadLibrary(find_library('c')) <CDLL '/usr/lib/libc.dylib', handle 7fff69b97c98 at 0x101b73ac8> >>> ctypes.cdll.LoadLibrary(find_library('bz2')) <CDLL '/usr/lib/libbz2.dylib', handle 10042d170 at 0x101b6ee80> >>> ctypes.cdll.LoadLibrary(find_library('AGL')) <CDLL '/System/Library/Frameworks/AGL.framework/AGL', handle 101811610 at 0x101b73a58>
When the library is successfully loaded, the common pattern is to store it as a module-level variable with the same name as library. The functions can be accessed as object attributes, so calling them is like calling a Python function from any other imported module:
>>> import ctypes >>> from ctypes.util import find_library >>> >>> libc = ctypes.cdll.LoadLibrary(find_library('c')) >>> >>> libc.printf(b"Hello world! ") Hello world! 13
Unfortunately, all the built-in Python types except integers, strings, and bytes are incompatible with C datatypes and thus must be wrapped in the corresponding classes provided by the ctypes
module. Here is the full list of compatible datatypes that comes from the ctypes
documentation:
ctypes type |
C type |
Python type |
---|---|---|
|
|
|
|
|
1-character |
|
|
1-character |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
As you can see, the preceding table does not contain dedicated types that would reflect any of the Python collections as C arrays. The recommended way to create types for C arrays is to simply use the multiplication operator with the desired basic ctypes
type:
>>> import ctypes >>> IntArray5 = ctypes.c_int * 5 >>> c_int_array = IntArray5(1, 2, 3, 4, 5) >>> FloatArray2 = ctypes.c_float * 2 >>> c_float_array = FloatArray2(0, 3.14) >>> c_float_array[1] 3.140000104904175
It is a very popular design pattern to delegate part of the work of function implementation to custom callbacks provided by the user. The most known function from the C standard library that accepts such callbacks is a qsort()
function that provides a generic implementation of the
Quicksort algorithm. It is rather unlikely that you would like to use this algorithm instead of the default Python Timsort that is more suited for sorting Python collections. Anyway, qsort()
seems to be a canonical example of an efficient sorting algorithm and a C API that uses the callback mechanism that is found in many programming books. This is why we will try to use it as an example of passing the Python function as a C callback.
The ordinary Python function type will not be compatible with the callback function type required by the qsort()
specification. Here is the signature of qsort()
from the BSD man
page that also contains the type of accepted callback type (the compar
argument):
void qsort(void *base, size_t nel, size_t width, int (*compar)(const void *, const void *));
So in order to execute qsort()
from libc
, you need to pass:
base
: This is the array that needs to be sorted as a void*
pointer.nel
: This is the number of elements as size_t
.width
: This is the size of the single element in the array as size_t
.compar
: This is the pointer to the function that is supposed to return int
and accepts two void*
pointers. It points to the function that compares the size of two elements being sorted.We already know from the Calling C functions using ctypes section how to construct the C array from other ctypes
types using the multiplication operator. nel
should be size_t
, and it maps to Python int
, so it does not require any additional wrapping and can be passed as len(iterable)
. The width
value can be obtained using the ctypes.sizeof()
function once we know the type of our base
array. The last thing we need to know is how to create the pointer to the Python function compatible with the compar
argument.
The ctypes
module contains a CFUNTYPE()
factory function that allows us to wrap Python functions and represents them as C callable function pointers. The first argument is the C return type that the wrapped function should return. It is followed by the variable list of C types that the function accepts as its arguments. The function type compatible with the compar
argument of qsort()
will be:
CMPFUNC = ctypes.CFUNCTYPE( # return type ctypes.c_int, # first argument type ctypes.POINTER(ctypes.c_int), # second argument type ctypes.POINTER(ctypes.c_int), )
CFUNTYPE()
uses the cdecl
calling convention, so it is compatible only with the CDLL
and PyDLL
shared libraries. The dynamic libraries on Windows that are loaded with WinDLL
or OleDLL
use the stdcall
calling convention. This means that the other factory must be used to wrap Python functions as C callable function pointers. In ctypes
, it is WINFUNCTYPE()
.
To wrap everything up, let's assume that we want to sort a randomly shuffled list of integer numbers with a qsort()
function from the standard C library. Here is the example script that shows how to do that using everything that we have learned about ctypes
so far:
from random import shuffle import ctypes from ctypes.util import find_library libc = ctypes.cdll.LoadLibrary(find_library('c')) CMPFUNC = ctypes.CFUNCTYPE( # return type ctypes.c_int, # first argument type ctypes.POINTER(ctypes.c_int), # second argument type ctypes.POINTER(ctypes.c_int), ) def ctypes_int_compare(a, b): # arguments are pointers so we access using [0] index print(" %s cmp %s" % (a[0], b[0])) # according to qsort specification this should return: # * less than zero if a < b # * zero if a == b # * more than zero if a > b return a[0] - b[0] def main(): numbers = list(range(5)) shuffle(numbers) print("shuffled: ", numbers) # create new type representing array with length # same as the length of numbers list NumbersArray = ctypes.c_int * len(numbers) # create new C array using a new type c_array = NumbersArray(*numbers) libc.qsort( # pointer to the sorted array c_array, # length of the array len(c_array), # size of single array element ctypes.sizeof(ctypes.c_int), # callback (pointer to the C comparison function) CMPFUNC(ctypes_int_compare) ) print("sorted: ", list(c_array)) if __name__ == "__main__": main()
The comparison function provided as a callback has an additional print
statement, so we can see how it is executed during the sorting process:
$ python ctypes_qsort.py shuffled: [4, 3, 0, 1, 2] 4 cmp 3 4 cmp 0 3 cmp 0 4 cmp 1 3 cmp 1 0 cmp 1 4 cmp 2 3 cmp 2 1 cmp 2 sorted: [0, 1, 2, 3, 4]
CFFI is a Foreign Function Interface for Python that is an interesting alternative to ctypes
. It is not a part of the standard library but is easily available as a cffi
package on PyPI. It is different from ctypes
because it puts more emphasis on reusing plain C declarations instead of providing extensive Python APIs in a single module. It is way more complex and also has a feature that also allows you to automatically compile some parts of your integration layer into extensions using C compiler. So it can be used as a hybrid solution that fills the gap between C extensions and ctypes
.
Because it is a very large project, it is impossible to shortly introduce it in a few paragraphs. On the other hand, it would be a shame to not say something more about it. We have already discussed one example of integrating the qsort()
function from the standard library using ctypes
. So, the best way to show the main differences between these two solutions will be to re-implement the same example with cffi
. I hope that one block of code is worth more than a few paragraphs of text:
from random import shuffle from cffi import FFI ffi = FFI() ffi.cdef(""" void qsort(void *base, size_t nel, size_t width, int (*compar)(const void *, const void *)); """) C = ffi.dlopen(None) @ffi.callback("int(void*, void*)") def cffi_int_compare(a, b): # Callback signature requires exact matching of types. # This involves less more magic than in ctypes # but also makes you more specific and requires # explicit casting int_a = ffi.cast('int*', a)[0] int_b = ffi.cast('int*', b)[0] print(" %s cmp %s" % (int_a, int_b)) # according to qsort specification this should return: # * less than zero if a < b # * zero if a == b # * more than zero if a > b return int_a - int_b def main(): numbers = list(range(5)) shuffle(numbers) print("shuffled: ", numbers) c_array = ffi.new("int[]", numbers) C.qsort( # pointer to the sorted array c_array, # length of the array len(c_array), # size of single array element ffi.sizeof('int'), # callback (pointer to the C comparison function) cffi_int_compare, ) print("sorted: ", list(c_array)) if __name__ == "__main__": main()