Metaprogramming

There may be a good definition of metaprogramming from some academy paper that could be cited here, but this is rather a book about good software craftsmanship than about computer science theory. This is why we will use a simple one:

"Metaprogramming is a technique of writing computer programs that can treat themselves as data, so you can introspect, generate, and/or modify itself while running."

Using this definition, we can distinguish two major approaches to metaprogramming in Python.

The first approach concentrates on the language's ability to introspect its basic elements such as functions, classes, or types and to create or modify them on the fly. Python gives a lot of tools to developers in this area. The easiest ones are decorators that allow to add additional functionality to the existing functions, methods, or classes. Next are special methods of classes that allow you to interfere with class instance process creation. The most powerful are metaclasses that allow programmers to even completely redesign the Python's implementation of the object-oriented programming paradigm. Here also, we have a good selection of different tools that allow programmers to work directly with code either in its raw plain text format or in the more programmatically accessible Abstract Syntax Tree (AST) form. This second approach is of course more complicated and difficult to work with but allows for really extraordinary things, such as extending Python's language syntax or even creating your own Domain Specific Language (DSL).

Decorators – a method of metaprogramming

The decorator syntax is explained in Chapter 2, Syntax Best Practices – below the Class Level, as a simple pattern:

def decorated_function():
    pass
decorated_function = some_decorator(decorated_function)

This clearly shows what the decorator does. It takes a function object and modifies it at run time. As a result, a new function (or anything else) is created based on the previous function object with the same name. This may be even a complex operation that performs some introspection to give different results depending on how the original function is implemented. All this means is that decorators can be considered as a metaprogramming tool.

This are good news. Decorators are relatively easy to catch and in most cases make code shorter, easier to read, and also cheaper to maintain. Other metaprogramming tools available in Python are more difficult to grasp and master. Also, they might not make the code simple at all.

Class decorators

One of the less known syntax features of Python is the class decorator. The syntax and the way that they work is exactly the same as with function decorators mentioned in Chapter 2, Syntax Best Practices – below the Class Level. The only difference is that they are expected to return a class instead of the function object. Here is an example class decorator that modifies the __repr__() method to return the printable object representation that is shortened to some arbitrary number of characters:

def short_repr(cls):
    cls.__repr__ = lambda self: super(cls, self).__repr__()[:8]
    return cls


@short_repr
class ClassWithRelativelyLongName:
    pass

The following is what you will see in the output:

>>> ClassWithRelativelyLongName()
<ClassWi

Of course, the preceding code snippet is not an example of a good code by any means because it is too cryptic. Still, it shows how multiple language features explained in this chapter can be used together:

  • Not only instances but also class objects can be modified at runtime
  • Functions are descriptors too, so they can be added to the class at runtime because the actual binding instance is performed on the attribute lookup as part of the descriptor protocol
  • The super() call can be used outside of a class definition scope as long as proper arguments are provided
  • Finally, class decorators can be used on class definitions

The other aspects of the writing function decorators apply to the class decorators as well. Most importantly, they can use closures and be parametrized. Taking advantage of these facts, the previous example can be rewritten into a more readable and maintainable form:

def parametrized_short_repr(max_width=8):
    """Parametrized decorator that shortens representation"""
    def parametrized(cls):
        """Inner wrapper function that is actual decorator"""
        class ShortlyRepresented(cls):
            """Subclass that provides decorated behavior"""
            def __repr__(self):
                return super().__repr__()[:max_width]

        return ShortlyRepresented

    return parametrized

The major drawback of using closures this way in class decorators is that the resulting objects are no longer instances of the class that was decorated but instances of the subclass created dynamically in the decorator function. Among others, this will affect the class's __name__ and __doc__ attributes:

@parametrized_short_repr(10)
class ClassWithLittleBitLongerLongName:
    pass

Such usage of class decorators will result in following changes to the class metadata:

>>> ClassWithLittleBitLongerLongName().__class__
<class 'ShortlyRepresented'>
>>> ClassWithLittleBitLongerLongName().__doc__
'Subclass that provides decorated behavior'

Unfortunately, this cannot be fixed as simply as explained in the Introspection Preserving Decorators section of Chapter 2, Syntax Best Practices – below the Class Level, using the additional wraps decorator. This makes use of the class decorators in this form limited in some circumstances. If no additional work is performed to preserve the old class's metadata, then this can break results of many automated documentation generation tools.

Still, despite this single caveat, class decorators are a simple and lightweight alternative to the popular mixin class pattern.

A mixin in Python is a class that is not meant to be instantiated, but is instead used to provide some reusable API or functionality to other existing classes. Mixin classes are almost always added using multiple inheritance in the form of:

class SomeConcreteClass(MixinClass, SomeBaseClass):
    pass

Mixins are useful design patterns that are used in many libraries. To name one, Django is one of the frameworks that uses them extensively. While useful and popular, the mixins can cause some trouble if not designed well, because, in most cases, they require the developer to rely on multiple inheritance. As was said earlier, Python handles multiple inheritance relatively well, thanks to the MRO. Anyway, it may be better to avoid subclassing multiple classes if it only does not require too much additional work and makes code simpler. This is why class decorators may be a good replacement of mixins.

Using the __new__() method to override instance creation process

The special method __new__() is a static method responsible for creating class instances. It is special-cased, so there is no need to declare it as a static using the staticmethod decorator. This __new__(cls, [,...]) method is called prior to the __init__() initialization method. Typically, the implementation of overridden __new__() invokes its superclass version using super().__new__() with suitable arguments and modifies the instance before returning it:

class InstanceCountingClass:
    instances_created = 0
    def __new__(cls, *args, **kwargs):
        print('__new__() called with:', cls, args, kwargs)
        instance = super().__new__(cls)
        instance.number = cls.instances_created
        cls.instances_created += 1

        return instance

    def __init__(self, attribute):
        print('__init__() called with:', self, attribute)
        self.attribute = attribute

Here is the log of example interactive session that shows how our InstanceCountingClass implementation works:

>>> instance1 = InstanceCountingClass('abc')
__new__() called with: <class '__main__.InstanceCountingClass'> ('abc',) {}
__init__() called with: <__main__.InstanceCountingClass object at 0x101259e10> abc
>>> instance2 = InstanceCountingClass('xyz')
__new__() called with: <class '__main__.InstanceCountingClass'> ('xyz',) {}
__init__() called with: <__main__.InstanceCountingClass object at 0x101259dd8> xyz
>>> instance1.number, instance1.instances_created
(0, 2)
>>> instance2.number, instance2.instances_created
(1, 2)

The __new__() method should usually return an instance of featured class but it is also possible that it returns other class instances. If it does happen (different class instance is returned) then the call to the __init__() method is skipped. This fact is useful when there is a need to modify creation behavior of non-mutable class instances such as some of Python's built-in types:

class NonZero(int):
    def __new__(cls, value):
        return super().__new__(cls, value) if value != 0 else None

    def __init__(self, skipped_value):
        # implementation of __init__ could be skipped in this case
        # but it is left to present how it may be not called
        print("__init__() called")
        super().__init__()

Let's see this in the interactive session:

>>> type(NonZero(-12))
__init__() called
<class '__main__.NonZero'>
>>> type(NonZero(0))
<class 'NoneType'>
>>> NonZero(-3.123)
__init__() called
-3

So, when to use __new__()? The answer is simple: only when __init__() is not enough. One such case was already mentioned. This is subclassing of non-mutable built-in Python types such as int, str, float, frozenset, and so on. It's because there is no way to modify such a nonmutable object instance in the __init__() method once it is created.

Some programmers can argue that __new__() may be useful for performing important object initialization that may be missed if the user forgets to use super().The __init__() call is the overridden initialization method. While it sounds reasonable, this has a major drawback. If such an approach is used, then it becomes harder for the programmer to explicitly skip previous initialization steps if this is the already desired behavior. It also breaks an unspoken rule of all initializations performed in __init__().

Because __new__() is not constrained to return the same class instance, it can be easily abused. Irresponsible usage of this method might do a lot of harm to the code, so it should always be used carefully and backed with extensive documentation. Generally, it is better to search for other solutions that may be available for the given problem, instead of affecting object creation in a way that will break basic programmers' expectations. Even overridden initialization of non-mutable types mentioned earlier can be replaced with more predictable and well-established design patterns, such as the Factory Method, which is described in Chapter 14, Useful Design Patterns.

There is at least one aspect of Python programming where extensive usage of the __new__() method is well justified. These are metaclasses that are described in the next section.

Metaclasses

Metaclass is a Python feature that is considered by many as one of the most difficult thing in this language and thus avoided by a great number of developers. In reality, it is not as complicated as it sounds once you understand few basic concepts. As a reward, knowing this feature grants the ability to do some things that were not possible using other approaches.

Metaclass is a type (class) that defines other types (classes). The most important thing to know in order to understand how they work is that classes that define object instances are objects too. So, if they are objects, then they have an associated class. The basic type of every class definition is simply the built-in type class. Here is a simple diagram that should make it clear:

Metaclasses

Figure 3 How classes are typed

In Python, it is possible to substitute the metaclass for a class object with our own type. Usually, the new metaclass is still the subclass of the type class (refer to Figure 4) because not doing so would make the resulting classes highly incompatible with other classes in terms of inheritance.

Metaclasses

Figure 4 Usual implementation of custom metaclasses

The general syntax

The call to the built-in type() class can be used as a dynamic equivalent of the class statement. It creates a new class object given its name, its base classes, and a mapping containing its attributes:

def method(self):
    return 1

klass = type('MyClass', (object,), {'method': method})

The following is the output:

>>> instance = klass()
>>> instance.method()
1

This is equivalent to the explicit definition of the class:

class MyClass:
    def method(self):
        return 1

Here is what you will get:

>>> instance = MyClass()
>>> instance.method()
1

Every class created with the class statement implicitly uses type as its metaclass. This default behavior can be changed by providing the metaclass keyword argument to the class statement:

class ClassWithAMetaclass(metaclass=type):
    pass

The value provided as a metaclass argument is usually another class object, but it can be any other callable that accepts the same arguments as the type class and is expected to return another class object. The call signature is type(name, bases, namespace), which is explained as follows:

  • name: This is the name of class that will be stored in the __name__ attribute
  • bases: This is the list of parent classes that will become the __bases__ attribute and will be used to construct the MRO of a newly created class
  • namespace: This is a namespace (mapping) with definitions for the class body that will become the __dict__ attribute

One way of thinking about metaclasses is the __new__() method, but at a higher level of class definition.

Despite the fact that functions that explicitly call type() can be used in place of metaclasses, the usual approach is to use a different class that inherits from type for this purpose. The common template for a metaclass is as follows:

class Metaclass(type):
    def __new__(mcs, name, bases, namespace):
        return super().__new__(mcs, name, bases, namespace)

    @classmethod
    def __prepare__(mcs, name, bases, **kwargs):
        return super().__prepare__(name, bases, **kwargs)

    def __init__(cls, name, bases, namespace, **kwargs):
        super().__init__(name, bases, namespace)

    def __call__(cls, *args, **kwargs):
        return super().__call__(*args, **kwargs)

The name, bases, and namespace arguments have the same meaning as in the type() call explained earlier, but each of these four methods can have different purposes:

  • __new__(mcs, name, bases, namespace): This is responsible for the actual creation of the class object in the same way as it does for ordinary classes. The first positional argument is a metaclass object. In the preceding example, it would simply be a Metaclass. Note that mcs is the popular naming convention for this argument.
  • __prepare__(mcs, name, bases, **kwargs): This creates an empty namespace object. By default, it returns an empty dict, but it can be overridden to return any other mapping type. Note that it does not accept namespace as an argument because before calling it the namespace does not exist.
  • __init__(cls, name, bases, namespace, **kwargs): This is not seen popularly in metaclass implementations but has the same meaning as in ordinary classes. It can perform additional class object initialization once it was created with __new__(). The first positional argument is now named cls by convention to mark that this is already a created class object (metaclass instance) and not a metaclass object. When __init__() gets called, the class was already constructed and so this method can do less things than the __new__() method. Implementing such a method is very similar to using class decorators, but the main difference is that __init__() will be called for every subclass, while class decorators are not called for subclasses.
  • __call__(cls, *args, **kwargs): This is called when an instance of a metaclass is called. The instance of a metaclass is a class object (refer to Figure 3); it is invoked when you create new instances of a class. This can be used to override the default way how class instances are created and initialized.

Each of the preceding methods can accept additional extra keyword arguments here represented by **kwargs. These arguments can be passed to the metaclass object using extra keyword arguments in the class definition in the form of the following code:

class Klass(metaclass=Metaclass, extra="value"):
    pass

Such amount of information can be overwhelming at the beginning without proper examples, so let's trace the creation of metaclasses, classes, and instances with some print() calls:

class RevealingMeta(type):
    def __new__(mcs, name, bases, namespace, **kwargs):
        print(mcs, "__new__ called")
        return super().__new__(mcs, name, bases, namespace)

    @classmethod
    def __prepare__(mcs, name, bases, **kwargs):
        print(mcs, "__prepare__ called")
        return super().__prepare__(name, bases, **kwargs)

    def __init__(cls, name, bases, namespace, **kwargs):
        print(cls, "__init__ called")
        super().__init__(name, bases, namespace)

    def __call__(cls, *args, **kwargs):
        print(cls, "__call__ called")
        return super().__call__(*args, **kwargs)

Using RevealingMeta as a metaclass to create a new class definition will give the following output in the Python interactive session:

>>> class RevealingClass(metaclass=RevealingMeta):
...     def __new__(cls):
...         print(cls, "__new__ called")
...         return super().__new__(cls)
...     def __init__(self):
...         print(self, "__init__ called")
...         super().__init__()
... 
<class 'RevealingMeta'> __prepare__ called
<class 'RevealingMeta'> __new__ called
<class 'RevealingClass'> __init__ called
>>> instance = RevealingClass()
<class 'RevealingClass'> __call__ called
<class 'RevealingClass'> __new__ called
<RevealingClass object at 0x1032b9fd0> __init__ called

New Python 3 syntax for metaclasses

Metaclasses are not a new feature and are available in Python since version 2.2. Anyway, the syntax of this changed significantly and this change is neither backwards nor forwards compatible. While the new syntax is:

class ClassWithAMetaclass(metaclass=type):
    pass

In Python 2, this must be written as follows:

class ClassWithAMetaclass(object):
    __metaclass__ = type

Class statements in Python 2 do not accept keyword arguments, so Python 3 syntax for defining metaclasses will raise the SyntaxError exception on import. It is still possible to write a code using metaclasses that will run on both Python versions, but it requires some extra work. Fortunately, compatibility-related packages such as six provide simple and reusable solutions to this problem:

from six import with_metaclass


class Meta(type):
    pass


class Base(object):
    pass


class MyClass(with_metaclass(Meta, Base)):
    pass

The other important difference is the lack of the __prepare__() hook in Python 2 metaclasses. Implementing such a function will not raise any exceptions under Python 2 but is pointless because it will not be called in order to provide a clean namespace object. This is why packages that need to maintain Python 2 compatibility need to rely on more complex tricks if they want to achieve things that are a lot easier to implement using __prepare__(). For instance, the Django REST Framework (http://www.django-rest-framework.org) uses the following approach to preserve the order in which attributes are added to a class:

class SerializerMetaclass(type):
    @classmethod
    def _get_declared_fields(cls, bases, attrs):
        fields = [(field_name, attrs.pop(field_name))
                  for field_name, obj in list(attrs.items())
                  if isinstance(obj, Field)]
        fields.sort(key=lambda x: x[1]._creation_counter)

        # If this class is subclassing another Serializer, add 
        # that Serializer's fields. 
        # Note that we loop over the bases in *reverse*. 
        # This is necessary in order to maintain the 
        # correct order of fields.
        for base in reversed(bases):
            if hasattr(base, '_declared_fields'):
                fields = list(base._declared_fields.items()) + fields

        return OrderedDict(fields)

    def __new__(cls, name, bases, attrs):
        attrs['_declared_fields'] = cls._get_declared_fields(
            bases, attrs
        )
        return super(SerializerMetaclass, cls).__new__(
            cls, name, bases, attrs
        )

This is the workaround if the default namespace type, which is dict, does not guarantee to preserve the order of the key-value tuples. The _creation_counter attribute is expected to be in every instance of the Field class. This Field.creation_counter attribute is created in the same way as InstanceCountingClass.instance_number that was presented in the section about the __new__() method. This is a rather complex solution that breaks a single responsibility principle by sharing its implementation across two different classes only to ensure a trackable order of attributes. In Python 3, this could be simpler because __prepare__() can return other mapping types such as OrderedDict:

from collections import OrderedDict


class OrderedMeta(type):
    @classmethod
    def __prepare__(cls, name, bases, **kwargs):
        return OrderedDict()

    def __new__(mcs, name, bases, namespace):
        namespace['order_of_attributes'] = list(namespace.keys())
        return super().__new__(mcs, name, bases, namespace)


class ClassWithOrder(metaclass=OrderedMeta):
    first = 8
    second = 2

Here is what you will see:

>>> ClassWithOrderedAttributes.order_of_attributes
['__module__', '__qualname__', 'first', 'second']
>>> ClassWithOrderedAttributes.__dict__.keys()
dict_keys(['__dict__', 'first', '__weakref__', 'second', 'order_of_attributes', '__module__', '__doc__'])

Note

For more examples, there's a great introduction to metaclass programming in Python 2 by David Mertz, which is available at http://www.onlamp.com/pub/a/python/2003/04/17/metaclasses.html.

Metaclass usage

Metaclasses once mastered are a powerful feature but always complicate the code. They might also make the code less robust that is intended to work on any kind of class. For instance, you might encounter bad interactions when slots are used in the class, or when some base class already implements a metaclass, which conflicts with what yours does. They just do not compose well.

For simple things like changing the read/write attributes or adding new ones, metaclasses can be avoided in favor of simpler solutions such as properties, descriptors, or class decorators.

It is also true that often metaclasses can be replaced with other simpler approaches, but there are situations where things cannot be easily done without them. For instance, it is hard to imagine Django's ORM implementation built without extensive use of metaclasses. It could be possible, but it is rather unlikely that the resulting solution would be similarly easy to use. And frameworks are the place where metaclasses are really well-suited. They usually have a lot of complex solutions that are not easy to understand and follow, but eventually allow other programmers to write more condensed and readable code that operates on a higher level of abstraction.

Metaclass pitfalls

Like some other advanced Python features, metaclasses are very elastic and can be easily abused. While the call signature of the class is rather strict, Python does not enforce the type of the return parameter. It can be anything as long as it accepts incoming arguments on calls and has the required attributes whenever it is needed.

One such object that can be anything-anywhere is the instance of the Mock class provided in the unittest.mock module. Mock is not a metaclass and also does not inherit from the type class. It also does not return the class object on instantiating. Still, it can be included as a metaclass keyword argument in the class definition and this will not raise any issues, despite, it is pointless to do so:

>>> from unittest.mock import Mock
>>> class Nonsense(metaclass=Mock):  # pointless, but illustrative
...     pass
... 
>>> Nonsense
<Mock spec='str' id='4327214664'>

The preceding example, of course, completely does not make sense and will fail on any attempt to instantiate such a Nonsense pseudo-class. It is still important to know that such things are possible because issues with metaclass types that do not result in the creation of the type subclass are sometimes very hard to spot and understand. As a proof, here is a traceback of the exception raised when we try to create a new instance of the Nonsense class presented earlier:

>>> Nonsense()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/unittest/mock.py", line 917, in __call__
    return _mock_self._mock_call(*args, **kwargs)
  File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/unittest/mock.py", line 976, in _mock_call
    result = next(effect)
StopIteration

Some tips on code generation

As already mentioned, dynamic code generation is the most difficult approach to code generation. There are some tools in Python that allow you to generate and execute code or even do some modifications to the already compiled code objects. A complete book could be written about this and even that will not exhaust the topic completely.

Various projects, such as Hy (mentioned later), show that even whole languages can be re-implemented in Python using code generation techniques. This proves that the possibilities are practically limitless. Knowing how vast this topic is and how badly it is riddled with various pitfalls, I won't even try to give detailed suggestions on how to create code this way or to provide useful code samples.

Anyway, knowing what is possible may be useful for you if you plan to study this field deeper by yourself. So, treat this section only as a short summary of possible starting points for further learning. Most of it is flavored with many warnings in case you would like to eagerly jump into calling exec() and eval() in your own project.

exec, eval, and compile

Python provides three built-in functions to manually execute, evaluate, and compile arbitrary Python code:

  • exec(object, globals, locals): This allows you to dynamically execute the Python code. object should be a string or a code object (see the compile() function). The globals and locals arguments provide global and local namespaces for the executed code and are optional. If they are not provided, then the code is executed in the current scope. If provided, globals must be dictionary, while locals might be any mapping object; it always returns None.
  • eval(expression, globals, locals): This is used to evaluate the given expression returning its value. It is similar to exec(), but it accepts that expression should be a single Python expression and not a sequence of statements. It returns the value of the evaluated expression.
  • compile(source, filename, mode): This compiles the source into the code object or AST object. The code to be compiled is provided as a string in the source argument. The filename should be the file from which the code was read. If it has no file associated because its source was created dynamically, then <string> is the value that is commonly used. Mode should be either exec (sequence of statements), eval (single expression), or single (a single interactive statement such as in Python interactive session).

The exec() and eval() functions are the easiest to start with when trying to dynamically generate code because they can operate on strings. If you already know how to program in Python, then you may know how to correctly generate a working source code programmatically. I hope you do.

The most useful in the context of metaprogramming is obviously exec() because it allows us to execute any sequence of Python statements. And the word any should be alarming for you. Even eval(), which only allows evaluation of expressions in the hands of a skillful programmer (when fed with the user input), can lead to serious security holes. Note that crashing the Python interpreter is the least scary scenario you should be afraid of. Introducing vulnerability to remote execution exploits due to irresponsible use of exec() and eval() can cost you your image as a professional developer, or even your job.

Even if used with a trusted input, there is a long list of little details about exec() and eval() that is too long to be included here, but might affect how your application works in the ways you would not expect. Armin Ronacher has a good article that lists the most important of them called Be careful with exec and eval in Python (refer to http://lucumr.pocoo.org/2011/2/1/exec-in-python/).

Despite all these frightening warnings, there are natural situations where the usage of exec() and eval() is really justified. The popular statement about when you have to use them is: you will know. In other words, in case of even the tiniest doubt, you should not use them and try to find a different solution.

Tip

eval() and untrusted input

The signature of the eval() function might make you think that if you provide empty globals and locals namespaces and wrap it with proper try ... except statements, then it will be reasonably safe. There could be nothing more wrong. Ned Batcheler has written a very good article in which he shows how to cause an interpreter segmentation fault in the eval() call even with erased access to all Python built-ins (http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html). This is a single proof that both exec() and eval() should never be used with untrusted input.

Abstract Syntax Tree

The Python syntax is converted to Abstract Syntax Tree (AST) before it is compiled to byte code. This is a tree representation of the abstract syntactic structure of the source code. The processing of Python grammar is available thanks to the built-in ast module. Raw AST of Python code can be created using the compile() function with the ast.PyCF_ONLY_AST flag, or using the ast.parse() helper. Direct translation in reverse is not that simple and there is no function provided in the built-ins for that. Some projects, such as PyPy, do such things though.

The ast module provides some helper functions that allow working with the AST:

>>> tree = ast.parse('def hello_world(): print("hello world!")')
>>> tree
<_ast.Module object at 0x00000000038E9588>
>>> ast.dump(tree)
"Module(
    body=[
        FunctionDef(
           name='hello_world', 
           args=arguments(
               args=[], 
               vararg=None, 
               kwonlyargs=[], 
               kw_defaults=[], 
               kwarg=None, 
               defaults=[]
           ), 
           body=[
               Expr(
                   value=Call(
                       func=Name(id='print', ctx=Load()), 
                       args=[Str(s='hello world!')], 
                       keywords=[]
                   )
               )
           ], 
           decorator_list=[], 
           returns=None
       )
    ]
)"

The output of ast.dump() in the preceding example was reformatted to increase the readability and better show the tree-like structure of the AST. It is important to know that the AST can be modified before being passed to the compile() call that gives many new possibilities. For instance, new syntax nodes can be used for additional instrumentation such as test coverage measurement. It is also possible to modify the existing code tree in order to add new semantics to the existing syntax. Such a technique is used by the MacroPy project (https://github.com/lihaoyi/macropy) to add syntactic macros to Python using the already existing syntax (refer to Figure 5):

Abstract Syntax Tree

Figure 5: How MacroPy adds syntactic macros to Python modules on import

AST can also be created in a purely artificial manner and there is no need to parse any source at all. This gives Python programmers the ability to create Python bytecode for custom domain-specific languages or even completely implement other existing programming languages on top of Python VM.

Import hooks

Taking advantage of the MacroPy's ability to modify original AST would not be as easy as using the import macropy.activate statement if it would not somehow override the Python import behavior. Fortunately, Python provides a way to intercept imports using two kinds of import hooks:

  • Meta hooks: These are called before any other import processing has occurred. Using meta hooks, you can override the way how sys.path is processed or even frozen and built-in modules. In order to add new meta hook, a new meta path finder object must be added to the sys.meta_path list.
  • Import path hooks: These are called as part of sys.path processing. They are used if the path item associated with the given hook is encountered. The import path hooks are added by extending the sys.path_hooks list with a new path finder object.

The details on implementing both path finders and meta path finders are extensively implemented in the official Python documentation (https://docs.python.org/3/reference/import.html). The official documentation should be your primary resource if you want to interact with imports on that level. It's so because import machinery in Python is rather complex and any attempt to summarize it in a few paragraphs would inevitably fail. Treat this section rather as a note that such things are possible and as a reference to more detailed information.

Projects using code generation patterns

It is really hard to find a really usable implementation of the library that relies on code generation patterns that is not only an experiment or simple proof of concepts. The reasons for that situation are fairly obvious:

  • Deserved fear of the exec() and eval() functions because if used irresponsibly they can cause real disasters
  • Successful code generation is simply very difficult because it requires a deep understanding of the featured language and exceptional programming skills in general

Despite these difficulties, there are some projects that successfully take this approach either to improve performance or achieve things that would be impossible by other means.

Falcon's compiled router

Falcon (http://falconframework.org/) is a minimalist Python WSGI web framework for building fast and lightweight APIs. It strongly encourages REST architectural style that is currently very popular around the Web. It is a good alternative to other rather heavy frameworks such as Django or Pyramid. It is also a strong competitor to other micro-frameworks that aim for simplicity such as Flask, Bottle, and web2py.

One of its features is its very simple routing mechanism. It is not as complex as the routing provided by Django urlconf and does not provide as many features but in most cases is just enough for any API that follows the REST architectural design. What is most interesting about falcon's routing is that the actual router is implemented using the code generated from the list of routes provided to the object that defines the API configuration. This is the effort to make routing fast.

Consider this very short API example taken from falcon's web documentation:

# sample.py
import falcon
import json
 
class QuoteResource:
    def on_get(self, req, resp):
        """Handles GET requests"""
        quote = {
            'quote': 'I've always been more interested in '
                     'the future than in the past.',
            'author': 'Grace Hopper'
        }

        resp.body = json.dumps(quote)
 
api = falcon.API()
api.add_route('/quote', QuoteResource())

The highlighted call to the api.add_route() method in brief words translates to updating the whole dynamically generated router code tree, compiling using compile() and generating the new route-finding function using eval(). Looking at the __code__ attribute of the api._router._find() function shows that it was generated from the string and that it changes with every call to api.add_route():

>>> api._router._find.__code__
<code object find at 0x00000000033C29C0, file "<string>", line 1>
>>> api.add_route('/none', None)
>>> api._router._find.__code__
<code object find at 0x00000000033C2810, file "<string>", line 1>

Hy

Hy (http://docs.hylang.org/) is the dialect of Lisp written entirely in Python. Many similar projects implementing other code in Python usually try only to tokenize the plain form of code provided either as a file-like object or string and interpret it as a series of explicit Python calls. Unlike others, Hy can be considered a language that runs fully in the Python run-time environment just as Python does. Code written in Hy can use the existing built-in modules and external packages and vice versa. Code written with Hy can be imported back to Python.

In order to embed Lisp in Python, Hy translates Lisp code directly to Python Abstract Syntax Tree. Import interoperability is achieved using import hook that is registered once the Hy module is imported in Python. Every module with the .hy extension is treated as the Hy module and can be imported like the ordinary Python module. Thanks to this fact, the following "hello world" program is written in this Lisp dialect:

;; hyllo.hy
(defn hello [] (print "hello world!"))

It can be imported and executed by the following Python code:

>>> import hy
>>> import hyllo
>>> hyllo.hello()
hello world!

If we dig deeper and try to disassemble hyllo.hello using the built-in dis module, we will notice that the byte code of the Hy function does not differ significantly from its pure Python counterpart:

>>> import dis
>>> dis.dis(hyllo.hello)
  2           0 LOAD_GLOBAL        0 (print)
              3 LOAD_CONST         1 ('hello world!')
              6 CALL_FUNCTION      1 (1 positional, 0 keyword pair)
              9 RETURN_VALUE
>>> def hello(): print("hello world!")
>>> dis.dis(hello)
  1           0 LOAD_GLOBAL        0 (print)
              3 LOAD_CONST         1 ('hello world!')
              6 CALL_FUNCTION      1 (1 positional, 0 keyword pair)
              9 POP_TOP
             10 LOAD_CONST         0 (None)
             13 RETURN_VALUE
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset