Accessing methods from superclasses

super is a built-in class that can be used to access an attribute belonging to an object's superclass.

Note

The Python official documentation lists super as a built-in function. But it's a built-in class, even if it is used like a function:

>>> super
<class 'super'>

Its usage is a bit confusing when you are used to accessing a class attribute or method by calling the parent class directly and passing self as the first argument. This is really old pattern but still can be found in some codebases (especially in legacy projects). See the following code:

class Mama:  # this is the old way
    def says(self):
        print('do your homework')

         
class Sister(Mama):
    def says(self):
        Mama.says(self)
        print('and clean your bedroom')

When run in an interpreter session it gives following result:

>>> Sister().says()
do your homework
and clean your bedroom

Look particularly at the line Mama.says(self), where we use the technique just described to call the says() method of the superclass (that is, the Mama class), and pass self as the argument. This means, the says() method belonging to Mama will be called. But the instance on which it will be called is provided as the self argument, which is an instance of Sister in this case.

Instead, the super usage would be:

class Sister(Mama):def says(self):
        super(Sister, self).says()
        print('and clean your bedroom')

Alternatively, you can also use the shorter form of the super() call:

class Sister(Mama):def says(self):
        super().says()
        print('and clean your bedroom')

The shorter form of super (without passing any arguments) is allowed inside the methods but super is not limited to methods. It can be used in any place of code where a call to the given instance superclass method implementation is required. Still, if super is not used inside the method, then its arguments are mandatory:

>>> anita = Sister()
>>> super(anita.__class__, anita).says()
do your homework

The last and most important thing that should be noted about super is that its second argument is optional. When only the first argument is provided, then super returns an unbounded type. This is especially useful when working with classmethod:

class Pizza:
    def __init__(self, toppings):
        self.toppings = toppings

    def __repr__(self):
        return "Pizza with " + " and ".join(self.toppings)

    @classmethod
    def recommend(cls):
        """Recommend some pizza with arbitrary toppings,"""
        return cls(['spam', 'ham', 'eggs'])


class VikingPizza(Pizza):
    @classmethod
    def recommend(cls):
        """Use same recommendation as super but add extra spam"""
        recommended = super(VikingPizza).recommend()
        recommended.toppings += ['spam'] * 5
        return recommended

Note that the zero-argument super() form is also allowed for methods decorated with the classmethod decorator. super() called without arguments in such a method is treated as having only the first argument defined.

The use cases presented earlier are very simple to follow and understand, but when you face a multiple inheritance schema, it becomes hard to use super. Before explaining these problems, understanding when super should be avoided and how the Method Resolution Order (MRO) works in Python is important.

Old-style classes and super in Python 2

super() in Python 2 works almost exactly the same. The only difference in call signature is that the shorter, zero-argument form is not available, so at least one of the expected arguments must be provided always.

Another important thing for programmers who want to write cross-version compatible code is that super in Python 2 works only for new-style classes. The earlier versions of Python did not have a common ancestor for all classes in the form of object. The old behavior was left in every Python 2.x branch release for backwards compatibility, so in those versions, if the class definition has no ancestor specified, it is interpreted as an old-style class and it cannot use super:

class OldStyle1:
    pass


class OldStyle2():
    pass

The new-style class in Python 2 must explicitly inherit from the object or other new-style class:

class NewStyleClass(object):
    pass


class NewStyleClassToo(NewStyleClass):
    pass

Python 3 no longer maintains the concept of old-style classes, so any class that does not inherit from any other class implicitly inherits from object. This means that explicitly stating that a class inherits from object may seem redundant. The general good practice is to not include redundant code, but removing such redundancy in this case is a good approach only for projects that no longer target any of the Python 2 versions. Code that aims for cross-version compatibility of Python must always include object as an ancestor of base classes even if this is redundant in Python 3. Not doing so will result in such classes being interpreted as old style, and this will eventually lead to issues that are very hard to diagnose.

Understanding Python's Method Resolution Order

Python's Method Resolution Order is based on C3, the MRO built for the Dylan programming language (http://opendylan.org). The reference document, written by Michele Simionato, is located at http://www.python.org/download/releases/2.3/mro. It describes how C3 builds the linearization of a class, also called precedence, which is an ordered list of the ancestors. This list is used to seek an attribute. The C3 algorithm is described in more detail later in this section.

The MRO change was made to resolve an issue introduced with the creation of a common base type (object). Before the change to the C3 linearization method, if a class had two ancestors (refer to Figure 1), the order in which methods were resolved was quite simple to compute and track for simple cases that do not use the multiple inheritance model. Here is an example of code that under Python 2 would not use C3 as a Method Resolution Order:

class Base1:
    pass
    

class Base2:
    def method(self):
        print('Base2')
        

class MyClass(Base1, Base2):
    pass

The following transcript from interactive session shows this method resolution at work:

>>> MyClass().method()
Base2

When MyClass().method() is called, the interpreter looks for the method in MyClass, then Base1, and then eventually finds it in Base2:

Understanding Python's Method Resolution Order

Figure 1 Classical hierarchy

When we introduce some CommonBase class on top of the two base classes (both Base1 and Base2 inherit from it, refer to Figure 2), things get more complicated. As a result, the simple resolution order that behaves according to the left to right depth first rule is getting back to the top through the Base1 class before looking into the Base2 class. This algorithm results in a counterintuitive output. In some cases, the method that is executed may not be the one that is the closest in the inheritance tree.

Such an algorithm is still available in Python 2 when old-style classes (not inheriting from object) are used. Here is an example of the old method resolution in Python 2 using old-style classes:

class CommonBase:
    def method(self):
        print('CommonBase')


class Base1(CommonBase):
    pass


class Base2(CommonBase):
    def method(self):
        print('Base2')


class MyClass(Base1, Base2):
    pass

The following transcript from interactive session shows that Base2.method() will not be called despite Base2 is closer in the class hierarchy than CommonBase:

>>> MyClass().method()
CommonBase
Understanding Python's Method Resolution Order

Figure 2 The Diamond class hierarchy

Such an inheritance scenario is extremely uncommon, so this is more a problem of theory than practice. The standard library does not structure the inheritance hierarchies in this way, and many developers think it is a bad practice. But with the introduction of object at the top of the types hierarchy, the multiple inheritance problem pops up on the C side of the language, resulting in conflicts when doing subtyping. Note also that every class in Python 3 has now the same common ancestor. Since making it work properly with the existing MRO involved too much work, a new MRO was a simpler and quicker solution.

So, the same example run under Python 3 gives a different result:

class CommonBase:
    def method(self):
        print('CommonBase')


class Base1(CommonBase):
    pass


class Base2(CommonBase):
    def method(self):
        print('Base2')


class MyClass(Base1, Base2):
    pass

And here is usage showing that C3 serialization will pick method of the closest ancestor:

>>> MyClass().method()
Base2

Tip

Note that the above behavior cannot be replicated in Python 2 without the CommonBase class explicitly inheriting from object. The reasons why it may be useful to specify object as a class ancestor in Python 3 even if this is redundant were mentioned in the previous section, Old-style classes and super in Python 2.

The Python MRO is based on a recursive call over the base classes. To summarize the Michele Simionato paper referenced at the beginning of this section, the C3 symbolic notation applied to our example is:

L[MyClass(Base1, Base2)] =
        MyClass + merge(L[Base1], L[Base2], Base1, Base2)

Here, L[MyClass] is the linearization of the MyClass class, and merge is a specific algorithm that merges several linearization results.

So, a synthetic description would be, as Simionato says:

"The linearization of C is the sum of C plus the merge of the linearizations of the parents and the list of the parents"

The merge algorithm is responsible for removing the duplicates and preserving the correct ordering. It is described in the paper like this (adapted to our example):

"Take the head of the first list, that is, L[Base1][0]; if this head is not in the tail of any of the other lists, then add it to the linearization of MyClass and remove it from the lists in the merge, otherwise look at the head of the next list and take it, if it is a good head.

Then, repeat the operation until all the classes are removed or it is impossible to find good heads. In this case, it is impossible to construct the merge, Python 2.3 will refuse to create the class MyClass and will raise an exception."

The head is the first element of a list and the tail contains the rest of the elements. For example, in (Base1, Base2, ..., BaseN), Base1 is the head, and (Base2, ..., BaseN) the tail.

In other words, C3 does a recursive depth lookup on each parent to get a sequence of lists. Then it computes a left-to-right rule to merge all lists with a hierarchy disambiguation, when a class is involved in several lists.

So the result is:

def L(klass):
    return [k.__name__ for k in klass.__mro__]


>>> L(MyClass)
['MyClass', 'Base1', 'Base2', 'CommonBase', 'object']

Tip

The __mro__ attribute of a class (which is read-only) stores the result of the linearization computation, which is done when the class definition is loaded.

You can also call MyClass.mro() to compute and get the result. This is another reason why classes in Python 2 should be taken with extra case. While old-style classes in Python 2 have a defined order in which methods are resolved, they do not provide the __mro__ attribute and the mro() method. So, despite the order of resolution, it is wrong to say that they have MRO. In most cases, whenever someone refers to MRO in Python, it means that they refer to the C3 algorithm described in this section.

super pitfalls

Back to super. Its usage, when using the multiple inheritance hierarchy, can be quite dangerous, mainly because of the initialization of classes. In Python, the base classes are not implicitly called in __init__(), and so it is up to the developer to call them. We will see a few examples.

Mixing super and explicit class calls

In the following example taken from James Knight's website (http://fuhm.net/super-harmful), a C class that calls its base classes using the __init__() method will make the B class be called twice:

class A:
    def __init__(self):
        print("A", end=" ")
        super().__init__()


class B:
    def __init__(self):
        print("B", end=" ")
        super().__init__()


class C(A, B):
    def __init__(self):
        print("C", end=" ")
        A.__init__(self)
        B.__init__(self)

Here is the output:

>>> print("MRO:", [x.__name__ for x in C.__mro__])
MRO: ['C', 'A', 'B', 'object']
>>> C()
C A B B <__main__.C object at 0x0000000001217C50>

This happens due to the A.__init__(self) call, which is made with the C instance, thus making the super(A, self).__init__() call the B.__init__() method. In other words, super should be used in the whole class hierarchy. The problem is that sometimes a part of this hierarchy is located in third-party code. Many related pitfalls on the hierarchy calls introduced by multiple inheritances can be found on James's page.

Unfortunately, you cannot be sure that external packages use super() in their code. Whenever you need to subclass some third-party class, it is always a good approach to take a look inside of its code and code of other classes in the MRO. This may be tedious, but as a bonus you get some information about the quality of code provided by such a package and more understanding of its implementation. You may learn something new that way.

Heterogeneous arguments

Another issue with super usage is the argument passing in initialization. How can a class call its base class __init__() code if it doesn't have the same signature? This leads to the following problem:

class CommonBase:
    def __init__(self):
        print('CommonBase')
        super().__init__()


class Base1(CommonBase):
    def __init__(self):
        print('Base1')
        super().__init__()


class Base2(CommonBase):
    def __init__(self, arg):
        print('base2')
        super().__init__()


class MyClass(Base1 , Base2):
    def __init__(self, arg):
        print('my base')
        super().__init__(arg)

An attempt to create a MyClass instance will raise TypeError due to the mismatch of the parent classes' __init__() signatures:

>>> MyClass(10)
my base
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 4, in __init__
TypeError: __init__() takes 1 positional argument but 2 were given

One solution would be to use arguments and keyword arguments packed with *args and **kwargs magic so that all constructors pass along all the parameters even if they do not use them:

class CommonBase:
    def __init__(self, *args, **kwargs):
        print('CommonBase')
        super().__init__()

class Base1(CommonBase):
    def __init__(self, *args, **kwargs):
        print('Base1')
        super().__init__(*args, **kwargs)

class Base2(CommonBase):
    def __init__(self, *args, **kwargs):
        print('base2')
        super().__init__(*args, **kwargs)

class MyClass(Base1 , Base2):
    def __init__(self, arg):
        print('my base')
        super().__init__(arg)

With this approach the parent class signatures will always match:

>>> _ = MyClass(10)
my base
Base1
base2
CommonBase

This is an awful fix though, because it makes all constructors accept any kind of parameter. It leads to weak code, since anything can be passed and gone through. Another solution is to use the explicit __init__() calls of specific classes in MyClass, but this would lead to the first pitfall.

Best practices

To avoid all the mentioned problems, and until Python evolves in this field, we need to take into consideration the following points:

  • Multiple inheritance should be avoided: It can be replaced with some design patterns presented in Chapter 14, Useful Design Patterns.
  • super usage has to be consistent: In a class hierarchy, super should be used everywhere or nowhere. Mixing super and classic calls is a confusing practice. People tend to avoid super, for their code to be more explicit.
  • Explicitly inherit from object in Python 3 if you target Python 2 too: Classes without any ancestor specified are recognized as old-style classes in Python 2. Mixing old-style classes with new-style classes should be avoided in Python 2.
  • Class hierarchy has to be looked over when a parent class is called: To avoid any problems, every time a parent class is called, a quick glance at the involved MRO (with __mro__) has to be done.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset