super
is a built-in class that can be used to access an attribute belonging to an object's superclass.
Its usage is a bit confusing when you are used to accessing a class attribute or method by calling the parent class directly and passing self
as the first argument. This is really old pattern but still can be found in some codebases (especially in legacy projects). See the following code:
class Mama: # this is the old way def says(self): print('do your homework') class Sister(Mama): def says(self): Mama.says(self) print('and clean your bedroom')
When run in an interpreter session it gives following result:
>>> Sister().says() do your homework and clean your bedroom
Look particularly at the line Mama.says(self)
, where we use the technique just described to call the says()
method of the superclass (that is, the Mama
class), and pass self
as the argument. This means, the says()
method belonging to Mama
will be called. But the instance on which it will be called is provided as the self
argument, which is an instance of Sister
in this case.
Instead, the super
usage would be:
class Sister(Mama):def says(self):
super(Sister, self).says()
print('and clean your bedroom')
Alternatively, you can also use the shorter form of the super()
call:
class Sister(Mama):def says(self):
super().says()
print('and clean your bedroom')
The shorter form of super
(without passing any arguments) is allowed inside the methods but super
is not limited to methods. It can be used in any place of code where a call to the given instance superclass method implementation is required. Still, if super
is not used inside the method, then its arguments are mandatory:
>>> anita = Sister() >>> super(anita.__class__, anita).says() do your homework
The last and most important thing that should be noted about super
is that its second argument is optional. When only the first argument is provided, then super
returns an unbounded type. This is especially useful when working with classmethod
:
class Pizza: def __init__(self, toppings): self.toppings = toppings def __repr__(self): return "Pizza with " + " and ".join(self.toppings) @classmethod def recommend(cls): """Recommend some pizza with arbitrary toppings,""" return cls(['spam', 'ham', 'eggs']) class VikingPizza(Pizza): @classmethod def recommend(cls): """Use same recommendation as super but add extra spam""" recommended = super(VikingPizza).recommend() recommended.toppings += ['spam'] * 5 return recommended
Note that the zero-argument super()
form is also allowed for methods decorated with the classmethod
decorator. super()
called without arguments in such a method is treated as having only the first argument defined.
The use cases presented earlier are very simple to follow and understand, but when you face a multiple inheritance schema, it becomes hard to use super
. Before explaining these problems, understanding when super
should be avoided and how the Method Resolution
Order (MRO) works in Python is important.
super()
in Python 2 works almost exactly the same. The only difference in call signature is that the shorter, zero-argument form is not available, so at least one of the expected arguments must be provided always.
Another important thing for programmers who want to write cross-version compatible code is that super
in Python 2 works only for new-style classes. The earlier versions of Python did not have a common ancestor for all classes in the form of object
. The old behavior was left in every Python 2.x branch release for backwards compatibility, so in those versions, if the class definition has no ancestor specified, it is interpreted as an old-style class and it cannot use super
:
class OldStyle1: pass class OldStyle2(): pass
The new-style class in Python 2 must explicitly inherit from the object or other new-style class:
class NewStyleClass(object): pass class NewStyleClassToo(NewStyleClass): pass
Python 3 no longer maintains the concept of old-style classes, so any class that does not inherit from any other class implicitly inherits from object
. This means that explicitly stating that a class inherits from object
may seem redundant. The general good practice is to not include redundant code, but removing such redundancy in this case is a good approach only for projects that no longer target any of the Python 2 versions. Code that aims for cross-version compatibility of Python must always include object
as an ancestor of base classes even if this is redundant in Python 3. Not doing so will result in such classes being interpreted as old style, and this will eventually lead to issues that are very hard to diagnose.
Python's Method Resolution Order is based on C3, the MRO built for the Dylan programming language (http://opendylan.org). The reference document, written by Michele Simionato, is located at http://www.python.org/download/releases/2.3/mro. It describes how C3 builds the linearization of a class, also called precedence, which is an ordered list of the ancestors. This list is used to seek an attribute. The C3 algorithm is described in more detail later in this section.
The MRO change was made to resolve an issue introduced with the creation of a common base type (object
). Before the change to the C3 linearization method, if a class had two ancestors (refer to Figure 1), the order in which methods were resolved was quite simple to compute and track for simple cases that do not use the multiple inheritance model. Here is an example of code that under Python 2 would not use C3 as a Method Resolution Order:
class Base1: pass class Base2: def method(self): print('Base2') class MyClass(Base1, Base2): pass
The following transcript from interactive session shows this method resolution at work:
>>> MyClass().method() Base2
When MyClass().method()
is called, the interpreter looks for the method in MyClass
, then Base1
, and then eventually finds it in Base2
:
When we introduce some CommonBase
class on top of the two base classes (both Base1
and Base2
inherit from it, refer to Figure 2), things get more complicated. As a result, the simple resolution order that behaves according to the left to right depth first rule is getting back to the top through the Base1
class before looking into the Base2
class. This algorithm results in a counterintuitive output. In some cases, the method that is executed may not be the one that is the closest in the inheritance tree.
Such an algorithm is still available in Python 2 when old-style classes (not inheriting from object
) are used. Here is an example of the old method resolution in Python 2 using old-style classes:
class CommonBase: def method(self): print('CommonBase') class Base1(CommonBase): pass class Base2(CommonBase): def method(self): print('Base2') class MyClass(Base1, Base2): pass
The following transcript from interactive session shows that Base2.method()
will not be called despite Base2
is closer in the class hierarchy than CommonBase
:
>>> MyClass().method() CommonBase
Such an inheritance scenario is extremely uncommon, so this is more a problem of theory than practice. The standard library does not structure the inheritance hierarchies in this way, and many developers think it is a bad practice. But with the introduction of object
at the top of the types hierarchy, the multiple inheritance problem pops up on the C side of the language, resulting in conflicts when doing subtyping. Note also that every class in Python 3 has now the same common ancestor. Since making it work properly with the existing MRO involved too much work, a new MRO was a simpler and quicker solution.
So, the same example run under Python 3 gives a different result:
class CommonBase: def method(self): print('CommonBase') class Base1(CommonBase): pass class Base2(CommonBase): def method(self): print('Base2') class MyClass(Base1, Base2): pass
And here is usage showing that C3 serialization will pick method of the closest ancestor:
>>> MyClass().method() Base2
Note that the above behavior cannot be replicated in Python 2 without the CommonBase
class explicitly inheriting from object
. The reasons why it may be useful to specify object
as a class ancestor in Python 3 even if this is redundant were mentioned in the previous section, Old-style classes and super in Python 2.
The Python MRO is based on a recursive call over the base classes. To summarize the Michele Simionato paper referenced at the beginning of this section, the C3 symbolic notation applied to our example is:
L[MyClass(Base1, Base2)] = MyClass + merge(L[Base1], L[Base2], Base1, Base2)
Here, L[MyClass]
is the linearization of the MyClass
class, and merge
is a specific algorithm that merges several linearization results.
So, a synthetic description would be, as Simionato says:
"The linearization of C is the sum of C plus the merge of the linearizations of the parents and the list of the parents"
The merge
algorithm is responsible for removing the duplicates and preserving the correct ordering. It is described in the paper like this (adapted to our example):
"Take the head of the first list, that is, L[Base1][0]; if this head is not in the tail of any of the other lists, then add it to the linearization of MyClass and remove it from the lists in the merge, otherwise look at the head of the next list and take it, if it is a good head.
Then, repeat the operation until all the classes are removed or it is impossible to find good heads. In this case, it is impossible to construct the merge, Python 2.3 will refuse to create the class MyClass and will raise an exception."
The head
is the first element of a list and the tail
contains the rest of the elements. For example, in (Base1, Base2, ..., BaseN), Base1
is the head
, and (Base2, ..., BaseN)
the tail
.
In other words, C3 does a recursive depth lookup on each parent to get a sequence of lists. Then it computes a left-to-right rule to merge all lists with a hierarchy disambiguation, when a class is involved in several lists.
So the result is:
def L(klass): return [k.__name__ for k in klass.__mro__] >>> L(MyClass) ['MyClass', 'Base1', 'Base2', 'CommonBase', 'object']
The __mro__
attribute of a class (which is read-only) stores the result of the linearization computation, which is done when the class definition is loaded.
You can also call MyClass.mro()
to compute and get the result. This is another reason why classes in Python 2 should be taken with extra case. While old-style classes in Python 2 have a defined order in which methods are resolved, they do not provide the __mro__
attribute and the mro()
method. So, despite the order of resolution, it is wrong to say that they have MRO. In most cases, whenever someone refers to MRO in Python, it means that they refer to the C3 algorithm described in this section.
Back to super
. Its usage, when using the multiple inheritance hierarchy, can be quite dangerous, mainly because of the initialization of classes. In Python, the base classes are not implicitly called in __init__()
, and so it is up to the developer to call them. We will see a few examples.
In the following example taken from James Knight's website (http://fuhm.net/super-harmful), a C
class that calls its base classes using the __init__()
method will make the B
class be called twice:
class A: def __init__(self): print("A", end=" ") super().__init__() class B: def __init__(self): print("B", end=" ") super().__init__() class C(A, B): def __init__(self): print("C", end=" ") A.__init__(self) B.__init__(self)
Here is the output:
>>> print("MRO:", [x.__name__ for x in C.__mro__]) MRO: ['C', 'A', 'B', 'object'] >>> C() C A B B <__main__.C object at 0x0000000001217C50>
This happens due to the A.__init__(self)
call, which is made with the C
instance, thus making the super(A, self).__init__()
call the B.__init__()
method. In other words, super
should be used in the whole class hierarchy. The problem is that sometimes a part of this hierarchy is located in third-party code. Many related pitfalls on the hierarchy calls introduced by multiple inheritances can be found on James's page.
Unfortunately, you cannot be sure that external packages use super()
in their code. Whenever you need to subclass some third-party class, it is always a good approach to take a look inside of its code and code of other classes in the MRO. This may be tedious, but as a bonus you get some information about the quality of code provided by such a package and more understanding of its implementation. You may learn something new that way.
Another issue with super
usage is the argument passing in initialization. How can a class call its base class __init__()
code if it doesn't have the same signature? This leads to the following problem:
class CommonBase: def __init__(self): print('CommonBase') super().__init__() class Base1(CommonBase): def __init__(self): print('Base1') super().__init__() class Base2(CommonBase): def __init__(self, arg): print('base2') super().__init__() class MyClass(Base1 , Base2): def __init__(self, arg): print('my base') super().__init__(arg)
An attempt to create a MyClass
instance will raise TypeError
due to the mismatch of the parent classes' __init__()
signatures:
>>> MyClass(10) my base Traceback (most recent call last): File "<stdin>", line 1, in <module> File "<stdin>", line 4, in __init__ TypeError: __init__() takes 1 positional argument but 2 were given
One solution would be to use arguments and keyword arguments packed with *args
and **kwargs
magic so that all constructors pass along all the parameters even if they do not use them:
class CommonBase: def __init__(self, *args, **kwargs): print('CommonBase') super().__init__() class Base1(CommonBase): def __init__(self, *args, **kwargs): print('Base1') super().__init__(*args, **kwargs) class Base2(CommonBase): def __init__(self, *args, **kwargs): print('base2') super().__init__(*args, **kwargs) class MyClass(Base1 , Base2): def __init__(self, arg): print('my base') super().__init__(arg)
With this approach the parent class signatures will always match:
>>> _ = MyClass(10) my base Base1 base2 CommonBase
This is an awful fix though, because it makes all constructors accept any kind of parameter. It leads to weak code, since anything can be passed and gone through. Another solution is to use the explicit __init__()
calls of specific classes in MyClass
, but this would lead to the first pitfall.
To avoid all the mentioned problems, and until Python evolves in this field, we need to take into consideration the following points:
super
should be used everywhere or nowhere. Mixing super
and classic calls is a confusing practice. People tend to avoid super
, for their code to be more explicit.__mro__
) has to be done.