If you haven’t quite gotten all of Python OOP yet, don’t worry; now that we’ve had a quick tour, we’re going to dig a bit deeper and study the concepts introduced earlier in further detail. In this and the following chapter, we’ll take another look at class mechanics. Here, we’re going to study classes, methods, and inheritance, formalizing and expanding on some of the coding ideas introduced in Chapter 26. Because the class is our last namespace tool, we’ll summarize Python’s namespace concepts here as well.
The next chapter continues this in-depth second pass over class mechanics by covering one specific aspect: operator overloading. Besides presenting the details, this chapter and the next also give us an opportunity to explore some larger classes than those we have studied so far.
Although the Python class
statement may seem similar to tools in other OOP languages on the
surface, on closer inspection, it is quite different from what some
programmers are used to. For example, as in C++, the class
statement is Python’s main OOP tool,
but unlike in C++, Python’s class
is not a declaration. Like a def
, a
class
statement is an object
builder, and an implicit assignment—when run, it generates a class
object and stores a reference to it in the name used in the header.
Also like a def
, a class
statement is true executable code—your
class doesn’t exist until Python reaches and runs the class
statement that defines it (typically
while importing the module it is coded in, but not before).
class
is a compound statement, with a body of indented statements
typically appearing under the header. In the header, superclasses
are listed in parentheses after the class name, separated by commas.
Listing more than one superclass leads to multiple inheritance
(which we’ll discuss more formally in Chapter 30). Here is the statement’s
general form:
class <name>(superclass,...): # Assign to name data = value # Shared class data def method(self,...): # Methods self.member = value # Per-instance data
Within the class
statement,
any assignments generate class attributes, and specially named
methods overload operators; for instance, a function called __init__
is called at instance object
construction time, if defined.
As we’ve seen, classes are mostly just namespaces—that
is, tools for defining names (i.e., attributes) that export data and
logic to clients. So, how do you get from the class
statement to a namespace?
Here’s how. Just like in a module file, the statements nested
in a class
statement body create
its attributes. When Python executes a class
statement (not a call to a class),
it runs all the statements in its body, from top to bottom.
Assignments that happen during this process create names in the
class’s local scope, which become attributes in the associated class
object. Because of this, classes resemble both modules and
functions:
Like functions, class
statements are local scopes where names created by nested
assignments live.
Like names in a module, names assigned in a class
statement become attributes in a
class object.
The main distinction for classes is that their namespaces are also the basis of inheritance in Python; reference attributes that are not found in a class or instance object are fetched from other classes.
Because class
is a compound
statement, any sort of statement can be nested inside its
body—print
, =
, if
,
def
, and so on. All the
statements inside the class
statement run when the class
statement itself runs (not when the class is later called to make an
instance). Assigning names inside the class
statement makes class attributes,
and nested def
s make class methods, but other
assignments make attributes, too.
For example, assignments of simple nonfunction objects to class attributes produce data attributes, shared by all instances:
>>>class SharedData:
...spam = 42
# Generates a class data attribute ... >>>x = SharedData()
# Make two instances >>>y = SharedData()
>>>x.spam, y.spam
# They inherit and share 'spam' (42, 42)
Here, because the name spam
is assigned at the top level of a
class
statement, it is attached
to the class and so will be shared by all instances. We can change
it by going through the class name, and we can refer to it through
either instances or the class.[63]
>>>SharedData.spam = 99
>>>x.spam, y.spam, SharedData.spam
(99, 99, 99)
Such class attributes can be used to manage information that
spans all the instances—a counter of the number of instances
generated, for example (we’ll expand on this idea by example in
Chapter 31). Now, watch what happens
if we assign the name spam
through an instance instead of the class:
>>>x.spam = 88
>>>x.spam, y.spam, SharedData.spam
(88, 99, 99)
Assignments to instance attributes create or change the names
in the instance, rather than in the shared class. More generally,
inheritance searches occur only on attribute
references, not on assignment: assigning to an
object’s attribute always changes that object, and no
other.[64] For example, y.spam
is looked up in the class by inheritance, but the assignment to
x.spam
attaches a name to
x
itself.
Here’s a more comprehensive example of this behavior that stores the same name in two places. Suppose we run the following class:
class MixedNames: # Define class data = 'spam' # Assign class attr def __init__(self, value): # Assign method name self.data = value # Assign instance attr def display(self): print(self.data, MixedNames.data) # Instance attr, class attr
This class contains two def
s, which bind class attributes to
method functions. It also contains an =
assignment statement; because this
assignment assigns the name data
inside the class
, it lives in the
class’s local scope and becomes an attribute of the class object.
Like all class attributes, this data
is inherited and shared by all
instances of the class that don’t have data
attributes of their own.
When we make instances of this class, the name data
is attached to those instances by the
assignment to self.data
in the
constructor method:
>>>x = MixedNames(1)
# Make two instance objects >>>y = MixedNames(2)
# Each has its own data >>>x.display(); y.display()
# self.data differs, MixedNames.data is the same 1 spam 2 spam
The net result is that data
lives in two places: in the instance objects (created by the
self.data
assignment in __init__
), and in the class from which
they inherit names (created by the data
assignment in the class
). The class’s display
method prints both versions, by
first qualifying the self
instance, and then the class.
By using these techniques to store attributes in different objects, we determine their scope of visibility. When attached to classes, names are shared; in instances, names record per-instance data, not shared behavior or data. Although inheritance searches look up names for us, we can always get to an attribute anywhere in a tree by accessing the desired object directly.
In the preceding example, for instance, specifying x.data
or self.data
will return an instance name,
which normally hides the same name in the class; however, MixedNames.data
grabs the class name
explicitly. We’ll see various roles for such coding patterns later;
the next section describes one of the most common.
Because you already know about functions, you also know about
methods in classes. Methods are just function objects created by
def
statements nested in a class
statement’s body. From an abstract
perspective, methods provide behavior for instance objects to inherit.
From a programming perspective, methods work in exactly the same way
as simple functions, with one crucial exception: a method’s first
argument always receives the instance object that is the implied
subject of the method call.
In other words, Python automatically maps instance method calls to class method functions as follows. Method calls made through an instance, like this:
instance
.method
(args...
)
are automatically translated to class method function calls of this form:
class
.method
(instance
,args...
)
where the class is determined by locating the method name using Python’s inheritance search procedure. In fact, both call forms are valid in Python.
Besides the normal inheritance of method attribute names, the
special first argument is the only real magic behind method calls. In
a class method, the first argument is usually called self
by convention (technically, only its
position is significant, not its name). This argument provides methods
with a hook back to the instance that is the subject of the
call—because classes generate many instance objects, they need to use
this argument to manage data that varies per instance.
C++ programmers may recognize Python’s self
argument as being similar to C++’s
this
pointer. In Python, though,
self
is always explicit in your
code: methods must always go through self
to fetch or change attributes of the
instance being processed by the current method call. This explicit
nature of self
is by design—the
presence of this name makes it obvious that you are using instance
attribute names in your script, not names in the local or global
scope.
To clarify these concepts, let’s turn to an example. Suppose we define the following class:
class NextClass: # Define class def printer(self, text): # Define method self.message = text # Change instance print(self.message) # Access instance
The name printer
references
a function object; because it’s assigned in the class
statement’s scope, it becomes a
class object attribute and is inherited by every instance made from
the class. Normally, because methods like printer
are designed to process instances,
we call them through instances:
>>>x = NextClass()
# Make instance >>>x.printer('instance call')
# Call its method instance call >>>x.message
# Instance changed 'instance call'
When we call the method by qualifying an instance like this,
printer
is first located by
inheritance, and then its self
argument is automatically assigned the instance object (x
); the text
argument gets the string passed at
the call ('instance call'
).
Notice that because Python automatically passes the first argument
to self
for us, we only actually
have to pass in one argument. Inside printer
, the name self
is used to access or set per-instance
data because it refers back to the instance currently being
processed.
Methods may be called in one of two ways—through an instance,
or through the class itself. For example, we can also call printer
by going through the class name,
provided we pass an instance to the self
argument explicitly:
>>>NextClass.printer(x, 'class call')
# Direct class call class call >>>x.message
# Instance changed again 'class call'
Calls routed through the instance and the class have the exact same effect, as long as we pass the same instance object ourselves in the class form. By default, in fact, you get an error message if you try to call a method without any instance:
>>> NextClass.printer('bad call')
TypeError: unbound method printer() must be called with NextClass instance...
Methods are normally called through instances. Calls to
methods through a class, though, do show up in a variety of special
roles. One common scenario involves the constructor method. The
__init__
method, like all
attributes, is looked up by inheritance. This means that at
construction time, Python locates and calls just one __init__
. If subclass constructors need to
guarantee that superclass construction-time logic runs, too, they
generally must call the superclass’s __init__
method explicitly through the
class:
class Super: def __init__(self, x):...default code...
class Sub(Super): def __init__(self, x, y): Super.__init__(self, x) # Run superclass __init__...custom code...
# Do my init actions I = Sub(1, 2)
This is one of the few contexts in which your code is likely
to call an operator overloading method directly. Naturally, you
should only call the superclass constructor this way if you really
want it to run—without the call, the subclass replaces it
completely. For a more realistic illustration of this technique in
action, see the Manager
class
example in the prior chapter’s tutorial.[65]
This pattern of calling methods through a class is the general basis of extending (instead of completely replacing) inherited method behavior. In Chapter 31, we’ll also meet a new option added in Python 2.2, static methods, that allow you to code methods that do not expect instance objects in their first arguments. Such methods can act like simple instanceless functions, with names that are local to the classes in which they are coded, and may be used to manage class data. A related concept, the class method, receives a class when called instead of an instance and can be used to manage per-class data. These are advanced and optional extensions, though; normally, you must always pass an instance to a method, whether it is called through an instance or a class.
The whole point of a namespace tool like the class
statement is to support name
inheritance. This section expands on some of the mechanisms and roles
of attribute inheritance in Python.
In Python, inheritance happens when an object is qualified, and
it involves searching an attribute definition tree (one or more
namespaces). Every time you use an expression of the form object
.
attr
(where
object
is an instance or class object),
Python searches the namespace tree from bottom to top, beginning with
object
, looking for the first
attr
it can find. This includes references
to self
attributes in your methods.
Because lower definitions in the tree override higher ones,
inheritance forms the basis of specialization.
Figure 28-1 summarizes the way namespace trees are constructed and populated with names. Generally:
Instance attributes are generated by assignments to
self
attributes in
methods.
Class attributes are created by statements (assignments)
in class
statements.
Superclass links are made by listing classes in
parentheses in a class
statement header.
The net result is a tree of attribute namespaces that leads
from an instance, to the class it was generated from, to all the
superclasses listed in the class
header. Python searches upward in this tree, from instances to
superclasses, each time you use qualification to fetch an attribute
name from an instance object.[66]
The tree-searching model of inheritance just described turns out to be a great way to specialize systems. Because inheritance finds names in subclasses before it checks superclasses, subclasses can replace default behavior by redefining their superclasses’ attributes. In fact, you can build entire systems as hierarchies of classes, which are extended by adding new external subclasses rather than changing existing logic in-place.
The idea of redefining inherited names leads to a variety of specialization techniques. For instance, subclasses may replace inherited attributes completely, provide attributes that a superclass expects to find, and extend superclass methods by calling back to the superclass from an overridden method. We’ve already seen replacement in action. Here’s an example that shows how extension works:
>>>class Super:
...def method(self):
...print('in Super.method')
... >>>class Sub(Super):
...def method(self):
# Override method ...print('starting Sub.method')
# Add actions here ...Super.method(self)
# Run default action ...print('ending Sub.method')
...
Direct superclass method calls are the crux of the matter
here. The Sub
class replaces
Super
’s method
function with its own specialized
version, but within the replacement, Sub
calls back to the version exported by
Super
to carry out the default
behavior. In other words, Sub.method
just extends Super.method
’s behavior, rather than
replacing it completely:
>>>x = Super()
# Make a Super instance >>>x.method()
# Runs Super.method in Super.method >>>x = Sub()
# Make a Sub instance >>>x.method()
# Runs Sub.method, calls Super.method starting Sub.method in Super.method ending Sub.method
This extension coding pattern is also commonly used with constructors; see the section Methods for an example.
Extension is only one way to interface with a superclass. The file shown in this section, specialize.py, defines multiple classes that illustrate a variety of common techniques:
Super
Defines a method
function and a delegate
that expects an action
in a
subclass.
Inheritor
Doesn’t provide any new names, so it gets everything
defined in Super
.
Replacer
Overrides Super
’s
method
with a version of
its own.
Extender
Customizes Super
’s
method
by overriding and
calling back to run the default.
Provider
Implements the action
method expected by Super
’s
delegate
method.
Study each of these subclasses to get a feel for the various ways they customize their common superclass. Here’s the file:
class Super: def method(self): print('in Super.method') # Default behavior def delegate(self): self.action() # Expected to be defined class Inheritor(Super): # Inherit method verbatim pass class Replacer(Super): # Replace method completely def method(self): print('in Replacer.method') class Extender(Super): # Extend method behavior def method(self): print('starting Extender.method') Super.method(self) print('ending Extender.method') class Provider(Super): # Fill in a required method def action(self): print('in Provider.action') if __name__ == '__main__': for klass in (Inheritor, Replacer, Extender): print(' ' + klass.__name__ + '...') klass().method() print(' Provider...') x = Provider() x.delegate()
A few things are worth pointing out here. First, the self-test
code at the end of this example creates instances of three different
classes in a for
loop. Because
classes are objects, you can put them in a tuple and create
instances generically (more on this idea later). Classes also have
the special __name__
attribute,
like modules; it’s preset to a string containing the name in the
class header. Here’s what happens when we run the file:
% python specialize.py
Inheritor...
in Super.method
Replacer...
in Replacer.method
Extender...
starting Extender.method
in Super.method
ending Extender.method
Provider...
in Provider.action
Notice how the Provider
class in the prior example works. When we call the delegate
method through a
Provider
instance,
two independent inheritance searches
occur:
On the initial x.delegate
call, Python finds the
delegate
method in Super
by searching the Provider
instance and above. The
instance x
is passed into the
method’s self
argument as
usual.
Inside the Super.delegate
method, self.action
invokes a new, independent
inheritance search of self
and above. Because self
references a Provider
instance, the action
method
is located in the Provider
subclass.
This “filling in the blanks” sort of coding structure is
typical of OOP frameworks. At least in terms of the delegate
method, the superclass in this
example is what is sometimes called an abstract
superclass—a class that expects parts of its behavior to
be provided by its subclasses. If an expected method is not defined
in a subclass, Python raises an undefined name exception when the inheritance search
fails.
Class coders sometimes make such subclass requirements more
obvious with assert
statements, or by raising the
built-in NotImplemented
Error
exception with raise
statements (we’ll study statements
that may trigger exceptions in depth in the next part of this book).
As a quick preview, here’s the assert
scheme in action:
class Super: def delegate(self): self.action() def action(self): assert False, 'action must be defined!' # If this version is called >>>X = Super()
>>>X.delegate()
AssertionError: action must be defined!
We’ll meet assert
in
Chapters 32 and 33; in short, if its first
expression evaluates to false,
it raises an exception with the provided error message. Here, the
expression is always false so
as to trigger an error message if a method is not redefined, and
inheritance locates the version here. Alternatively, some classes
simply raise a NotImplementedError
exception
directly in such method stubs to signal the mistake:
class Super: def delegate(self): self.action() def action(self): raise NotImplementedError('action must be defined!') >>>X = Super()
>>>X.delegate()
NotImplementedError: action must be defined!
For instances of subclasses, we still get the exception unless the subclass provides the expected method to replace the default in the superclass:
>>>class Sub(Super): pass
... >>>X = Sub()
>>>X.delegate()
NotImplementedError: action must be defined! >>>class Sub(Super):
...def action(self): print('spam')
... >>>X = Sub()
>>>X.delegate()
spam
For a somewhat more realistic example of this section’s concepts in action, see the “Zoo animal hierarchy” exercise (exercise 8) at the end of Chapter 31, and its solution in Part VI, Classes and OOP in Appendix B. Such taxonomies are a traditional way to introduce OOP, but they’re a bit removed from most developers’ job descriptions.
As of Python 2.6 and 3.0, the prior section’s abstract superclasses (a.k.a.
“abstract base classes”), which require methods to be filled in by
subclasses, may also be implemented with special class syntax. The
way we code this varies slightly depending on the version. In Python
3.0, we use a keyword argument in a class
header, along with special @
decorator syntax, both of which we’ll
study in detail later in this book:
from abc import ABCMeta, abstractmethod class Super(metaclass=ABCMeta): @abstractmethod def method(self, ...): pass
But in Python 2.6, we use a class attribute instead:
class Super: __metaclass__ = ABCMeta @abstractmethod def method(self, ...): pass
Either way, the effect is the same—we can’t make an instance unless the method is defined lower in the class tree. In 3.0, for example, here is the special syntax equivalent of the prior section’s example:
>>>from abc import ABCMeta, abstractmethod
>>> >>>class Super(metaclass=ABCMeta):
...def delegate(self):
...self.action()
...@abstractmethod
...def action(self):
...pass
... >>>X = Super()
TypeError: Can't instantiate abstract class Super with abstract methods action >>>class Sub(Super): pass
... >>>X = Sub()
TypeError: Can't instantiate abstract class Sub with abstract methods action >>>class Sub(Super):
...def action(self): print('spam')
... >>>X = Sub()
>>>X.delegate()
spam
Coded this way, a class with an abstract method cannot be instantiated (that is, we cannot create an instance by calling it) unless all of its abstract methods have been defined in subclasses. Although this requires more code, the advantage of this approach is that errors for missing methods are issued when we attempt to make an instance of the class, not later when we try to call a missing method. This feature may also be used to define an expected interface, automatically verified in client classes.
Unfortunately, this scheme also relies on two advanced language tools we have not met yet—function decorators, introduced in Chapter 31 and covered in depth in Chapter 38, as well as metaclass declarations, mentioned in Chapter 31 and covered in Chapter 39—so we will finesse other facets of this option here. See Python’s standard manuals for more on this, as well as precoded abstract superclasses Python provides.
Now that we’ve examined class and instance objects, the Python namespace story is complete. For reference, I’ll quickly summarize all the rules used to resolve names here. The first things you need to remember are that qualified and unqualified names are treated differently, and that some scopes serve to initialize object namespaces:
Unqualified names (e.g., X
) deal with scopes.
Qualified attribute names (e.g.,
object
.X
) use object namespaces.
Some scopes initialize object namespaces (for modules and classes).
Unqualified simple names follow the LEGB lexical scoping rule outlined for functions in Chapter 17:
X =
value
)Makes names local: creates or changes the name X
in the current local scope, unless
declared global.
X
)Looks for the name X
in the current local scope, then any and all enclosing
functions, then the current global scope, then the built-in
scope.
Qualified attribute names refer to attributes of specific objects and obey the rules for modules and classes. For class and instance objects, the reference rules are augmented to include the inheritance search procedure:
object
.X =
value
)Creates or alters the attribute name X
in the namespace of the
object
being qualified, and none
other. Inheritance-tree climbing happens only on attribute
reference, not on attribute assignment.
object
.X
)For class-based objects, searches for the attribute name
X
in
object
, then in all accessible
classes above it, using the inheritance search procedure. For
nonclass objects such as modules, fetches X
from
object
directly.
With distinct search procedures for qualified and unqualified names, and multiple lookup layers for both, it can sometimes be difficult to tell where a name will wind up going. In Python, the place where you assign a name is crucial—it fully determines the scope or object in which a name will reside. The file manynames.py illustrates how this principle translates to code and summarizes the namespace ideas we have seen throughout this book:
# manynames.py X = 11 # Global (module) name/attribute (X, or manynames.X) def f(): print(X) # Access global X (11) def g(): X = 22 # Local (function) variable (X, hides module X) print(X) class C: X = 33 # Class attribute (C.X) def m(self): X = 44 # Local variable in method (X) self.X = 55 # Instance attribute (instance.X)
This file assigns the same name, X
, five times. Because this name is
assigned in five different locations, though, all five X
s in this program are completely
different variables. From top to bottom, the assignments to X
here generate: a module attribute
(11
), a local variable in a
function (22
), a class attribute
(33
), a local variable in a
method (44
), and an instance
attribute (55
). Although all five
are named X
, the fact that they
are all assigned at different places in the source code or to
different objects makes all of these unique variables.
You should take the time to study this example carefully
because it collects ideas we’ve been exploring throughout the last
few parts of this book. When it makes sense to you, you will have
achieved a sort of Python namespace nirvana. Of course, an
alternative route to nirvana is to simply run the program and see
what happens. Here’s the remainder of this source file, which makes
an instance and prints all the X
s
that it can fetch:
# manynames.py, continued if __name__ == '__main__': print(X) # 11: module (a.k.a. manynames.X outside file) f() # 11: global g() # 22: local print(X) # 11: module name unchanged obj = C() # Make instance print(obj.X) # 33: class name inherited by instance obj.m() # Attach attribute name X to instance now print(obj.X) # 55: instance print(C.X) # 33: class (a.k.a. obj.X if no X in instance) #print(C.m.X) # FAILS: only visible in method #print(g.X) # FAILS: only visible in function
The outputs that are printed when the file is run are noted in
the comments in the code; trace through them to see which variable
named X
is being accessed each
time. Notice in particular that we can go through the class to fetch
its attribute (C.X
), but we can
never fetch local variables in functions or methods from outside
their def
statements. Locals are
visible only to other code within the def
, and in fact only live in memory while
a call to the function or method is executing.
Some of the names defined by this file are visible outside the file to other modules, but recall that we must always import before we can access names in another file—that is the main point of modules, after all:
# otherfile.py import manynames X = 66 print(X) # 66: the global here print(manynames.X) # 11: globals become attributes after imports manynames.f() # 11: manynames's X, not the one here! manynames.g() # 22: local in other file's function print(manynames.C.X) # 33: attribute of class in other module I = manynames.C() print(I.X) # 33: still from class here I.m() print(I.X) # 55: now from instance!
Notice here how manynames.f()
prints the X
in manynames
, not the X
assigned in this file—scopes are always determined by
the position of assignments in your source code (i.e., lexically)
and are never influenced by what imports what or who imports whom.
Also, notice that the instance’s own X
is not created until we call I.m()
—attributes, like all variables,
spring into existence when assigned, and not before. Normally we
create instance attributes by assigning them in class __init__
constructor methods, but this
isn’t the only option.
Finally, as we learned in Chapter 17, it’s also
possible for a function to change names outside
itself, with global
and (in
Python 3.0) nonlocal
statements—these statements provide write access, but also modify
assignment’s namespace binding rules:
X = 11 # Global in module def g1(): print(X) # Reference global in module def g2(): global X X = 22 # Change global in module def h1(): X = 33 # Local in function def nested(): print(X) # Reference local in enclosing scope def h2(): X = 33 # Local in function def nested(): nonlocal X # Python 3.0 statement X = 44 # Change local in enclosing scope
Of course, you generally shouldn’t use the same name for every variable in your script—but as this example demonstrates, even if you do, Python’s namespaces will work to keep names used in one context from accidentally clashing with those used in another.
In Chapter 22, we learned that module namespaces are actually
implemented as dictionaries and exposed with the built-in __dict__
attribute. The same holds for
class and instance objects: attribute qualification is really a
dictionary indexing operation internally, and attribute inheritance
is just a matter of searching linked dictionaries. In fact, instance
and class objects are mostly just dictionaries with links inside
Python. Python exposes these dictionaries, as well as the links
between them, for use in advanced roles (e.g., for coding
tools).
To help you understand how attributes work internally, let’s work through an interactive session that traces the way namespace dictionaries grow when classes are involved. We saw a simpler version of this type of code in Chapter 26, but now that we know more about methods and superclasses, let’s embellish it here. First, let’s define a superclass and a subclass with methods that will store data in their instances:
>>>class super:
...def hello(self):
...self.data1 = 'spam'
... >>>class sub(super):
...def hola(self):
...self.data2 = 'eggs'
...
When we make an instance of the
subclass, the instance starts out with an empty namespace
dictionary, but it has links back to the class for the inheritance
search to follow. In fact, the inheritance tree is explicitly
available in special attributes, which you can inspect. Instances
have a __class__
attribute that links to their class, and classes have a __bases__
attribute that is a tuple containing links to higher superclasses
(I’m running this on Python 3.0; name formats and some internal
attributes vary slightly in 2.6):
>>>X = sub()
>>>X.__dict__
# Instance namespace dict {} >>>X.__class__
# Class of instance <class '__main__.sub'> >>>sub.__bases__
# Superclasses of class (<class '__main__.super'>,) >>>super.__bases__
# () empty tuple in Python 2.6 (<class 'object'>,)
As classes assign to self
attributes, they populate the instance objects—that is, attributes wind up in the instances’
attribute namespace dictionaries, not in the classes’. An instance
object’s namespace records data that can vary from instance to
instance, and self
is a hook into
that namespace:
>>>Y = sub()
>>>X.hello()
>>>X.__dict__
{'data1': 'spam'} >>>X.hola()
>>>X.__dict__
{'data1': 'spam', 'data2': 'eggs'} >>>sub.__dict__.keys()
['__module__', '__doc__', 'hola'] >>>super.__dict__.keys()
['__dict__', '__module__', '__weakref__', 'hello', '__doc__'] >>>Y.__dict__
{}
Notice the extra underscore names in the class dictionaries;
Python sets these automatically. Most are not used in typical
programs, but there are tools that use some of them (e.g., __doc__
holds the docstrings discussed in
Chapter 15).
Also, observe that Y
, a second instance made at the start of
this series, still has an empty namespace dictionary at the end, even though X
’s dictionary has been populated by
assignments in methods. Again, each instance has an independent
namespace dictionary, which
starts out empty and can record completely different attributes than
those recorded by the namespace dictionaries of other instances of
the same class.
Because attributes are actually dictionary keys inside Python, there are really two ways to fetch and assign their values—by qualification, or by key indexing:
>>>X.data1, X.__dict__['data1']
('spam', 'spam') >>>X.data3 = 'toast'
>>>X.__dict__
{'data1': 'spam', 'data3': 'toast', 'data2': 'eggs'} >>>X.__dict__['data3'] = 'ham'
>>>X.data3
'ham'
This equivalence applies only to attributes actually attached
to the instance, though. Because attribute fetch qualification also
performs an inheritance search, it can access attributes that
namespace dictionary indexing cannot. The inherited attribute
X.hello
, for instance, cannot be
accessed by X.__dict__['hello']
.
Finally, here is the built-in dir
function we met in Chapters 4 and 15 at work on class and instance
objects. This function works on anything with attributes: dir(
object
)
is similar to an
object
.__dict__.keys()
call. Notice, though,
that dir
sorts its list and
includes some system attributes. As of Python 2.2, dir
also collects inherited attributes
automatically, and in 3.0 it includes names inherited from the
object
class that is an implied
superclass of all classes:[67]
>>>X.__dict__, Y.__dict__
({'data1': 'spam', 'data3': 'ham', 'data2': 'eggs'}, {}) >>>list(X.__dict__.keys())
# Need list in 3.0 ['data1', 'data3', 'data2'] # In Python 2.6: >>>>dir(X)
['__doc__', '__module__', 'data1', 'data2', 'data3', 'hello', 'hola'] >>>dir(sub)
['__doc__', '__module__', 'hello', 'hola'] >>>dir(super)
['__doc__', '__module__', 'hello'] # In Python 3.0: >>>dir(X)
['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__',...more omitted...
'data1', 'data2', 'data3', 'hello', 'hola'] >>>dir(sub)
['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__',...more omitted...
'hello', 'hola'] >>>dir(super)
['__class__', '__delattr__', '__dict__', '__doc__', '__eq__', '__format__',...more omitted...
'hello' ]
Experiment with these special attributes on your own to get a better feel for how namespaces actually do their attribute business. Even if you will never use these in the kinds of programs you write, seeing that they are just normal dictionaries will help demystify the notion of namespaces in general.
The prior section introduced the special __class__
and __bases__
instance
and class attributes, without really explaining why you might care
about them. In short, these attributes allow you to inspect
inheritance hierarchies within your own code. For example, they can
be used to display a class tree, as in the following example:
# classtree.py """ Climb inheritance trees using namespace links, displaying higher superclasses with indentation """ def classtree(cls, indent): print('.' * indent + cls.__name__) # Print class name here for supercls in cls.__bases__: # Recur to all superclasses classtree(supercls, indent+3) # May visit super > once def instancetree(inst): print('Tree of %s' % inst) # Show instance classtree(inst.__class__, 3) # Climb to its class def selftest(): class A: pass class B(A): pass class C(A): pass class D(B,C): pass class E: pass class F(D,E): pass instancetree(B()) instancetree(F()) if __name__ == '__main__': selftest()
The classtree
function in this script is
recursive—it prints a class’s name using
__name__
, then climbs up to the
superclasses by calling itself. This allows the function to traverse
arbitrarily shaped class trees; the recursion climbs to the top, and
stops at root superclasses that have empty __bases__
attributes. When using
recursion, each active level of a function gets its own copy of the
local scope; here, this means that cls
and indent
are different at each classtree
level.
Most of this file is self-test code. When run standalone in Python 3.0, it builds an empty class tree, makes two instances from it, and prints their class tree structures:
C:misc> c:python26python classtree.py
Tree of <__main__.B instance at 0x02557328>
...B
......A
Tree of <__main__.F instance at 0x02557328>
...F
......D
.........B
............A
.........C
............A
......E
When run under Python 3.0, the tree includes the implied
object
superclasses that are
automatically added above standalone classes, because all classes
are “new style” in 3.0 (more on this change in Chapter 31):
C:misc> c:python30python classtree.py
Tree of <__main__.B object at 0x02810650>
...B
......A
.........object
Tree of <__main__.F object at 0x02810650>
...F
......D
.........B
............A
...............object
.........C
............A
...............object
......E
.........object
Here, indentation marked by periods is used to denote class tree height. Of course, we could improve on this output format, and perhaps even sketch it in a GUI display. Even as is, though, we can import these functions anywhere we want a quick class tree display:
C:misc>c:python30python
>>>class Emp: pass
... >>>class Person(Emp): pass
>>>bob = Person()
>>>import classtree
>>>classtree.instancetree(bob)
Tree of <__main__.Person object at 0x028203B0> ...Person ......Emp .........object
Regardless of whether you will ever code or use such tools, this example demonstrates one of the many ways that you can make use of special attributes that expose interpreter internals. You’ll see another when we code the lister.py general-purpose class display tools in the section Multiple Inheritance: “Mix-in” Classes—there, we will extend this technique to also display attributes in each object in a class tree. And in the last part of this book, we’ll revisit such tools in the context of Python tool building at large, to code tools that implement attribute privacy, argument validation, and more. While not for every Python programmer, access to internals enables powerful development tools.
The last section’s example includes a docstring for its module,
but remember that docstrings can be used for class components as well.
Docstrings, which we covered in detail in Chapter 15, are string literals that
show up at the top of various structures and are automatically saved
by Python in the corresponding objects’ __doc__
attributes. This works for module files, function
def
s, and classes and
methods.
Now that we know more about classes and methods, the following file, docstr.py, provides a quick but comprehensive example that summarizes the places where docstrings can show up in your code. All of these can be triple-quoted blocks:
"I am: docstr.__doc__" def func(args): "I am: docstr.func.__doc__" pass class spam: "I am: spam.__doc__ or docstr.spam.__doc__" def method(self, arg): "I am: spam.method.__doc__ or self.method.__doc__" pass
The main advantage of documentation
strings is that they stick around at runtime. Thus, if it’s been coded
as a docstring, you can qualify an object with its __doc__
attribute to fetch its
documentation:
>>>import docstr
>>>docstr.__doc__
'I am: docstr.__doc__' >>>docstr.func.__doc__
'I am: docstr.func.__doc__' >>>docstr.spam.__doc__
'I am: spam.__doc__ or docstr.spam.__doc__' >>>docstr.spam.method.__doc__
'I am: spam.method.__doc__ or self.method.__doc__'
A discussion of the PyDoc tool, which knows
how to format all these strings in reports, appears in Chapter 15. Here it is running on our
code under Python 2.6 (Python 3.0 shows additional attributes
inherited from the implied object
superclass in the new-style class model—run this on your own to see
the 3.0 extras, and watch for more about this difference in Chapter 31):
>>> help(docstr)
Help on module docstr:
NAME
docstr - I am: docstr.__doc__
FILE
c:miscdocstr.py
CLASSES
spam
class spam
| I am: spam.__doc__ or docstr.spam.__doc__
|
| Methods defined here:
|
| method(self, arg)
| I am: spam.method.__doc__ or self.method.__doc__
FUNCTIONS
func(args)
I am: docstr.func.__doc__
Documentation strings are available at runtime, but they are
less flexible syntactically than #
comments (which can appear anywhere in a program). Both forms are
useful tools, and any program documentation is good (as long as it’s
accurate, of course!). As a best-practice rule of thumb, use
docstrings for functional documentation (what your objects do) and
hash-mark comments for more micro-level documentation (how arcane
expressions work).
Let’s wrap up this chapter by briefly comparing the topics of this book’s last two parts: modules and classes. Because they’re both about namespaces, the distinction can be confusing. In short:
Modules
Are data/logic packages
Are created by writing Python files or C extensions
Are used by being imported
Classes
Implement new objects
Are created by class
statements
Are used by being called
Always live within a module
Classes also support extra features that modules don’t, such as operator overloading, multiple instance generation, and inheritance. Although both classes and modules are namespaces, you should be able to tell by now that they are very different things.
This chapter took us on a second, more in-depth tour of the OOP mechanisms of the Python language. We learned more about classes, methods, and inheritance, and we wrapped up the namespace story in Python by extending it to cover its application to classes. Along the way, we looked at some more advanced concepts, such as abstract superclasses, class data attributes, namespace dictionaries and links, and manual calls to superclass methods and constructors.
Now that we’ve learned all about the mechanics of coding classes in Python, Chapter 29 turns to a specific facet of those mechanics: operator overloading. After that we’ll explore common design patterns, looking at some of the ways that classes are commonly used and combined to optimize code reuse. Before you read ahead, though, be sure to work though the usual chapter quiz to review what we’ve covered here.
What is an abstract superclass?
What happens when a simple assignment statement appears at
the top level of a class
statement?
Why might a class need to manually call the __init__
method in a superclass?
How can you augment, instead of completely replacing, an inherited method?
What...was the capital of Assyria?
An abstract superclass is a class that calls a method, but does not inherit or define it—it expects the method to be filled in by a subclass. This is often used as a way to generalize classes when behavior cannot be predicted until a more specific subclass is coded. OOP frameworks also use this as a way to dispatch to client-defined, customizable operations.
When a simple assignment statement (X = Y
) appears at the top level of a
class
statement, it attaches a
data attribute to the class
(Class
.X
). Like all class attributes, this
will be shared by all instances; data attributes are not callable
method functions, though.
A class must manually call the __init__
method in a superclass if it
defines an __init__
constructor
of its own and still wants the superclass’s construction code to
run. Python itself automatically runs just one constructor—the
lowest one in the tree. Superclass constructors are called through
the class name, passing in the self
instance manually:
Superclass
.__init__(self, ...)
.
To augment instead of completely replacing an inherited
method, redefine it in a subclass, but call back to the
superclass’s version of the method manually from the new version
of the method in the subclass. That is, pass the self
instance to the superclass’s
version of the method manually:
Superclass
.
method
(self, ...)
.
Ashur (or Qalat Sherqat), Calah (or Nimrud), the short-lived Dur Sharrukin (or Khorsabad), and finally Nineveh.
[63] If you’ve used C++ you may recognize this as similar to
the notion of C++’s “static” data members—members that are
stored in the class, independent of instances. In Python, it’s
nothing special: all class attributes are just names assigned in
the class
statement, whether
they happen to reference functions (C++’s “methods”) or
something else (C++’s “members”). In Chapter 31, we’ll also meet Python
static methods (akin to those in C++), which are just self-less
functions that usually process class attributes.
[64] Unless the class has redefined the attribute assignment
operation to do something unique with the __setattr__
operator overloading
method (discussed in Chapter 29).
[65] On a somewhat related note, you can also code multiple
__init__
methods within the
same class, but only the last definition will be used; see Chapter 30 for more details on multiple
method definitions.
[66] This description isn’t 100% complete, because we can also
create instance and class attributes by assigning to objects
outside class
statements—but
that’s a much less common and sometimes more error-prone
approach (changes aren’t isolated to class
statements). In Python, all
attributes are always accessible by default. We’ll talk more
about attribute name privacy in Chapter 29 when we study __setattr__
, in Chapter 30 when we meet __
X
names,
and again in Chapter 38, where we’ll
implement it with a class decorator.
[67] As you can see, the contents of attribute dictionaries and
dir
call results may change
over time. For example, because Python now allows built-in types
to be subclassed like classes, the contents of dir
results for built-in types have
expanded to include operator overloading methods, just like our
dir
results here for
user-defined classes under Python 3.0. In general, attribute
names with leading and trailing double underscores are
interpreter-specific. Type subclasses will be discussed further
in Chapter 31.