This chapter expands on the attribute interception techniques introduced earlier, introduces another, and employs them in a handful of larger examples. Like everything in this part of the book, this chapter is classified as an advanced topic and optional reading, because most applications programmers don’t need to care about the material discussed here—they can fetch and set attributes on objects without concern for attribute implementations. Especially for tools builders, though, managing attribute access can be an important part of flexible APIs.
Object attributes are central to most Python programs—they are
where we often store information about the entities our scripts
process. Normally, attributes are simply names for objects; a person’s
name
attribute, for example, might
be a simple string, fetched and set with basic attribute
syntax:
person.name # Fetch attribute value
person.name = value
# Change attribute value
In most cases, the attribute lives in the object itself, or is inherited from a class from which it derives. That basic model suffices for most programs you will write in your Python career.
Sometimes, though, more flexibility is required. Suppose you’ve
written a program to use a name
attribute directly, but then your requirements change—for example, you
decide that names should be validated with logic when set or mutated
in some way when fetched. It’s straightforward to code methods to
manage access to the attribute’s value (valid
and transform
are abstract here):
class Person: def getName(self): if not valid(): raise TypeError('cannot fetch name') else: return self.name.transform() def setName(self, value): if not valid(value): raise TypeError('cannot change name') else: self.name = transform(value) person = Person() person.getName() person.setName('value')
However, this also requires changing all the places where names are used in the entire program—a possibly nontrivial task. Moreover, this approach requires the program to be aware of how values are exported: as simple names or called methods. If you begin with a method-based interface to data, clients are immune to changes; if you do not, they can become problematic.
This issue can crop up more often than you might expect. The value of a cell in a spreadsheet-like program, for instance, might begin its life as a simple discrete value, but later mutate into an arbitrary calculation. Since an object’s interface should be flexible enough to support such future changes without breaking existing code, switching to methods later is less than ideal.
A better solution would allow you to run code automatically on attribute access, if needed. At various points in this book, we’ve met Python tools that allow our scripts to dynamically compute attribute values when fetching them and validate or change attribute values when storing them. In this chapter, were going to expand on the tools already introduced, explore other available tools, and study some larger use-case examples in this domain. Specifically, this chapter presents:
The __getattr__
and __setattr__
methods, for routing undefined attribute fetches and all
attribute assignments to generic handler methods.
The __getattribute__
method, for routing
all attribute fetches to a generic handler method in new-style
classes in 2.6 and all classes in 3.0.
The property
built-in, for routing specific
attribute access to get and set handler functions, known as
properties.
The descriptor protocol, for routing specific attribute accesses to instances of classes with arbitrary get and set handler methods.
The first and third of these were briefly introduced in Part VI; the others are new topics introduced and covered here.
As we’ll see, all four techniques share goals to some degree, and it’s usually possible to code a given problem using any one of them. They do differ in some important ways, though. For example, the last two techniques listed here apply to specific attributes, whereas the first two are generic enough to be used by delegation-based classes that must route arbitrary attributes to wrapped objects. As we’ll see, all four schemes also differ in both complexity and aesthetics, in ways you must see in action to judge for yourself.
Besides studying the specifics behind the four attribute
interception techniques listed in this section, this chapter also
presents an opportunity to explore larger programs than we’ve seen
elsewhere in this book. The CardHolder
case study at the end, for
example, should serve as a self-study example of larger classes in
action. We’ll also be using some of the techniques outlined here in
the next chapter to code decorators, so be sure you have at least a
general understanding of these topics before you move on.
The property protocol allows us to route a specific attribute’s get and set operations to functions or methods we provide, enabling us to insert code to be run automatically on attribute access, intercept attribute deletions, and provide documentation for the attributes if desired.
Properties are created with the property
built-in and are assigned to class
attributes, just like method functions. As such, they are inherited by
subclasses and instances, like any other class attributes. Their
access-interception functions are provided with the self
instance argument, which grants access
to state information and class attributes available on the subject
instance.
A property manages a single, specific attribute; although it can’t catch all attribute accesses generically, it allows us to control both fetch and assignment accesses and enables us to change an attribute from simple data to a computation freely, without breaking existing code. As we’ll see, properties are strongly related to descriptors; they are essentially a restricted form of them.
A property is created by assigning the result of a built-in function to a class attribute:
attribute = property(fget, fset, fdel, doc)
None of this built-in’s arguments are required, and all
default to None
if not passed;
such operations are not supported, and attempting them will raise an
exception. When using them, we pass fget
a function for intercepting attribute
fetches, fset
a function for
assignments, and fdel
a function
for attribute deletions; the doc
argument receives a documentation string for the attribute, if
desired (otherwise the property copies the docstring of fget
, if provided, which defaults to
None
). fget
returns the computed attribute value,
and fset
and fdel
return nothing (really, None
).
This built-in call returns a property object, which we assign to the name of the attribute to be managed in the class scope, where it will be inherited by every instance.
To demonstrate how this translates to working code, the following
class uses a property to trace access to an attribute named name
; the actual stored data is named
_name
so it does not clash with
the property:
class Person: # Use (object) in 2.6 def __init__(self, name): self._name = name def getName(self): print('fetch...') return self._name def setName(self, value): print('change...') self._name = value def delName(self): print('remove...') del self._name name = property(getName, setName, delName, "name property docs") bob = Person('Bob Smith') # bob has a managed attribute print(bob.name) # Runs getName bob.name = 'Robert Smith' # Runs setName print(bob.name) del bob.name # Runs delName print('-'*20) sue = Person('Sue Jones') # sue inherits property too print(sue.name) print(Person.name.__doc__) # Or help(Person.name)
Properties are available in both 2.6 and 3.0, but they require
new-style object
derivation in
2.6 to work correctly for assignments—add object
as a superclass here to run this in
2.6 (you can list the superclass in 3.0 too, but it’s implied and
not required).
This particular property doesn’t do much—it simply intercepts and traces an attribute—but it serves to demonstrate the protocol. When this code is run, two instances inherit the property, just as they would any other attribute attached to their class. However, their attribute accesses are caught:
fetch... Bob Smith change... fetch... Robert Smith remove... -------------------- fetch... Sue Jones name property docs
Like all class attributes, properties are inherited by both instances and lower subclasses. If we change our example as follows, for example:
class Super:
...the original Person class code...
name = property(getName, setName, delName, 'name property docs')
class Person(Super):
pass # Properties are inherited
bob = Person('Bob Smith')
...rest unchanged...
the output is the same—the Person
subclass inherits the name
property from Super
, and the bob
instance gets it from Person
. In terms of inheritance,
properties work the same as normal methods; because they have access
to the self
instance argument,
they can access instance state information like methods, as the next
section demonstrates.
The example in the prior section simply traces attribute accesses. Usually, though, properties do much more—computing the value of an attribute dynamically when fetched, for example. The following example illustrates:
class PropSquare: def __init__(self, start): self.value = start def getX(self): # On attr fetch return self.value ** 2 def setX(self, value): # On attr assign self.value = value X = property(getX, setX) # No delete or docs P = PropSquare(3) # 2 instances of class with property Q = PropSquare(32) # Each has different state information print(P.X) # 3 ** 2 P.X = 4 print(P.X) # 4 ** 2 print(Q.X) # 32 ** 2
This class defines an attribute X
that is accessed as though it were
static data, but really runs code to compute its value when fetched.
The effect is much like an implicit method call. When the code is
run, the value is stored in the instance as state information, but
each time we fetch it via the managed attribute, its value is
automatically squared:
9 16 1024
Notice that we’ve made two different instances—because
property methods automatically receive a self
argument, they have access to the
state information stored in instances. In our case, this mean the
fetch computes the square of the subject instance’s data.
Although we’re saving additional details until the next chapter, we introduced function decorator basics earlier, in Chapter 31. Recall that the function decorator syntax:
@decorator def func(args): ...
is automatically translated to this equivalent by Python, to
rebind the function name to the result of the decorator
callable:
def func(args): ... func = decorator(func)
Because of this mapping, it turns out that the property
built-in can serve as a
decorator, to define a function that will run automatically when an
attribute is fetched:
class Person:
@property
def name(self): ... # Rebinds: name = property(name)
When run, the decorated method is automatically passed to the
first argument of the property
built-in. This is really just alternative syntax for creating a
property and rebinding the attribute name manually:
class Person: def name(self): ... name = property(name)
As of Python 2.6, property objects also have getter
, setter
, and deleter
methods that assign the
corresponding property accessor methods and return a copy of the
property itself. We can use these to specify components of
properties by decorating normal methods too, though the getter
component is usually filled in
automatically by the act of creating the property itself:
class Person: def __init__(self, name): self._name = name @property def name(self): # name = property(name) "name property docs" print('fetch...') return self._name @name.setter def name(self, value): # name = name.setter(name) print('change...') self._name = value @name.deleter def name(self): # name = name.deleter(name) print('remove...') del self._name bob = Person('Bob Smith') # bob has a managed attribute print(bob.name) # Runs name getter (name 1) bob.name = 'Robert Smith' # Runs name setter (name 2) print(bob.name) del bob.name # Runs name deleter (name 3) print('-'*20) sue = Person('Sue Jones') # sue inherits property too print(sue.name) print(Person.name.__doc__) # Or help(Person.name)
In fact, this code is equivalent to the first example in this section—decoration is just an alternative way to code properties in this case. When it’s run, the results are the same:
fetch... Bob Smith change... fetch... Robert Smith remove... -------------------- fetch... Sue Jones name property docs
Compared to manual assignment of property
results, in this case using
decorators to code properties requires just three extra lines of
code (a negligible difference). As is so often the case with
alternative tools, the choice between the two techniques is largely
subjective.
Descriptors provide an alternative way to intercept attribute access; they
are strongly related to the properties discussed in the prior section.
In fact, a property is a kind of descriptor—technically speaking, the
property
built-in is just a
simplified way to create a specific type of descriptor that runs
method functions on attribute accesses.
Functionally speaking, the descriptor protocol allows us to route a specific attribute’s get and set operations to methods of a separate class object that we provide: they provide a way to insert code to be run automatically on attribute access, and they allow us to intercept attribute deletions and provide documentation for the attributes if desired.
Descriptors are created as independent
classes, and they are assigned to class
attributes just like method functions. Like any other class attribute,
they are inherited by subclasses and instances. Their
access-interception methods are provided with both a self
for the descriptor itself, and the
instance of the client class. Because of this, they can retain and use
state information of their own, as well as state information of the
subject instance. For example, a descriptor may call methods available
in the client class, as well as descriptor-specific methods it
defines.
Like a property, a descriptor manages a single, specific attribute; although it can’t catch all attribute accesses generically, it provides control over both fetch and assignment accesses and allows us to change an attribute freely from simple data to a computation without breaking existing code. Properties really are just a convenient way to create a specific kind of descriptor, and as we shall see, they can be coded as descriptors directly.
Whereas properties are fairly narrow in scope, descriptors provide a more general solution. For instance, because they are coded as normal classes, descriptors have their own state, may participate in descriptor inheritance hierarchies, can use composition to aggregate objects, and provide a natural structure for coding internal methods and attribute documentation strings.
As mentioned previously, descriptors are coded as separate classes and provide specially named accessor methods for the attribute access operations they wish to intercept—get, set, and deletion methods in the descriptor class are automatically run when the attribute assigned to the descriptor class instance is accessed in the corresponding way:
class Descriptor: "docstring goes here" def __get__(self, instance, owner): ... # Return attr value def __set__(self, instance, value): ... # Return nothing (None) def __delete__(self, instance): ... # Return nothing (None)
Classes with any of these methods are considered descriptors,
and their methods are special when one of their instances is
assigned to another class’s attribute—when the attribute is accessed, they are
automatically invoked. If any of these methods are absent, it
generally means that the corresponding type of access is not
supported. Unlike with properties, however, omitting a __set__
allows the name to be redefined in
an instance, thereby hiding the descriptor—to make an attribute
read-only, you must define __set__
to catch assignments and raise an
exception.
Before we code anything realistic, let’s take a brief look
at some fundamentals. All three descriptor methods outlined in the prior
section are passed both the descriptor class instance (self
) and the instance of the client
class to which the descriptor instance is attached (instance
).
The __get__
access method additionally receives an owner
argument, specifying the class to
which the descriptor instance is attached. Its instance
argument is either the instance
through which the attribute was accessed (for
instance
.attr
), or None
when the attribute is accessed through the owner
class directly (for class
.attr
). The former of these generally
computes a value for instance access, and the latter usually
returns self
if descriptor
object access is supported.
For example, in the following, when X.attr
is fetched, Python automatically
runs the __get__
method of the
Descriptor
class to which the
Subject.attr
class attribute is
assigned (as with properties, in Python 2.6 we must derive from
object
to use descriptors here;
in 3.0 this is implied, but doesn’t hurt):
>>>class Descriptor(object):
...def __get__(self, instance, owner):
...print(self, instance, owner, sep=' ')
... >>>class Subject:
...attr = Descriptor()
# Descriptor instance is class attr ... >>>X = Subject()
>>>X.attr
<__main__.Descriptor object at 0x0281E690> <__main__.Subject object at 0x028289B0> <class '__main__.Subject'> >>>Subject.attr
<__main__.Descriptor object at 0x0281E690> None <class '__main__.Subject'>
Notice the arguments automatically passed in to the __get__
method in the first attribute
fetch—when X.attr
is fetched,
it’s as though the following translation occurs (though the
Subject.attr
here doesn’t
invoke __get__
again):
X.attr -> Descriptor.__get__(Subject.attr, X, Subject)
The descriptor knows it is being accessed directly when its
instance argument is None
.
As mentioned earlier, unlike with properties, with
descriptors simply omitting the __set__
method isn’t enough to make an attribute read-only,
because the descriptor name can be assigned to an instance. In the
following, the attribute assignment to X.a
stores a
in the instance object X
, thereby hiding the descriptor stored
in class C
:
>>>class D:
...def __get__(*args): print('get')
... >>>class C:
...a = D()
... >>>X = C()
>>>X.a
# Runs inherited descriptor __get__ get >>>C.a
get >>>X.a = 99
# Stored on X, hiding C.a >>>X.a
99 >>>list(X.__dict__.keys())
['a'] >>>Y = C()
>>>Y.a
# Y still inherits descriptor get >>>C.a
get
This is the way all instance attribute assignments work in Python, and it allows classes to selectively override class-level defaults in their instances. To make a descriptor-based attribute read-only, catch the assignment in the descriptor class and raise an exception to prevent attribute assignment—when assigning an attribute that is a descriptor, Python effectively bypasses the normal instance-level assignment behavior and routes the operation to the descriptor object:
>>>class D:
...def __get__(*args): print('get')
...def __set__(*args): raise AttributeError('cannot set')
... >>>class C:
...a = D()
... >>>X = C()
>>>X.a
# Routed to C.a.__get__ get >>>X.a = 99
# Routed to C.a.__set__ AttributeError: cannot set
Also be careful not to confuse the descriptor __delete__
method with the general __del__
method. The former is called
on attempts to delete the managed attribute name on an instance
of the owner class; the latter is the general instance
destructor method, run when an instance of any kind of class is
about to be garbage collected. __delete__
is more closely related to
the __delattr__
generic
attribute deletion method we’ll meet later in this chapter. See
Chapter 29 for more on operator
overloading methods.
To see how this all comes together in more realistic code,
let’s get started with the same first example we wrote for
properties. The following defines a descriptor that intercepts
access to an attribute named name
in its clients. Its methods use their instance
argument to access state
information in the subject instance, where the name string is
actually stored. Like properties, descriptors work properly only for
new-style classes, so be sure to derive both classes in the
following from object
if you’re
using 2.6:
class Name: # Use (object) in 2.6 "name descriptor docs" def __get__(self, instance, owner): print('fetch...') return instance._name def __set__(self, instance, value): print('change...') instance._name = value def __delete__(self, instance): print('remove...') del instance._name class Person: # Use (object) in 2.6 def __init__(self, name): self._name = name name = Name() # Assign descriptor to attr bob = Person('Bob Smith') # bob has a managed attribute print(bob.name) # Runs Name.__get__ bob.name = 'Robert Smith' # Runs Name.__set__ print(bob.name) del bob.name # Runs Name.__delete__ print('-'*20) sue = Person('Sue Jones') # sue inherits descriptor too print(sue.name) print(Name.__doc__) # Or help(Name)
Notice in this code how we assign an instance of our
descriptor class to a class attribute in the client class;
because of this, it is inherited by all instances of the class, just
like a class’s methods. Really, we must assign
the descriptor to a class attribute like this—it won’t work if assigned to a
self
instance attribute instead.
When the descriptor’s __get__
method is run, it is passed three objects to define its
context:
self
is the Name
class instance.
instance
is the
Person
class instance.
owner
is the Person
class.
When this code is run the descriptor’s methods intercept accesses to the attribute, much like the property version. In fact, the output is the same again:
fetch... Bob Smith change... fetch... Robert Smith remove... -------------------- fetch... Sue Jones name descriptor docs
Also like in the property example, our descriptor class
instance is a class attribute and thus is
inherited by all instances of the client class
and any subclasses. If we change the Person
class in our example to the
following, for instance, the output of our script is the
same:
...
class Super:
def __init__(self, name):
self._name = name
name = Name()
class Person(Super): # Descriptors are inherited
pass
...
Also note that when a descriptor class is not useful outside the client class, it’s perfectly reasonable to embed the descriptor’s definition inside its client syntactically. Here’s what our example looks like if we use a nested class:
class Person:
def __init__(self, name):
self._name = name
class Name: # Using a nested class
"name descriptor docs"
def __get__(self, instance, owner):
print('fetch...')
return instance._name
def __set__(self, instance, value):
print('change...')
instance._name = value
def __delete__(self, instance):
print('remove...')
del instance._name
name = Name()
When coded this way, Name
becomes a local variable in the scope of the Person
class statement, such that it won’t
clash with any names outside the class. This version works the same
as the original—we’ve simply moved the descriptor class definition
into the client class’s scope—but the last line of the testing code
must change to fetch the docstring from its new location:
...
print(Person.Name.__doc__) # Differs: not Name.__doc__ outside class
As was the case when using properties, our first descriptor example of the prior section didn’t do much—it simply printed trace messages for attribute accesses. In practice, descriptors can also be used to compute attribute values each time they are fetched. The following illustrates—it’s a rehash of the same example we coded for properties, which uses a descriptor to automatically square an attribute’s value each time it is fetched:
class DescSquare: def __init__(self, start): # Each desc has own state self.value = start def __get__(self, instance, owner): # On attr fetch return self.value ** 2 def __set__(self, instance, value): # On attr assign self.value = value # No delete or docs class Client1: X = DescSquare(3) # Assign descriptor instance to class attr class Client2: X = DescSquare(32) # Another instance in another client class # Could also code 2 instances in same class c1 = Client1() c2 = Client2() print(c1.X) # 3 ** 2 c1.X = 4 print(c1.X) # 4 ** 2 print(c2.X) # 32 ** 2
When run, the output of this example is the same as that of the original property-based version, but here a descriptor class object is intercepting the attribute accesses:
9 16 1024
If you study the two descriptor examples we’ve written so
far, you might notice that they get their information from different
places—the first (the name
attribute example) uses data stored on the client
instance, and the second (the attribute
squaring example) uses data attached to the
descriptor object itself. In fact, descriptors
can use both instance state and descriptor
state, or any combination thereof:
Descriptor state is used to manage data internal to the workings of the descriptor.
Instance state records information related to and possibly created by the client class.
Descriptor methods may use either, but descriptor state often makes it unnecessary to use special naming conventions to avoid name collisions for descriptor data stored on an instance. For example, the following descriptor attaches information to its own instance, so it doesn’t clash with that on the client class’s instance:
class DescState: # Use descriptor state def __init__(self, value): self.value = value def __get__(self, instance, owner): # On attr fetch print('DescState get') return self.value * 10 def __set__(self, instance, value): # On attr assign print('DescState set') self.value = value # Client class class CalcAttrs: X = DescState(2) # Descriptor class attr Y = 3 # Class attr def __init__(self): self.Z = 4 # Instance attr obj = CalcAttrs() print(obj.X, obj.Y, obj.Z) # X is computed, others are not obj.X = 5 # X assignment is intercepted obj.Y = 6 obj.Z = 7 print(obj.X, obj.Y, obj.Z)
This code’s value
information lives only in the descriptor, so
there won’t be a collision if the same name is used in the client’s
instance. Notice that only the descriptor attribute is managed
here—get and set accesses to X
are intercepted, but accesses to Y
and Z
are not (Y
is attached to the
client class and Z
to the
instance). When this code is run, X
is computed when fetched:
DescState get 20 3 4 DescState set DescState get 50 6 7
It’s also feasible for a descriptor to store or use an
attribute attached to the client class’s instance, instead of
itself. Unlike data stored in the descriptor itself, this allows for
data that can vary per client class instance. The descriptor in the
following example assumes the instance has an attribute _Y
attached by the client class, and uses
it to compute the value of the attribute it represents:
class InstState: # Using instance state def __get__(self, instance, owner): print('InstState get') # Assume set by client class return instance._Y * 100 def __set__(self, instance, value): print('InstState set') instance._Y = value # Client class class CalcAttrs: X = DescState(2) # Descriptor class attr Y = InstState() # Descriptor class attr def __init__(self): self._Y = 3 # Instance attr self.Z = 4 # Instance attr obj = CalcAttrs() print(obj.X, obj.Y, obj.Z) # X and Y are computed, Z is not obj.X = 5 # X and Y assignments intercepted obj.Y = 6 obj.Z = 7 print(obj.X, obj.Y, obj.Z)
This time, X
and Y
are both assigned to descriptors and
computed when fetched (X
is
assigned the descriptor of the prior example). The new descriptor
here has no information itself, but it uses an attribute assumed to
exist in the instance—that attribute is named _Y
, to avoid collisions with the name of
the descriptor itself. When this version is run the results are
similar, but a second attribute is managed, using state that lives
in the instance instead of the descriptor:
DescState get InstState get 20 300 4 DescState set InstState set DescState get InstState get 50 600 7
Both descriptor and instance state have roles. In fact, this is a general advantage that descriptors have over properties—because they have state of their own, they can easily retain data internally, without adding it to the namespace of the client instance object.
As mentioned earlier, properties and descriptors are
strongly related—the property
built-in is just a convenient way to create a descriptor. Now that
you know how both work, you should also be able to see that it’s
possible to simulate the property
built-in with a descriptor class like the following:
class Property: def __init__(self, fget=None, fset=None, fdel=None, doc=None): self.fget = fget self.fset = fset self.fdel = fdel # Save unbound methods self.__doc__ = doc # or other callables def __get__(self, instance, instancetype=None): if instance is None: return self if self.fget is None: raise AttributeError("can't get attribute") return self.fget(instance) # Pass instance to self # in property accessors def __set__(self, instance, value): if self.fset is None: raise AttributeError("can't set attribute") self.fset(instance, value) def __delete__(self, instance): if self.fdel is None: raise AttributeError("can't delete attribute") self.fdel(instance) class Person: def getName(self): ... def setName(self, value): ... name = Property(getName, setName) # Use like property()
This Property
class catches
attribute accesses with the descriptor protocol and routes requests
to functions or methods passed in and saved in descriptor state when
the class is created. Attribute fetches, for example, are routed
from the Person
class, to the
Property
class’s __get__
method, and back to the Person
class’s getName
. With descriptors, this “just
works.”
Note that this descriptor class equivalent only handles basic
property usage, though; to use @
decorator syntax to also specify set and delete
operations, our Property
class
would also have to be extended with setter
and deleter
methods, which would save the
decorated accessor function and return the property object (self
should suffice). Since the property
built-in already does this, we’ll
omit a formal coding of this extension here.
Also note that descriptors are used to implement Python’s __slots__
;
instance attribute dictionaries are avoided by intercepting slot
names with descriptors stored at the class level. See Chapter 31 for more on slots.
In Chapter 38, we’ll also make use of descriptors to implement function decorators that apply to both functions and methods. As you’ll see there, because descriptors receive both descriptor and subject class instances they work well in this role, though nested functions are usually a simpler solution.
So far, we’ve studied properties and descriptors—tools for managing
specific attributes. The __getattr__
and __getattribute__
operator overloading
methods provide still other ways to intercept attribute fetches for
class instances. Like properties and descriptors, they allow us to
insert code to be run automatically when attributes are accessed; as
we’ll see, though, these two methods can be used in more general
ways.
Attribute fetch interception comes in two flavors, coded with two different methods:
__getattr__
is run for
undefined attributes—that is, attributes not
stored on an instance or inherited from one of its classes.
__getattribute__
is run
for every attribute, so when using it you
must be cautious to avoid recursive loops by passing attribute
accesses to a superclass.
We met the former of these in Chapter 29; it’s available for all Python
versions. The latter of these is available for new-style classes in
2.6, and for all (implicitly new-style) classes in 3.0. These two
methods are representatives of a set of attribute interception methods
that also includes __setattr__
and
__delattr__
. Because these methods have
similar roles, we will generally treat them as a single topic
here.
Unlike properties and descriptors, these methods are part of
Python’s operator overloading protocol—specially
named methods of a class, inherited by subclasses, and run
automatically when instances are used in the implied built-in
operation. Like all methods of a class, they each receive a first
self
argument when called, giving
access to any required instance state information or other methods of
the class.
The __getattr__
and __getattribute__
methods are also more
generic than properties and descriptors—they can
be used to intercept access to any (or even all) instance attribute
fetches, not just the specific name to which they are assigned.
Because of this, these two methods are well suited to general
delegation-based coding patterns—they can be used
to implement wrapper objects that manage all attribute accesses for an
embedded object. By contrast, we must define one property or
descriptor for every attribute we wish to intercept.
Finally, these two methods are more narrowly
focused than the alternatives we considered earlier: they
intercept attribute fetches only, not assignments. To also catch
attribute changes by assignment, we must code a __setattr__
method—an operator overloading
method run for every attribute fetch, which must take care to avoid
recursive loops by routing attribute assignments through the instance
namespace dictionary.
Although much less common, we can also code a __delattr__
overloading method (which must
avoid looping in the same way) to intercept attribute deletions. By
contrast, properties and descriptors catch get, set,
and delete operations by design.
Most of these operator overloading methods were introduced earlier in the book; here, we’ll expand on their usage and study their roles in larger contexts.
__getattr__
and __setattr__
were introduced in Chapters
29 and 31, and __getattribute__
was mentioned
briefly in Chapter 31. In short, if a
class defines or inherits the following methods, they will be run
automatically when an instance is used in the context described by
the comments to the right:
def __getattr__(self, name): # On undefined attribute fetch [obj.name] def __getattribute__(self, name): # On all attribute fetch [obj.name] def __setattr__(self, name, value): # On all attribute assignment [obj.name=value] def __delattr__(self, name): # On all attribute deletion [del obj.name]
In all of these, self
is
the subject instance object as usual, name
is the string name of the attribute being accessed, and
value
is the object being
assigned to the attribute. The two get methods normally return an
attribute’s value, and the other two return nothing (None
). For example, to catch every
attribute fetch, we can use either of the first two methods above,
and to catch every attribute assignment we can use the third:
class Catcher: def __getattr__(self, name): print('Get:', name) def __setattr__(self, name, value): print('Set:', name, value) X = Catcher() X.job # Prints "Get: job" X.pay # Prints "Get: pay" X.pay = 99 # Prints "Set: pay 99"
Such a coding structure can be used to implement the delegation design pattern we met earlier, in Chapter 30. Because all attribute are routed to our interception methods generically, we can validate and pass them along to embedded, managed objects. The following class (borrowed from Chapter 30), for example, traces every attribute fetch made to another object passed to the wrapper class:
class Wrapper: def __init__(self, object): self.wrapped = object # Save object def __getattr__(self, attrname): print('Trace:', attrname) # Trace fetch return getattr(self.wrapped, attrname) # Delegate fetch
There is no such analog for properties and descriptors, short of coding accessors for every possible attribute in every possibly wrapped object.
These methods are generally straightforward to use; their only complex
part is the potential for looping (a.k.a.
recursing). Because __getattr__
is called for undefined attributes only, it can freely fetch other
attributes within its own code. However, because __getattribute__
and __setattr__
are run for all attributes,
their code needs to be careful when accessing other attributes to
avoid calling themselves again and triggering a recursive
loop.
For example, another attribute fetch run inside a __getattribute__
method’s code will
trigger __getattribute__
again,
and the code will loop until memory is exhausted:
def __getattribute__(self, name):
x = self.other # LOOPS!
To work around this, route the fetch through a higher
superclass instead to skip this level’s version—the object
class is always a superclass, and
it serves well in this role:
def __getattribute__(self, name):
x = object.__getattribute__(self, 'other') # Force higher to avoid me
For __setattr__
, the
situation is similar; assigning any attribute inside this method
triggers __setattr__
again and
creates a similar loop:
def __setattr__(self, name, value):
self.other = value # LOOPS!
To work around this problem, assign the attribute as a key
in the instance’s __dict__
namespace dictionary instead. This avoids direct attribute
assignment:
def __setattr__(self, name, value):
self.__dict__['other'] = value # Use atttr dict to avoid me
Although it’s a less common approach, __setattr__
can also pass its own
attribute assignments to a higher superclass to avoid looping,
just like __getattribute__
:
def __setattr__(self, name, value):
object.__setattr__(self, 'other', value) # Force higher to avoid me
By contrast, though, we cannot use the
__dict__
trick to avoid loops
in __getattribute__
:
def __getattribute__(self, name):
x = self.__dict__['other'] # LOOPS!
Fetching the __dict__
attribute itself triggers __getattribute__
again, causing a
recursive loop. Strange but true!
The __delattr__
method is
rarely used in practice, but when it is, it is called for every
attribute deletion (just as __setattr__
is called for every
attribute assignment). Therefore, you must take care to avoid
loops when deleting attributes, by using the same techniques:
namespace dictionaries or superclass method calls.
All this is not nearly as complicated as the prior section may have implied. To see how to put these ideas to work, here is the same first example we used for properties and descriptors in action again, this time implemented with attribute operator overloading methods. Because these methods are so generic, we test attribute names here to know when a managed attribute is being accessed; others are allowed to pass normally:
class Person: def __init__(self, name): # On [Person()] self._name = name # Triggers __setattr__! def __getattr__(self, attr): # On [obj.undefined] if attr == 'name': # Intercept name: not stored print('fetch...') return self._name # Does not loop: real attr else: # Others are errors raise AttributeError(attr) def __setattr__(self, attr, value): # On [obj.any = value] if attr == 'name': print('change...') attr = '_name' # Set internal name self.__dict__[attr] = value # Avoid looping here def __delattr__(self, attr): # On [del obj.any] if attr == 'name': print('remove...') attr = '_name' # Avoid looping here too del self.__dict__[attr] # but much less common bob = Person('Bob Smith') # bob has a managed attribute print(bob.name) # Runs __getattr__ bob.name = 'Robert Smith' # Runs __setattr__ print(bob.name) del bob.name # Runs __delattr__ print('-'*20) sue = Person('Sue Jones') # sue inherits property too print(sue.name) #print(Person.name.__doc__) # No equivalent here
Notice that the attribute assignment in the __init__
constructor triggers __setattr__
too—this method catches every
attribute assignment, even those within the class itself. When this
code is run, the same output is produced, but this time it’s the
result of Python’s normal operator overloading mechanism and our
attribute interception methods:
fetch... Bob Smith change... fetch... Robert Smith remove... -------------------- fetch... Sue Jones
Also note that, unlike with properties and descriptors, there’s no direct notion of specifying documentation for our attribute here; managed attributes exist within the code of our interception methods, not as distinct objects.
To achieve exactly the same results with __getattribute__
, replace __getattr__
in the example with the
following; because it catches all attribute fetches, this version
must be careful to avoid looping by passing new fetches to a
superclass, and it can’t generally assume unknown names are
errors:
# Replace __getattr__ with this def __getattribute__(self, attr): # On [obj.any] if attr == 'name': # Intercept all names print('fetch...') attr = '_name' # Map to internal name return object.__getattribute__(self, attr) # Avoid looping here
This example is equivalent to that coded for properties and
descriptors, but it’s a bit artificial, and it doesn’t really
highlight these tools in practice. Because they are generic,
__getattr__
and __getattribute__
are probably more
commonly used in delegation-base code (as sketched earlier), where
attribute access is validated and routed to an embedded object.
Where just a single attribute must be managed,
properties and descriptors might do as well or better.
As before, our prior example doesn’t really do anything but trace
attribute fetches; it’s not much more work to compute an attribute’s
value when fetched. As for properties and descriptors, the following
creates a virtual attribute X
that runs a calculation when fetched:
class AttrSquare: def __init__(self, start): self.value = start # Triggers __setattr__! def __getattr__(self, attr): # On undefined attr fetch if attr == 'X': return self.value ** 2 # value is not undefined else: raise AttributeError(attr) def __setattr__(self, attr, value): # On all attr assignments if attr == 'X': attr = 'value' self.__dict__[attr] = value A = AttrSquare(3) # 2 instances of class with overloading B = AttrSquare(32) # Each has different state information print(A.X) # 3 ** 2 A.X = 4 print(A.X) # 4 ** 2 print(B.X) # 32 ** 2
Running this code results in the same output that we got earlier when using properties and descriptors, but this script’s mechanics are based on generic attribute interception methods:
9 16 1024
As before, we can achieve the same effect with __getattribute__
instead of __getattr__
; the following
replaces the fetch method with a __getattribute__
and changes the __setattr__
assignment method to avoid
looping by using direct superclass method calls instead of __dict__
keys:
class AttrSquare: def __init__(self, start): self.value = start # Triggers __setattr__! def __getattribute__(self, attr): # On all attr fetches if attr == 'X': return self.value ** 2 # Triggers __getattribute__ again! else: return object.__getattribute__(self, attr) def __setattr__(self, attr, value): # On all attr assignments if attr == 'X': attr = 'value' object.__setattr__(self, attr, value)
When this version is run, the results are the same again. Notice the implicit routing going on in inside this class’s methods:
self.value=start
inside
the constructor triggers __setattr__
self.value
inside
__getattribute__
triggers
__getattribute__
again
In fact, __getattribute__
is run twice each time we fetch attribute
X
. This doesn’t happen in the
__getattr__
version, because the
value
attribute is not undefined.
If you care about speed and want to avoid this, change __getattribute__
to use the superclass to
fetch value
as well:
def __getattribute__(self, attr): if attr == 'X': return object.__getattribute__(self, 'value') ** 2
Of course, this still incurs a call to the superclass method,
but not an additional recursive call before we get there. Add
print
calls to these methods to
trace how and when they run.
To summarize the coding differences between __getattr__
and __getattribute__
, the following example
uses both to implement three attributes—attr1
is a class attribute, attr2
is an instance attribute, and
attr3
is a virtual managed
attribute computed when fetched:
class GetAttr: attr1 = 1 def __init__(self): self.attr2 = 2 def __getattr__(self, attr): # On undefined attrs only print('get: ' + attr) # Not attr1: inherited from class return 3 # Not attr2: stored on instance X = GetAttr() print(X.attr1) print(X.attr2) print(X.attr3) print('-'*40) class GetAttribute(object): # (object) needed in 2.6 only attr1 = 1 def __init__(self): self.attr2 = 2 def __getattribute__(self, attr): # On all attr fetches print('get: ' + attr) # Use superclass to avoid looping here if attr == 'attr3': return 3 else: return object.__getattribute__(self, attr) X = GetAttribute() print(X.attr1) print(X.attr2) print(X.attr3)
When run, the __getattr__
version intercepts only attr3
accesses, because it is undefined. The __getattribute__
version, on the other
hand, intercepts all attribute fetches and must route those it does
not manage to the superclass fetcher to avoid loops:
1 2 get: attr3 3 ---------------------------------------- get: attr1 1 get: attr2 2 get: attr3 3
Although __getattribute__
can catch more attribute fetches than __getattr__
, in practice they are often
just variations on a theme—if attributes are not physically stored,
the two have the same effect.
To summarize the coding differences in all four attribute
management schemes we’ve seen in this chapter, let’s quickly step
through a more comprehensive computed-attribute example using each
technique. The following version uses properties to intercept and
calculate attributes named square
and cube
. Notice how their base
values are stored in names that begin with an underscore, so they
don’t clash with the names of the properties themselves:
# 2 dynamically computed attributes with properties class Powers: def __init__(self, square, cube): self._square = square # _square is the base value self._cube = cube # square is the property name def getSquare(self): return self._square ** 2 def setSquare(self, value): self._square = value square = property(getSquare, setSquare) def getCube(self): return self._cube ** 3 cube = property(getCube) X = Powers(3, 4) print(X.square) # 3 ** 2 = 9 print(X.cube) # 4 ** 3 = 64 X.square = 5 print(X.square) # 5 ** 2 = 25
To do the same with descriptors, we define the attributes with complete classes. Note that these descriptors store base values as instance state, so they must use leading underscores again so as not to clash with the names of descriptors (as we’ll see in the final example of this chapter, we could avoid this renaming requirement by storing base values as descriptor state instead):
# Same, but with descriptors class DescSquare: def __get__(self, instance, owner): return instance._square ** 2 def __set__(self, instance, value): instance._square = value class DescCube: def __get__(self, instance, owner): return instance._cube ** 3 class Powers: # Use (object) in 2.6 square = DescSquare() cube = DescCube() def __init__(self, square, cube): self._square = square # "self.square = square" works too, self._cube = cube # because it triggers desc __set__! X = Powers(3, 4) print(X.square) # 3 ** 2 = 9 print(X.cube) # 4 ** 3 = 64 X.square = 5 print(X.square) # 5 ** 2 = 25
To achieve the same result with __getattr__
fetch interception, we again
store base values with underscore-prefixed names so that accesses to
managed names are undefined and thus invoke our method; we also need
to code a __setattrr__
to
intercept assignments, and take care to avoid its potential for
looping:
# Same, but with generic __getattr__ undefined attribute interception class Powers: def __init__(self, square, cube): self._square = square self._cube = cube def __getattr__(self, name): if name == 'square': return self._square ** 2 elif name == 'cube': return self._cube ** 3 else: raise TypeError('unknown attr:' + name) def __setattr__(self, name, value): if name == 'square': self.__dict__['_square'] = value else: self.__dict__[name] = value X = Powers(3, 4) print(X.square) # 3 ** 2 = 9 print(X.cube) # 4 ** 3 = 64 X.square = 5 print(X.square) # 5 ** 2 = 25
The final option, coding this with __getattribute__
, is similar to the prior
version. Because we catch every attribute now, though, we must route
base value fetches to a superclass to avoid looping:
# Same, but with generic __getattribute__ all attribute interception class Powers: def __init__(self, square, cube): self._square = square self._cube = cube def __getattribute__(self, name): if name == 'square': return object.__getattribute__(self, '_square') ** 2 elif name == 'cube': return object.__getattribute__(self, '_cube') ** 3 else: return object.__getattribute__(self, name) def __setattr__(self, name, value): if name == 'square': self.__dict__['_square'] = value else: self.__dict__[name] = value X = Powers(3, 4) print(X.square) # 3 ** 2 = 9 print(X.cube) # 4 ** 3 = 64 X.square = 5 print(X.square) # 5 ** 2 = 25
As you can see, each technique takes a different form in code, but all four produce the same result when run:
9 64 25
For more on how these alternatives compare, and other coding options, stay tuned for a more realistic application of them in the attribute validation example in the section Example: Attribute Validations. First, though, we need to study a pitfall associated with two of these tools.
When I introduced __getattr__
and __getattribute__
, I stated that they
intercept undefined and all attribute fetches, respectively, which
makes them ideal for delegation-based coding patterns. While this is
true for normally named attributes, their behavior needs some
additional clarification: for method-name attributes implicitly
fetched by built-in operations, these methods may not be
run at all. This means that operator overloading method
calls cannot be delegated to wrapped objects unless wrapper classes
somehow redefine these methods themselves.
For example, attribute fetches for the __str__
, __add__
, and __getitem__
methods run implicitly by
printing, +
expressions, and
indexing, respectively, are not routed to the generic attribute
interception methods in 3.0. Specifically:
In Python 3.0, neither __getattr__
nor __getattribute__
is run for such
attributes.
In Python 2.6, __getattr__
is
run for such attributes if they are undefined in the
class.
In Python 2.6, __getattribute__
is available for
new-style classes only and works as it does in 3.0.
In other words, in Python 3.0 classes (and 2.6 new-style classes), there is no direct way to generically intercept built-in operations like printing and addition. In Python 2.X, the methods such operations invoke are looked up at runtime in instances, like all other attributes; in Python 3.0 such methods are looked up in classes instead.
This change makes delegation-based coding patterns more complex in 3.0, since they cannot generically intercept operator overloading method calls and route them to an embedded object. This is not a showstopper—wrapper classes can work around this constraint by redefining all relevant operator overloading methods in the wrapper itself, in order to delegate calls. These extra methods can be added either manually, with tools, or by definition in and inheritance from common superclasses. This does, however, make wrappers more work than they used to be when operator overloading methods are a part of a wrapped object’s interface.
Keep in mind that this issue applies only to __getattr__
and __getattribute__
. Because properties and
descriptors are defined for specific attributes only, they don’t
really apply to delegation-based classes at all—a single property or
descriptor cannot be used to intercept arbitrary attributes.
Moreover, a class that defines both operator
overloading methods and attribute interception will work correctly,
regardless of the type of attribute interception defined. Our
concern here is only with classes that do not have operator
overloading methods defined, but try to intercept them
generically.
Consider the following example, the file getattr.py, which tests various attribute
types and built-in operations
on instances of classes containing __getattr__
and __getattribute__
methods:
class GetAttr: eggs = 88 # eggs stored on class, spam on instance def __init__(self): self.spam = 77 def __len__(self): # len here, else __getattr__ called with __len__ print('__len__: 42') return 42 def __getattr__(self, attr): # Provide __str__ if asked, else dummy func print('getattr: ' + attr) if attr == '__str__': return lambda *args: '[Getattr str]' else: return lambda *args: None class GetAttribute(object): # object required in 2.6, implied in 3.0 eggs = 88 # In 2.6 all are isinstance(object) auto def __init__(self): # But must derive to get new-style tools, self.spam = 77 # incl __getattribute__, some __X__ defaults def __len__(self): print('__len__: 42') return 42 def __getattribute__(self, attr): print('getattribute: ' + attr) if attr == '__str__': return lambda *args: '[GetAttribute str]' else: return lambda *args: None for Class in GetAttr, GetAttribute: print(' ' + Class.__name__.ljust(50, '=')) X = Class() X.eggs # Class attr X.spam # Instance attr X.other # Missing attr len(X) # __len__ defined explicitly try: # New-styles must support [], +, call directly: redefine X[0] # __getitem__? except: print('fail []') try: X + 99 # __add__? except: print('fail +') try: X() # __call__? (implicit via built-in) except: print('fail ()') X.__call__() # __call__? (explicit, not inherited) print(X.__str__()) # __str__? (explicit, inherited from type) print(X) # __str__? (implicit via built-in)
When run under Python 2.6, __getattr__
does
receive a variety of implicit attribute fetches for built-in
operations, because Python looks up such attributes in instances
normally. Conversely, __getattribute__
is
not run for any of the operator overloading
names, because such names are looked up in classes only:
C:misc> c:python26python getattr.py
GetAttr===========================================
getattr: other
__len__: 42
getattr: __getitem__
getattr: __coerce__
getattr: __add__
getattr: __call__
getattr: __call__
getattr: __str__
[Getattr str]
getattr: __str__
[Getattr str]
GetAttribute======================================
getattribute: eggs
getattribute: spam
getattribute: other
__len__: 42
fail []
fail +
fail ()
getattribute: __call__
getattribute: __str__
[GetAttribute str]
<__main__.GetAttribute object at 0x025EA1D0>
Note how __getattr__
intercepts both implicit and explicit fetches of __call__
and __str__
in 2.6 here. By contrast, __getattribute__
fails to catch implicit
fetches of either attribute name for built-in operations.
Really, the __getattribute__
case is the same in 2.6
as it is in 3.0, because in 2.6 classes must be made new-style by
deriving from object
to use this
method. This code’s object
derivation is optional in 3.0 because all classes are
new-style.
When run under Python 3.0, though, results for __getattr__
differ—none of the implicitly run operator
overloading methods trigger either attribute
interception method when their attributes are fetched by built-in
operations. Python 3.0 skips the normal instance lookup mechanism
when resolving such names:
C:misc> c:python30python getattr.py
GetAttr===========================================
getattr: other
__len__: 42
fail []
fail +
fail ()
getattr: __call__
<__main__.GetAttr object at 0x025D17F0>
<__main__.GetAttr object at 0x025D17F0>
GetAttribute======================================
getattribute: eggs
getattribute: spam
getattribute: other
__len__: 42
fail []
fail +
fail ()
getattribute: __call__
getattribute: __str__
[GetAttribute str]
<__main__.GetAttribute object at 0x025D1870>
We can trace these outputs back to print
s in the script to see how this
works:
__str__
access fails to
be caught twice by __getattr__
in 3.0: once for the
built-in print, and once for explicit fetches because a default
is inherited from the class (really, from the built-in object
, which is a superclass to every
class).
__str__
fails to be
caught only once by the __getattribute__
catchall, during the
built-in print operation; explicit fetches bypass the inherited
version.
__call__
fails to be
caught in both schemes in 3.0 for built-in call expressions, but
it is intercepted by both when fetched explicitly; unlike with
__str__
, there is no
inherited __call__
default to
defeat __getattr__
.
__len__
is caught by
both classes, simply because it is an explicitly defined method
in the classes themselves—its name it is not routed to either
__getattr__
or __getattribute__
in 3.0 if we delete
the class’s __len__
methods.
All other built-in operations fail to be intercepted by both schemes in 3.0.
Again, the net effect is that operator overloading methods implicitly run by built-in operations are never routed through either attribute interception method in 3.0: Python 3.0 searches for such attributes in classes and skips instance lookup entirely.
This makes delegation-based wrapper classes more difficult to code in 3.0—if wrapped classes may contain operator overloading methods, those methods must be redefined redundantly in the wrapper class in order to delegate to the wrapped object. In general delegation tools, this can add many extra methods.
Of course, the addition of such methods can be partly automated by tools that augment classes with new methods (the class decorators and metaclasses of the next two chapters might help here). Moreover, a superclass might be able to define all these extra methods once, for inheritance in delegation-based classes. Still, delegation coding patterns require extra work in 3.0.
For a more realistic illustration of this phenomenon as well
as its workaround, see the Private
decorator example in the following
chapter. There, we’ll see that it’s also possible to insert a __getattribute__
in the client class to
retain its original type, although this method still won’t be called
for operator overloading methods; printing still runs a __str__
defined in such a class directly,
for example, instead of routing the request through __getattribute__
.
As another example, the next section resurrects our class tutorial example. Now that you understand how attribute interception works, I’ll be able to explain one of its stranger bits.
For an example of this 3.0 change at work in Python itself,
see the discussion of the 3.0 os.popen
object in Chapter 14. Because it
is implemented with a wrapper that uses __getattr__
to delegate attribute
fetches to an embedded object, it does not intercept the next(X)
built-in iterator function in
Python 3.0, which is defined to run __next__
. It does, however, intercept
and delegate explicit X.__next__()
calls, because these are
not routed through the built-in and are not inherited from a
superclass like __str__
is.
This is equivalent to __call__
in our example—implicit calls
for built-ins do not trigger __getattr__
, but explicit calls to names
not inherited from the class type do. In other words, this change
impacts not only our delegators, but also those in the Python
standard library! Given the scope of this change, it’s possible
that this behavior may evolve in the future, so be sure to verify
this issue in later releases.
The object-oriented tutorial of Chapter 27 presented a Manager
class that used object embedding
and method delegation to customize its superclass, rather than
inheritance. Here is the code again for reference, with some
irrelevant testing removed:
class Person: def __init__(self, name, job=None, pay=0): self.name = name self.job = job self.pay = pay def lastName(self): return self.name.split()[-1] def giveRaise(self, percent): self.pay = int(self.pay * (1 + percent)) def __str__(self): return '[Person: %s, %s]' % (self.name, self.pay) class Manager: def __init__(self, name, pay): self.person = Person(name, 'mgr', pay) # Embed a Person object def giveRaise(self, percent, bonus=.10): self.person.giveRaise(percent + bonus) # Intercept and delegate def __getattr__(self, attr): return getattr(self.person, attr) # Delegate all other attrs def __str__(self): return str(self.person) # Must overload again (in 3.0) if __name__ == '__main__': sue = Person('Sue Jones', job='dev', pay=100000) print(sue.lastName()) sue.giveRaise(.10) print(sue) tom = Manager('Tom Jones', 50000) # Manager.__init__ print(tom.lastName()) # Manager.__getattr__ -> Person.lastName tom.giveRaise(.10) # Manager.giveRaise -> Person.giveRaise print(tom) # Manager.__str__ -> Person.__str__
Comments at the end of this file show which methods are
invoked for a line’s operation. In particular, notice how lastName
calls are undefined in Manager
, and thus are routed into the
generic __getattr__
and from
there on to the embedded Person
object. Here is the script’s output—Sue receives a 10% raise from
Person
, but Tom gets 20% because
giveRaise
is customized in
Manager
:
C:misc> c:python30python person.py
Jones
[Person: Sue Jones, 110000]
Jones
[Person: Tom Jones, 60000]
By contrast, though, notice what occurs when we
print a Manager
at the end of the script: the
wrapper class’s __str__
is
invoked, and it delegates to the embedded Person
object’s __str__
. With that in mind, watch what
happens if we delete the Manager.__str__
method in this
code:
# Delete the Manager __str__ method class Manager: def __init__(self, name, pay): self.person = Person(name, 'mgr', pay) # Embed a Person object def giveRaise(self, percent, bonus=.10): self.person.giveRaise(percent + bonus) # Intercept and delegate def __getattr__(self, attr): return getattr(self.person, attr) # Delegate all other attrs
Now printing does not route its attribute
fetch through the generic __getattr__
interceptor under Python 3.0
for Manager
objects. Instead, a
default __str__
display method
inherited from the class’s implicit object
superclass is looked up and run
(sue
still prints correctly,
because Person
has an explicit
__str__
):
C:misc> c:python30python person.py
Jones
[Person: Sue Jones, 110000]
Jones
<__main__.Manager object at 0x02A5AE30>
Curiously, running without a __str__
like this
does trigger __getattr__
in Python 2.6, because
operator overloading attributes are routed through this method, and
classes do not inherit a default for __str__
:
C:misc> c:python26python person.py
Jones
[Person: Sue Jones, 110000]
Jones
[Person: Tom Jones, 60000]
Switching to __getattribute__
won’t help 3.0 here
either—like __getattr__
, it is
not run for operator overloading attributes
implied by built-in operations in either Python 2.6 or 3.0:
# Replace __getattr_ with __getattribute__ class Manager: # Use (object) in 2.6 def __init__(self, name, pay): self.person = Person(name, 'mgr', pay) # Embed a Person object def giveRaise(self, percent, bonus=.10): self.person.giveRaise(percent + bonus) # Intercept and delegate def __getattribute__(self, attr): print('**', attr) if attr in ['person', 'giveRaise']: return object.__getattribute__(self, attr) # Fetch my attrs else: return getattr(self.person, attr) # Delegate all others
Regardless of which attribute interception method is used in
3.0, we still must include a redefined __str__
in Manager
(as shown above) in order to
intercept printing operations and route them to the embedded
Person
object:
C:misc> c:python30python person.py
Jones
[Person: Sue Jones, 110000]
** lastName
** person
Jones
** giveRaise
** person
<__main__.Manager object at 0x028E0590>
Notice that __getattribute__
gets called
twice here for methods—once for the method
name, and again for the self.person
embedded object fetch. We
could avoid that with a different coding, but we would still have to
redefine __str__
to catch
printing, albeit differently here (self.person
would cause this __getattribute__
to fail):
# Code __getattribute__ differently to minimize extra calls
class Manager:
def __init__(self, name, pay):
self.person = Person(name, 'mgr', pay)
def __getattribute__(self, attr):
print('**', attr)
person = object.__getattribute__(self, 'person')
if attr == 'giveRaise':
return lambda percent: person.giveRaise(percent+.10)
else:
return getattr(person, attr)
def __str__(self):
person = object.__getattribute__(self, 'person')
return str(person)
When this alternative runs, our object prints properly, but
only because we’ve added an explicit __str__
in the wrapper—this attribute is
still not routed to our generic attribute interception
method:
Jones [Person: Sue Jones, 110000] ** lastName Jones ** giveRaise [Person: Tom Jones, 60000]
That short story here is that delegation-based classes like
Manager
must redefine some
operator overloading methods (like __str__
) to route them to embedded objects
in Python 3.0, but not in Python 2.6 unless new-style classes are
used. Our only direct options seem to be using __getattr__
and Python 2.6, or redefining
operator overloading methods in wrapper classes redundantly in
3.0.
Again, this isn’t an impossible task; many wrappers can predict the set of operator overloading methods required, and tools and superclasses can automate part of this task. Moreover, not all classes use operator overloading methods (indeed, most application classes usually should not). It is, however, something to keep in mind for delegation coding models used in Python 3.0; when operator overloading methods are part of an object’s interface, wrappers must accommodate them portably by redefining them locally.
To close out this chapter, let’s turn to a more realistic
example, coded in all four of our attribute management schemes. The
example we will use defines a CardHolder
object with four attributes,
three of which are managed. The managed attributes validate or
transform values when fetched or stored. All four versions produce the
same results for the same test code, but they implement their
attributes in very different ways. The examples are included largely
for self-study; although I won’t go through their code in detail, they
all use concepts we’ve already explored in this chapter.
Our first coding uses properties to manage three attributes. As usual, we could use simple methods instead of managed attributes, but properties help if we have been using attributes in existing code already. Properties run code automatically on attribute access, but are focused on a specific set of attributes; they cannot be used to intercept all attributes generically.
To understand this code, it’s crucial to notice that the
attribute assignments inside the __init__
constructor method trigger
property setter methods too. When this method assigns to self.name
, for example, it automatically
invokes the setName
method, which
transforms the value and assigns it to an instance attribute called
__name
so it won’t clash with the
property’s name.
This renaming (sometimes called name
mangling) is necessary because properties use common
instance state and have none of their own. Data is stored in an
attribute called __name
, and the
attribute called name
is always a
property, not data.
In the end, this class manages attributes called name
, age
, and acct
; allows the attribute addr
to be accessed directly; and provides
a read-only attribute called remain
that is entirely virtual and
computed on demand. For comparison purposes, this property-based
coding weighs in at 39 lines of code:
class CardHolder: acctlen = 8 # Class data retireage = 59.5 def __init__(self, acct, name, age, addr): self.acct = acct # Instance data self.name = name # These trigger prop setters too self.age = age # __X mangled to have class name self.addr = addr # addr is not managed # remain has no data def getName(self): return self.__name def setName(self, value): value = value.lower().replace(' ', '_') self.__name = value name = property(getName, setName) def getAge(self): return self.__age def setAge(self, value): if value < 0 or value > 150: raise ValueError('invalid age') else: self.__age = value age = property(getAge, setAge) def getAcct(self): return self.__acct[:-3] + '***' def setAcct(self, value): value = value.replace('-', '') if len(value) != self.acctlen: raise TypeError('invald acct number') else: self.__acct = value acct = property(getAcct, setAcct) def remainGet(self): # Could be a method, not attr return self.retireage - self.age # Unless already using as attr remain = property(remainGet)
The following code tests our class; add this to the bottom
of your file, or place the class in a module and import it first.
We’ll use this same testing code for all four versions of this
example. When it runs, we make two instances of our
managed-attribute class and fetch and change its various
attributes. Operations expected to fail are wrapped in try
statements:
bob = CardHolder('1234-5678', 'Bob Smith', 40, '123 main st') print(bob.acct, bob.name, bob.age, bob.remain, bob.addr, sep=' / ') bob.name = 'Bob Q. Smith' bob.age = 50 bob.acct = '23-45-67-89' print(bob.acct, bob.name, bob.age, bob.remain, bob.addr, sep=' / ') sue = CardHolder('5678-12-34', 'Sue Jones', 35, '124 main st') print(sue.acct, sue.name, sue.age, sue.remain, sue.addr, sep=' / ') try: sue.age = 200 except: print('Bad age for Sue') try: sue.remain = 5 except: print("Can't set sue.remain") try: sue.acct = '1234567' except: print('Bad acct for Sue')
Here is the output of our self-test code; again, this is the same for all versions of this example. Trace through this code to see how the class’s methods are invoked; accounts are displayed with some digits hidden, names are converted to a standard format, and time remaining until retirement is computed when fetched using a class attribute cutoff:
12345*** / bob_smith / 40 / 19.5 / 123 main st 23456*** / bob_q._smith / 50 / 9.5 / 123 main st 56781*** / sue_jones / 35 / 24.5 / 124 main st Bad age for Sue Can't set sue.remain Bad acct for Sue
Now, let’s recode our example using descriptors instead of properties. As we’ve seen, descriptors are very similar to properties in terms of functionality and roles; in fact, properties are basically a restricted form of descriptor. Like properties, descriptors are designed to handle specific attributes, not generic attribute access. Unlike properties, descriptors have their own state, and they’re a more general scheme.
To understand this code, it’s again important to notice that
the attribute assignments inside the __init__
constructor method trigger
descriptor __set__
methods. When
the constructor method assigns to self.name
, for example, it automatically
invokes the Name.__set__()
method, which transforms the value and assigns it to a descriptor
attribute called name
.
Unlike in the prior property-based variant, though, in this
case the actual name
value is
attached to the descriptor object, not the
client class instance. Although we could store this value in either
instance or descriptor state, the latter avoids the need to mangle
names with underscores to avoid collisions. In the CardHolder
client class, the attribute
called name
is always a
descriptor object, not data. The downside of this scheme is that
state stored inside a descriptor itself is class-level data which is
effectively shared by all client class instances, and so cannot vary
between them.
In the end, this class implements the same attributes as the
prior version: it manages attributes called name
, age
, and acct
; allows the attribute addr
to be accessed directly; and provides
a read-only attribute called remain
that is entirely virtual and
computed on demand. Notice how we must catch assignments to the
remain
name in its descriptor and
raise an exception; as we learned earlier, if we did not do this,
assigning to this attribute of an instance would silently create an
instance attribute that hides the class attribute descriptor. For
comparison purposes, this descriptor-based coding takes 45 lines of
code:
class CardHolder: acctlen = 8 # Class data retireage = 59.5 def __init__(self, acct, name, age, addr): self.acct = acct # Instance data self.name = name # These trigger __set__ calls too self.age = age # __X not needed: in descriptor self.addr = addr # addr is not managed # remain has no data class Name: def __get__(self, instance, owner): # Class names: CardHolder locals return self.name def __set__(self, instance, value): value = value.lower().replace(' ', '_') self.name = value name = Name() class Age: def __get__(self, instance, owner): return self.age # Use descriptor data def __set__(self, instance, value): if value < 0 or value > 150: raise ValueError('invalid age') else: self.age = value age = Age() class Acct: def __get__(self, instance, owner): return self.acct[:-3] + '***' def __set__(self, instance, value): value = value.replace('-', '') if len(value) != instance.acctlen: # Use instance class data raise TypeError('invald acct number') else: self.acct = value acct = Acct() class Remain: def __get__(self, instance, owner): return instance.retireage - instance.age # Triggers Age.__get__ def __set__(self, instance, value): raise TypeError('cannot set remain') # Else set allowed here remain = Remain()
As we’ve seen, the __getattr__
method intercepts all undefined attributes, so it can be more
generic than using properties or descriptors. For our example, we
simply test the attribute name to know when a managed attribute is
being fetched; others are stored physically on the instance and so
never reach __getattr__
. Although
this approach is more general than using properties or descriptors,
extra work may be required to imitate the specific-attribute focus
of other tools. We need to check names at runtime, and we must code
a __setattr__
in order to
intercept and validate attribute assignments.
As for the property and descriptor versions of this example,
it’s critical to notice that the attribute assignments inside the
__init__
constructor method
trigger the class’s __setattr__
method too. When this method assigns to self.name
, for example, it automatically
invokes the __setattr__
method,
which transforms the value and assigns it to an instance attribute
called name
. By storing name
on the instance, it ensures that
future accesses will not trigger __getattr__
. In contrast, acct
is stored as _acct
, so that later accesses to acct
do invoke __getattr__
.
In the end, this class, like the prior two, manages attributes
called name
, age
, and acct
; allows the attribute addr
to be accessed directly; and provides
a read-only attribute called remain
that is entirely virtual and is
computed on demand.
For comparison purposes, this alternative comes in at 32 lines
of code—7 fewer than the property-based version, and 13 fewer than
the version using descriptors. Clarity matters more than code size,
of course, but extra code can sometimes imply extra development and
maintenance work. Probably more important here are
roles: generic tools like __getattr__
may be better suited to
generic delegation, while properties and descriptors are more
directly designed to manage specific attributes.
Also note that the code here incurs extra
calls when setting unmanaged attributes (e.g., addr
), although no extra calls are
incurred for fetching unmanaged attributes, since they are defined.
Though this will likely result in negligible overhead for most
programs, properties and descriptors incur an extra call only when
managed attributes are accessed.
Here’s the __getattr__
version of our code:
class CardHolder: acctlen = 8 # Class data retireage = 59.5 def __init__(self, acct, name, age, addr): self.acct = acct # Instance data self.name = name # These trigger __setattr__ too self.age = age # _acct not mangled: name tested self.addr = addr # addr is not managed # remain has no data def __getattr__(self, name): if name == 'acct': # On undefined attr fetches return self._acct[:-3] + '***' # name, age, addr are defined elif name == 'remain': return self.retireage - self.age # Doesn't trigger __getattr__ else: raise AttributeError(name) def __setattr__(self, name, value): if name == 'name': # On all attr assignments value = value.lower().replace(' ', '_') # addr stored directly elif name == 'age': # acct mangled to _acct if value < 0 or value > 150: raise ValueError('invalid age') elif name == 'acct': name = '_acct' value = value.replace('-', '') if len(value) != self.acctlen: raise TypeError('invald acct number') elif name == 'remain': raise TypeError('cannot set remain') self.__dict__[name] = value # Avoid looping
Our final variant uses the __getattribute__
catchall to intercept
attribute fetches and manage them as needed. Every attribute fetch
is caught here, so we test the attribute names to detect managed
attributes and route all others to the superclass for normal fetch
processing. This version uses the same __setattr__
to catch assignments as the
prior version.
The code works very much like the __getattr__
version, so I won’t repeat the
full description here. Note, though, that because
every attribute fetch is routed to __getattribute__
, we don’t need
to mangle names to intercept them here (acct
is stored as acct
). On the other hand, this code must
take care to route nonmanaged attribute fetches to a superclass to
avoid looping.
Also notice that this version incurs extra calls for both
setting and fetching unmanaged attributes (e.g., addr
); if speed is paramount, this
alternative may be the slowest of the bunch. For comparison
purposes, this version amounts to 32 lines of code, just like the
prior version:
class CardHolder: acctlen = 8 # Class data retireage = 59.5 def __init__(self, acct, name, age, addr): self.acct = acct # Instance data self.name = name # These trigger __setattr__ too self.age = age # acct not mangled: name tested self.addr = addr # addr is not managed # remain has no data def __getattribute__(self, name): superget = object.__getattribute__ # Don't loop: one level up if name == 'acct': # On all attr fetches return superget(self, 'acct')[:-3] + '***' elif name == 'remain': return superget(self, 'retireage') - superget(self, 'age') else: return superget(self, name) # name, age, addr: stored def __setattr__(self, name, value): if name == 'name': # On all attr assignments value = value.lower().replace(' ', '_') # addr stored directly elif name == 'age': if value < 0 or value > 150: raise ValueError('invalid age') elif name == 'acct': value = value.replace('-', '') if len(value) != self.acctlen: raise TypeError('invald acct number') elif name == 'remain': raise TypeError('cannot set remain') self.__dict__[name] = value # Avoid loops, orig names
Be sure to study and run this section’s code on your own for more pointers on managed attribute coding techniques.
This chapter covered the various techniques for managing access
to attributes in Python, including the __getattr__
and __getattribute__
operator overloading
methods, class properties, and attribute descriptors. Along the way,
it compared and contrasted these tools and presented a handful of use
cases to demonstrate their behavior.
Chapter 38 continues our tool-building survey with a look at decorators—code run automatically at function and class creation time, rather than on attribute access. Before we continue, though, let’s work through a set of questions to review what we’ve covered here.
How do __getattr__
and
__getattribute__
differ?
How do properties and descriptors differ?
How are properties and decorators related?
What are the main functional differences between __getattr__
and __getattribute__
and properties and
descriptors?
Isn’t all this feature comparison just a kind of argument?
The __getattr__
method
is run for fetches of undefined attributes
only—i.e., those not present on an instance and not inherited
from any of its classes. By contrast, the __getattribute__
method is called for
every attribute fetch, whether the
attribute is defined or not. Because of this, code inside a
__getattr__
can freely fetch
other attributes if they are defined, whereas __getattribute__
must use special code
for all such attribute fetches to avoid looping (it must route
fetches to a superclass to skip itself).
Properties serve a specific role, while descriptors are more general. Properties define get, set, and delete functions for a specific attribute; descriptors provide a class with methods for these actions, too, but they provide extra flexibility to support more arbitrary actions. In fact, properties are really a simple way to create a specific kind of descriptor—one that runs functions on attribute accesses. Coding differs too: a property is created with a built-in function, and a descriptor is coded with a class; as such, descriptors can leverage all the usual OOP features of classes, such as inheritance. Moreover, in addition to the instance’s state information, descriptors have local state of their own, so they can avoid name collisions in the instance.
Properties can be coded with decorator syntax. Because the
property
built-in accepts a
single function argument, it can be used directly as a function
decorator to define a fetch access property. Due to the name
rebinding behavior of decorators, the name of the decorated
function is assigned to a property whose get accessor is set to
the original function decorated (name =
property(name)
). Property setter
and deleter
attributes allow us to further
add set and delete accessors with decoration syntax—they set the
accessor to the decorated function and return the augmented
property.
The __getattr__
and
__getattribute__
methods are
more generic: they can be used to catch arbitrarily many
attributes. In contrast, each property or descriptor provides
access interception for only one specific
attribute—we can’t catch every attribute fetch with a single
property or descriptor. On the other hand, properties and
descriptors handle both attribute fetch and
assignment by design: __getattr__
and __getattribute__
handle fetches only;
to intercept assignments as well, __setattr__
must also be coded. The
implementation is also different: __getattr__
and __getattribute__
are operator
overloading methods, whereas properties and descriptors are
objects manually assigned to class attributes.
No it isn’t. To quote from Python namesake Monty Python’s Flying Circus:
An argument is a connected series of statements intended to establish a
proposition.
No it isn't.
Yes it is! It's not just contradiction.
Look, if I argue with you, I must take up a contrary position.
Yes, but that's not just saying "No it isn't."
Yes it is!
No it isn't!
Yes it is!
No it isn't. Argument is an intellectual process. Contradiction is just
the automatic gainsaying of any statement the other person makes.
(short pause)
No it isn't.
It is.
Not at all.
Now look...