Descriptors are classes which provide access control for the attributes of other classes. Any class that implements one or more of the descriptor special methods, __get__()
, __set__()
, and __delete__()
, is called (and can be used as) a descriptor.
The built-in property()
and classmethod()
functions are implemented using descriptors. The key to understanding descriptors is that although we create an instance of a descriptor in a class as a class attribute, Python accesses the descriptor through the class’s instances.
To make things clear, let’s imagine that we have a class whose instances hold some strings. We want to access the strings in the normal way, for example, as a property, but we also want to get an XML-escaped version of the strings whenever we want. One simple solution would be that whenever a string is set we immediately create an XML-escaped copy. But if we had thousands of strings and only ever read the XML version of a few of them, we would be wasting a lot of processing and memory for nothing. So we will create a descriptor that will provide XML-escaped strings on demand without storing them. We will start with the beginning of the client (owner) class, that is, the class that uses the descriptor:
class Product:
__slots__ = ("__name", "__description", "__price")
name_as_xml = XmlShadow("name")
description_as_xml = XmlShadow("description")
def __init__(self, name, description, price):
self.__name = name
self.description = description
self.price = price
The only code we have not shown are the properties; the name is a read-only property and the description and price are readable/writable properties, all set up in the usual way. (All the code is in the XmlShadow.py
file.) We have used the __slots__
variable to ensure that the class has no __dict__
and can store only the three specified private attributes; this is not related to or necessary for our use of descriptors. The name_as_xml
and description_as_xml
class attributes are set to be instances of the XmlShadow
descriptor. Although no Product
object has a name_as_xml
attribute or a description_as_xml
attribute, thanks to the descriptor we can write code like this (here quoting from the module’s doctests):
>>> product = Product("Chisel <3cm>", "Chisel & cap", 45.25)
>>> product.name, product.name_as_xml, product.description_as_xml
('Chisel <3cm>', 'Chisel <3cm>', 'Chisel & cap')
This works because when we try to access, for example, the name_as_xml
attribute, Python finds that the Product
class has a descriptor with that name, and so uses the descriptor to get the attribute’s value. Here’s the complete code for the XmlShadow
descriptor class:
class XmlShadow:
def __init__(self, attribute_name):
self.attribute_name = attribute_name
def __get__(self, instance, owner=None):
return xml.sax.saxutils.escape(
getattr(instance, self.attribute_name))
When the name_as_xml
and description_as_xml
objects are created we pass the name of the Product
class’s corresponding attribute to the XmlShadow
initializer so that the descriptor knows which attribute to work on. Then, when the name_as_xml
or description_as_xml
attribute is looked up, Python calls the descriptor’s __get__()
method. The self
argument is the instance of the descriptor, the instance
argument is the Product
instance (i.e., the product’s self
), and the owner
argument is the owning class (Product
in this case). We use the getattr()
function to retrieve the relevant attribute from the product (in this case the relevant property), and return an XML-escaped version of it.
If the use case was that only a small proportion of the products were accessed for their XML strings, but the strings were often long and the same ones were frequently accessed, we could use a cache. For example:
class CachedXmlShadow:
def __init__(self, attribute_name):
self.attribute_name = attribute_name
self.cache = {}
def __get__(self, instance, owner=None):
xml_text = self.cache.get(id(instance))
if xml_text is not None:
return xml_text
return self.cache.setdefault(id(instance),
xml.sax.saxutils.escape(
getattr(instance, self.attribute_name)))
We store the unique identity of the instance as the key rather than the instance itself because dictionary keys must be hashable (which IDs are), but we don’t want to impose that as a requirement on classes that use the CachedXmlShadow
descriptor. The key is necessary because descriptors are created per class rather than per instance. (The dict.setdefault()
method conveniently returns the value for the given key, or if no item with that key is present, creates a new item with the given key and value and returns the value.)
Having seen descriptors used to generate data without necessarily storing it, we will now look at a descriptor that can be used to store all of an object’s attribute data, with the object not needing to store anything itself. In the example, we will just use a dictionary, but in a more realistic context, the data might be stored in a file or a database. Here’s the start of a modified version of the Point
class that makes use of the descriptor (from the ExternalStorage.py
file):
class Point:
__slots__ = ()
x = ExternalStorage("x")
y = ExternalStorage("y")
def __init__(self, x=0, y=0):
self.x = x
self.y = y
By setting __slots__
to an empty tuple we ensure that the class cannot store any data attributes at all. When self.x
is assigned to, Python finds that there is a descriptor with the name “x”, and so uses the descriptor’s __set__()
method. Here is the complete ExternalStorage
descriptor class:
class ExternalStorage:
__slots__ = ("attribute_name",)
__storage = {}
def __init__(self, attribute_name):
self.attribute_name = attribute_name
def __set__(self, instance, value):
self.__storage[id(instance), self.attribute_name] = value
def __get__(self, instance, owner=None):
if instance is None:
return self
return self.__storage[id(instance), self.attribute_name]
Each ExternalStorage
object has a single data attribute, attribute_name
, which holds the name of the owner class’s data attribute. Whenever an attribute is set we store its value in the private class dictionary, __storage
. Similarly, whenever an attribute is retrieved we get it from the __storage
dictionary.
As with all descriptor methods, self
is the instance of the descriptor object and instance
is the self
of the object that contains the descriptor, so here self
is an ExternalStorage
object and instance
is a Point
object.
Although __storage
is a class attribute, we can access it as self.__storage
(just as we can call methods using self.
method()
), because Python will look for it as an instance attribute, and not finding it will then look for it as a class attribute. The one (theoretical) disadvantage of this approach is that if we have a class attribute and an instance attribute with the same name, one would hide the other. (If this were really a problem we could always refer to the class attribute using the class, that is, ExternalStorage.__storage
. Although hard-coding the class does not play well with subclassing in general, it doesn’t really matter for private attributes since Python name-mangles the class name into them anyway.)
The implementation of the __get__()
special method is slightly more sophisticated than before because we provide a means by which the ExternalStorage
instance itself can be accessed. For example, if we have p = Point(3, 4)
, we can access the x-coordinate with p.x
, and we can access the ExternalStorage
object that holds all the x
s with Point.x
.
To complete our coverage of descriptors we will create the Property
descriptor that mimics the behavior of the built-in property()
function, at least for setters and getters. The code is in Property.py
. Here is the complete NameAndExtension
class that makes use of it:
class NameAndExtension:
def __init__(self, name, extension):
self.__name = name
self.extension = extension
@Property # Uses the custom Property descriptor
def name(self):
return self.__name
@Property # Uses the custom Property descriptor
def extension(self):
return self.__extension
@extension.setter # Uses the custom Property descriptor
def extension(self, extension):
self.__extension = extension
The usage is just the same as for the built-in @property
decorator and for the @
propertyName
.setter
decorator. Here is the start of the Property
descriptor’s implementation:
class Property:
def __init__(self, getter, setter=None):
self.__getter = getter
self.__setter = setter
self.__name__ = getter.__name__
The class’s initializer takes one or two functions as arguments. If it is used as a decorator, it will get just the decorated function and this becomes the getter, while the setter is set to None
. We use the getter’s name as the property’s name. So for each property, we have a getter, possibly a setter, and a name.
def __get__(self, instance, owner=None):
if instance is None:
return self
return self.__getter(instance)
When a property is accessed we return the result of calling the getter function where we have passed the instance as its first parameter. At first sight, self.__getter()
looks like a method call, but it is not. In fact, self.__getter
is an attribute, one that happens to hold an object reference to a method that was passed in. So what happens is that first we retrieve the attribute (self.__getter
), and then we call it as a function ()
. And because it is called as a function rather than as a method we must pass in the relevant self
object explicitly ourselves. And in the case of a descriptor the self
object (from the class that is using the descriptor) is called instance
(since self
is the descriptor object). The same applies to the __set__()
method.
def __set__(self, instance, value):
if self.__setter is None:
raise AttributeError("'{0}' is read-only".format(
self.__name__))
return self.__setter(instance, value)
If no setter has been specified, we raise an AttributeError
; otherwise, we call the setter with the instance and the new value.
def setter(self, setter):
self.__setter = setter
return self.__setter
This method is called when the interpreter reaches, for example, @extension.setter
, with the function it decorates as its setter
argument. It stores the setter method it has been given (which can now be used in the __set__()
method), and returns the setter, since decorators should return the function or method they decorate.
We have now looked at three quite different uses of descriptors. Descriptors are a very powerful and flexible feature that can be used to do lots of under-the-hood work while appearing to be simple attributes in their client (owner) class.