ABCs of collections

The collections.abc module provides a wealth of ABCs that decompose collections into a number of discrete feature sets. A related set of features of a class is called a protocol: the idea is that things such as getting, setting, and deleting items are the protocol for list-like behavior. Similarly, the __iter__() method is part of the protocol for defining an iterable collection. A list often implements both protocols, but some data structures may support fewer protocols. Support for a given protocol is often exploited by mypy algorithms to determine whether an object is being used properly.

We can successfully use the list class without thinking too deeply about the various features and how they relate to the set class or the dict class. Once we start looking at the ABCs, however, we can see that there's a bit of subtlety to these classes. By decomposing the aspects of each collection, we can see areas of overlap that manifest themselves as an elegant polymorphism, even among different data structures.

At the bottom of the base classes are some definitions of the core protocols for collections.

These are the base classes that often define a single special method:

  • The Container base class requires the concrete class to implement the __contains__() method. This special method implements the in operator.
  • The Iterable base class requires __iter__(). This special method is used by the for statement and the generator expressions as well as the iter() function.
  • The Sized base class requires __len__(). This method is used by the len() function. It's also prudent to implement __bool__(), but it's not required by this ABC.
  • The Hashable base class requires __hash__(). This is used by the hash() function. If this is implemented, it means that the object is immutable.

Each of these abstract class definitions is used to build the higher-level, composite definitions of structures we can use in our applications. These composite constructs include the lower-level base classes of Sized, Iterable, and Container. Here are some composite base classes that we might use in an application:

  • The Sequence and MutableSequence classes build on the basics and include methods such as index(), count(), reverse(), extend(), and remove().
  • The Mapping and MutableMapping classes include methods such as keys(), items(), values(), and get(), among others.
  • The Set and MutableSet classes include comparison and arithmetic operators to perform set operations.

If we look more deeply into the built-in collections, we can see how the ABC definitions serve to organize the special methods that we need to write or modify.

The collections module also contains three concrete implementations: UserDict, UserList and UserString. UserDict is a version of the built-in dictionary, with the details exposed. Similarly, UserList and UserString provide implementations that can be extended through subclasses. These can be helpful to see how a collection is built. In older versions of Python, these were used as superclasses and were extended because the built-in types could not easily be extended. In Python 3, the built-in types are trivially extended: these are rarely used except as example code.

Let's take a look at some examples of special methods in the next section.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset