This chapter presents the list and dictionary object types, both of which are collections of other objects. These two types are the main workhorses in almost all Python scripts. As you’ll see, both types are remarkably flexible: they can be changed in-place, can grow and shrink on demand, and may contain and be nested in any other kind of object. By leveraging these types, you can build up and process arbitrarily rich information structures in your scripts.
The next stop on our built-in object tour is the Python list. Lists are Python’s most flexible ordered collection object type. Unlike strings, lists can contain any sort of object: numbers, strings, and even other lists. Also, unlike strings, lists may be changed in-place by assignment to offsets and slices, list method calls, deletion statements, and more—they are mutable objects.
Python lists do the work of most of the collection data structures you might have to implement manually in lower-level languages such as C. Here is a quick look at their main properties. Python lists are:
From a functional view, lists are just places to collect other objects so you can treat them as groups. Lists also maintain a left-to-right positional ordering among the items they contain (i.e., they are sequences).
Just as with strings, you can fetch a component object out of a list by indexing the list on the object’s offset. Because items in lists are ordered by their positions, you can also do tasks such as slicing and concatenation.
Unlike strings, lists can grow and shrink in-place (their lengths can vary), and they can contain any sort of object, not just one-character strings (they’re heterogeneous). Because lists can contain other complex objects, they also support arbitrary nesting; you can create lists of lists of lists, and so on.
In terms of our type category qualifiers, lists are mutable (i.e., can be changed in-place) and can respond to all the sequence operations used with strings, such as indexing, slicing, and concatenation. In fact, sequence operations work the same on lists as they do on strings; the only difference is that sequence operations such as concatenation and slicing return new lists instead of new strings when applied to lists. Because lists are mutable, however, they also support other operations that strings don’t (such as deletion and index assignment operations, which change the lists in-place).
Technically, Python lists contain zero or more references to other objects. Lists might remind you of arrays of pointers (addresses) if you have a background in some other languages. Fetching an item from a Python list is about as fast as indexing a C array; in fact, lists really are arrays inside the standard Python interpreter, not linked structures. As we learned in Chapter 6, though, Python always follows a reference to an object whenever the reference is used, so your program deals only with objects. Whenever you assign an object to a data structure component or variable name, Python always stores a reference to that same object, not a copy of it (unless you request a copy explicitly).
Table 8-1
summarizes common and representative list object operations. As usual,
for the full story see the Python standard library manual, or run a
help(list)
or dir(list)
call interactively for a complete
list of list methods—you can pass in a real list, or the word list
, which is the name of the list data
type.
Operation | Interpretation |
| An empty list |
| Four items: indexes 0..3 |
| Nested sublists |
| Lists of an iterable’s items, list of successive integers |
| Index, index of index, slice, length |
| Concatenate, repeat |
| Iteration, membership |
| Methods: growing |
| Methods: searching |
| Methods: sorting, reversing, etc. |
| Methods, statement: shrinking |
| Index assignment, slice assignment |
|
When written down as a literal expression, a list is coded as a series of objects (really,
expressions that return objects) in square brackets, separated by commas. For instance, the
second row in Table 8-1
assigns the variable L
to a
four-item list. A nested list is coded as a nested square-bracketed
series (row 3), and the empty list is just a square-bracket pair with
nothing inside (row 1).[21]
Many of the operations in Table 8-1 should look familiar, as they are the same sequence operations we put to work on strings—indexing, concatenation, iteration, and so on. Lists also respond to list-specific method calls (which provide utilities such as sorting, reversing, adding items to the end, etc.), as well as in-place change operations (deleting items, assignment to indexes and slices, and so forth). Lists have these tools for change operations because they are a mutable object type.
Perhaps the best way to understand lists is to see them at work. Let’s once again turn to some simple interpreter interactions to illustrate the operations in Table 8-1.
Because they are sequences, lists support many of the same
operations as strings. For example, lists respond to the +
and *
operators much like strings—they mean
concatenation and repetition here too, except that the result is a
new list, not a string:
%python
>>>len([1, 2, 3])
# Length 3 >>>[1, 2, 3] + [4, 5, 6]
# Concatenation [1, 2, 3, 4, 5, 6] >>>['Ni!'] * 4
# Repetition ['Ni!', 'Ni!', 'Ni!', 'Ni!']
Although the +
operator
works the same for lists and strings, it’s important to know that it
expects the same sort of sequence on both sides—otherwise, you get a
type error when the code runs. For instance, you cannot concatenate
a list and a string unless you first convert the list to a string
(using tools such as str
or
%
formatting) or convert the
string to a list (the list
built-in function does the trick):
>>>str([1, 2]) + "34"
# Same as "[1, 2]" + "34" '[1, 2]34' >>>[1, 2] + list("34")
# Same as [1, 2] + ["3", "4"] [1, 2, '3', '4']
More generally, lists respond to all the sequence operations we used on strings in the prior chapter, including iteration tools:
>>>3 in [1, 2, 3]
# Membership True >>>for x in [1, 2, 3]:
...print(x, end=' ')
# Iteration ... 1 2 3
We will talk more formally about for
iteration and the range
built-ins in Chapter 13, because they are related to
statement syntax. In short, for
loops step through items in any sequence from left to right,
executing one or more statements for each item.
The last items in Table 8-1, list
comprehensions and map
calls, are
covered in more detail in Chapter 14 and expanded on
in Chapter 20. Their
basic operation is straightforward, though—as introduced in Chapter 4, list comprehensions
are a way to build a new list by applying an expression to each item
in a sequence, and are close relatives to for
loops:
>>>res =
[c * 4 for c in 'SPAM']
# List comprehensions >>>res
['SSSS', 'PPPP', 'AAAA', 'MMMM']
This expression is functionally equivalent to a for
loop that builds up a list of results
manually, but as we’ll learn in later chapters, list comprehensions
are simpler to code and faster to run today:
>>>res = []
>>>for c in 'SPAM':
# List comprehension equivalent ...res.append(c * 4)
... >>>res
['SSSS', 'PPPP', 'AAAA', 'MMMM']
As also introduced in Chapter 4, the map
built-in function does similar work, but
applies a function to items in a sequence and collects all the
results in a new list:
>>> list(map(abs, [−1, −2, 0, 1, 2]))
# map function across sequence
[1, 2, 0, 1, 2]
Because we’re not quite ready for the full iteration story, we’ll postpone further details for now, but watch for a similar comprehension expression for dictionaries later in this chapter.
Because lists are sequences, indexing and slicing work the same way for lists as they do for strings. However, the result of indexing a list is whatever type of object lives at the offset you specify, while slicing a list always returns a new list:
>>>L = ['spam', 'Spam', 'SPAM!']
>>>L[2]
# Offsets start at zero 'SPAM!' >>>L[−2]
# Negative: count from the right 'Spam' >>>L[1:]
# Slicing fetches sections ['Spam', 'SPAM!']
One note here: because you can nest lists and other object types within lists, you will sometimes need to string together index operations to go deeper into a data structure. For example, one of the simplest ways to represent matrixes (multidimensional arrays) in Python is as lists with nested sublists. Here’s a basic 3 × 3 two-dimensional list-based array:
>>> matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
With one index, you get an entire row (really, a nested sublist), and with two, you get an item within the row:
>>>matrix[1]
[4, 5, 6] >>>matrix[1][1]
5 >>>matrix[2][0]
7 >>>matrix = [[1, 2, 3],
...[4, 5, 6],
...[7, 8, 9]]
>>>matrix[1][1]
5
Notice in the preceding interaction that lists can naturally span multiple lines if you want them to because they are contained by a pair of brackets (more on syntax in the next part of the book). Later in this chapter, you’ll also see a dictionary-based matrix representation. For high-powered numeric work, the NumPy extension mentioned in Chapter 5 provides other ways to handle matrixes.
Because lists are mutable, they support operations that change a list object in-place. That is, the operations in this section all modify the list object directly, without requiring that you make a new copy, as you had to for strings. Because Python deals only in object references, this distinction between changing an object in-place and creating a new object matters—as discussed in Chapter 6, if you change an object in-place, you might impact more than one reference to it at the same time.
When using a list, you can change its contents by assigning to either a particular item (offset) or an entire section (slice):
>>>L = ['spam', 'Spam', 'SPAM!']
>>>L[1] = 'eggs'
# Index assignment >>>L
['spam', 'eggs', 'SPAM!'] >>>L[0:2] = ['eat', 'more']
# Slice assignment: delete+insert >>>L
# Replaces items 0,1 ['eat', 'more', 'SPAM!']
Both index and slice assignments are in-place changes—they modify the subject list directly, rather than generating a new list object for the result. Index assignment in Python works much as it does in C and most other languages: Python replaces the object reference at the designated offset with a new one.
Slice assignment, the last operation in the preceding example, replaces an entire section of a list in a single step. Because it can be a bit complex, it is perhaps best thought of as a combination of two steps:
Deletion. The slice you specify to
the left of the =
is
deleted.
Insertion. The new items contained
in the object to the right of the =
are inserted into the list on the
left, at the place where the old slice was deleted.[22]
This isn’t what really happens, but it tends to help clarify
why the number of items inserted doesn’t have to match the number
of items deleted. For instance, given a list L
that has the value [1,2,3]
, the assignment L[1:2]=[4,5]
sets L
to the list [1,4,5,3]
. Python first deletes the
2
(a one-item slice), then
inserts the 4
and 5
where the deleted 2
used to be. This also explains why
L[1:2]=[]
is really a deletion
operation—Python deletes the slice (the item at offset 1), and
then inserts nothing.
In effect, slice assignment replaces an entire section, or
“column,” all at once. Because the length of the sequence being
assigned does not have to match the length of the slice being
assigned to, slice assignment can be used to replace (by
overwriting), expand (by inserting), or shrink (by deleting) the
subject list. It’s a powerful operation, but frankly, one that you
may not see very often in practice. There are usually more
straightforward ways to replace, insert, and delete (concatenation
and the insert
, pop
, and remove
list methods, for example), which
Python programmers tend to prefer in practice.
Like strings, Python list objects also support type-specific method calls, many of which change the subject list in-place:
>>>L.append('please')
# Append method call: add item at end >>>L
['eat', 'more', 'SPAM!', 'please'] >>>L.sort()
# Sort list items ('S' < 'e') >>>L
['SPAM!', 'eat', 'more', 'please']
Methods were introduced in Chapter 7. In brief, they are functions (really, attributes that reference functions) that are associated with particular objects. Methods provide type-specific tools; the list methods presented here, for instance, are generally available only for lists.
Perhaps the most commonly used list method is append
, which simply tacks a single item
(object reference) onto the end of the list. Unlike concatenation,
append
expects you to pass in a
single object, not a list. The effect of L.append(X)
is similar to L+[X]
, but while the former changes
L
in-place, the latter makes a
new list.[23]
Another commonly seen method, sort
, orders a list in-place; it uses
Python standard comparison tests (here, string comparisons), and
by default sorts in ascending order. You can modify sort behavior
by passing in keyword arguments—a special “name=value”
syntax in function calls that specifies passing by name and is
often used for giving configuration options. In sorts, the
key
argument gives a
one-argument function that returns the value to be used in
sorting, and the reverse
argument allows sorts to be made in descending instead of
ascending order:
>>>L = ['abc', 'ABD', 'aBe']
>>>L.sort()
# Sort with mixed case >>>L
['ABD', 'aBe', 'abc'] >>>L = ['abc', 'ABD', 'aBe']
>>>L.sort(key=str.lower)
# Normalize to lowercase >>>L
['abc', 'ABD', 'aBe'] >>> >>>L = ['abc', 'ABD', 'aBe']
>>>L.sort(key=str.lower, reverse=True)
# Change sort order >>>L
['aBe', 'ABD', 'abc']
The sort key
argument
might also be useful when sorting lists of dictionaries, to pick
out a sort key by indexing each dictionary. We’ll study
dictionaries later in this chapter, and you’ll learn more about
keyword function arguments in Part IV.
Comparison and sorts in 3.0:
In Python 2.6 and earlier, comparisons of differently typed objects (e.g., a
string and a list) work—the language defines a fixed ordering
among different types, which is deterministic, if not
aesthetically pleasing. That is, the ordering is based on the
names of the types involved: all integers are less than all
strings, for example, because "int"
is less than "str"
. Comparisons never automatically
convert types, except when comparing numeric type
objects.
In Python 3.0, this has changed: comparison of mixed types
raises an exception instead of falling back on the fixed
cross-type ordering. Because sorting uses comparisons
internally, this means that [1, 2,
'spam'].sort()
succeeds in Python 2.X but will raise
an exception in Python 3.0 and later.
Python 3.0 also no longer supports passing in an arbitrary
comparison function to sorts, to implement
different orderings. The suggested workaround is to use the
key=
func
keyword argument to code value transformations during the sort,
and use the reverse=True
keyword argument to change the sort order to descending. These
were the typical uses of comparison functions in the
past.
One warning here: beware that append
and sort
change the associated list object
in-place, but don’t return the list as a result (technically, they
both return a value called None
). If you say something like
L=L.append(X)
, you won’t get
the modified value of L
(in
fact, you’ll lose the reference to the list altogether!). When you
use attributes such as append
and sort
, objects are changed
as a side effect, so there’s no reason to reassign.
Partly because of such constraints, sorting is also available in recent Pythons as a built-in function, which sorts any collection (not just lists) and returns a new list for the result (instead of in-place changes):
>>>L = ['abc', 'ABD', 'aBe']
>>>sorted(L, key=str.lower, reverse=True)
# Sorting built-in ['aBe', 'ABD', 'abc'] >>>L = ['abc', 'ABD', 'aBe']
>>>sorted([x.lower() for x in L], reverse=True)
# Pretransform items: differs! ['abe', 'abd', 'abc']
Notice the last example here—we can convert to lowercase
prior to the sort with a list comprehension, but the result does
not contain the original list’s values as it does with the
key
argument. The latter is
applied temporarily during the sort, instead of changing the
values to be sorted. As we move along, we’ll see contexts in which
the sorted
built-in can
sometimes be more useful than the sort
method.
Like strings, lists have other methods that perform other
specialized operations. For instance, reverse
reverses the list in-place, and
the extend
and pop
methods insert multiple items at the
end of and delete an item from the end of the list, respectively.
There is also a reversed
built-in function that works much like sorted
, but it must be wrapped in a
list
call because it’s an
iterator (more on iterators later):
>>>L = [1, 2]
>>>L.extend([3,4,5])
# Add many items at end >>>L
[1, 2, 3, 4, 5] >>>L.pop()
# Delete and return last item 5 >>>L
[1, 2, 3, 4] >>>L.reverse()
# In-place reversal method >>>L
[4, 3, 2, 1] >>>list(reversed(L))
# Reversal built-in with a result [1, 2, 3, 4]
In some types of programs, the list pop
method used here is often used in
conjunction with append
to
implement a quick last-in-first-out (LIFO)
stack structure. The end of the list serves
as the top of the stack:
>>>L = []
>>>L.append(1)
# Push onto stack >>>L.append(2)
>>>L
[1, 2] >>>L.pop()
# Pop off stack 2 >>>L
[1]
The pop
method also
accepts an optional offset of the item to be deleted and returned
(the default is the last item). Other list methods remove an item
by value (remove
), insert an item at an offset
(insert
), search for an item’s offset
(index
), and more:
>>>L = ['spam', 'eggs', 'ham']
>>>L.index('eggs')
# Index of an object 1 >>>L.insert(1, 'toast')
# Insert at position >>>L
['spam', 'toast', 'eggs', 'ham'] >>>L.remove('eggs')
# Delete by value >>>L
['spam', 'toast', 'ham'] >>>L.pop(1)
# Delete by position 'toast' >>>L
['spam', 'ham']
See other documentation sources or experiment with these calls interactively on your own to learn more about list methods.
Because lists are mutable, you can use the del
statement to delete an item or
section in-place:
>>>L
['SPAM!', 'eat', 'more', 'please'] >>>del L[0]
# Delete one item >>>L
['eat', 'more', 'please'] >>>del L[1:]
# Delete an entire section >>>L
# Same as L[1:] = [] ['eat']
Because slice assignment is a deletion plus an insertion,
you can also delete a section of a list by assigning an empty list
to a slice (L[i:j]=[]
); Python
deletes the slice named on the left, and then inserts nothing.
Assigning an empty list to an index, on the other hand, just
stores a reference to the empty list in the specified slot, rather
than deleting it:
>>>L = ['Already', 'got', 'one']
>>>L[1:] = []
>>>L
['Already'] >>>L[0] = []
>>>L
[[]]
Although all the operations just discussed are typical,
there are additional list methods and operations not illustrated
here (including methods for inserting and searching). For a
comprehensive and up-to-date list of type tools, you should always
consult Python’s manuals,
Python’s dir
and help
functions (which we first met in
Chapter 4), or one of the
reference texts mentioned in the Preface.
I’d also like to remind you one more time that all the in-place change operations discussed here work only for mutable objects: they won’t work on strings (or tuples, discussed in Chapter 9), no matter how hard you try. Mutability is an inherent property of each object type.
Apart from lists, dictionaries are perhaps the most flexible built-in data type in Python. If you think of lists as ordered collections of objects, you can think of dictionaries as unordered collections; the chief distinction is that in dictionaries, items are stored and fetched by key, instead of by positional offset.
Being a built-in type, dictionaries can replace many of the searching algorithms and data structures you might have to implement manually in lower-level languages—indexing a dictionary is a very fast search operation. Dictionaries also sometimes do the work of records and symbol tables used in other languages, can represent sparse (mostly empty) data structures, and much more. Here’s a rundown of their main properties. Python dictionaries are:
Dictionaries are sometimes called associative arrays or hashes. They associate a set of values with keys, so you can fetch an item out of a dictionary using the key under which you originally stored it. You use the same indexing operation to get components in a dictionary as you do in a list, but the index takes the form of a key, not a relative offset.
Unlike in a list, items stored in a dictionary aren’t kept in any particular order; in fact, Python randomizes their left-to-right order to provide quick lookup. Keys provide the symbolic (not physical) locations of items in a dictionary.
Like lists, dictionaries can grow and shrink in-place (without new copies being made), they can contain objects of any type, and they support nesting to any depth (they can contain lists, other dictionaries, and so on).
Dictionaries can be changed in-place by assigning to indexes (they are mutable), but they don’t support the sequence operations that work on strings and lists. Because dictionaries are unordered collections, operations that depend on a fixed positional order (e.g., concatenation, slicing) don’t make sense. Instead, dictionaries are the only built-in representatives of the mapping type category (objects that map keys to values).
If lists are arrays of object references that support access by position, dictionaries are unordered tables of object references that support access by key. Internally, dictionaries are implemented as hash tables (data structures that support very fast retrieval), which start small and grow on demand. Moreover, Python employs optimized hashing algorithms to find keys, so retrieval is quick. Like lists, dictionaries store object references (not copies).
Table 8-2
summarizes some of the most common and representative dictionary
operations (again, see the library manual or run a dir(dict)
or help(dict)
call for a complete list—dict
is the name of the type). When
coded as a literal expression, a dictionary is written
as a series of key
:
value
pairs,
separated by commas, enclosed in curly braces.[24] An empty dictionary is an empty set of braces, and
dictionaries can be nested by writing one as a value inside another
dictionary, or within a list or tuple.
Operation | Interpretation |
| Empty dictionary |
| Two-item dictionary |
| Nesting |
| Alternative construction techniques: keywords, zipped pairs, key lists |
| Indexing by key |
| Membership: key present test |
| Methods: keys, values, keys+values, copies, defaults, merge, delete, etc. |
| Length: number of stored entries |
| Adding/changing keys |
| Deleting entries by key |
| Dictionary views (Python 3.0) |
| Dictionary comprehensions (Python 3.0) |
As Table 8-2 suggests, dictionaries are indexed by key, and nested dictionary entries are referenced by a series of indexes (keys in square brackets). When Python creates a dictionary, it stores its items in any left-to-right order it chooses; to fetch a value back, you supply the key with which it is associated, not its relative position. Let’s go back to the interpreter to get a feel for some of the dictionary operations in Table 8-2.
In normal operation, you create dictionaries with literals and store and access items by key with indexing:
%python
>>>D = {'spam': 2, 'ham': 1, 'eggs': 3}
# Make a dictionary >>>D['spam']
# Fetch a value by key 2 >>>D
# Order is scrambled {'eggs': 3, 'ham': 1, 'spam': 2}
Here, the dictionary is assigned to the variable D
; the value of the key 'spam'
is the integer 2
, and so on. We use the same square
bracket syntax to index dictionaries by key as we did to index lists
by offset, but here it means access by key, not by position.
Notice the end of this example: the left-to-right order of keys in a dictionary will almost always be different from what you originally typed. This is on purpose: to implement fast key lookup (a.k.a. hashing), keys need to be reordered in memory. That’s why operations that assume a fixed left-to-right order (e.g., slicing, concatenation) do not apply to dictionaries; you can fetch values only by key, not by position.
The built-in len
function
works on dictionaries, too; it returns the number of items stored in
the dictionary or, equivalently, the length of its keys list. The
dictionary in
membership operator
allows you to test for key existence, and the keys
method returns all the keys in the
dictionary. The latter of these can be useful for processing
dictionaries sequentially, but you shouldn’t depend on the order of
the keys list. Because the keys
result can be used as a normal list, however, it can always be
sorted if order matters (more on sorting and dictionaries
later):
>>>len(D)
# Number of entries in dictionary 3 >>>'ham' in D
# Key membership test alternative True >>>list(D.keys())
# Create a new list of my keys ['eggs', 'ham', 'spam']
Notice the second expression in this listing. As mentioned
earlier, the in
membership test
used for strings and lists also works on dictionaries—it checks
whether a key is stored in the dictionary. Technically, this works
because dictionaries define iterators that step
through their keys lists. Other types provide iterators that reflect
their common uses; files, for
example, have iterators that read line by line. We’ll discuss
iterators in Chapters 14 and 20.
Also note the syntax of the last example in this listing. We
have to enclose it in a list
call
in Python 3.0 for similar reasons—keys
in 3.0 returns an iterator, instead
of a physical list. The list
call
forces it to produce all its values at once so we can print them. In
2.6, keys
builds and returns an
actual list, so the list
call
isn’t needed to display results. More on this later in this
chapter.
The order of keys in a dictionary is arbitrary and can change from release to release, so don’t be alarmed if your dictionaries print in a different order than shown here. In fact, the order has changed for me too—I’m running all these examples with Python 3.0, but their keys had a different order in an earlier edition when displayed. You shouldn’t depend on dictionary key ordering, in either programs or books!
Let’s continue with our interactive session. Dictionaries,
like lists, are mutable, so you can change, expand, and shrink them
in-place without making new dictionaries: simply assign a value to a
key to change or create an entry. The del
statement works here, too; it deletes
the entry associated with the key specified as an index. Notice also
the nesting of a list inside a dictionary in this example (the value
of the key 'ham'
). All collection
data types in Python can nest inside each other arbitrarily:
>>>D
{'eggs': 3, 'ham': 1, 'spam': 2} >>>D['ham'] = ['grill', 'bake', 'fry']
# Change entry >>>D
{'eggs': 3, 'ham': ['grill', 'bake', 'fry'], 'spam': 2} >>>del D['eggs']
# Delete entry >>>D
{'ham': ['grill', 'bake', 'fry'], 'spam': 2} >>>D['brunch'] = 'Bacon'
# Add new entry >>>D
{'brunch': 'Bacon', 'ham': ['grill', 'bake', 'fry'], 'spam': 2}
As with lists, assigning to an existing index in a dictionary
changes its associated value. Unlike with lists, however, whenever
you assign a new dictionary key (one that
hasn’t been assigned before) you create a new entry in the
dictionary, as was done in the previous example for the key 'brunch'
. This doesn’t work for lists
because Python considers an offset beyond the end of a list out of
bounds and throws an error. To expand a list, you need to use tools
such as the append
method or
slice assignment instead.
Dictionary methods provide a variety of tools. For instance, the
dictionary values
and items
methods return the dictionary’s values and (
key
,
value
)
pair tuples, respectively (as with
keys
, wrap them in a list
call in Python 3.0 to collect their
values for display):
>>>D = {'spam': 2, 'ham': 1, 'eggs': 3}
>>>list(D.values())
[3, 1, 2] >>>list(D.items())
[('eggs', 3), ('ham', 1), ('spam', 2)]
Such lists are useful in loops that need to step through
dictionary entries one by one. Fetching a nonexistent key is
normally an error, but the get
method returns a default value (None
, or a passed-in default) if the key
doesn’t exist. It’s an easy way to fill in a default for a key that
isn’t present and avoid a missing-key error:
>>>D.get('spam')
# A key that is there 2 >>>print(D.get('toast'))
# A key that is missing None >>>D.get('toast', 88)
88
The update
method provides something similar to concatenation for
dictionaries, though it has nothing to do with left-to-right
ordering (again, there is no such thing in dictionaries). It merges
the keys and values of one dictionary into another, blindly
overwriting values of the same key:
>>>D
{'eggs': 3, 'ham': 1, 'spam': 2} >>>D2 = {'toast':4, 'muffin':5}
>>>D.update(D2)
>>>D
{'toast': 4, 'muffin': 5, 'eggs': 3, 'ham': 1, 'spam': 2}
Finally, the dictionary pop
method deletes a key from a dictionary and returns the value it had.
It’s similar to the list pop
method, but it takes a key instead of an optional position:
# pop a dictionary by key >>>D
{'toast': 4, 'muffin': 5, 'eggs': 3, 'ham': 1, 'spam': 2} >>>D.pop('muffin')
5 >>>D.pop('toast')
# Delete and return from a key 4 >>>D
{'eggs': 3, 'ham': 1, 'spam': 2} # pop a list by position >>>L = ['aa', 'bb', 'cc', 'dd']
>>>L.pop()
# Delete and return from the end 'dd' >>>L
['aa', 'bb', 'cc'] >>>L.pop(1)
# Delete from a specific position 'bb' >>>L
['aa', 'cc']
Dictionaries also provide a copy
method; we’ll discuss this in Chapter 9, as it’s a way
to avoid the potential side effects of shared references to the same
dictionary. In fact, dictionaries come with many more methods than
those listed in Table 8-2; see the Python
library manual or other documentation sources for a comprehensive
list.
Let’s look at a more realistic dictionary example. The following example creates a table that maps programming language names (the keys) to their creators (the values). You fetch creator names by indexing on language names:
>>>table = {'Python': 'Guido van Rossum',
...'Perl': 'Larry Wall',
...'Tcl': 'John Ousterhout' }
>>> >>>language = 'Python'
>>>creator = table[language]
>>>creator
'Guido van Rossum' >>>for lang in table:
# Same as: for lang in table.keys() ...print(lang, ' ', table[lang])
... Tcl John Ousterhout Python Guido van Rossum Perl Larry Wall
The last command uses a for
loop, which we haven’t covered in detail yet. If you aren’t familiar
with for
loops, this command
simply iterates through each key in the table and prints a
tab-separated list of keys and their values. We’ll learn more about
for
loops in Chapter 13.
Dictionaries aren’t sequences like lists and strings, but if
you need to step through the items in a dictionary, it’s
easy—calling the dictionary keys
method returns all stored keys, which you can iterate through with a
for
. If needed, you can index
from key to value inside the for
loop, as was done in this code.
In fact, Python also lets you step through a dictionary’s keys
list without actually calling the keys
method in most for
loops. For any dictionary D
, saying for key
in D:
works the same as saying the complete for key in D.keys():
. This is really just
another instance of the iterators mentioned earlier, which allow the
in
membership operator to work on
dictionaries as well (more on iterators later in this book).
Dictionaries are fairly straightforward tools once you get the hang of them, but here are a few additional pointers and reminders you should be aware of when using them:
Sequence operations don’t work. Dictionaries are mappings, not sequences; because there’s no notion of ordering among their items, things like concatenation (an ordered joining) and slicing (extracting a contiguous section) simply don’t apply. In fact, Python raises an error when your code runs if you try to do such things.
Assigning to new indexes adds entries. Keys can be created when you write a dictionary literal (in which case they are embedded in the literal itself), or when you assign values to new keys of an existing dictionary object. The end result is the same.
Keys need not always be strings. Our examples so far have used strings as keys, but any other immutable objects (i.e., not lists) work just as well. For instance, you can use integers as keys, which makes the dictionary look much like a list (when indexing, at least). Tuples are sometimes used as dictionary keys too, allowing for compound key values. Class instance objects (discussed in Part VI) can also be used as keys, as long as they have the proper protocol methods; roughly, they need to tell Python that their values are hashable and won’t change, as otherwise they would be useless as fixed keys.
The last point in the prior list is important enough to demonstrate with a few examples. When you use lists, it is illegal to assign to an offset that is off the end of the list:
>>>L = []
>>>L[99] = 'spam'
Traceback (most recent call last): File "<stdin>", line 1, in ? IndexError: list assignment index out of range
Although you can use repetition to preallocate as big a list
as you’ll need (e.g., [0]*100
),
you can also do something that looks similar with dictionaries
that does not require such space allocations. By using integer
keys, dictionaries can emulate lists that seem to grow on offset
assignment:
>>>D = {}
>>>D[99] = 'spam'
>>>D[99]
'spam' >>>D
{99: 'spam'}
Here, it looks as if D
is
a 100-item list, but it’s really a dictionary with a single entry;
the value of the key 99
is the
string 'spam'
. You can access
this structure with offsets much like a list, but you don’t have
to allocate space for all the positions you might ever need to
assign values to in the future. When used like this, dictionaries
are like more flexible equivalents of lists.
In a similar way, dictionary keys are also commonly leveraged to implement sparse data structures—for example, multidimensional arrays where only a few positions have values stored in them:
>>>Matrix = {}
>>>Matrix[(2, 3, 4)] = 88
>>>Matrix[(7, 8, 9)] = 99
>>> >>>X = 2; Y = 3; Z = 4
# ; separates statements >>>Matrix[(X, Y, Z)]
88 >>>Matrix
{(2, 3, 4): 88, (7, 8, 9): 99}
Here, we’ve used a dictionary to represent a
three-dimensional array that is empty except for the two positions
(2,3,4)
and (7,8,9)
. The keys are
tuples that record the coordinates of
nonempty slots. Rather than allocating a large and mostly empty
three-dimensional matrix to hold these values, we can use a simple
two-item dictionary. In this scheme, accessing an empty slot
triggers a nonexistent key exception, as these slots are not
physically stored:
>>> Matrix[(2,3,6)]
Traceback (most recent call last):
File "<stdin>", line 1, in ?
KeyError: (2, 3, 6)
Errors for nonexistent key fetches are common in sparse
matrixes, but you probably won’t want them to shut down your
program. There are at least three ways to fill in a default value
instead of getting such an error message—you can test for keys
ahead of time in if
statements,
use a try
statement to catch
and recover from the exception explicitly, or simply use the
dictionary get
method shown
earlier to provide a default for keys that do not exist:
>>>if (2,3,6) in Matrix:
# Check for key before fetch ...print(Matrix[(2,3,6)])
# See Chapter 12 for if/else ...else:
...print(0)
... 0 >>>try:
...print(Matrix[(2,3,6)])
# Try to index ...except KeyError:
# Catch and recover ...print(0)
# See Chapter 33 for try/except ... 0 >>>Matrix.get((2,3,4), 0)
# Exists; fetch and return 88 >>>Matrix.get((2,3,6), 0)
# Doesn't exist; use default arg 0
Of these, the get
method
is the most concise in terms of coding requirements; we’ll study
the if
and try
statements in more detail later in
this book.
As you can see, dictionaries can play many roles in Python. In general, they can replace search data structures (because indexing by key is a search operation) and can represent many types of structured information. For example, dictionaries are one of many ways to describe the properties of an item in your program’s domain; that is, they can serve the same role as “records” or “structs” in other languages.
The following, for example, fills out a dictionary by assigning to new keys over time:
>>>rec = {}
>>>rec['name'] = 'mel'
>>>rec['age'] = 45
>>>rec['job'] = 'trainer/writer'
>>> >>>print(rec['name'])
mel
Especially when nested, Python’s built-in data types allow us to easily represent structured information. This example again uses a dictionary to capture object properties, but it codes it all at once (rather than assigning to each key separately) and nests a list and a dictionary to represent structured property values:
>>>mel = {'name': 'Mark',
...'jobs': ['trainer', 'writer'],
...'web': 'www.rmi.net/˜lutz',
...'home': {'state': 'CO', 'zip':80513}}
To fetch components of nested objects, simply string together indexing operations:
>>>mel['name']
'Mark' >>>mel['jobs']
['trainer', 'writer'] >>>mel['jobs'][1]
'writer' >>>mel['home']['zip']
80513
Although we’ll learn in Part VI that classes (which group both data and logic) can be better in this record role, dictionaries are an easy-to-use tool for simpler requirements.
Finally, note that because dictionaries are so useful, more
ways to build them have emerged over time. In Python 2.3 and later,
for example, the last two calls to the dict
constructor (really, type name) shown
here have the same effect as the literal and key-assignment forms
above them:
{'name': 'mel', 'age': 45} # Traditional literal expression D = {} # Assign by keys dynamically D['name'] = 'mel' D['age'] = 45 dict(name='mel', age=45) # dict keyword argument form dict([('name', 'mel'), ('age', 45)]) # dict key/value tuples form
All four of these forms create the same two-key dictionary, but they are useful in differing circumstances:
The first is handy if you can spell out the entire dictionary ahead of time.
The second is of use if you need to create the dictionary one field at a time on the fly.
The third involves less typing than the first, but it requires all keys to be strings.
The last is useful if you need to build up keys and values as sequences at runtime.
We met keyword arguments earlier when sorting; the third form
illustrated in this code listing has become especially popular in
Python code today, since it has less syntax (and hence there is less
opportunity for mistakes). As suggested previously in Table 8-2, the last form
in the listing is also commonly used in conjunction with the
zip
function, to combine separate
lists of keys and values obtained dynamically at runtime (parsed out
of a data file’s columns, for instance). More on this option in the
next section.
Provided all the key’s values are the same initially, you can
also create a dictionary with this special form—simply pass in a
list of keys and an initial value for all of the values (the default
is None
):
>>> dict.fromkeys(['a', 'b'], 0)
{'a': 0, 'b': 0}
Although you could get by with just literals and key assignments at this point in your Python career, you’ll probably find uses for all of these dictionary-creation forms as you start applying them in realistic, flexible, and dynamic Python programs.
The listings in this section document the various ways to create dictionaries in both Python 2.6 and 3.0. However, there is yet another way to create dictionaries, available only in Python 3.0 (and later): the dictionary comprehension expression. To see how this last form looks, we need to move on to the next section.
This chapter has so far focused on dictionary basics that span releases, but the dictionary’s functionality has mutated in Python 3.0. If you are using Python 2.X code, you may come across some dictionary tools that either behave differently or are missing altogether in 3.0. Moreover, 3.0 coders have access to additional dictionary tools not available in 2.X. Specifically, dictionaries in 3.0:
Support a new dictionary comprehension expression, a close cousin to list and set comprehensions
Return iterable views instead of lists for the methods
D.keys
, D.values
, and D.items
Require new coding styles for scanning by sorted keys, because of the prior point
No longer support relative magnitude comparisons directly—compare manually instead
No longer have the D.has_key
method—the in
membership test is used
instead
Let’s take a look at what’s new in 3.0 dictionaries.
As mentioned at the end of the prior section, dictionaries in 3.0 can also be created with dictionary comprehensions. Like the set comprehensions we met in Chapter 5, dictionary comprehensions are available only in 3.0 (not in 2.6). Like the longstanding list comprehensions we met briefly in Chapter 4 and earlier in this chapter, they run an implied loop, collecting the key/value results of expressions on each iteration and using them to fill out a new dictionary. A loop variable allows the comprehension to use loop iteration values along the way.
For example, a standard way to initialize a
dictionary dynamically in both 2.6 and 3.0 is to zip together its
keys and values and pass the result to the dict
call. As we’ll learn in more detail
in Chapter 13, the zip
function is a way to construct a
dictionary from key and value lists in a single call. If you
cannot predict the set of keys and values in your code, you can
always build them up as lists and zip them together:
>>>list(zip(['a', 'b', 'c'], [1, 2, 3]))
# Zip together keys and values [('a', 1), ('b', 2), ('c', 3)] >>>D = dict(zip(['a', 'b', 'c'], [1, 2, 3]))
# Make a dict from zip result >>>D
{'a': 1, 'c': 3, 'b': 2}
In Python 3.0, you can achieve the same effect with a
dictionary comprehension expression. The following builds a new
dictionary with a key/value pair for every such pair in the
zip
result (it reads almost the
same in Python, but with a bit more formality):
C:misc>c:python30python
# Use a dict comprehension >>>D = {k: v for (k, v) in zip(['a', 'b', 'c'], [1, 2, 3])}
>>>D
{'a': 1, 'c': 3, 'b': 2}
Comprehensions actually require more code in this case, but they are also more general than this example implies—we can use them to map a single stream of values to dictionaries as well, and keys can be computed with expressions just like values:
>>>D = {x: x ** 2 for x in [1, 2, 3, 4]}
# Or: range(1, 5) >>>D
{1: 1, 2: 4, 3: 9, 4: 16} >>>D = {c: c * 4 for c in 'SPAM'}
# Loop over any iterable >>>D
{'A': 'AAAA', 'P': 'PPPP', 'S': 'SSSS', 'M': 'MMMM'} >>>D = {c.lower(): c + '!' for c in ['SPAM', 'EGGS', 'HAM']}
>>>D
{'eggs': 'EGGS!', 'ham': 'HAM!', 'spam': 'SPAM!'}
Dictionary comprehensions are also useful for initializing
dictionaries from keys lists, in much the same way as the fromkeys
method we met at the end of the
preceding section:
>>>D = dict.fromkeys(['a', 'b', 'c'], 0)
# Initialize dict from keys >>>D
{'a': 0, 'c': 0, 'b': 0} >>>D = {k:0 for k in ['a', 'b', 'c']}
# Same, but with a comprehension >>>D
{'a': 0, 'c': 0, 'b': 0} >>>D = dict.fromkeys('spam')
# Other iterators, default value >>>D
{'a': None, 'p': None, 's': None, 'm': None} >>>D = {k: None for k in 'spam'}
>>>D
{'a': None, 'p': None, 's': None, 'm': None}
Like related tools, dictionary comprehensions support
additional syntax not shown here, including nested loops and
if
clauses. Unfortunately, to
truly understand dictionary comprehensions, we need to also know
more about iteration statements and concepts in Python, and we
don’t yet have enough information to address that story well.
We’ll learn much more about all flavors of comprehensions (list,
set, and dictionary) in Chapters 14 and 20, so we’ll defer further details
until later. We’ll also study the zip
built-in we used in this section in
more detail in Chapter 13, when we
explore for
loops.
In 3.0 the dictionary keys
,
values
, and items
methods all return view
objects, whereas in 2.6 they return actual result
lists. View objects are iterables, which
simply means objects that generate result items one at a time,
instead of producing the result list all at once in memory.
Besides being iterable, dictionary views also retain the original
order of dictionary components, reflect future changes to the
dictionary, and may support set operations. On the other hand,
they are not lists, and they do not support operations like
indexing or the list sort
method; nor do they display their items when printed.
We’ll discuss the notion of iterables more formally in Chapter 14, but for our
purposes here it’s enough to know that we have to run the results
of these three methods through the list
built-in if we want to apply list
operations or display their values:
>>>D = dict(a=1, b=2, c=3)
>>>D
{'a': 1, 'c': 3, 'b': 2} >>>K = D.keys()
# Makes a view object in 3.0, not a list >>>K
<dict_keys object at 0x026D83C0> >>>list(K)
# Force a real list in 3.0 if needed ['a', 'c', 'b'] >>>V = D.values()
# Ditto for values and items views >>>V
<dict_values object at 0x026D8260> >>>list(V)
[1, 3, 2] >>>list(D.items())
[('a', 1), ('c', 3), ('b', 2)] >>>K[0]
# List operations fail unless converted TypeError: 'dict_keys' object does not support indexing >>>list(K)[0]
'a'
Apart from when displaying results at the interactive prompt, you will probably rarely even notice this change, because looping constructs in Python automatically force iterable objects to produce one result on each iteration:
>>> for k in D.keys(): print(k)
# Iterators used automatically in loops
...
a
c
b
In addition, 3.0 dictionaries still have iterators themselves, which return successive keys—as in 2.6, it’s still often not necessary to call keys directly:
>>> for key in D: print(key)
# Still no need to call keys() to iterate
...
a
c
b
Unlike 2.X’s list results, though, dictionary views in 3.0 are not carved in stone when created—they dynamically reflect future changes made to the dictionary after the view object has been created:
>>>D = {'a':1, 'b':2, 'c':3}
>>>D
{'a': 1, 'c': 3, 'b': 2} >>>K = D.keys()
>>>V = D.values()
>>>list(K)
# Views maintain same order as dictionary ['a', 'c', 'b'] >>>list(V)
[1, 3, 2] >>>del D['b']
# Change the dictionary in-place >>>D
{'a': 1, 'c': 3} >>>list(K)
# Reflected in any current view objects ['a', 'c'] >>>list(V)
# Not true in 2.X! [1, 3]
Also unlike 2.X’s list results, 3.0’s view objects
returned by the keys
method are
set-like and support common set operations
such as intersection and union; values
views are not, since they aren’t
unique, but items
results are
if their (
key
,
value
)
pairs are unique and hashable. Given
that sets behave much like valueless dictionaries (and are even
coded in curly braces like dictionaries in 3.0), this is a
logical symmetry. Like dictionary keys, set items are unordered,
unique, and immutable.
Here is what keys lists look like when used in set
operations. In set operations, views may be mixed with other
views, sets, and dictionaries (dictionaries are treated the same
as their keys
views in this
context):
>>>K | {'x': 4}
# Keys (and some items) views are set-like {'a', 'x', 'c'} >>>V & {'x': 4}
TypeError: unsupported operand type(s) for &: 'dict_values' and 'dict' >>>V & {'x': 4}.values()
TypeError: unsupported operand type(s) for &: 'dict_values' and 'dict_values' >>>D = {'a':1, 'b':2, 'c':3}
>>>D.keys() & D.keys()
# Intersect keys views {'a', 'c', 'b'} >>>D.keys() & {'b'}
# Intersect keys and set {'b'} >>>D.keys() & {'b': 1}
# Intersect keys and dict {'b'} >>>D.keys() | {'b', 'c', 'd'}
# Union keys and set {'a', 'c', 'b', 'd'}
Dictionary items views are set-like too if they are hashable—that is, if they contain only immutable objects:
>>>D = {'a': 1}
>>>list(D.items())
# Items set-like if hashable [('a', 1)] >>>D.items() | D.keys()
# Union view and view {('a', 1), 'a'} >>>D.items() | D
# dict treated same as its keys {('a', 1), 'a'} >>>D.items() | {('c', 3), ('d', 4)}
# Set of key/value pairs {('a', 1), ('d', 4), ('c', 3)} >>>dict(D.items() | {('c', 3), ('d', 4)})
# dict accepts iterable sets too {'a': 1, 'c': 3, 'd': 4}
For more details on set operations in general, see Chapter 5. Now, let’s look at three other quick coding notes for 3.0 dictionaries.
First of all, because keys
does not return a list, the
traditional coding pattern for scanning a dictionary by sorted
keys in 2.X won’t work in 3.0. You must either convert to a list
manually or use the sorted
call
introduced in Chapter 4
and earlier in this chapter on either a keys
view or the dictionary
itself:
>>>D = {'a':1, 'b':2, 'c':3}
>>>D
{'a': 1, 'c': 3, 'b': 2} >>>Ks = D.keys()
# Sorting a view object doesn't work! >>>Ks.sort()
AttributeError: 'dict_keys' object has no attribute 'sort' >>>Ks = list(Ks)
# Force it to be a list and then sort >>>Ks.sort()
>>>for k in Ks: print(k, D[k])
... a 1 b 2 c 3 >>>D
{'a': 1, 'c': 3, 'b': 2} >>>Ks = D.keys()
# Or you can use sorted() on the keys >>>for k in sorted(Ks): print(k, D[k])
# sorted() accepts any iterable ... # sorted() returns its result a 1 b 2 c 3 >>>D
{'a': 1, 'c': 3, 'b': 2} # Better yet, sort the dict directly >>>for k in sorted(D): print(k, D[k])
# dict iterators return keys ... a 1 b 2 c 3
Secondly, while in Python 2.6 dictionaries may be compared for
relative magnitude directly with <
, >
, and so on, in Python 3.0 this no
longer works. However, it can be simulated by comparing sorted
keys lists manually:
sorted(D1.items()) < sorted(D2.items()) # Like 2.6 D1 < D2
Dictionary equality tests still work in 3.0, though. Since we’ll revisit this in the next chapter in the context of comparisons at large, we’ll defer further details here.
Finally, the widely used dictionary has_key
key presence test method is gone
in 3.0. Instead, use the in
membership expression, or a get
with a default test (of these, in
is generally preferred):
>>>D
{'a': 1, 'c': 3, 'b': 2} >>>D.has_key('c')
# 2.X only: True/False AttributeError: 'dict' object has no attribute 'has_key' >>>'c' in D
True >>>'x' in D
False >>>if 'c' in D: print('present', D['c'])
# Preferred in 3.0 ... present 3 >>>print(D.get('c'))
3 >>>print(D.get('x'))
None >>>if D.get('c') != None: print('present', D['c'])
# Another option ... present 3
If you work in 2.6 and care about 3.0 compatibility, note
that the first two changes (comprehensions and views) can only be
coded in 3.0, but the last three (sorted
, manual comparisons, and in
) can be coded in 2.6 today to ease
3.0 migration in the future.
In this chapter, we explored the list and dictionary
types—probably the two most common, flexible, and powerful collection
types you will see and use in Python code. We learned that the list
type supports positionally ordered collections of arbitrary objects,
and that it may be freely nested and grown and shrunk on demand. The
dictionary type is similar, but it stores items by key instead of by
position and does not maintain any reliable left-to-right order among
its items. Both lists and dictionaries are mutable, and so support a
variety of in-place change operations not available for strings: for
example, lists can be grown by append
calls, and dictionaries by assignment
to new keys.
In the next chapter, we will wrap up our in-depth core object type tour by looking at tuples and files. After that, we’ll move on to statements that code the logic that processes our objects, taking us another step toward writing complete programs. Before we tackle those topics, though, here are some chapter quiz questions to review.
Name two ways to build a list containing five integer zeros.
Name two ways to build a dictionary with two keys, 'a'
and 'b'
, each having an associated value of
0
.
Name four operations that change a list object in-place.
Name four operations that change a dictionary object in-place.
A literal expression like [0, 0, 0,
0, 0]
and a repetition expression like [0] * 5
will each create a list of five
zeros. In practice, you might also build one up with a loop that
starts with an empty list and appends 0
to it in each iteration:
L.append(0)
. A list
comprehension ([0 for i in
range(5)]
) could work here, too, but this is more work
than you need to do.
A literal expression such as {'a':
0, 'b': 0}
or a series of assignments like D =
{}
, D['a'] =
0
, and D['b'] = 0
would create the desired dictionary. You can also use the newer
and simpler-to-code dict(a=0,
b=0)
keyword form, or the more flexible dict([('a', 0), ('b', 0)])
key/value
sequences form. Or, because all the values are the same, you can
use the special form dict.fromkeys('ab',
0)
. In 3.0, you can also use a dictionary comprehension:
{k:0 for k in 'ab'}
.
The append
and extend
methods grow a list in-place, the
sort
and reverse
methods order and reverse lists,
the insert
method inserts an
item at an offset, the remove
and pop
methods delete from a
list by value and by position, the del
statement deletes an item or slice,
and index and slice assignment statements replace an item or
entire section. Pick any four of these for the quiz.
Dictionaries are primarily changed by assignment to a new or
existing key, which creates or changes the key’s entry in the
table. Also, the del
statement
deletes a key’s entry, the dictionary update
method merges one dictionary into
another in-place, and D.pop(key)
removes a key and returns the
value it had. Dictionaries also have other, more exotic in-place
change methods not listed in this chapter, such as setdefault
; see reference sources for
more details.
[21] In practice, you won’t see many lists written out like this in list-processing programs. It’s more common to see code that processes lists constructed dynamically (at runtime). In fact, although it’s important to master literal syntax, most data structures in Python are built by running program code at runtime.
[22] This description needs elaboration when the value
and the slice being assigned overlap: L[2:5]=L[3:6]
, for instance,
works fine because the value to be inserted is fetched
before the deletion happens on the left.
[23] Unlike +
concatenation, append
doesn’t have to generate new objects, so it’s usually faster.
You can also mimic append
with clever slice assignments: L[len(L):]=[X]
is like L.append(X)
, and L[:0]=[X]
is like appending at the
front of a list. Both delete an empty slice and insert
X
, changing L
in-place quickly, like append
.
[24] As with lists, you won’t often see dictionaries constructed
using literals. Lists and dictionaries are grown in different
ways, though. As you’ll see in the next section, dictionaries are
typically built up by assigning to new keys at runtime; this
approach fails for lists (lists are commonly grown with append
instead).