In Chapter 2 we started at the bottom with Python’s basic data types: booleans, integers, floats, and strings. If you think of those as atoms, the data structures in this chapter are like molecules. That is, we combine those basic types in more complex ways. You will use these every day. Much of programming consists of chopping and glueing data into specific forms, and these are your hacksaws and glue guns.
Most computer languages can represent a sequence of items indexed by their integer position: first, second, and so on down to the last. You’ve already seen Python strings, which are sequences of characters. You’ve also had a little preview of lists, which you’ll now see are sequences of anything.
Python has two other sequence structures: tuples and lists. These contain zero or more elements. Unlike strings, the elements can be of different types. In fact, each element can be any Python object. This lets you create structures as deep and complex as you like.
Why does Python contain both lists and tuples? Tuples are immutable; when you assign elements to a tuple, they’re baked in the cake and can’t be changed. Lists are mutable, meaning you can insert and delete elements with great enthusiasm. I’ll show many examples of each, with an emphasis on lists.
By the way, you might hear two different pronunciations for tuple. Which is right? If you guess wrong, do you risk being considered a Python poseur? No worries. Guido van Rossum, the creator of Python, tweeted “I pronounce tuple too-pull on Mon/Wed/Fri and tub-pull on Tue/Thu/Sat. On Sunday I don’t talk about them. :)”
Lists are good for keeping track of things by their order, especially when the order and contents might change. Unlike strings, lists are mutable. You can change a list in-place, add new elements, and delete or overwrite existing elements. The same value can occur more than once in a list.
A list is made from zero or more elements, separated by commas, and surrounded by square brackets:
>>>
empty_list
=
[
]
>>>
weekdays
=
[
'Monday'
,
'Tuesday'
,
'Wednesday'
,
'Thursday'
,
'Friday'
]
>>>
big_birds
=
[
'emu'
,
'ostrich'
,
'cassowary'
]
>>>
first_names
=
[
'Graham'
,
'John'
,
'Terry'
,
'Terry'
,
'Michael'
]
You can also make an empty list
with the list()
function:
>>>
another_empty_list
=
list
()
>>>
another_empty_list
[]
“Comprehensions” shows one more way to create a list, called a list comprehension.
The weekdays
list is the only one that actually
takes advantage of list order.
The first_names
list shows that values do
not need to be unique.
If you only want to keep track of unique
values and don’t care about order,
a Python set might be a better choice than a list.
In the previous example, big_birds
could have been a set.
You’ll read about sets a little later in this chapter.
Python’s list()
function converts other data types to lists. The following example converts a string to a list of one-character strings:
>>>
list
(
'cat'
)
[
'c'
,
'a'
,
't'
]
This example converts a tuple (coming up after lists in this chapter) to a list:
>>>
a_tuple
=
(
'ready'
,
'fire'
,
'aim'
)
>>>
list
(
a_tuple
)
['ready', 'fire', 'aim']
As I mentioned earlier in “Split with split()”,
use split()
to chop a string into a list by some separator string:
>>>
birthday
=
'1/6/1952'
>>>
birthday
.
split
(
'/'
)
['1', '6', '1952']
What if you have more than one separator string in a row in your original string? Well, you get an empty string as a list item:
>>>
splitme
=
'a/b//c/d///e'
>>>
splitme
.
split
(
'/'
)
['a', 'b', '', 'c', 'd', '', '', 'e']
If you had used the two-character separator
string //
instead, you would get this:
>>>
splitme
=
'a/b//c/d///e'
>>>
splitme
.
split
(
'//'
)
>>>
['a/b', 'c/d', '/e']
As with strings, you can extract a single value from a list by specifying its offset:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
]
>>>
marxes
[
0
]
'Groucho'
>>>
marxes
[
1
]
'Chico'
>>>
marxes
[
2
]
'Harpo'
Again, as with strings, negative indexes count backward from the end:
>>>
marxes
[
-
1
]
'Harpo'
>>>
marxes
[
-
2
]
'Chico'
>>>
marxes
[
-
3
]
'Groucho'
>>>
The offset has to be a valid one for this list—a position you have assigned a value previously.
If you specify an offset before the beginning or after
the end, you’ll get an exception (error).
Here’s what happens if we try to get the
sixth Marx brother (offset 5
counting from 0
),
or the fifth before the end:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
]
>>>
marxes
[
5
]
Traceback (most recent call last):
File"<stdin>"
, line1
, in<module>
IndexError
:list index out of range
>>>
marxes
[
-
5
]
Traceback (most recent call last):
File"<stdin>"
, line1
, in<module>
IndexError
:list index out of range
Lists can contain elements of different types, including other lists, as illustrated here:
>>>
small_birds
=
[
'hummingbird'
,
'finch'
]
>>>
extinct_birds
=
[
'dodo'
,
'passenger pigeon'
,
'Norwegian Blue'
]
>>>
carol_birds
=
[
3
,
'French hens'
,
2
,
'turtledoves'
]
>>>
all_birds
=
[
small_birds
,
extinct_birds
,
'macaw'
,
carol_birds
]
So what does all_birds
, a list of lists, look like?
>>>
all_birds
[['hummingbird', 'finch'], ['dodo', 'passenger pigeon', 'Norwegian Blue'], 'macaw',
[3, 'French hens', 2, 'turtledoves']]
Let’s look at the first item in it:
>>>
all_birds
[
0
]
['hummingbird', 'finch']
The first item is a list:
in fact, it’s small_birds
,
the first item we
specified when creating all_birds
.
You should be able to guess
what the second item is:
>>>
all_birds
[
1
]
['dodo', 'passenger pigeon', 'Norwegian Blue']
It’s the second item we specified, extinct_birds
.
If we want the first item of extinct_birds
,
we can extract it from
all_birds
by specifying two indexes:
>>>
all_birds
[
1
][
0
]
'dodo'
The [1]
refers to the list that’s the second item in all_birds
,
whereas the [0]
refers to the first item in that inner list.
Just as you can get the value of a list item by its offset, you can change it:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
]
>>>
marxes
[
2
]
=
'Wanda'
>>>
marxes
['Groucho', 'Chico', 'Wanda']
Again, the list offset needs to be a valid one for this list.
You can’t change a character in a string in this way, because strings are immutable. Lists are mutable. You can change how many items a list contains, and the items themselves.
You can extract a subsequence of a list by using a slice:
>>>
marxes
=
[
'Groucho'
,
'Chico,'
'Harpo'
]
>>>
marxes
[
0
:
2
]
['Groucho', 'Chico']
A slice of a list is also a list.
As with strings, slices can step by values other than one. The next example starts at the beginning and goes right by 2:
>>>
marxes
[::
2
]
['Groucho', 'Harpo']
Here, we start at the end and go left by 2:
>>>
marxes
[::
-
2
]
['Harpo', 'Groucho']
And finally, the trick to reverse a list:
>>>
marxes
[::
-
1
]
['Harpo', 'Chico', 'Groucho']
The traditional way of adding items to a list
is to append()
them one by one to the end.
In the previous examples, we forgot Zeppo, but that’s all right because the list is
mutable, so we can add him now:
>>>
marxes
.
append
(
'Zeppo'
)
>>>
marxes
['Groucho', 'Chico', 'Harpo', 'Zeppo']
You can merge one list into another by using extend()
.
Suppose that a well-meaning person
gave us a new list of Marxes
called others
,
and we’d like to merge them into the main marxes
list:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
,
'Zeppo'
]
>>>
others
=
[
'Gummo'
,
'Karl'
]
>>>
marxes
.
extend
(
others
)
>>>
marxes
['Groucho', 'Chico', 'Harpo', 'Zeppo', 'Gummo', 'Karl']
Alternatively, you can use +=
:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
,
'Zeppo'
]
>>>
others
=
[
'Gummo'
,
'Karl'
]
>>>
marxes
+=
others
>>>
marxes
['Groucho', 'Chico', 'Harpo', 'Zeppo', 'Gummo', 'Karl']
If we had used append()
,
others
would have been added as a single list item
rather than merging its items:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
,
'Zeppo'
]
>>>
others
=
[
'Gummo'
,
'Karl'
]
>>>
marxes
.
append
(
others
)
>>>
marxes
['Groucho', 'Chico', 'Harpo', 'Zeppo', ['Gummo', 'Karl']]
This again demonstrates that a list can contain elements of different types. In this case, four strings, and a list of two strings.
The append()
function adds items only to the end of the list.
When you want to add an item before any offset in the list, use insert()
.
Offset 0
inserts at the beginning.
An offset beyond the end
of the list inserts at the end,
like append()
,
so you don’t need to worry about Python throwing an exception.
>>>
marxes
.
insert
(
3
,
'Gummo'
)
>>>
marxes
['Groucho', 'Chico', 'Harpo', 'Gummo', 'Zeppo']
>>>
marxes
.
insert
(
10
,
'Karl'
)
>>>
marxes
['Groucho', 'Chico', 'Harpo', 'Gummo', 'Zeppo', 'Karl']
Our fact checkers have just informed us that Gummo was indeed one of the Marx Brothers, but Karl wasn’t. Let’s undo that last insertion:
>>>
del
marxes
[
-
1
]
>>>
marxes
['Groucho', 'Chico', 'Harpo', 'Gummo', 'Zeppo']
When you delete an item by its position in the list,
the items that follow it
move back to take the deleted item’s space,
and the list’s length decreases by one.
If we delete 'Harpo'
from the last version
of the marxes
list, we get this as a result:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
,
'Gummo'
,
'Zeppo'
]
>>>
marxes
[
2
]
'Harpo'
>>>
del
marxes
[
2
]
>>>
marxes
['Groucho', 'Chico', 'Gummo', 'Zeppo']
>>>
marxes
[
2
]
'Gummo'
del
is a Python statement, not a list method—you don’t say marxes[-2].del()
.
It’s sort of the reverse of assignment (=
):
it detaches a name from a Python object
and can free up the object’s memory if that name
was the last reference to it.
You can get an item from a list and delete it from the list
at the same time by using pop()
.
If you call pop()
with an offset,
it will return the item at that offset;
with no argument, it uses -1
.
So, pop(0)
returns the head (start) of the list,
and pop()
or pop(-1)
returns the tail (end), as shown here:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
,
'Zeppo'
]
>>>
marxes
.
pop
()
'Zeppo'
>>>
marxes
['Groucho', 'Chico', 'Harpo']
>>>
marxes
.
pop
(
1
)
'Chico'
>>>
marxes
['Groucho', 'Harpo']
It’s computing jargon time!
Don’t worry, these won’t be on the final exam.
If you use append()
to add new items to the end
and pop()
to remove them from the same end,
you’ve implemented a data structure known as a
LIFO (last in, first out) queue.
This is more commonly known as a stack.
pop(0)
would create a FIFO (first in, first out) queue.
These are useful when you want to collect data as they arrive
and work with either the oldest first (FIFO) or the newest first (LIFO).
The Pythonic way to check for the existence of a value in a list is using in
:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
,
'Zeppo'
]
>>>
'Groucho'
in
marxes
True
>>>
'Bob'
in
marxes
False
The same value may be in more than one position in the list.
As long as it’s in there at least once, in
will return True
:
>>>
words
=
[
'a'
,
'deer'
,
'a'
'female'
,
'deer'
]
>>>
'deer'
in
words
True
If you check for the existence of some value in a list often and don’t care about the order of items, a Python set is a more appropriate way to store and look up unique values. We’ll talk about sets a little later in this chapter.
To count how many times a particular value occurs
in a list, use count()
:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
]
>>>
marxes
.
count
(
'Harpo'
)
1
>>>
marxes
.
count
(
'Bob'
)
0
>>>
snl_skit
=
[
'cheeseburger'
,
'cheeseburger'
,
'cheeseburger'
]
>>>
snl_skit
.
count
(
'cheeseburger'
)
3
“Combine with join()” discusses join()
in greater detail, but here’s another example of what you can do with it:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
]
>>>
', '
.
join
(
marxes
)
'Groucho, Chico, Harpo'
But wait: you might be thinking that this seems a little backward.
join()
is a string method, not a list method.
You can’t say marxes.join(', ')
, even though it seems more intuitive.
The argument to join()
is a string or
any iterable sequence of strings (including a list),
and its output is a string.
If join()
were just a list method, you couldn’t use it with
other iterable objects such as tuples or strings. If you did want it to
work with any iterable type, you’d need special code
for each type to handle the actual joining.
It might help to remember:
join()` is the opposite of `split()
, as demonstrated here:
>>>
friends
=
[
'Harry'
,
'Hermione'
,
'Ron'
]
>>>
separator
=
' * '
>>>
joined
=
separator
.
join
(
friends
)
>>>
joined
'Harry * Hermione * Ron'
>>>
separated
=
joined
.
split
(
separator
)
>>>
separated
['Harry', 'Hermione', 'Ron']
>>>
separated
==
friends
True
You’ll often need to sort the items in a list by their values rather than their offsets. Python provides two functions:
The list function sort()
sorts the list itself, in place.
The general function sorted()
returns a sorted copy of the list.
If the items in the list are numeric, they’re sorted by default in ascending numeric order. If they’re strings, they’re sorted in alphabetical order:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
]
>>>
sorted_marxes
=
sorted
(
marxes
)
>>>
sorted_marxes
['Chico', 'Groucho', 'Harpo']
sorted_marxes
is a copy, and creating it
did not change the original list:
>>>
marxes
['Groucho', 'Chico', 'Harpo']
But, calling the list function sort()
on the marxes
list does change marxes
:
>>>
marxes
.
sort
()
>>>
marxes
['Chico', 'Groucho', 'Harpo']
If the elements of your list are all of the same
type (such as strings in marxes
), sort()
will work correctly.
You can sometimes even mix types—for example, integers and floats—because they are automatically converted to one another
by Python in expressions:
>>>
numbers
=
[
2
,
1
,
4.0
,
3
]
>>>
numbers
.
sort
()
>>>
numbers
[1, 2, 3, 4.0]
The default sort order is ascending,
but you can add the argument reverse=True
to set it to descending:
>>>
numbers
=
[
2
,
1
,
4.0
,
3
]
>>>
numbers
.
sort
(
reverse
=
True
)
>>>
numbers
[4.0, 3, 2, 1]
When you assign one list to more than one variable, changing the list in one place also changes it in the other, as illustrated here:
>>>
a
=
[
1
,
2
,
3
]
>>>
a
[1, 2, 3]
>>>
b
=
a
>>>
b
[1, 2, 3]
>>>
a
[
0
]
=
'surprise'
>>>
a
['surprise', 2, 3]
So what’s in b
now?
Is it still [1, 2, 3]
, or ['surprise', 2, 3]
?
Let’s see:
>>>
b
['surprise', 2, 3]
Remember the sticky note analogy in Chapter 2?
b
just refers to the same list object as a
; therefore, whether we change the list contents by using the name a
or b
,
it’s reflected in both:
>>>
b
['surprise', 2, 3]
>>>
b
[
0
]
=
'I hate surprises'
>>>
b
['I hate surprises', 2, 3]
>>>
a
['I hate surprises', 2, 3]
You can copy the values of a list to an independent, fresh list by using any of these methods:
The list copy()
function
The list()
conversion function
The list slice [:]
Our original list will be a
again.
We’ll make b
with the list copy()
function,
c
with the list()
conversion function,
and d
with a list slice:
>>>
a
=
[
1
,
2
,
3
]
>>>
b
=
a
.
copy
()
>>>
c
=
list
(
a
)
>>>
d
=
a
[:]
Again, b
, c
, and d
are copies of a
:
they are new objects with their own values
and no connection to the original
list object [1, 2, 3]
to which a
refers.
Changing a
does not affect
the copies b
, c
, and d
:
>>>
a
[
0
]
=
'integer lists are boring'
>>>
a
['integer lists are boring', 2, 3]
>>>
b
[1, 2, 3]
>>>
c
[1, 2, 3]
>>>
d
[1, 2, 3]
Similar to lists, tuples are sequences of arbitrary items. Unlike lists, tuples are immutable, meaning you can’t add, delete, or change items after the tuple is defined. So, a tuple is similar to a constant list.
The syntax to make tuples is a little inconsistent, as we’ll demonstrate in the examples that follow.
Let’s begin by making an empty tuple using ()
:
>>>
empty_tuple
=
()
>>>
empty_tuple
()
To make a tuple with one or more elements, follow each element with a comma. This works for one-element tuples:
>>>
one_marx
=
'Groucho'
,
>>>
one_marx
('Groucho',)
If you have more than one element, follow all but the last one with a comma:
>>>
marx_tuple
=
'Groucho'
,
'Chico'
,
'Harpo'
>>>
marx_tuple
('Groucho', 'Chico', 'Harpo')
Python includes parentheses when echoing a tuple. You don’t need them—it’s the trailing commas that really define a tuple—but using parentheses doesn’t hurt. You can use them to enclose the values, which helps to make the tuple more visible:
>>>
marx_tuple
=
(
'Groucho'
,
'Chico'
,
'Harpo'
)
>>>
marx_tuple
('Groucho', 'Chico', 'Harpo')
Tuples let you assign multiple variables at once:
>>>
marx_tuple
=
(
'Groucho'
,
'Chico'
,
'Harpo'
)
>>>
a
,
b
,
c
=
marx_tuple
>>>
a
'Groucho'
>>>
b
'Chico'
>>>
c
'Harpo'
This is sometimes called tuple unpacking.
You can use tuples to exchange values in one statement without using a temporary variable:
>>>
password
=
'swordfish'
>>>
icecream
=
'tuttifrutti'
>>>
password
,
icecream
=
icecream
,
password
>>>
password
'tuttifrutti'
>>>
icecream
'swordfish'
>>>
The tuple()
conversion function makes tuples from other things:
>>>
marx_list
=
[
'Groucho'
,
'Chico'
,
'Harpo'
]
>>>
tuple
(
marx_list
)
('Groucho', 'Chico', 'Harpo')
You can often use tuples in place of lists,
but they have many fewer functions—there is no append()
, insert()
, and so on—because they can’t be modified after creation.
Why not just use lists instead of tuples everywhere?
Tuples use less space.
You can’t clobber tuple items by mistake.
You can use tuples as dictionary keys (see the next section).
Named tuples (see “Named Tuples”) can be a simple alternative to objects.
Function arguments are passed as tuples (see “Functions”).
I won’t go into much more detail about tuples here. In everyday programming, you’ll use lists and dictionaries more. Which is a perfect segue to…
A dictionary is similar to a list, but the order of items doesn’t matter, and they aren’t selected by an offset such as 0 or 1. Instead, you specify a unique key to associate with each value. This key is often a string, but it can actually be any of Python’s immutable types: boolean, integer, float, tuple, string, and others that you’ll see in later chapters. Dictionaries are mutable, so you can add, delete, and change their key-value elements.
If you’ve worked with languages that support only arrays or lists, you’ll love dictionaries.
In other languages, dictionaries might be called associative arrays, hashes, or hashmaps. In Python, a dictionary is also called a dict to save syllables.
To create a dictionary, you place curly brackets ({}
) around comma-separated key
:
value
pairs.
The simplest dictionary is an empty one,
containing no keys or values at all:
>>>
empty_dict
=
{}
>>>
empty_dict
{}
Let’s make a small dictionary with quotes from Ambrose Bierce’s The Devil’s Dictionary:
>>>
bierce
=
{
...
"day"
:
"A period of twenty-four hours, mostly misspent"
,
...
"positive"
:
"Mistaken at the top of one's voice"
,
...
"misfortune"
:
"The kind of fortune that never misses"
,
...
}
>>>
Typing the dictionary’s name in the interactive interpreter will print its keys and values:
>>>
bierce
{'misfortune': 'The kind of fortune that never misses',
'positive': "Mistaken at the top of one's voice",
'day': 'A period of twenty-four hours, mostly misspent'}
In Python, it’s okay to leave a comma after the last item of a list, tuple, or dictionary. Also, you don’t need to indent, as I did in the preceding example, when you’re typing keys and values within the curly braces. It just helps readability.
You can use the dict()
function to convert two-value
sequences into a dictionary. (You might run into such key-value sequences at times, such as “Strontium, 90, Carbon, 14”,
or “Vikings, 20, Packers, 7”.)
The first item in each sequence
is used as the key and the second as
the value.
First, here’s a small example using lol
(a list of two-item lists):
>>>
lol
=
[
[
'a'
,
'b'
],
[
'c'
,
'd'
],
[
'e'
,
'f'
]
]
>>>
dict
(
lol
)
{'c': 'd', 'a': 'b', 'e': 'f'}
Remember that the order of keys in a dictionary is arbitrary, and might differ depending on how you add items.
We could have used any sequence containing two-item sequences. Here are other examples.
A list of two-item tuples:
>>>
lot
=
[
(
'a'
,
'b'
),
(
'c'
,
'd'
),
(
'e'
,
'f'
)
]
>>>
dict
(
lot
)
{'c': 'd', 'a': 'b', 'e': 'f'}
A tuple of two-item lists:
>>>
tol
=
(
[
'a'
,
'b'
],
[
'c'
,
'd'
],
[
'e'
,
'f'
]
)
>>>
dict
(
tol
)
{'c': 'd', 'a': 'b', 'e': 'f'}
A list of two-character strings:
>>>
los
=
[
'ab'
,
'cd'
,
'ef'
]
>>>
dict
(
los
)
{'c': 'd', 'a': 'b', 'e': 'f'}
A tuple of two-character strings:
>>>
tos
=
(
'ab'
,
'cd'
,
'ef'
)
>>>
dict
(
tos
)
{'c': 'd', 'a': 'b', 'e': 'f'}
The section “Iterate Multiple Sequences with zip()” introduces you to a function
called zip()
that makes it easy to create
these two-item sequences.
Adding an item to a dictionary is easy. Just refer to the item by its key and assign a value. If the key was already present in the dictionary, the existing value is replaced by the new one. If the key is new, it’s added to the dictionary with its value. Unlike lists, you don’t need to worry about Python throwing an exception during assignment by specifying an index that’s out of range.
Let’s make a dictionary of most of the members of Monty Python, using their last names as keys, and first names as values:
>>>
pythons
=
{
...
'Chapman'
:
'Graham'
,
...
'Cleese'
:
'John'
,
...
'Idle'
:
'Eric'
,
...
'Jones'
:
'Terry'
,
...
'Palin'
:
'Michael'
,
...
}
>>>
pythons
{'Cleese': 'John', 'Jones': 'Terry', 'Palin': 'Michael',
'Chapman': 'Graham', 'Idle': 'Eric'}
We’re missing one member: the one born in America, Terry Gilliam. Here’s an attempt by an anonymous programmer to add him, but he’s botched the first name:
>>>
pythons
[
'Gilliam'
]
=
'Gerry'
>>>
pythons
{'Cleese': 'John', 'Gilliam': 'Gerry', 'Palin': 'Michael',
'Chapman': 'Graham', 'Idle': 'Eric', 'Jones': 'Terry'}
And here’s some repair code by another programmer who is Pythonic in more than one way:
>>>
pythons
[
'Gilliam'
]
=
'Terry'
>>>
pythons
{'Cleese': 'John', 'Gilliam': 'Terry', 'Palin': 'Michael',
'Chapman': 'Graham', 'Idle': 'Eric', 'Jones': 'Terry'}
By using the same key ('Gilliam'
),
we replaced the original value 'Gerry'
with 'Terry'
.
Remember that dictionary keys must be unique.
That’s why we used last names for keys
instead of first names here—two members of Monty Python have the first name Terry
!
If you use a key more than once, the last value wins:
>>>
some_pythons
=
{
...
'Graham'
:
'Chapman'
,
...
'John'
:
'Cleese'
,
...
'Eric'
:
'Idle'
,
...
'Terry'
:
'Gilliam'
,
...
'Michael'
:
'Palin'
,
...
'Terry'
:
'Jones'
,
...
}
>>>
some_pythons
{'Terry': 'Jones', 'Eric': 'Idle', 'Graham': 'Chapman',
'John': 'Cleese', 'Michael': 'Palin'}
We first assigned the value
'Gilliam'
to the key 'Terry'
and then replaced it with the value 'Jones'
.
You can use the update()
function to copy the keys and values of one dictionary into another.
Let’s define the pythons
dictionary,
with all members:
>>>
pythons
=
{
...
'Chapman'
:
'Graham'
,
...
'Cleese'
:
'John'
,
...
'Gilliam'
:
'Terry'
,
...
'Idle'
:
'Eric'
,
...
'Jones'
:
'Terry'
,
...
'Palin'
:
'Michael'
,
...
}
>>>
pythons
{'Cleese': 'John', 'Gilliam': 'Terry', 'Palin': 'Michael',
'Chapman': 'Graham', 'Idle': 'Eric', 'Jones': 'Terry'}
We also have a dictionary of other humorous persons called others
:
>>>
others
=
{
'Marx'
:
'Groucho'
,
'Howard'
:
'Moe'
}
Now, along comes another anonymous programmer
who thinks the members of others
should be members of Monty Python:
>>>
pythons
.
update
(
others
)
>>>
pythons
{'Cleese': 'John', 'Howard': 'Moe', 'Gilliam': 'Terry',
'Palin': 'Michael', 'Marx': 'Groucho', 'Chapman': 'Graham',
'Idle': 'Eric', 'Jones': 'Terry'}
What happens if the second dictionary has the same key as the dictionary into which it’s being merged? The value from the second dictionary wins:
>>>
first
=
{
'a'
:
1
,
'b'
:
2
}
>>>
second
=
{
'b'
:
'platypus'
}
>>>
first
.
update
(
second
)
>>>
first
{'b': 'platypus', 'a': 1}
Our anonymous programmer’s code was correct—technically.
But, he shouldn’t have done it!
The members of others
, although funny and famous,
were not in Monty Python.
Let’s undo those last two additions:
>>>
del
pythons
[
'Marx'
]
>>>
pythons
{'Cleese': 'John', 'Howard': 'Moe', 'Gilliam': 'Terry',
'Palin': 'Michael', 'Chapman': 'Graham', 'Idle': 'Eric',
'Jones': 'Terry'}
>>>
del
pythons
[
'Howard'
]
>>>
pythons
{'Cleese': 'John', 'Gilliam': 'Terry', 'Palin': 'Michael',
'Chapman': 'Graham', 'Idle': 'Eric', 'Jones': 'Terry'}
If you want to know whether a key exists in a dictionary,
use in
.
Let’s redefine the pythons
dictionary again,
this time omitting a name or two:
>>>
pythons
=
{
'Chapman'
:
'Graham'
,
'Cleese'
:
'John'
,
'Jones': 'Terry', 'Palin': 'Michael'}
Now let’s see who’s in there:
>>>
'Chapman'
in
pythons
True
>>>
'Palin'
in
pythons
True
Did we remember to add Terry Gilliam this time?
>>>
'Gilliam'
in
pythons
False
Drat.
This is the most common use of a dictionary. You specify the dictionary and key to get the corresponding value:
>>>
pythons
[
'Cleese'
]
'John'
If the key is not present in the dictionary, you’ll get an exception:
>>>
pythons
[
'Marx'
]
Traceback (most recent call last):
File"<stdin>"
, line1
, in<module>
KeyError
:'Marx'
There are two good ways to avoid this. The first is to test for
the key at the outset by using in
, as you saw in the previous section:
>>>
'Marx'
in
pythons
False
The second is to use the special dictionary get()
function.
You provide the dictionary, key, and an optional value.
If the key exists, you get its value:
>>>
pythons
.
get
(
'Cleese'
)
'John'
If not, you get the optional value, if you specified one:
>>>
pythons
.
get
(
'Marx'
,
'Not a Python'
)
'Not a Python'
Otherwise, you get None
(which displays nothing in the interactive interpreter):
>>>
pythons
.
get
(
'Marx'
)
>>>
You can use keys()
to get all the keys in a dictionary.
We’ll use a different sample dictionary for the
next few examples:
>>>
signals
=
{
'green'
:
'go'
,
'yellow'
:
'go faster'
,
'red'
:
'smile for the camera'
}
>>>
signals
.
keys
()
dict_keys(['green', 'red', 'yellow'])
In Python 2, keys()
just returns a list.
Python 3 returns dict_keys()
, which is an iterable view of the keys.
This is handy with large dictionaries
because it doesn’t use the time and memory to create and store a list that
you might not use.
But often you actually do want a list.
In Python 3,
you need to call list()
to convert a dict_keys
object to a list.
>>>
list
(
signals
.
keys
()
)
['green', 'red', 'yellow']
In Python 3,
you also need to use the list()
function to turn
the results of
values()
and items()
into normal Python lists.
I’m using that in these examples.
When you want to get all the key-value pairs from a dictionary, use the items()
function:
>>>
list
(
signals
.
items
()
)
[('green', 'go'), ('red', 'smile for the camera'), ('yellow', 'go faster')]
Each key and value is returned as a tuple,
such as ('green', 'go')
.
As with lists, if you make a change to a dictionary, it will be reflected in all the names that refer to it.
>>>
signals
=
{
'green'
:
'go'
,
'yellow'
:
'go faster'
,
'red'
:
'smile for the camera'
}
>>>
save_signals
=
signals
>>>
signals
[
'blue'
]
=
'confuse everyone'
>>>
save_signals
{'blue': 'confuse everyone', 'green': 'go',
'red': 'smile for the camera', 'yellow': 'go faster'}
To actually copy keys and values from a dictionary to another
dictionary and avoid this, you can use copy()
:
>>>
signals
=
{
'green'
:
'go'
,
'yellow'
:
'go faster'
,
'red'
:
'smile for the camera'
}
>>>
original_signals
=
signals
.
copy
()
>>>
signals
[
'blue'
]
=
'confuse everyone'
>>>
signals
{'blue': 'confuse everyone', 'green': 'go',
'red': 'smile for the camera', 'yellow': 'go faster'}
>>>
original_signals
{'green': 'go', 'red': 'smile for the camera', 'yellow': 'go faster'}
A set is like a dictionary with its values thrown away, leaving only the keys. As with a dictionary, each key must be unique. You use a set when you only want to know that something exists, and nothing else about it. Use a dictionary if you want to attach some information to the key as a value.
At some bygone time, in some places, set theory was taught in elementary school along with basic mathematics. If your school skipped it (or covered it and you were staring out the window as I often did), Figure 3-1 shows the ideas of union and intersection.
Suppose that you take the union of two sets that have some keys in common.
Because a set must contain only one of each item, the union of two
sets will contain only one of each key.
The null or empty set is a set with zero elements.
In Figure 3-1, an example of a null set would be female names
beginning with X
.
To create a set, you use the set()
function or enclose
one or more comma-separated values in curly brackets, as shown here:
>>>
empty_set
=
set
()
>>>
empty_set
set()
>>>
even_numbers
=
{
0
,
2
,
4
,
6
,
8
}
>>>
even_numbers
{0, 8, 2, 4, 6}
>>>
odd_numbers
=
{
1
,
3
,
5
,
7
,
9
}
>>>
odd_numbers
{9, 3, 1, 5, 7}
As with dictionary keys, sets are unordered.
You can create a set from a list, string, tuple, or dictionary, discarding any duplicate values.
First, let’s take a look at a string with more than one occurrence of some letters:
>>>
set
(
'letters'
)
{'l', 'e', 't', 'r', 's'}
Notice that the set contains only one 'e'
or 't'
,
even though 'letters'
contained two of each.
Now, let’s make a set from a list:
>>>
set
(
[
'Dasher'
,
'Dancer'
,
'Prancer'
,
'Mason-Dixon'
]
)
{'Dancer', 'Dasher', 'Prancer', 'Mason-Dixon'}
This time, a set from a tuple:
>>>
set
(
(
'Ummagumma'
,
'Echoes'
,
'Atom Heart Mother'
)
)
{'Ummagumma', 'Atom Heart Mother', 'Echoes'}
When you give set()
a dictionary, it uses only the keys:
>>>
set
(
{
'apple'
:
'red'
,
'orange'
:
'orange'
,
'cherry'
:
'red'
}
)
{'apple', 'cherry', 'orange'}
This is the most common use of a set.
We’ll make a dictionary called drinks
.
Each key is the name of a mixed drink,
and the corresponding value is a set of its ingredients:
>>>
drinks
=
{
...
'martini'
:
{
'vodka'
,
'vermouth'
},
...
'black russian'
:
{
'vodka'
,
'kahlua'
},
...
'white russian'
:
{
'cream'
,
'kahlua'
,
'vodka'
},
...
'manhattan'
:
{
'rye'
,
'vermouth'
,
'bitters'
},
...
'screwdriver'
:
{
'orange juice'
,
'vodka'
}
...
}
Even though both are enclosed by curly braces ({
and }
),
a set is just a sequence of values, and a dictionary is one or more
key : value pairs.
Which drinks contain vodka? (Note that I’m previewing the use of for
, if
, and
, and or
from the next chapter for these tests.)
>>>
for
name
,
contents
in
drinks
.
items
():
...
if
'vodka'
in
contents
:
...
(
name
)
...
screwdriver
martini
black russian
white russian
We want something with vodka but are lactose intolerant, and think vermouth tastes like kerosene:
>>>
for
name
,
contents
in
drinks
.
items
():
...
if
'vodka'
in
contents
and
not
(
'vermouth'
in
contents
or
...
'cream'
in
contents
):
...
(
name
)
...
screwdriver
black russian
We’ll rewrite this a bit more succinctly in the next section.
What if you want to check for combinations of set values?
Suppose that you want to find any drink that has orange juice or vermouth?
We’ll use the set intersection operator, which is an ampersand (&
):
>>>
for
name
,
contents
in
drinks
.
items
():
...
if
contents
&
{
'vermouth'
,
'orange juice'
}:
...
(
name
)
...
screwdriver
martini
manhattan
The result of the &
operator is a set, which contains all the
items that appear in both lists that you compare.
If neither of those ingredients were in contents
, the &
returns an empty set,
which is considered False
.
Now, let’s rewrite the example from the previous section, in which we wanted vodka but neither cream nor vermouth:
>>>
for
name
,
contents
in
drinks
.
items
():
...
if
'vodka'
in
contents
and
not
contents
&
{
'vermouth'
,
'cream'
}:
...
(
name
)
...
screwdriver
black russian
Let’s save the ingredient sets for these two drinks in variables, just to save typing in the coming examples:
>>>
bruss
=
drinks
[
'black russian'
]
>>>
wruss
=
drinks
[
'white russian'
]
The following are examples of all the set operators.
Some have special punctuation,
some have special functions,
and some have both.
We’ll use test sets a
(contains 1
and 2
)
and b
(contains 2
and 3
):
>>>
a
=
{
1
,
2
}
>>>
b
=
{
2
,
3
}
You get the intersection
(members common to both sets)
with the special punctuation symbol
&
or the set intersection()
function, as demonstrated here:
>>>
a
&
b
{2}
>>>
a
.
intersection
(
b
)
{2}
This snippet uses our saved drink variables:
>>>
bruss
&
wruss
{'kahlua', 'vodka'}
In this example, you get the union
(members of either set)
by using |
or the set union()
function:
>>>
a
|
b
{1, 2, 3}
>>>
a
.
union
(
b
)
{1, 2, 3}
And here’s the alcoholic version:
>>>
bruss
|
wruss
{'cream', 'kahlua', 'vodka'}
The difference
(members of the first set but not the second)
is obtained by using the character -
or difference()
:
>>>
a
-
b
{1}
>>>
a
.
difference
(
b
)
{1}
>>>
bruss
-
wruss
set()
>>>
wruss
-
bruss
{'cream'}
By far, the most common set operations are union, intersection, and difference. I’ve included the others for completeness in the examples that follow, but you might never use them.
The exclusive or
(items in one set or the other, but not both)
uses ^
or symmetric_difference()
:
>>>
a
^
b
{1, 3}
>>>
a
.
symmetric_difference
(
b
)
{1, 3}
This finds the exclusive ingredient in our two russian drinks:
>>>
bruss
^
wruss
{'cream'}
You can check whether one set is a subset of another
(all members of the first set are also in the second set)
by using <=
or issubset()
:
>>>
a
<=
b
False
>>>
a
.
issubset
(
b
)
False
Adding cream to a black russian makes a white russian,
so wruss
is a superset of bruss
:
>>>
bruss
<=
wruss
True
Is any set a subset of itself? Yup.
>>>
a
<=
a
True
>>>
a
.
issubset
(
a
)
True
To be a proper subset, the second set
needs to have all the members of the first and more.
Calculate it by using <
, as in this example:
>>>
a
<
b
False
>>>
a
<
a
False
>>>
bruss
<
wruss
True
A superset is the opposite of a subset
(all members of the second
set are also members of the first).
This uses >=
or issuperset()
:
>>>
a
>=
b
False
>>>
a
.
issuperset
(
b
)
False
>>>
wruss
>=
bruss
True
Any set is a superset of itself:
>>>
a
>=
a
True
>>>
a
.
issuperset
(
a
)
True
And finally, you can find a proper superset
(the first set has all members of the second, and more)
by using >
, as shown here:
>>>
a
>
b
False
>>>
wruss
>
bruss
True
You can’t be a proper superset of yourself:
>>>
a
>
a
False
To review: you make a list by using square brackets ([]
),
a tuple by using commas,
and a dictionary by using curly brackets ({}
).
In each case, you access a single element with square brackets:
>>>
marx_list
=
[
'Groucho'
,
'Chico'
,
'Harpo'
]
>>>
marx_tuple
=
'Groucho'
,
'Chico'
,
'Harpo'
>>>
marx_dict
=
{
'Groucho'
:
'banjo'
,
'Chico'
:
'piano'
,
'Harpo'
:
'harp'
}
>>>
marx_list
[
2
]
'Harpo'
>>>
marx_tuple
[
2
]
'Harpo'
>>>
marx_dict
[
'Harpo'
]
'harp'
For the list and tuple, the value between the square brackets is an integer offset. For the dictionary, it’s a key. For all three, the result is a value.
We worked up from simple booleans, numbers, and strings to lists, tuples, sets, and dictionaries. You can combine these built-in data structures into bigger, more complex structures of your own. Let’s start with three different lists:
>>>
marxes
=
[
'Groucho'
,
'Chico'
,
'Harpo'
]
>>>
pythons
=
[
'Chapman'
,
'Cleese'
,
'Gilliam'
,
'Jones'
,
'Palin'
]
>>>
stooges
=
[
'Moe'
,
'Curly'
,
'Larry'
]
We can make a tuple that contains each list as an element:
>>>
tuple_of_lists
=
marxes
,
pythons
,
stooges
>>>
tuple_of_lists
(['Groucho', 'Chico', 'Harpo'],
['Chapman', 'Cleese', 'Gilliam', 'Jones', 'Palin'],
['Moe', 'Curly', 'Larry'])
And, we can make a list that contains the three lists:
>>>
list_of_lists
=
[
marxes
,
pythons
,
stooges
]
>>>
list_of_lists
[['Groucho', 'Chico', 'Harpo'],
['Chapman', 'Cleese', 'Gilliam', 'Jones', 'Palin'],
['Moe', 'Curly', 'Larry']]
Finally, let’s create a dictionary of lists. In this example, let’s use the name of the comedy group as the key and the list of members as the value:
>>>
dict_of_lists
=
{
'Marxes'
:
marxes
,
'Pythons'
:
pythons
,
'Stooges'
:
stooges
}
>> dict_of_lists
{'Stooges': ['Moe', 'Curly', 'Larry'],
'Marxes': ['Groucho', 'Chico', 'Harpo'],
'Pythons': ['Chapman', 'Cleese', 'Gilliam', 'Jones', 'Palin']}
Your only limitations are those in the data types themselves. For example, dictionary keys need to be immutable, so a list, dictionary, or set can’t be a key for another dictionary. But a tuple can be. For example, you could index sites of interest by GPS coordinates (latitude, longitude, and altitude; see “Maps” for more mapping examples):
>>>
houses
=
{
(44.79, -93.14, 285): 'My House',
(38.89, -77.03, 13): 'The White House'
}
In this chapter, you saw more complex data structures: lists, tuples, dictionaries, and sets. Using these and those from Chapter 2 (numbers and strings), you can represent elements in the real world with great variety.
3.1. Create a list called years_list
, starting with the year of your birth, and each year thereafter until the year of your fifth birthday. For example, if you were born in 1980. the list would be years_list = [1980, 1981, 1982, 1983, 1984, 1985]
.
If you’re less than five years old and reading this book, I don’t know what to tell you.
3.2. In which year in years_list
was your third birthday?
Remember, you were 0 years of age for your first year.
3.3. In which year in years_list
were you the oldest?
3.4. Make a list called things
with these three strings as elements:
"mozzarella"
, "cinderella"
, "salmonella"
.
3.5. Capitalize the element in things
that refers to a person and then print the list. Did it change the element in the list?
3.6. Make the cheesy element of things
all uppercase and then print the list.
3.7. Delete the disease element from things
,
collect your Nobel Prize,
and print the list.
3.8. Create a list called surprise
with the elements "Groucho"
,
"Chico"
, and "Harpo"
.
3.9. Lowercase the last element of the surprise
list, reverse it,
and then capitalize it.
3.10. Make an English-to-French dictionary called e2f
and print it.
Here are your starter
words: dog
is chien
, cat
is chat
, and walrus
is morse
.
3.11. Using your three-word dictionary e2f
, print the French word for walrus
.
3.12. Make a French-to-English dictionary called f2e
from e2f
. Use the items
method.
3.13. Using f2e
, print the English equivalent of the French word chien
.
3.14. Make and print a set of English words from the keys in e2f
.
3.15. Make a multilevel dictionary called life
. Use these strings for the topmost keys: 'animals'
, 'plants'
, and 'other'
. Make the 'animals'
key refer to another dictionary with the keys 'cats'
, 'octopi'
, and 'emus'
. Make the 'cats'
key refer to a list of strings with the values 'Henri'
, 'Grumpy'
, and 'Lucy'
. Make all the other keys refer to empty dictionaries.
3.16. Print the top-level keys of life
.
3.17. Print the keys for life['animals']
.
3.18. Print the values for life['animals']['cats']
.