During your bottom-up climb, you’ve progressed from built-in data types to constructing ever-larger data and code structures. In this chapter, you’ll finally get down to brass tacks and learn how to write realistic, large programs in Python.
Thus far, you’ve been writing and running code fragments such as the following within Python’s interactive interpreter:
>>>
(
"This interactive snippet works."
)
This
interactive
snippet
works
.
Now let’s make your first standalone program. On your computer, create a file called test1.py containing this single line of Python code:
(
"This standalone program works!"
)
Notice that there’s no >>>
prompt,
just a single line of Python code.
Ensure that there is no indentation in the line before print
.
If you’re running Python in a text terminal or terminal window, type the name of your Python program followed by the program filename:
$ python test1.py This standalone program works!
You can save all of the interactive snippets that you’ve seen in this book so far
to files and run them directly.
If you’re cutting and pasting, ensure that you delete the
initial >>>
and …
(include the final space).
On your computer, create a file called test2.py that contains these two lines:
import
sys
(
'Program arguments:'
,
sys
.
argv
)
Now, use your version of Python to run this program. Here’s how it might look in a Linux or Mac OS X terminal window using a standard shell program:
$ python test2.py Program arguments: ['test2.py'] $ python test2.py tra la la Program arguments: ['test2.py', 'tra', 'la', 'la']
We’re going to step up another level, creating and using Python code in more than one file. A module is just a file of Python code.
The text of this book is organized in a hierarchy: words, sentences, paragraphs, and chapters. Otherwise, it would be unreadable after a page or two. Code has a roughly similar bottom-up organization: data types are like words, statements are like sentences, functions are like paragraphs, and modules are like chapters. To continue the analogy, in this book, when I say that something will be explained in Chapter 8, in programming, that’s like referring to code in another module.
We refer to code of other modules by using the import
statement.
This makes the code and variables in the
imported module available to your program.
The simplest use of the import
statement is import
module,
where module is the name of another Python file,
without the .py extension.
Let’s simulate a weather station
and print a weather report.
One main program prints the report,
and a separate
module with a single function returns the
weather description used by the report.
Here’s the main program (call it weatherman.py):
import
report
description
=
report
.
get_description
()
(
"Today's weather:"
,
description
)
And here is the module (report.py):
def
get_description
():
# see the docstring below?
"""Return random weather, just like the pros"""
from
random
import
choice
possibilities
=
[
'rain'
,
'snow'
,
'sleet'
,
'fog'
,
'sun'
,
'who knows'
]
return
choice
(
possibilities
)
If you have these two files in the same directory
and instruct Python to run weatherman.py
as the main program,
it will access the report
module and
run its get_description()
function.
We wrote this version of get_description()
to return a random result from a list of strings,
so that’s what the main program will get
back and print:
$ python weatherman.py Today's weather: who knows $ python weatherman.py Today's weather: sun $ python weatherman.py Today's weather: sleet
We used imports in two different places:
The main program weatherman.py imported the module report
.
In the module file report.py, the get_description()
function imported the choice
function from Python’s standard random
module.
We also used imports in two different ways:
The main program called import report
and then ran report.get_description()
.
The get_description()
function in report.py called from random import choice
and then ran choice(possibilities)
.
In the first case, we imported the entire report
module
but needed to use report.
as a prefix to get_description()
.
After this import
statement, everything in report.py is
available to the main program, as long as we tack report.
before
its name.
By qualifying the contents of a module with the module’s name,
we avoid any nasty naming conflicts.
There could be a get_description()
function in some other module,
and we would not call it by mistake.
In the second case, we’re within a function and know that nothing else
named choice
is here,
so we imported the choice()
function from the random
module directly.
We could have written the function like the following snippet,
which returns random results:
def
get_description
():
import
random
possibilities
=
[
'rain'
,
'snow'
,
'sleet'
,
'fog'
,
'sun'
,
'who knows'
]
return
random
.
choice
(
possibilities
)
Like many aspects of programming, pick the style that seems the
most clear to you.
The module-qualified name (random.choice
) is safer but
requires a little more typing.
These get_description()
examples showed variations of
what to import, but but not where to do the importing—they all called import
from inside the function.
We could have imported random
from outside the function:
>>>
import
random
>>>
def
get_description
():
...
possibilities
=
[
'rain'
,
'snow'
,
'sleet'
,
'fog'
,
'sun'
,
'who knows'
]
...
return
random
.
choice
(
possibilities
)
...
>>>
get_description
()
'who knows'
>>>
get_description
()
'rain'
You should consider importing from outside the function if the imported code might be used in more than one place, and from inside if you know its use will be limited. Some people prefer to put all their imports at the top of the file, just to make all the dependencies of their code explicit. Either way works.
In our main weatherman.py program,
we called import report
.
But what if you have another module with the same name
or want to use a name that is more mnemonic
or shorter?
In such a situation, you can import using an alias.
Let’s use the alias wr
:
import
report
as
wr
description
=
wr
.
get_description
()
(
"Today's weather:"
,
description
)
With Python, you can import one or more parts of a module.
Each part can keep its original name or you can give it an alias.
First, let’s import
get_description()
from the report
module
with its original name:
from
report
import
get_description
description
=
get_description
()
(
"Today's weather:"
,
description
)
Now, import it as do_it
:
from
report
import
get_description
as
do_it
description
=
do_it
()
(
"Today's weather:"
,
description
)
Where does Python look for files to import?
It uses a list of directory names and ZIP archive files stored
in the standard sys
module as the variable path
.
You can access and modify this list.
Here’s the value of sys.path
for Python 3.3 on my Mac:
>>>
import
sys
>>>
for
place
in
sys
.
path
:
...
(
place
)
...
/Library/Frameworks/Python.framework/Versions/3.3/lib/python33.zip
/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3
/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/plat-darwin
/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/lib-dynload
/Library/Frameworks/Python.framework/Versions/3.3/lib/python3.3/site-packages
That initial blank output line is the empty string
''
, which stands for the current directory.
If ''
is first in sys.path
, Python looks in the current directory first
when you try to import something:
import report
looks for report.py
.
The first match will be used.
This means that if you define a module named random
and it’s in the search path
before the standard library, you won’t be able to access the standard library’s
random
now.
We went from single lines of code, to multiline functions, to standalone programs, to multiple modules in the same directory. To allow Python applications to scale even more, you can organize modules into file hierarchies called packages.
Maybe we want different types of text forecasts:
one for the next day and one for the next week.
One way to structure this is to make a directory
named sources
,
and create two modules within it:
daily.py and weekly.py.
Each has a function called forecast
.
The daily version returns a string,
and the weekly version returns a list of seven strings.
Here’s the main program and the two modules.
(The enumerate()
function takes apart a list and feeds each item
of the list to the for
loop, adding a number to each item
as a little bonus.)
from
sources
import
daily
,
weekly
(
"Daily forecast:"
,
daily
.
forecast
())
(
"Weekly forecast:"
)
for
number
,
outlook
in
enumerate
(
weekly
.
forecast
(),
1
):
(
number
,
outlook
)
def
forecast
():
'fake daily forecast'
return
'like yesterday'
def
forecast
():
"""Fake weekly forecast"""
return
[
'snow'
,
'more snow'
,
'sleet'
,
'freezing rain'
,
'rain'
,
'fog'
,
'hail'
]
You’ll need one more thing in the sources
directory:
a file named __init__.py.
This can be empty,
but Python needs it to treat the directory
containing it as a package.
Run the main weather.py program to see what happens:
$ python weather.py Daily forecast: like yesterday Weekly forecast: 1 snow 2 more snow 3 sleet 4 freezing rain 5 rain 6 fog 7 hail
One of Python’s prominent claims is that it has “batteries included”—a large standard library of modules that perform many useful tasks, and are kept separate to avoid bloating the core language. When you’re about to write some Python code, it’s often worthwhile to first check whether there’s a standard module that already does what you want. It’s surprising how often you encounter little gems in the standard library. Python also provides authoritative documentation for the modules, along with a tutorial. Doug Hellmann’s website Python Module of the Week and his book The Python Standard Library by Example (Addison-Wesley Professional) are also very useful guides.
Upcoming chapters in this book feature many of the standard modules that are specific to the Web, systems, databases, and so on. In this section, I’ll talk about some standard modules that have generic uses.
You’ve seen that trying to access a dictionary
with a nonexistent key raises an exception.
Using the dictionary get()
function
to return a default value avoids an exception.
The setdefault()
function is like get()
,
but also assigns an item to the dictionary
if the key is missing:
>>>
periodic_table
=
{
'Hydrogen'
:
1
,
'Helium'
:
2
}
>>>
(
periodic_table
)
{'Helium': 2, 'Hydrogen': 1}
If the key was not already in the dictionary, the new value is used:
>>>
carbon
=
periodic_table
.
setdefault
(
'Carbon'
,
12
)
>>>
carbon
12
>>>
periodic_table
{'Helium': 2, 'Carbon': 12, 'Hydrogen': 1}
If we try to assign a different default value to an existing key, the original value is returned and nothing is changed:
>>>
helium
=
periodic_table
.
setdefault
(
'Helium'
,
947
)
>>>
helium
2
>>>
periodic_table
{'Helium': 2, 'Carbon': 12, 'Hydrogen': 1}
defaultdict()
is similar, but specifies the default
value for any new key up front,
when the dictionary is created.
Its argument is a function.
In this example, we pass the function int
,
which will be called as int()
and return the integer 0
:
>>>
from
collections
import
defaultdict
>>>
periodic_table
=
defaultdict
(
int
)
Now, any missing value will be an integer (int
),
with the value 0
:
>>>
periodic_table
[
'Hydrogen'
]
=
1
>>>
periodic_table
[
'Lead'
]
0
>>>
periodic_table
defaultdict(<class 'int'>, {'Lead': 0, 'Hydrogen': 1})
The argument to defaultdict()
is a function that returns
the value to be assigned to a missing key.
In the following example, no_idea()
is executed to return
a value when needed:
>>>
from
collections
import
defaultdict
>>>
>>>
def
no_idea
():
...
return
'Huh?'
...
>>>
bestiary
=
defaultdict
(
no_idea
)
>>>
bestiary
[
'A'
]
=
'Abominable Snowman'
>>>
bestiary
[
'B'
]
=
'Basilisk'
>>>
bestiary
[
'A'
]
'Abominable Snowman'
>>>
bestiary
[
'B'
]
'Basilisk'
>>>
bestiary
[
'C'
]
'Huh?'
You can use the functions int()
, list()
, or dict()
to return
default empty values for those types:
int()
returns 0
,
list()
returns an empty list ([]
),
and dict()
returns an empty dictionary ({}
).
If you omit the argument,
the initial value of a new key will be set to None
.
By the way, you can use lambda
to define your default-making function
right inside the call:
>>>
bestiary
=
defaultdict
(
lambda
:
'Huh?'
)
>>>
bestiary
[
'E'
]
'Huh?'
Using int
is one way to make your own counter:
>>>
from
collections
import
defaultdict
>>>
food_counter
=
defaultdict
(
int
)
>>>
for
food
in
[
'spam'
,
'spam'
,
'eggs'
,
'spam'
]:
...
food_counter
[
food
]
+=
1
...
>>>
for
food
,
count
in
food_counter
.
items
():
...
(
food
,
count
)
...
eggs 1
spam 3
In the preceding example, if food_counter
had been a normal dictionary instead of a defaultdict
,
Python would have raised an exception every time we tried to increment
the dictionary element food_counter[food]
because it would not
have been initialized.
We would have needed to do some extra work, as shown here:
>>>
dict_counter
=
{}
>>>
for
food
in
[
'spam'
,
'spam'
,
'eggs'
,
'spam'
]:
...
if
not
food
in
dict_counter
:
...
dict_counter
[
food
]
=
0
...
dict_counter
[
food
]
+=
1
...
>>>
for
food
,
count
in
dict_counter
.
items
():
...
(
food
,
count
)
...
spam 3
eggs 1
Speaking of counters, the standard library has one that does the work of the previous example and more:
>>>
from
collections
import
Counter
>>>
breakfast
=
[
'spam'
,
'spam'
,
'eggs'
,
'spam'
]
>>>
breakfast_counter
=
Counter
(
breakfast
)
>>>
breakfast_counter
Counter({'spam': 3, 'eggs': 1})
The most_common()
function returns all elements
in descending order, or just the top count
elements if given a count:
>>>
breakfast_counter
.
most_common
()
[('spam', 3), ('eggs', 1)]
>>>
breakfast_counter
.
most_common
(
1
)
[('spam', 3)]
You can combine counters.
First, let’s see again what’s in breakfast_counter
:
>>>
breakfast_counter
>>>
Counter
({
'spam'
:
3
,
'eggs'
:
1
})
This time, we’ll make a new list called lunch
,
and a counter called lunch_counter
:
>>>
lunch
=
[
'eggs'
,
'eggs'
,
'bacon'
]
>>>
lunch_counter
=
Counter
(
lunch
)
>>>
lunch_counter
Counter({'eggs': 2, 'bacon': 1})
The first way we combine the two counters
is by addition, using +
:
>>>
breakfast_counter
+
lunch_counter
Counter({'spam': 3, 'eggs': 3, 'bacon': 1})
As you might expect, you subtract one
counter from another by using -
.
What’s for breakfast but not for lunch?
>>>
breakfast_counter
-
lunch_counter
Counter({'spam': 3})
Okay, now what can we have for lunch that we can’t have for breakfast?
>>>
lunch_counter
-
breakfast_counter
Counter({'bacon': 1, 'eggs': 1})
Similar to sets in Chapter 4, you can get common items by using the intersection operator &
:
>>>
breakfast_counter
&
lunch_counter
Counter({'eggs': 1})
The intersection picked the common element ('eggs'
)
with the lower count.
This makes sense:
breakfast only offered one egg,
so that’s the common count.
Finally, you can get all items by using the
union operator |
:
>>>
breakfast_counter
|
lunch_counter
Counter({'spam': 3, 'eggs': 2, 'bacon': 1})
The item 'eggs'
was again common to both.
Unlike addition, union didn’t add their counts, but
picked the one with the larger count.
Many of the code examples in the early chapters of this book demonstrate that the order
of keys in a dictionary is not predictable:
you might add keys a
, b
, and c
in that order,
but keys()
might return c
, a
, b
.
Here’s a repurposed example from Chapter 1:
>>>
quotes
=
{
...
'Moe'
:
'A wise guy, huh?'
,
...
'Larry'
:
'Ow!'
,
...
'Curly'
:
'Nyuk nyuk!'
,
...
}
>>>
for
stooge
in
quotes
:
...
(
stooge
)
...
Larry
Curly
Moe
An OrderedDict()
remembers the order of key addition
and returns them in the same order from an iterator.
Try creating an OrderedDict
from a sequence of
(key, value) tuples:
>>>
from
collections
import
OrderedDict
>>>
quotes
=
OrderedDict
([
...
(
'Moe'
,
'A wise guy, huh?'
),
...
(
'Larry'
,
'Ow!'
),
...
(
'Curly'
,
'Nyuk nyuk!'
),
...
])
>>>
>>>
for
stooge
in
quotes
:
...
(
stooge
)
...
Moe
Larry
Curly
A deque
(pronounced deck) is a double-ended queue,
which has features of both a stack and a queue.
It’s useful when you want to add and delete items from
either end of a sequence.
Here, we’ll work from both ends of a word to the middle
to see if it’s a palindrome.
The function popleft()
removes the
leftmost item from the deque and returns it;
pop()
removes the rightmost item
and returns it.
Together,
they work from the ends toward the middle.
As long as the end characters match,
it keeps popping until it reaches the middle:
>>>
def
palindrome
(
word
):
...
from
collections
import
deque
...
dq
=
deque
(
word
)
...
while
len
(
dq
)
>
1
:
...
if
dq
.
popleft
()
!=
dq
.
pop
():
...
return
False
...
return
True
...
...
>>>
palindrome
(
'a'
)
True
>>>
palindrome
(
'racecar'
)
True
>>>
palindrome
(
''
)
True
>>>
palindrome
(
'radar'
)
True
>>>
palindrome
(
'halibut'
)
False
I used this as a simple illustration
of deques.
If you really wanted a quick palindrome
checker,
it would be a lot simpler to just
compare a string with its reverse.
Python doesn’t have a reverse()
function for strings,
but it does have a way to reverse a
string with a slice, as illustrated here:
>>>
def
another_palindrome
(
word
):
...
return
word
==
word
[::
-
1
]
...
>>>
another_palindrome
(
'radar'
)
True
>>>
another_palindrome
(
'halibut'
)
False
itertools
contains special-purpose iterator functions.
Each returns one item at a time
when called within a for
… in
loop,
and remembers its state between calls.
chain()
runs through its arguments as though they
were a single iterable:
>>>
import
itertools
>>>
for
item
in
itertools
.
chain
([
1
,
2
],
[
'a'
,
'b'
]):
...
(
item
)
...
1
2
a
b
cycle()
is an infinite iterator, cycling through its arguments:
>>>
import
itertools
>>>
for
item
in
itertools
.
cycle
([
1
,
2
]):
...
(
item
)
...
1
2
1
2
.
.
.
…and so on.
accumulate()
calculates accumulated values.
By default, it calculates the sum:
>>>
import
itertools
>>>
for
item
in
itertools
.
accumulate
([
1
,
2
,
3
,
4
]):
...
(
item
)
...
1
3
6
10
You can provide a function as the second argument to
accumulate()
, and it will be used instead of addition.
The function should take two arguments and return
a single result. This example
calculates an accumulated product:
>>>
import
itertools
>>>
def
multiply
(
a
,
b
):
...
return
a
*
b
...
>>>
for
item
in
itertools
.
accumulate
([
1
,
2
,
3
,
4
],
multiply
):
...
(
item
)
...
1
2
6
24
The itertools
module has many more functions,
notably some for
combinations and permutations
that can be time savers when the need arises.
All of our examples have used print()
(or just the variable name, in the interactive
interpreter)
to print things.
Sometimes, the results are hard to read.
We need a pretty printer such as pprint()
:
>>>
from
pprint
import
pprint
>>>
quotes
=
OrderedDict
([
...
(
'Moe'
,
'A wise guy, huh?'
),
...
(
'Larry'
,
'Ow!'
),
...
(
'Curly'
,
'Nyuk nyuk!'
),
...
])
>>>
Plain old print()
just dumps things out there:
>>>
(
quotes
)
OrderedDict([('Moe', 'A wise guy, huh?'), ('Larry', 'Ow!'), ('Curly', 'Nyuk nyuk!')])
However, pprint()
tries to align elements for better readability:
>>>
pprint
(
quotes
)
{'Moe': 'A wise guy, huh?',
'Larry': 'Ow!',
'Curly': 'Nyuk nyuk!'}
Sometimes, the standard library doesn’t have what you need, or doesn’t do it in quite the right way. There’s an entire world of open-source, third-party Python software. Good resources include:
PyPi (also known as the Cheese Shop, after an old Monty Python skit)
You can find many smaller code examples at activestate.
Almost all of the Python code in this book uses
the standard Python installation on your computer,
which includes all the built-ins and the standard library.
External packages are featured in some places:
I mentioned requests
in Chapter 1,
and have more details in “Beyond the Standard Library: Requests”.
Appendix D shows how to
install third-party Python software,
along with many other
nuts-and-bolts development details.
5.1. Create a file called zoo.py. In it, define a function called hours()
that prints the string 'Open 9-5 daily'
. Then, use the interactive interpreter to import the zoo
module and call its hours()
function.
5.2. In the interactive interpreter, import the zoo
module as menagerie
and call its hours()
function.
5.3. Staying in the interpreter, import the hours()
function from zoo
directly and call it.
5.4. Import the hours()
function as info
and call it.
5.5. Make a dictionary called plain
with the key-value pairs 'a': 1
, 'b': 2
, and 'c': 3
, and then print it.
5.6. Make an OrderedDict
called fancy
from the same pairs listed in 5.5 and print it. Did it print in the same order as plain
?
5.7. Make a defaultdict
called dict_of_lists
and pass it the argument list
. Make the list dict_of_lists['a']
and append the value 'something for a'
to it in one assignment. Print dict_of_lists['a']
.