This chapter concludes our tour of Python procedural statements by
presenting the language’s two main looping constructs—statements that repeat an
action over and over. The first of these, the while
statement, provides a way to code general
loops. The second, the for
statement, is designed for stepping through
the items in a sequence object and running a block of code for
each.
We’ve seen both of these informally already, but we’ll fill in
additional usage details here. While we’re at it, we’ll also study a few
less prominent statements used within loops, such as break
and continue
, and cover some built-ins commonly
used with loops, such as range
,
zip
, and map
.
Although the while
and for
statements covered here are the primary
syntax provided for coding repeated actions, there are additional
looping operations and concepts in Python. Because of that, the
iteration story is continued in the next chapter, where we’ll explore
the related ideas of Python’s iteration protocol
(used by the for
loop) and
list comprehensions (a close cousin to the for
loop). Later chapters explore even more
exotic iteration tools such as generators, filter
, and reduce
. For now, though, let’s keep things
simple.
Python’s while
statement is the most general iteration
construct in the language. In simple terms, it repeatedly executes a
block of (normally indented) statements as long as a test at the top
keeps evaluating to a true value. It is called a “loop” because
control keeps looping back to the start of the statement until the
test becomes false. When the test becomes false, control passes to the
statement that follows the while
block. The net effect is that the loop’s body is executed repeatedly
while the test at the top is true; if the test is false to begin with,
the body never runs.
In its most complex form, the while
statement consists of a header line
with a test expression, a body of one or more indented statements,
and an optional else
part that is
executed if control exits the loop without a break
statement being encountered. Python
keeps evaluating the test at the top and executing the statements
nested in the loop body until the test returns a false value:
while <test>: # Loop test <statements1> # Loop body else: # Optional else <statements2> # Run if didn't exit loop with break
To illustrate, let’s look at a few simple while
loops in action. The first, which
consists of a print
statement
nested in a while
loop, just
prints a message forever. Recall that True
is just a custom version of the
integer 1 and always stands for a Boolean true value; because the
test is always true, Python keeps executing the body forever, or
until you stop its execution. This sort of behavior is usually
called an infinite loop:
>>>while True:
...print('Type Ctrl-C to stop me!')
The next example keeps slicing off the first character of a
string until the string is empty and hence false. It’s typical to
test an object directly like this instead of using the more verbose
equivalent (while x != '':
).
Later in this chapter, we’ll see other ways to step more directly
through the items in a string with a for
loop.
>>>x = 'spam'
>>>while x:
# While x is not empty ...print(x, end=' ')
...x = x[1:]
# Strip first character off x ... spam pam am m
Note the end=' '
keyword
argument used here to place all outputs on the same line separated
by a space; see Chapter 11 if you’ve
forgotten why this works as it does. The following code counts from
the value of a
up to, but not
including, b
. We’ll see an easier
way to do this with a Python for
loop and the built-in range
function later:
>>>a=0; b=10
>>>while a < b:
# One way to code counter loops ...print(a, end=' ')
...a += 1
# Or, a = a + 1 ... 0 1 2 3 4 5 6 7 8 9
Finally, notice that Python doesn’t have what some languages
call a “do until” loop statement. However, we can simulate one with
a test and break
at the bottom of
the loop body:
while True:
...loop body
...
if exitTest(): break
To fully understand how this structure works, we need to move
on to the next section and learn more about the break
statement.
Now that we’ve seen a few Python loops in action, it’s time
to take a look at two simple statements that have a purpose only when
nested inside loops—the break
and continue statements. While we’re looking at oddballs, we will
also study the loop else
clause
here, because it is intertwined with break
, and Python’s empty placeholder
statement, the pass
(which is not tied to loops per se, but
falls into the general category of simple one-word statements). In
Python:
break
Jumps out of the closest enclosing loop (past the entire loop statement)
continue
Jumps to the top of the closest enclosing loop (to the loop’s header line)
pass
Does nothing at all: it’s an empty statement placeholder
Loop else block
Runs if and only if the loop is exited normally (i.e.,
without hitting a break
)
Factoring in break
and
continue
statements, the general
format of the while
loop looks
like this:
while <test1>: <statements1> if <test2>: break # Exit loop now, skip else if <test3>: continue # Go to top of loop now, to test1 else: <statements2> # Run if we didn't hit a 'break'
break
and continue
statements can appear anywhere
inside the while
(or for
) loop’s body, but they are usually
coded further nested in an if
test to take action in response to some condition.
Let’s turn to a few simple examples to see how these statements come together in practice.
Simple things first: the pass
statement is a no-operation placeholder
that is used when the syntax requires a statement, but you have
nothing useful to say. It is often used to code an empty body for a
compound statement. For instance, if you want to code an infinite
loop that does nothing each time through, do it with a pass
:
while True: pass # Type Ctrl-C to stop me!
Because the body is just an empty statement, Python gets stuck
in this loop. pass
is roughly to
statements as None
is to
objects—an explicit nothing. Notice that here the while
loop’s body is on the same line as
the header, after the colon; as with if
statements, this only works if the body
isn’t a compound statement.
This example does nothing forever. It probably isn’t the most
useful Python program ever written (unless you want to warm up your
laptop computer on a cold winter’s day!); frankly, though, I
couldn’t think of a better pass
example at this point in the book.
We’ll see other places where pass
makes more sense later—for instance,
to ignore exceptions caught by try
statements, and to define empty
class
objects with attributes
that behave like “structs” and “records” in other languages. A
pass
is also sometimes coded to
mean “to be filled in later,” to stub out the bodies of functions
temporarily:
def func1():
pass # Add real code here later
def func2():
pass
We can’t leave the body empty without getting a syntax error,
so we say pass
instead.
Version skew note: Python 3.0 (but not
2.6) allows ellipses coded as ...
(literally, three consecutive dots) to appear any place an
expression can. Because ellipses do nothing by themselves, this
can serve as an alternative to the pass
statement, especially for code to
be filled in later—a sort of Python “TBD”:
def func1(): ... # Alternative to pass def func2(): ... func1() # Does nothing if called
Ellipses can also appear on the same line as a statement header and may be used to initialize variable names if no specific type is required:
def func1(): ... # Works on same line too def func2(): ... >>> X = ... # Alternative to None >>> X Ellipsis
This notation is new in Python 3.0 (and goes well beyond the
original intent of ...
in
slicing extensions), so time will tell if it becomes widespread
enough to challenge pass
and
None
in these roles.
The continue
statement causes an immediate jump
to the top of a loop. It also sometimes lets you avoid statement
nesting. The next example uses continue
to skip odd numbers. This code
prints all even numbers less than 10 and greater than or equal to 0.
Remember, 0 means false and %
is
the remainder of division operator, so this loop counts down to 0,
skipping numbers that aren’t multiples of 2 (it prints 8 6 4 2 0
):
x = 10 while x: x = x−1 # Or, x -= 1 if x % 2 != 0: continue # Odd? -- skip print print(x, end=' ')
Because continue
jumps to
the top of the loop, you don’t need to nest the print
statement inside an if
test; the print
is only reached if the continue
is not run. If this sounds
similar to a “goto” in other languages, it should. Python has no
“goto” statement, but because continue
lets you jump about in a program,
many of the warnings about readability and maintainability you may
have heard about goto apply. continue
should probably be used
sparingly, especially when you’re first getting started with Python.
For instance, the last example might be clearer if the print
were nested under the if
:
x = 10
while x:
x = x−1
if x % 2 == 0: # Even? -- print
print(x, end=' ')
The break
statement causes an immediate exit from a loop.
Because the code that follows it in the loop is not executed if the
break
is reached, you can also
sometimes avoid nesting by including a break
. For example, here is a simple
interactive loop (a variant of a larger example we studied in Chapter 10) that inputs data with
input
(known as raw_input
in Python 2.6) and exits when
the user enters “stop” for the name request:
>>>while True:
...name = input('Enter name:')
...if name == 'stop': break
...age = input('Enter age: ')
...print('Hello', name, '=>', int(age) ** 2)
... Enter name:mel
Enter age:40
Hello mel => 1600 Enter name:bob
Enter age:30
Hello bob => 900 Enter name:stop
Notice how this code converts the age
input to an integer with int
before raising it to the second power;
as you’ll recall, this is necessary because input
returns user input as a string. In
Chapter 35, you’ll see that
input
also raises an exception at
end-of-file (e.g., if the user types Ctrl-Z or Ctrl-D); if this
matters, wrap input
in try
statements.
When combined with the loop else
clause,
the break
statement can often
eliminate the need for the search status flags used in other
languages. For instance, the following piece of code determines
whether a positive integer y
is
prime by searching for factors greater than 1:
x = y // 2 # For some y > 1 while x > 1: if y % x == 0: # Remainder print(y, 'has factor', x) break # Skip else x -= 1 else: # Normal exit print(y, 'is prime')
Rather than setting a flag to be tested when the loop is
exited, it inserts a break
where
a factor is found. This way, the loop else
clause can assume that it will be
executed only if no factor is found; if you don’t hit the break
, the number is prime.
The loop else
clause is
also run if the body of the loop is never executed, as you don’t run
a break
in that event either; in
a while
loop, this happens if the
test in the header is false to begin with. Thus, in the preceding
example you still get the “is prime” message if x
is initially less than or equal to 1
(for instance, if y
is 2).
This example determines primes, but only informally so.
Numbers less than 2 are not considered prime by the strict
mathematical definition. To be really picky, this code also fails
for negative numbers and succeeds for floating-point numbers with
no decimal digits. Also note that its code must use //
instead of /
in Python 3.0 because of the migration
of /
to “true division,” as
described in Chapter 5 (we need the initial
division to truncate remainders, not retain them!). If you want to
experiment with this code, be sure to see the exercise at the end
of Part IV, which wraps it in a function for
reuse.
Because the loop else
clause is unique to Python, it tends to perplex some newcomers. In
general terms, the loop else
provides explicit syntax for a common coding scenario—it is a coding structure
that lets us catch the “other” way out of a loop, without setting
and checking flags or conditions.
Suppose, for instance, that we are writing a loop to search a list for a value, and we need to know whether the value was found after we exit the loop. We might code such a task this way (this code is intentionally abstract and incomplete; x is a sequence and match() is a tester function to be defined):
found = False while x and not found: if match(x[0]): # Value at front? print('Ni') found = True else: x = x[1:] # Slice off front and repeat if not found: print('not found')
Here, we initialize, set, and later test a flag to determine
whether the search succeeded or not. This is valid Python code,
and it does work; however, this is exactly the sort of structure
that the loop else
clause is
there to handle. Here’s an else
equivalent:
while x: # Exit when x empty if match(x[0]): print('Ni') break # Exit, go around else x = x[1:] else: print('Not found') # Only here if exhausted x
This version is more concise. The flag is gone, and we’ve
replaced the if
test at the
loop end with an else
(lined up
vertically with the word while
). Because the break
inside the main part of the
while
exits the loop and goes
around the else
, this serves as
a more structured way to catch the search-failure case.
Some readers might have noticed that the prior example’s
else
clause could be replaced
with a test for an empty x
after the loop (e.g., if not
x:
). Although that’s true in this example, the else
provides explicit syntax for this
coding pattern (it’s more obviously a search-failure clause here),
and such an explicit empty test may not apply in some cases. The
loop else
becomes even more
useful when used in conjunction with the for
loop—the topic of the next
section—because sequence iteration is not under your
control.
The for
loop is a generic sequence iterator in Python: it can step through
the items in any ordered sequence object. The for
statement works on strings, lists,
tuples, other built-in iterables, and new objects that we’ll see how
to create later with classes. We met it in brief when studying
sequence object types; let’s expand on its usage more formally
here.
The Python for
loop begins with a header line that
specifies an assignment target (or targets), along with the object
you want to step through. The header is followed by a block of
(normally indented) statements that you want to repeat:
for <target> in <object>: # Assign object items to target <statements> # Repeated loop body: use target else: <statements> # If we didn't hit a 'break'
When Python runs a for
loop, it assigns the items in the sequence object to the target one
by one and executes the loop body for each. The loop body typically
uses the assignment target to refer to the current item in the
sequence as though it were a cursor stepping through the
sequence.
The name used as the assignment target in a for
header line is usually a (possibly
new) variable in the scope where the for
statement is coded. There’s not much
special about it; it can even be changed inside the loop’s body, but
it will automatically be set to the next item in the sequence when
control returns to the top of the loop again. After the loop this
variable normally still refers to the last item visited, which is
the last item in the sequence unless the loop exits with a break
statement.
The for
statement also
supports an optional else
block,
which works exactly as it does in a while
loop—it’s executed if the loop exits
without running into a break
statement (i.e., if all items in the sequence have been visited).
The break
and continue
statements introduced earlier
also work the same in a for
loop
as they do in a while
. The
for
loop’s complete format can be
described this way:
for <target> in <object>: # Assign object items to target <statements> if <test>: break # Exit loop now, skip else if <test>: continue # Go to top of loop now else: <statements> # If we didn't hit a 'break'
Let’s type a few for
loops interactively now, so you can see how they are used in
practice.
As mentioned earlier, a for
loop can step across any kind of
sequence object. In our first example, for instance, we’ll assign
the name x
to each of the three
items in a list in turn, from left to right, and the print
statement will be executed for
each. Inside the print
statement (the loop body), the name x
refers to the current item in the
list:
>>>for x in ["spam", "eggs", "ham"]:
...print(x, end=' ')
... spam eggs ham
The next two examples compute the sum and product of all the
items in a list. Later in this chapter and later in the book we’ll
meet tools that apply operations such as +
and *
to items in a list automatically, but
it’s usually just as easy to use a for
:
>>>sum = 0
>>>for x in [1, 2, 3, 4]:
...sum = sum + x
... >>>sum
10 >>>prod = 1
>>>for item in [1, 2, 3, 4]: prod *= item
... >>>prod
24
Any sequence works in a for
, as it’s a generic tool. For
example, for
loops work on
strings and tuples:
>>>S = "lumberjack"
>>>T = ("and", "I'm", "okay")
>>>for x in S: print(x, end=' ')
# Iterate over a string ... l u m b e r j a c k >>>for x in T: print(x, end=' ')
# Iterate over a tuple ... and I'm okay
In fact, as we’ll in the next chapter when we explore the
notion of “iterables,” for
loops can even work on some objects that are not sequences—files
and dictionaries work, too!
If you’re iterating through a sequence of tuples, the loop target itself can
actually be a tuple of targets. This is just
another case of the tuple-unpacking assignment we studied in Chapter 11 at work.
Remember, the for
loop assigns
items in the sequence object to the target, and assignment works
the same everywhere:
>>>T = [(1, 2), (3, 4), (5, 6)]
>>>for (a, b) in T:
# Tuple assignment at work ...print(a, b)
... 1 2 3 4 5 6
Here, the first time through the loop is like writing
(a,b) = (1,2)
, the second time
is like writing (a,b) = (3,4)
,
and so on. The net effect is to automatically unpack the current
tuple on each iteration.
This form is commonly used in conjunction with the zip
call we’ll meet later in this
chapter to implement parallel traversals. It also makes regular
appearances in conjunction with SQL databases in Python, where
query result tables are returned as sequences of sequences like
the list used here—the outer list is the database table, the
nested tuples are the rows within the table, and tuple assignment
extracts columns.
Tuples in for
loops also
come in handy to iterate through both keys
and values in dictionaries using the items
method, rather than looping
through the keys and indexing to fetch the values manually:
>>>D = {'a': 1, 'b': 2, 'c': 3}
>>>for key in D:
...print(key, '=>', D[key])
# Use dict keys iterator and index ... a => 1 c => 3 b => 2 >>>list(D.items())
[('a', 1), ('c', 3), ('b', 2)] >>>for (key, value) in D.items():
...print(key, '=>', value)
# Iterate over both keys and values ... a => 1 c => 3 b => 2
It’s important to note that tuple assignment in for
loops isn’t a special case; any
assignment target works syntactically after the word for
. Although we can always assign
manually within the loop to unpack:
>>>T
[(1, 2), (3, 4), (5, 6)] >>>for both in T:
...a, b = both
# Manual assignment equivalent ...print(a, b)
... 1 2 3 4 5 6
Tuples in the loop header save us an extra step when
iterating through sequences of sequences. As suggested in Chapter 11, even
nested structures may be automatically
unpacked this way in a for
:
>>>((a, b), c) = ((1, 2), 3)
# Nested sequences work too >>> a, b, c (1, 2, 3) >>>for ((a, b), c) in [((1, 2), 3), ((4, 5), 6)]: print(a, b, c)
... 1 2 3 4 5 6
But this is no special case—the for
loop simply runs the sort of
assignment we ran just before it, on each iteration. Any nested
sequence structure may be unpacked this way, just because
sequence assignment is so generic:
>>> for ((a, b), c) in [([1, 2], 3), ['XY', 6]]: print(a, b, c)
...
1 2 3
X Y 6
In fact, because the loop variable in a for
loop can really be any assignment
target, we can also use Python 3.0’s extended sequence-unpacking assignment syntax here
to extract items and sections of sequences within sequences.
Really, this isn’t a special case either, but simply a new
assignment form in 3.0 (as discussed in Chapter 11); because it
works in assignment statements, it automatically works in for
loops.
Consider the tuple assignment form introduced in the prior section. A tuple of values is assigned to a tuple of names on each iteration, exactly like a simple assignment statement:
>>>a, b, c = (1, 2, 3)
# Tuple assignment >>>a, b, c
(1, 2, 3) >>>for (a, b, c) in [(1, 2, 3), (4, 5, 6)]:
# Used in for loop ...print(a, b, c)
... 1 2 3 4 5 6
In Python 3.0, because a sequence can be assigned to a more
general set of names with a starred name to collect multiple
items, we can use the same syntax to extract parts of nested
sequences in the for
loop:
>>>a, *b, c = (1, 2, 3, 4)
# Extended seq assignment >>>a, b, c
(1, [2, 3], 4) >>>for (a, *b, c) in [(1, 2, 3, 4), (5, 6, 7, 8)]:
...print(a, b, c)
... 1 [2, 3] 4 5 [6, 7] 8
In practice, this approach might be used to pick out multiple columns from rows of data represented as nested sequences. In Python 2.X starred names aren’t allowed, but you can achieve similar effects by slicing. The only difference is that slicing returns a type-specific result, whereas starred names always are assigned lists:
>>>for all in [(1, 2, 3, 4), (5, 6, 7, 8)]:
# Manual slicing in 2.6 ...a, b, c = all[0], all[1:3], all[3]
...print(a, b, c)
... 1 (2, 3) 4 5 (6, 7) 8
See Chapter 11 for more on this assignment form.
Now let’s look at a for
loop that’s a bit more sophisticated
than those we’ve seen so far. The next example illustrates
statement nesting and the loop else
clause in a for
. Given a list of objects (items
) and a list of keys (tests
), this code searches for each key
in the objects list and reports on the search’s outcome:
>>>items = ["aaa", 111, (4, 5), 2.01]
# A set of objects >>>tests = [(4, 5), 3.14]
# Keys to search for >>> >>>for key in tests:
# For all keys ...for item in items:
# For all items ...if item == key:
# Check for match ...print(key, "was found")
...break
...else:
...print(key, "not found!")
... (4, 5) was found 3.14 not found!
Because the nested if
runs a break
when a match is
found, the loop else
clause can
assume that if it is reached, the search has failed. Notice the
nesting here. When this code runs, there are two loops going at
the same time: the outer loop scans the keys list, and the inner
loop scans the items list for each key. The nesting of the loop
else
clause is critical; it’s
indented to the same level as the header line of the inner
for
loop, so it’s associated
with the inner loop, not the if
or the outer for
.
Note that this example is easier to code if we employ the
in
operator to test membership.
Because in
implicitly scans an
object looking for a match (at least logically), it replaces the
inner loop:
>>>for key in tests:
# For all keys ...if key in items:
# Let Python check for a match ...print(key, "was found")
...else:
...print(key, "not found!")
... (4, 5) was found 3.14 not found!
In general, it’s a good idea to let Python do as much of the work as possible (as in this solution) for the sake of brevity and performance.
The next example performs a typical data-structure task with
a for
—collecting common items
in two sequences (strings). It’s roughly a simple set intersection
routine; after the loop runs, res
refers to a list that contains all
the items found in seq1
and
seq2
:
>>>seq1 = "spam"
>>>seq2 = "scam"
>>> >>>res = []
# Start empty >>>for x in seq1:
# Scan first sequence ...if x in seq2:
# Common item? ...res.append(x)
# Add to result end ... >>>res
['s', 'a', 'm']
Unfortunately, this code is equipped to work only on two
specific variables: seq1
and
seq2
. It would be nice if this
loop could somehow be generalized into a tool you could use more
than once. As you’ll see, that simple idea leads us to
functions, the topic of the next part of the
book.
The for
loop subsumes most counter-style loops. It’s generally
simpler to code and quicker to run than a while
, so it’s the first tool you should
reach for whenever you need to step through a sequence. But there are
also situations where you will need to iterate in more specialized
ways. For example, what if you need to visit every second or third
item in a list, or change the list along the way? How about traversing
more than one sequence in parallel, in the same for
loop?
You can always code such unique iterations with a while
loop and manual indexing, but Python
provides two built-ins that allow you to specialize the iteration in a
for
:
The built-in range
function produces a series of successively higher integers, which
can be used as indexes in a for
.
The built-in zip
function
returns a series of parallel-item tuples, which can be used to
traverse multiple sequences in a for
.
Because for
loops typically
run quicker than while
-based
counter loops, it’s to your advantage to use tools like these that
allow you to use for
when possible.
Let’s look at each of these built-ins in turn.
The range
function is really a general tool that can be used in a variety of
contexts. Although it’s used most often to generate indexes in a
for
, you can use it anywhere you
need a list of integers. In Python 3.0, range
is an iterator that generates items on demand, so
we need to wrap it in a list
call
to display its results all at once (more on iterators in Chapter 14):
>>> list(range(5)), list(range(2, 5)), list(range(0, 10, 2))
([0, 1, 2, 3, 4], [2, 3, 4], [0, 2, 4, 6, 8])
With one argument, range
generates a list of integers from zero up to but not including the
argument’s value. If you pass in two arguments, the first is taken
as the lower bound. An optional third argument can give a
step; if it is used, Python adds the step to
each successive integer in the result (the step defaults to 1).
Ranges can also be nonpositive and nonascending, if you want them to
be:
>>>list(range(−5, 5))
[−5, −4, −3, −2, −1, 0, 1, 2, 3, 4] >>>list(range(5, −5, −1))
[5, 4, 3, 2, 1, 0, −1, −2, −3, −4]
Although such range
results
may be useful all by themselves, they tend to come in most handy
within for
loops. For one thing,
they provide a simple way to repeat an action a specific number of
times. To print three lines, for example, use a range
to generate the appropriate number
of integers; for
loops force
results from range
automatically
in 3.0, so we don’t need list
here:
>>>for i in range(3):
...print(i, 'Pythons')
... 0 Pythons 1 Pythons 2 Pythons
range
is also commonly used
to iterate over a sequence indirectly. The easiest and fastest way
to step through a sequence exhaustively is always with a simple
for
, as Python handles most of
the details for you:
>>>X = 'spam'
>>>for item in X: print(item, end=' ')
# Simple iteration ... s p a m
Internally, the for
loop
handles the details of the iteration automatically when used this
way. If you really need to take over the indexing logic explicitly,
you can do it with a while
loop:
>>>i = 0
>>>while i < len(X):
# while loop iteration ...print(X[i], end=' ')
...i += 1
... s p a m
You can also do manual indexing with a for
, though, if you use range
to generate a list of indexes to
iterate through. It’s a multistep process, but it’s sufficient to
generate offsets, rather than the items at those offsets:
>>>X
'spam' >>>len(X)
# Length of string 4 >>>list(range(len(X)))
# All legal offsets into X [0, 1, 2, 3] >>> >>>for i in range(len(X)): print(X[i], end=' ')
# Manual for indexing ... s p a m
Note that because this example is stepping over a list of
offsets into X
, not the actual
items of X
,
we need to index back into X
within the loop to fetch each item.
The last example in the prior section works, but it’s not the
fastest option. It’s also more work than we need to do. Unless you
have a special indexing requirement, you’re always better off using
the simple for
loop form in
Python—as a general rule, use for
instead of while
whenever
possible, and don’t use range
calls in for
loops except as a
last resort. This simpler solution is better:
>>> for item in X: print(item)
# Simple iteration
...
However, the coding pattern used in the prior example does allow us to do more specialized sorts of traversals. For instance, we can skip items as we go:
>>>S = 'abcdefghijk'
>>>list(range(0, len(S), 2))
[0, 2, 4, 6, 8, 10] >>>for i in range(0, len(S), 2): print(S[i], end=' ')
... a c e g i k
Here, we visit every second item in the
string S
by stepping over the
generated range
list. To visit
every third item, change the third range
argument to be 3
, and so on. In effect, using range
this way lets you skip items in
loops while still retaining the simplicity of the for
loop construct.
Still, this is probably not the ideal best-practice technique
in Python today. If you really want to skip items in a sequence, the
extended three-limit form of the slice
expression, presented in Chapter 7,
provides a simpler route to the same goal. To visit every second
character in S
, for example,
slice with a stride of 2:
>>>S = 'abcdefghijk'
>>>for c in S[::2]:
print(c, end=' ')
... a c e g i k
The result is the same, but substantially easier for you to
write and for others to read. The only real advantage to using
range
here instead is that it
does not copy the string and does not create a list in 3.0; for very
large strings, it may save memory.
Another common place where you may use the range
and for
combination is in loops that change a
list as it is being traversed. Suppose, for example, that you need
to add 1 to every item in a list. You can try this with a simple
for
loop, but the result probably
won’t be exactly what you want:
>>>L = [1, 2, 3, 4, 5]
>>>for x in L:
...x += 1
... >>>L
[1, 2, 3, 4, 5] >>>x
6
This doesn’t quite work—it changes the loop variable x
, not the list L
. The reason is somewhat subtle. Each
time through the loop, x
refers
to the next integer already pulled out of the list. In the first
iteration, for example, x
is
integer 1
. In the next iteration,
the loop body sets x
to a
different object, integer 2
, but
it does not update the list where 1
originally came from.
To really change the list as we march across it, we need to
use indexes so we can assign an updated value to each position as we
go. The range
/len
combination can produce the required
indexes for us:
>>>L = [1, 2, 3, 4, 5]
>>>for i in range(len(L)):
# Add one to each item in L ...L[i] += 1
# Or L[i] = L[i] + 1 ... >>>L
[2, 3, 4, 5, 6]
When coded this way, the list is changed as we proceed through
the loop. There is no way to do the same with a simple for x in L:
-style loop, because such a
loop iterates through actual items, not list positions. But what
about the equivalent while
loop?
Such a loop requires a bit more work on our part, and likely runs
more slowly:
>>>i = 0
>>>while i < len(L):
...L[i] += 1
...i += 1
... >>>L
[3, 4, 5, 6, 7]
Here again, though, the range
solution may not be ideal either. A
list comprehension expression of the form:
[x+1 for x in L]
would do similar work, albeit without changing the original
list in-place (we could assign the expression’s new list object
result back to L
, but this would
not update any other references to the original list). Because this
is such a central looping concept, we’ll save a complete exploration
of list comprehensions for the next chapter.
As we’ve seen, the range
built-in allows us to traverse sequences with for
in a nonexhaustive fashion. In the
same spirit, the built-in zip
function allows us to use for
loops to visit multiple sequences
in parallel. In basic operation, zip
takes one or more sequences as
arguments and returns a series of tuples that pair up parallel items
taken from those sequences. For example, suppose we’re working with
two lists:
>>>L1 = [1,2,3,4]
>>>L2 = [5,6,7,8]
To combine the items in these lists, we can use zip
to create a list of tuple pairs (like
range
, zip
is an iterable object in 3.0, so we
must wrap it in a list
call to
display all its results at once—more on iterators in the next
chapter):
>>>zip(L1, L2)
<zip object at 0x026523C8> >>>list(zip(L1, L2))
# list() required in 3.0, not 2.6 [(1, 5), (2, 6), (3, 7), (4, 8)]
Such a result may be useful in other contexts as well, but
when wedded with the for
loop, it
supports parallel iterations:
>>>for (x, y) in zip(L1, L2):
...print(x, y, '--', x+y)
... 1 5 -- 6 2 6 -- 8 3 7 -- 10 4 8 -- 12
Here, we step over the result of the zip
call—that is, the pairs of items
pulled from the two lists. Notice that this for
loop again uses the tuple assignment
form we met earlier to unpack each tuple in the zip
result. The first time through, it’s
as though we ran the assignment statement (x, y) = (1, 5)
.
The net effect is that we scan both L1
and L2
in our loop. We could achieve a similar
effect with a while
loop that
handles indexing manually, but it would require more typing and
would likely run more slowly than the for
/zip
approach.
Strictly speaking, the zip
function is more general than this example suggests. For instance,
it accepts any type of sequence (really, any iterable object,
including files), and it accepts more than two arguments. With three
arguments, as in the following example, it builds a list of
three-item tuples with items from each sequence, essentially
projecting by columns (technically, we get an N-ary tuple for N
arguments):
>>>T1, T2, T3 = (1,2,3), (4,5,6), (7,8,9)
>>>T3
(7, 8, 9) >>>list(zip(T1, T2, T3))
[(1, 4, 7), (2, 5, 8), (3, 6, 9)]
Moreover, zip
truncates
result tuples at the length of the shortest sequence when the
argument lengths differ. In the following, we zip together two
strings to pick out characters in parallel, but the result has only
as many tuples as the length of the shortest sequence:
>>>S1 = 'abc'
>>>S2 = 'xyz123'
>>> >>>list(zip(S1, S2))
[('a', 'x'), ('b', 'y'), ('c', 'z')]
In Python 2.X, the related built-in map
function pairs items from sequences in
a similar fashion, but it pads shorter sequences with None
if the argument lengths differ
instead of truncating to the shortest length:
>>>S1 = 'abc'
>>>S2 = 'xyz123'
>>>map(None, S1, S2)
# 2.X only [('a', 'x'), ('b', 'y'), ('c', 'z'), (None, '1'), (None, '2'), (None,'3')]
This example is using a degenerate form of the map
built-in, which is no longer
supported in 3.0. Normally, map
takes a function and one or more sequence arguments and collects
the results of calling the function with parallel items taken from
the sequence(s). We’ll study map
in detail in Chapters 19 and 20, but as a brief example, the
following maps the built-in ord
function across each item in a string and collects the results
(like zip
, map
is a value generator in 3.0 and so
must be passed to list
to
collect all its results at once):
>>> list(map(ord, 'spam'))
[115, 112, 97, 109]
This works the same as the following loop statement, but is often quicker:
>>>res = []
>>>for c in 'spam': res.append(ord(c))
>>>res
[115, 112, 97, 109]
Version skew note: The degenerate
form of map
using a function
argument of None
is no longer
supported in Python 3.0, because it largely overlaps with
zip
(and was, frankly, a bit
at odds with map
’s
function-application purpose). In 3.0, either use zip
or write loop code to pad results
yourself. We’ll see how to do this in Chapter 20, after we’ve
had a chance to study some additional iteration concepts.
In Chapter 8, I suggested
that the zip
call used here can also be handy for
generating dictionaries when the sets of keys and values must be
computed at runtime. Now that we’re becoming proficient with
zip
, I’ll explain how it
relates to dictionary construction. As you’ve learned, you can
always create a dictionary by coding a dictionary literal, or by
assigning to keys over time:
>>>D1 = {'spam':1, 'eggs':3, 'toast':5}
>>>D1
{'toast': 5, 'eggs': 3, 'spam': 1} >>>D1 = {}
>>>D1['spam'] = 1
>>>D1['eggs'] = 3
>>>D1['toast'] = 5
What to do, though, if your program obtains dictionary keys and values in lists at runtime, after you’ve coded your script? For example, say you had the following keys and values lists:
>>>keys = ['spam', 'eggs', 'toast']
>>>vals = [1, 3, 5]
One solution for turning those lists into a dictionary would
be to zip
the lists and step
through them in parallel with a for
loop:
>>>list(zip(keys, vals))
[('spam', 1), ('eggs', 3), ('toast', 5)] >>>D2 = {}
>>>for (k, v) in zip(keys, vals): D2[k] = v
... >>>D2
{'toast': 5, 'eggs': 3, 'spam': 1}
It turns out, though, that in
Python 2.2 and later you can skip the for
loop altogether and simply pass the
zipped keys/values lists to the built-in dict
constructor call:
>>>keys = ['spam', 'eggs', 'toast']
>>>vals = [1, 3, 5]
>>>D3 = dict(zip(keys, vals))
>>>D3
{'toast': 5, 'eggs': 3, 'spam': 1}
The built-in name dict
is
really a type name in Python (you’ll learn more about type names,
and subclassing them, in Chapter 31). Calling it achieves something
like a list-to-dictionary conversion, but it’s really an object
construction request. In the next chapter we’ll explore a related
but richer concept, the list comprehension,
which builds lists in a single expression; we’ll also revisit 3.0
dictionary comprehensions, an alternative to
the dict
call for zipped
key/value pairs.
Earlier, we discussed using range
to generate the offsets of items in
a string, rather than the items at those offsets. In some programs,
though, we need both: the item to use, plus an offset as we go.
Traditionally, this was coded with a simple for
loop that also kept a counter of the
current offset:
>>>S = 'spam'
>>>offset = 0
>>>for item in S:
...print(item, 'appears at offset', offset)
...offset += 1
... s appears at offset 0 p appears at offset 1 a appears at offset 2 m appears at offset 3
This works, but in recent Python releases a new built-in named
enumerate
does the job for us:
>>>S = 'spam'
>>>for (offset, item) in enumerate(S):
...print(item, 'appears at offset', offset)
... s appears at offset 0 p appears at offset 1 a appears at offset 2 m appears at offset 3
The enumerate
function
returns a generator object—a kind of object that
supports the iteration protocol that we will study in the next
chapter and will discuss in more detail in the next part of the
book. In short, it has a __next__
method called by the next
built-in function, which returns an (
index
,
value
)
tuple each time through the loop. We can
unpack these tuples with tuple assignment in the for
loop (much like using zip
):
>>>E = enumerate(S)
>>>E
<enumerate object at 0x02765AA8> >>>next(E)
(0, 's') >>>next(E)
(1, 'p') >>>next(E)
(2, 'a')
As usual, we don’t normally see this machinery because iteration contexts—including list comprehensions, the subject of Chapter 14—run the iteration protocol automatically:
>>> [c * i for (i, c) in enumerate(S)]
['', 'p', 'aa', 'mmm']
To fully understand iteration concepts like enumerate
, zip
, and list comprehensions, we need to
move on to the next chapter for a more formal dissection.
In this chapter, we explored Python’s looping statements as well
as some concepts related to looping in Python. We looked at the
while
and for
loop statements in depth, and we learned
about their associated else
clauses. We also studied the break
and continue
statements, which have
meaning only inside loops, and met several built-in tools commonly
used in for
loops, including
range
, zip
, map
,
and enumerate
(although their roles
as iterators in Python 3.0 won’t be fully uncovered until the next
chapter).
In the next chapter, we continue the iteration story by
discussing list comprehensions and the iteration protocol in
Python—concepts strongly related to for
loops. There, we’ll also explain some of
the subtleties of iterable tools we met here, such as range
and zip
. As always, though, before moving on
let’s exercise what you’ve picked up here with a quiz.
What are the main functional differences between a while
and a for
?
What’s the difference between break
and continue
?
When is a loop’s else
clause executed?
How can you code a counter-based loop in Python?
What can a range
be used
for in a for
loop?
The while
loop is a
general looping statement, but the for
is designed to iterate across items
in a sequence (really, iterable). Although the while
can imitate the for
with counter loops, it takes more
code and might run slower.
The break
statement exits
a loop immediately (you wind up below the entire while
or for
loop statement), and continue
jumps back to the top of the
loop (you wind up positioned just before the test in while
or the next item fetch in for
).
The else
clause in a
while
or for
loop will be run once as the loop is
exiting, if the loop exits normally (without running into a
break
statement). A break
exits the loop immediately,
skipping the else
part on the
way out (if there is one).
Counter loops can be coded with a while
statement that keeps track of the
index manually, or with a for
loop that uses the range
built-in function to generate successive integer offsets. Neither
is the preferred way to work in Python, if you need to simply step
across all the items in a sequence. Instead, use a simple for
loop instead, without range
or counters, whenever possible; it
will be easier to code and usually quicker to run.
The range
built-in can be
used in a for
to implement a
fixed number of repetitions, to scan by offsets instead of items
at offsets, to skip successive items as you go, and to change a
list while stepping across it. None of these roles requires
range
, and most have
alternatives—scanning actual items, three-limit slices, and list
comprehensions are often better solutions today (despite the
natural inclinations of ex-C programmers to want to count
things!).