Chapter 16 introduced basic function definitions and calls. As we saw, Python’s basic function model is simple to use, but even simple function examples quickly led us to questions about the meaning of variables in our code. This chapter moves on to present the details behind Python’s scopes—the places where variables are defined and looked up. As we’ll see, the place where a name is assigned in our code is crucial to determining what the name means. We’ll also find that scope usage can have a major impact on program maintenance effort; overuse of globals, for example, is a generally bad thing.
Now that you’re ready to start writing your own functions, we need to get more formal about what names mean in Python. When you use a name in a program, Python creates, changes, or looks up the name in what is known as a namespace—a place where names live. When we talk about the search for a name’s value in relation to code, the term scope refers to a namespace: that is, the location of a name’s assignment in your code determines the scope of the name’s visibility to your code.
Just about everything related to names, including scope classification, happens at assignment time in Python. As we’ve seen, names in Python spring into existence when they are first assigned values, and they must be assigned before they are used. Because names are not declared ahead of time, Python uses the location of the assignment of a name to associate it with (i.e., bind it to) a particular namespace. In other words, the place where you assign a name in your source code determines the namespace it will live in, and hence its scope of visibility.
Besides packaging code, functions add an extra namespace layer to your programs—by default, all names assigned inside a function are associated with that function’s namespace, and no other. This means that:
Names defined inside a def
can only be seen by the code within
that def
. You cannot even refer to such names
from outside the function.
Names defined inside a def
do not clash with variables outside
the def
, even if the same names
are used elsewhere. A name X
assigned outside a given def
(i.e., in a different def
or at
the top level of a module file) is a completely different variable
from a name X
assigned inside
that def
.
In all cases, the scope of a variable (where it can be used) is always determined by where it is assigned in your source code and has nothing to do with which functions call which. In fact, as we’ll learn in this chapter, variables may be assigned in three different places, corresponding to three different scopes:
We call this lexical scoping because variable scopes are determined entirely by the locations of the variables in the source code of your program files, not by function calls.
For example, in the following module file, the X = 99
assignment creates a
global variable named X
(visible everywhere in this file), but the
X = 88
assignment creates a
local variable X
(visible only within the def
statement):
X = 99 def func(): X = 88
Even though both variables are named X
, their scopes make them different. The net
effect is that function scopes help to avoid name clashes in your
programs and help to make functions more self-contained program
units.
Before we started writing functions, all the code we wrote
was at the top level of a module (i.e., not nested in a def
), so the names we used either lived in
the module itself or were built-ins predefined by Python (e.g.,
open
). Functions provide nested
namespaces (scopes) that localize the names they use, such that
names inside a function won’t clash with those outside it (in a
module or another function). Again, functions define a local scope,
and modules define a global scope.
The two scopes are related as follows:
The enclosing module is a global scope. Each module is a global scope—that is, a namespace in which variables created (assigned) at the top level of the module file live. Global variables become attributes of a module object to the outside world but can be used as simple variables within a module file.
The global scope spans a single file only. Don’t be fooled by the word “global” here—names at the top level of a file are only global to code within that single file. There is really no notion of a single, all-encompassing global file-based scope in Python. Instead, names are partitioned into modules, and you must always import a module explicitly if you want to be able to use the names its file defines. When you hear “global” in Python, think “module.”
Each call to a function creates a
new local scope. Every time you call a function, you
create a new local scope—that is, a namespace in which the
names created inside that function will usually live. You can
think of each def
statement
(and lambda
expression) as
defining a new local scope, but because Python allows functions
to call themselves to loop (an advanced technique known as
recursion), the local scope in fact
technically corresponds to a function call—in other words, each
call creates a new local namespace. Recursion is useful when
processing structures whose shapes can’t be predicted ahead of
time.
Assigned names are local unless
declared global or nonlocal. By default, all the
names assigned inside a function definition are put in the local
scope (the namespace associated with the function call). If you
need to assign a name that lives at the top level of the module
enclosing the function, you can do so by declaring it in a
global
statement inside the function. If you need to assign a name
that lives in an enclosing def
, as of Python 3.0 you can do so by
declaring it in a nonlocal
statement.
All other names are enclosing
function locals, globals, or built-ins. Names not
assigned a value in the function definition are assumed to be
enclosing scope locals (in an enclosing def
), globals (in the enclosing
module’s namespace), or built-ins (in the predefined __builtin__
module Python
provides).
There are a few subtleties to note here. First, keep in mind
that code typed at the interactive command
prompt follows these same rules. You may not know it yet,
but code run interactively is really entered into a built-in module
called __main__
; this module works just like a
module file, but results are echoed as you go. Because of this,
interactively created names live in a module, too, and thus follow
the normal scope rules: they are global to the interactive session.
You’ll learn more about modules in the next part of this
book.
Also note that any type of assignment within a function classifies a
name as local. This includes =
statements, module names in import
, function names in def
, function argument names, and so on.
If you assign a name in any way within a def
, it will become a local to that
function.
Conversely, in-place changes to objects
do not classify names as locals; only actual name assignments do.
For instance, if the name L
is
assigned to a list at the top level of a module, a statement
L = X
within a function will
classify L
as a local, but
L.append(X)
will not. In the
latter case, we are changing the list object that L
references, not L
itself—L
is found in the global scope as usual,
and Python happily modifies it without requiring a global
(or nonlocal
) declaration. As usual, it helps
to keep the distinction between names and objects clear: changing an
object is not an assignment to a name.
If the prior section sounds confusing, it really boils
down to three simple rules. With a def
statement:
Name references search at most four scopes: local, then enclosing functions (if any), then global, then built-in.
Name assignments create or change local names by default.
global
and nonlocal
declarations map assigned
names to enclosing module and function scopes.
In other words, all names assigned inside a function def
statement (or a lambda
, an expression we’ll meet later)
are locals by default. Functions can freely use names assigned in
syntactically enclosing functions and the global scope, but they
must declare such nonlocals and globals in order to change
them.
Python’s name-resolution scheme is sometimes called the LEGB rule, after the scope names:
When you use an unqualified name inside a function, Python
searches up to four scopes—the local (L)
scope, then the local scopes of any enclosing
(E) def
s
and lambda
s, then the global
(G) scope, and then the built-in
(B) scope—and stops at the first place the
name is found. If the name is not found during this search,
Python reports an error. As we learned in Chapter 6, names must be
assigned before they can be used.
When you assign a name in a function (instead of just referring to it in an expression), Python always creates or changes the name in the local scope, unless it’s declared to be global or nonlocal in that function.
When you assign a name outside any function (i.e., at the top level of a module file, or at the interactive prompt), the local scope is the same as the global scope—the module’s namespace.
Figure 17-1
illustrates Python’s four scopes. Note that the second scope lookup
layer, E—the scopes of
enclosing def
s or lambda
s—can technically correspond to more
than one lookup layer. This case only comes into play when you nest
functions within functions, and it is addressed by the nonlocal
statement.[36]
Also keep in mind that these rules apply only to simple
variable names (e.g., spam
). In Parts V and VI, we’ll
see that qualified attribute names (e.g.,
object.spam
) live in particular
objects and follow a completely different set of lookup rules than
those covered here. References
to attribute names following periods (.
) search one or more
objects, not scopes, and may invoke something
called “inheritance”; more on this in Part VI of this book.
Let’s look at a larger example that demonstrates scope ideas. Suppose we wrote the following code in a module file:
# Global scope X = 99 # X and func assigned in module: global def func(Y): # Y and Z assigned in function: locals # Local scope Z = X + Y # X is a global return Z func(1) # func in module: result=100
This module and the function it contains use a number of names to do their business. Using Python’s scope rules, we can classify the names as follows:
X
,
func
X
is global because
it’s assigned at the top level of the module file; it can be
referenced inside the function without being declared global.
func
is global for the same
reason; the def
statement
assigns a function object to the name func
at the top level of the
module.
Y
,
Z
Y
and Z
are local to the function (and
exist only while the function runs) because they are both
assigned values in the function definition: Z
by virtue of the =
statement, and Y
because arguments are always
passed by assignment.
The whole point behind this name-segregation scheme is that
local variables serve as temporary names that you need only while a
function is running. For instance, in the preceding example, the
argument Y
and the addition
result Z
exist only inside the
function; these names don’t interfere with the enclosing module’s
namespace (or any other function, for that matter).
The local/global distinction also makes functions easier to understand, as most of the names a function uses appear in the function itself, not at some arbitrary place in a module. Also, because you can be sure that local names will not be changed by some remote function in your program, they tend to make programs easier to debug and modify.
We’ve been talking about the built-in scope in the abstract, but it’s a
bit simpler than you may think. Really, the built-in scope is just a
built-in module called builtins
, but you have to import builtins
to query built-ins because the
name builtins
is not itself
built-in....
No, I’m serious! The built-in scope is implemented as a
standard library module named builtins
, but that name itself is not
placed in the built-in scope, so you have to import it in order to
inspect it. Once you do, you can run a dir
call to see which names are
predefined. In Python 3.0:
>>>import builtins
>>>dir(builtins)
['ArithmeticError', 'AssertionError', 'AttributeError', 'BaseException', 'BufferError', 'BytesWarning', 'DeprecationWarning', 'EOFError', 'Ellipsis',...many more names omitted...
'print', 'property', 'quit', 'range', 'repr', 'reversed', 'round', 'set', 'setattr', 'slice', 'sorted', 'staticmethod', 'str', 'sum', 'super', 'tuple', 'type', 'vars', 'zip']
The names in this list constitute the built-in scope in
Python; roughly the first half are built-in exceptions, and the
second half are built-in functions. Also in this list are the
special names None
, True
, and False
, though they are treated as reserved
words. Because Python automatically searches this module last in its
LEGB lookup, you get all the names in this list “for free;” that is,
you can use them without importing any modules. Thus, there are
really two ways to refer to a built-in function—by taking advantage
of the LEGB rule, or by manually importing the builtins
module:
>>>zip
# The normal way <class 'zip'> >>>import builtins
# The hard way >>>builtins.zip
<class 'zip'>
The second of these approaches is sometimes useful in advanced
work. The careful reader might also notice that because the LEGB
lookup procedure takes the first occurrence of a name that it finds,
names in the local scope may override variables of the same name in
both the global and built-in scopes, and global names may override
built-ins. A function can, for instance, create a local variable
called open
by assigning to
it:
def hider(): open = 'spam' # Local variable, hides built-in ... open('data.txt') # This won't open a file now in this scope!
However, this will hide the built-in function called open
that lives in the built-in (outer)
scope. It’s also usually a bug, and a nasty one at that, because
Python will not issue a warning message about it (there are times in
advanced programming where you may really want to replace a built-in
name by redefining it in your code). Functions can similarly hide
global variables of the same name with locals:
X = 88 # Global X def func(): X = 99 # Local X: hides global func() print(X) # Prints 88: unchanged
Here, the assignment within the function creates a local
X
that is a completely different
variable from the global X
in the
module outside the function. Because of this, there is no way to
change a name outside a function without adding a global
(or nonlocal
) declaration to the def
, as described in the next
section.[37]
Version skew note: Actually, the tongue
twisting gets a bit worse. The Python 3.0 builtins
module used here is named
__builtin__
in Python 2.6. And
just for fun, the name __builtins__
(with the “s”) is preset in
most global scopes, including the interactive session, to
reference the module known as builtins
(a.k.a. __builtin__
in 2.6).
That is, after importing builtins
, __builtins__ is builtins
is True
in 3.0, and __builtins__ is __builtin__
is True
in 2.6. The net effect is that we
can inspect the built-in scope by simply running dir(__builtins__)
with no
import in both 3.0 and 2.6, but we are advised to use builtins
for real work in 3.0. Who said
documenting this stuff was easy?
The global
statement and its nonlocal
cousin are the only things that are
remotely like declaration statements in Python. They are not type or
size declarations, though; they are namespace
declarations. The global
statement tells Python that a function plans to change one or more
global names—i.e., names that live in the enclosing module’s scope
(namespace).
We’ve talked about global
in
passing already. Here’s a summary:
Global names are variables assigned at the top level of the enclosing module file.
Global names must be declared only if they are assigned within a function.
Global names may be referenced within a function without being declared.
In other words, global
allows
us to change names that live outside a def
at the top level of a module file. As
we’ll see later, the nonlocal
statement is almost identical but applies to names in the enclosing
def
’s local scope, rather than
names in the enclosing module.
The global
statement consists
of the keyword global
, followed by
one or more names separated by commas. All the listed names will be
mapped to the enclosing module’s scope when assigned or referenced
within the function body. For instance:
X = 88 # Global X def func(): global X X = 99 # Global X: outside def func() print(X) # Prints 99
We’ve added a global
declaration to the example here, such that the X
inside the def
now refers to the X
outside the def
; they are the same variable this time.
Here is a slightly more involved example of global
at work:
y, z = 1, 2 # Global variables in module def all_global(): global x # Declare globals assigned x = y + z # No need to declare y, z: LEGB rule
Here, x
, y
, and z
are all globals inside the function all_global
. y
and z
are global because they aren’t assigned in the function; x
is global because it was listed in a
global
statement to map it to the
module’s scope explicitly. Without the global
here, x
would be considered local by virtue of the
assignment.
Notice that y
and z
are not declared global; Python’s LEGB
lookup rule finds them in the module automatically. Also, notice that
x
might not exist in the enclosing
module before the function runs; in this case, the assignment in the
function creates x
in the
module.
By default, names assigned in functions are locals, so if
you want to change names outside functions you have to write extra
code (e.g., global
statements).
This is by design—as is common in Python, you have to say more to do
the potentially “wrong” thing. Although there are times when globals
are useful, variables assigned in a def
are local by default because that is
normally the best policy. Changing globals can lead to well-known
software engineering problems: because the variables’ values are
dependent on the order of
calls to arbitrarily distant functions, programs can become
difficult to debug.
Consider this module file, for example:
X = 99 def func1(): global X X = 88 def func2(): global X X = 77
Now, imagine that it is your job to modify or reuse this
module file. What will the value of X
be here? Really, that question has no
meaning unless it’s qualified with a point of reference in time—the
value of X
is timing-dependent,
as it depends on which function was called last (something we can’t
tell from this file alone).
The net effect is that to understand this code, you have to trace the flow of control through the entire program. And, if you need to reuse or modify the code, you have to keep the entire program in your head all at once. In this case, you can’t really use one of these functions without bringing along the other. They are dependent on (that is, coupled with) the global variable. This is the problem with globals—they generally make code more difficult to understand and use than code consisting of self-contained functions that rely on locals.
On the other hand, short of using object-oriented programming and classes, global variables are probably the most straightforward way to retain shared state information (information that a function needs to remember for use the next time it is called) in Python—local variables disappear when the function returns, but globals do not. Other techniques, such as default mutable arguments and enclosing function scopes, can achieve this, too, but they are more complex than pushing values out to the global scope for retention.
Some programs designate a single module to collect globals; as long as this is expected, it is not as harmful. In addition, programs that use multithreading to do parallel processing in Python commonly depend on global variables—they become shared memory between functions running in parallel threads, and so act as a communication device.[38]
For now, though, especially if you are relatively new to programming, avoid the temptation to use globals whenever you can—try to communicate with passed-in arguments and return values instead. Six months from now, both you and your coworkers will be happy you did.
Here’s another scope-related issue: although we can change variables in another file directly, we usually shouldn’t. Module files were introduced in Chapter 3 and are covered in more depth in the next part of this book. To illustrate their relationship to scopes, consider these two module files:
# first.py X = 99 # This code doesn't know about second.py # second.py import first print(first.X) # Okay: references a name in another file first.X = 88 # But changing it can be too subtle and implicit
The first defines a variable X
, which the second prints and then
changes by assignment. Notice that we must import the first module
into the second file to get to its variable at all—as we’ve learned,
each module is a self-contained namespace (package of variables),
and we must import one module to see inside it from another. That’s
the main point about modules: by segregating variables on a per-file
basis, they avoid name collisions across files.
Really, though, in terms of this chapter’s topic, the global scope of a module file becomes the attribute namespace of the module object once it is imported—importers automatically have access to all of the file’s global variables, because a file’s global scope morphs into an object’s attribute namespace when it is imported.
After importing the first module, the second module prints its
variable and then assigns it a new value. Referencing the module’s
variable to print it is fine—this is how modules are linked together
into a larger system normally. The problem with the assignment,
however, is that it is far too implicit: whoever’s charged with
maintaining or reusing the first module probably has no clue that
some arbitrarily far-removed module on the import chain can change
X
out from under him at runtime.
In fact, the second module may be in a completely different
directory, and so difficult to notice at all.
Although such cross-file variable changes are always possible
in Python, they are usually much more subtle than you will want.
Again, this sets up too strong a coupling
between the two files—because they are both dependent on the value
of the variable X
, it’s difficult
to understand or reuse one file without the other. Such implicit
cross-file dependencies can lead to inflexible code at best, and
outright bugs at worst.
Here again, the best prescription is generally to not do this—the best way to communicate across file boundaries is to call functions, passing in arguments and getting back return values. In this specific case, we would probably be better off coding an accessor function to manage the change:
# first.py X = 99 def setX(new): global X X = new # second.py import first first.setX(88)
This requires more code and may seem like a trivial change,
but it makes a huge difference in terms of readability and
maintainability—when a person reading the first module by itself
sees a function, that person will know that it is a point of
interface and will expect the change to the
X
. In other words, it removes the
element of surprise that is rarely a good thing in software
projects. Although we cannot prevent cross-file changes from
happening, common sense dictates that they should be minimized
unless widely accepted across the program.
Interestingly, because global-scope variables morph into the
attributes of a loaded module object, we can emulate the global
statement by importing the
enclosing module and assigning to its attributes, as in the
following example module file. Code in this file imports the
enclosing module, first by name, and then by indexing the sys.modules
loaded modules table (more on
this table in Chapter 21):
# thismod.py var = 99 # Global variable == module attribute def local(): var = 0 # Change local var def glob1(): global var # Declare global (normal) var += 1 # Change global var def glob2(): var = 0 # Change local var import thismod # Import myself thismod.var += 1 # Change global var def glob3(): var = 0 # Change local var import sys # Import system table glob = sys.modules['thismod'] # Get module object (or use __name__) glob.var += 1 # Change global var def test(): print(var) local(); glob1(); glob2(); glob3() print(var)
When run, this adds 3 to the global variable (only the first function does not impact it):
>>>import thismod
>>>thismod.test()
99 102 >>>thismod.var
102
This works, and it illustrates the equivalence of globals to
module attributes, but it’s much more work than using the global
statement to make your intentions
explicit.
As we’ve seen, global
allows us to change names in a module outside a function. It has a
cousin named nonlocal
that can be
used to change names in enclosing functions, too, but to understand
how that can be useful, we first need to explore enclosing functions
in general.
So far, I’ve omitted one part of Python’s scope rules on
purpose, because it’s relatively rare to encounter it in practice.
However, it’s time to take a deeper look at the letter
E in the LEGB lookup rule. The
E layer is fairly new (it was added in Python
2.2); it takes the form of the local scopes of any and all enclosing
function def
s. Enclosing scopes are
sometimes also called statically nested scopes.
Really, the nesting is a lexical one—nested scopes correspond to physically and syntactically
nested code structures in your program’s source code.
With the addition of nested function scopes, variable lookup rules become slightly more complex. Within a function:
A reference (X
) looks
for the name X
first in the
current local scope (function); then in the local scopes of any
lexically enclosing functions in your source code, from inner to
outer; then in the current global scope (the module file); and
finally in the built-in scope (the module builtins
). global
declarations make the search
begin in the global (module file) scope instead.
An assignment (X =
value
) creates or changes the name X
in the current local scope, by
default. If X
is declared
global within the function, the assignment
creates or changes the name X
in the enclosing module’s scope instead. If, on the other hand,
X
is declared
nonlocal within the function, the
assignment changes the name X
in the closest enclosing function’s local scope.
Notice that the global
declaration still maps variables to the enclosing module. When
nested functions are present, variables in enclosing functions may
be referenced, but they require nonlocal
declarations to be
changed.
To clarify the prior section’s points, let’s illustrate with some real code. Here is what an enclosing function scope looks like:
X = 99 # Global scope name: not used def f1(): X = 88 # Enclosing def local def f2(): print(X) # Reference made in nested def f2() f1() # Prints 88: enclosing def local
First off, this is legal Python code: the def
is simply an executable statement,
which can appear anywhere any other statement can—including nested
in another def
. Here, the nested
def
runs while a call to the
function f1
is running; it
generates a function and assigns it to the name f2
, a local variable within f1
’s local scope. In a sense, f2
is a temporary function that lives only
during the execution of (and is visible only to code in) the
enclosing f1
.
But notice what happens inside f2
: when it prints the variable X
, it refers to the X
that lives in the enclosing f1
function’s local scope. Because
functions can access names in all physically enclosing def
statements, the X
in f2
is automatically mapped to the X
in f1
, by the LEGB lookup
rule.
This enclosing scope lookup works even if the enclosing function has already returned. For example, the following code defines a function that makes and returns another function:
def f1(): X = 88 def f2(): print(X) # Remembers X in enclosing def scope return f2 # Return f2 but don't call it action = f1() # Make, return function action() # Call it now: prints 88
In this code, the call to action
is really running the function we
named f2
when f1
ran. f2
remembers the enclosing scope’s
X
in f1
, even though f1
is no longer active.
Depending on whom you ask, this sort of behavior is also sometimes called a closure or factory function. These terms refer to a function object that remembers values in enclosing scopes regardless of whether those scopes are still present in memory. Although classes (described in Part VI of this book) are usually best at remembering state because they make it explicit with attribute assignments, such functions provide an alternative.
For instance, factory functions are sometimes used by programs that need to generate event handlers on the fly in response to conditions at runtime (e.g., user inputs that cannot be anticipated). Look at the following function, for example:
>>>def maker(N):
...def action(X):
# Make and return action ...return X ** N
# action retains N from enclosing scope ...return action
...
This defines an outer function that simply generates and returns a nested function, without calling it. If we call the outer function:
>>>f = maker(2)
# Pass 2 to N >>>f
<function action at 0x014720B0>
what we get back is a reference to the generated nested
function—the one created by running the nested def
. If we now call what we got back
from the outer function:
>>>f(3)
# Pass 3 to X, N remembers 2: 3 ** 2 9 >>>f(4)
# 4 ** 2 16
it invokes the nested function—the one called action
within maker
. The most unusual part of this is
that the nested function remembers integer 2
, the value of the variable N
in maker
, even though maker
has returned and exited by the
time we call action
. In effect,
N
from the enclosing local
scope is retained as state information attached to action
, and we get back its argument
squared.
If we now call the outer function again, we get back a new nested function with different state information attached. That is, we get the argument cubed instead of squared, but the original still squares as before:
>>>g = maker(3)
# g remembers 3, f remembers 2 >>>g(3)
# 3 ** 3 27 >>>f(3)
# 3 ** 2 9
This works because each call to a factory function like this
gets its own set of state information. In our case, the function
we assign to name g
remembers
3
, and f
remembers 2
, because each has its own state
information retained by the variable N
in maker
.
This is an advanced technique that you’re unlikely to see
very often in most code, except among programmers with backgrounds
in functional programming languages. On the other hand, enclosing
scopes are often employed by lambda
function-creation expressions
(discussed later in this chapter)—because they are expressions,
they are almost always nested within a def
. Moreover, function nesting is
commonly used for decorators (explored in
Chapter 38)—in some cases, it’s the most
reasonable coding pattern.
As a general rule, classes are better at “memory” like this because they make the state retention explicit in attributes. Short of using classes, though, globals, enclosing scope references like these, and default arguments are the main ways that Python functions can retain state information. To see how they compete, Chapter 18 provides complete coverage of defaults, but the next section gives enough of an introduction to get us started.
In earlier versions of Python, the sort of code in the prior
section failed because nested def
s did not do anything about scopes—a
reference to a variable within f2
would search only the local (f2
), then global (the code outside
f1
), and then built-in scopes.
Because it skipped the scopes of enclosing functions, an error
would result. To work around this, programmers typically used
default argument values to pass in and
remember the objects in an enclosing scope:
def f1(): x = 88 def f2(x=x): # Remember enclosing scope X with defaults print(x) f2() f1() # Prints 88
This code works in all Python releases, and you’ll still see
this pattern in some existing Python code. In short, the syntax
arg = val
in a def
header means that the argument
arg
will default to the value
val
if no real value is passed
to arg
in a call.
In the modified f2
here,
the x=x
means that the argument
x
will default to the value of
x
in the enclosing
scope—because the second x
is
evaluated before Python steps into the nested def
, it still refers to the x
in f1
. In effect, the default remembers
what x
was in f1
(i.e., the object 88
).
That’s fairly complex, and it depends entirely on the timing
of default value evaluations. In fact, the nested scope lookup
rule was added to Python to make defaults unnecessary for this
role—today, Python automatically remembers any values required in
the enclosing scope for use in nested def
s.
Of course, the best prescription for most code is simply to
avoid nesting def
s within
def
s, as it will make your
programs much simpler. The following is an equivalent of the prior
example that banishes the notion of nesting. Notice the forward
reference in this code—it’s OK to call a function defined after
the function that calls it, as long as the second def
runs before the first function is
actually called. Code inside a def
is never evaluated until the
function is actually called:
>>>def f1():
...x = 88
# Pass x along instead of nesting ...f2(x)
# Forward reference okay ... >>>def f2(x):
...print(x)
... >>>f1()
88
If you avoid nesting this way, you can almost forget about
the nested scopes concept in Python, unless you need to code in
the factory function style discussed earlier—at least, for
def
statements. lambda
s, which almost naturally appear
nested in def
s, often rely on
nested scopes, as the next section explains.
While they’re rarely used in practice for def
s themselves, you are more likely to
care about nested function scopes when you start coding lambda
expressions. We won’t cover
lambda
in depth until Chapter 19, but in short, it’s an
expression that generates a new function to be called later, much
like a def
statement. Because
it’s an expression, though, it can be used in places that def
cannot, such as within list and
dictionary literals.
Like a def
, a lambda
expression introduces a new local
scope for the function it creates. Thanks to the enclosing scopes
lookup layer, lambda
s can see
all the variables that live in the functions in which they are
coded. Thus, the following code works, but only because the nested
scope rules are applied:
def func(): x = 4 action = (lambda n: x ** n) # x remembered from enclosing def return action x = func() print(x(2)) # Prints 16, 4 ** 2
Prior to the introduction of nested function scopes,
programmers used defaults to pass values from an enclosing scope
into lambda
s, just as for
def
s. For instance, the
following works on all Python releases:
def func():
x = 4
action = (lambda n, x=x: x ** n) # Pass x in manually
return action
Because lambda
s are
expressions, they naturally (and even normally) nest inside
enclosing def
s. Hence, they are
perhaps the biggest beneficiaries of the addition of enclosing
function scopes in the lookup rules; in most cases, it is no
longer necessary to pass values into lambda
s with defaults.
There is one notable exception to the rule I just gave: if
a lambda
or def
defined within a function is nested
inside a loop, and the nested function references an enclosing
scope variable that is changed by that loop, all functions
generated within the loop will have the same value—the value the
referenced variable had in the last loop iteration.
For instance, the following attempts to build up a list of
functions that each remember the current variable i
from the enclosing scope:
>>>def makeActions():
...acts = []
...for i in range(5):
# Tries to remember each i ...acts.append(lambda x: i ** x)
# All remember same last i! ...return acts
... >>>acts = makeActions()
>>>acts[0]
<function <lambda> at 0x012B16B0>
This doesn’t quite work, though—because the enclosing scope
variable is looked up when the nested functions are later
called, they all effectively remember the
same value (the value the loop variable had on the
last loop iteration). That is, we get back 4
to the power of 2 for each function in the list, because i
is the same in all of them:
>>>acts[0](2)
# All are 4 ** 2, value of last i 16 >>>acts[2](2)
# This should be 2 ** 2 16 >>>acts[4](2)
# This should be 4 ** 2 16
This is the one case where we still have to explicitly
retain enclosing scope values with default arguments, rather than
enclosing scope references. That is, to make this sort of code
work, we must pass in the current value of
the enclosing scope’s variable with a default. Because defaults
are evaluated when the nested function is
created (not when it’s later
called), each remembers its own value for
i
:
>>>def makeActions():
...acts = []
...for i in range(5):
# Use defaults instead ...acts.append(lambda x, i=i: i ** x)
# Remember current i ...return acts
... >>>acts = makeActions()
>>>acts[0](2)
# 0 ** 2 0 >>>acts[2](2)
# 2 ** 2 4 >>>acts[4](2)
# 4 ** 2 16
This is a fairly obscure case, but it can come up in
practice, especially in code that generates callback handler
functions for a number of widgets in a GUI (e.g., button-press
handlers). We’ll talk more about defaults in Chapter 18 and lambda
s in Chapter 19, so you may want to return
and review this section later.[39]
Before ending this discussion, I should note that scopes
may nest arbitrarily, but only enclosing function def
statements (not classes, described
in Part VI) are searched:
>>>def f1():
...x = 99
...def f2():
...def f3():
...print(x)
# Found in f1's local scope! ...f3()
...f2()
... >>>f1()
99
Python will search the local scopes of
all enclosing def
s, from inner to outer, after the
referencing function’s local scope and before the module’s global
scope or built-ins. However, this sort of code is even less likely
to pop up in practice. In Python, we say flat is better
than nested—except in very limited contexts, your life
(and the lives of your coworkers) will generally be better if you
minimize nested function definitions.
In the prior section we explored the way that nested functions
can reference variables in an enclosing
function’s scope, even if that function has already returned. It turns
out that, as of Python 3.0, we can also change
such enclosing scope variables, as long as we declare them in nonlocal
statements. With this statement,
nested def
s can have both read and
write access to names in enclosing functions.
The nonlocal
statement is a
close cousin to global
, covered
earlier. Like global
, nonlocal
declares that a name will
be changed in an enclosing scope. Unlike global
, though, nonlocal
applies to a name in an enclosing
function’s scope, not the global module scope outside all def
s. Also unlike global
, nonlocal
names must already exist in the
enclosing function’s scope when declared—they can exist only in
enclosing functions and cannot be created by a first assignment in a
nested def
.
In other words, nonlocal
both
allows assignment to names in enclosing function scopes and limits
scope lookups for such names to enclosing def
s. The net effect is a more direct and
reliable implementation of changeable state information, for programs
that do not desire or need classes with attributes.
Python 3.0 introduces a new nonlocal
statement,
which has meaning only inside a function:
def func(): nonlocal name1, name2, ...
This statement allows a nested function to change one or more
names defined in a syntactically enclosing function’s scope. In
Python 2.X (including 2.6), when one function def
is nested in another, the nested
function can reference any of the names defined by assignment in the
enclosing def
’s scope, but it
cannot change them. In 3.0, declaring the enclosing scopes’ names in
a nonlocal
statement enables
nested functions to assign and thus change such names as
well.
This provides a way for enclosing functions to provide
writeable state information, remembered when
the nested function is later called. Allowing the state to change
makes it more useful to the nested function (imagine a counter in
the enclosing scope, for instance). In 2.X, programmers usually
achieve similar goals by using classes or other schemes. Because
nested functions have become a more common coding pattern for state
retention, though, nonlocal
makes
it more generally applicable.
Besides allowing names in enclosing def
s to be changed, the nonlocal
statement also forces the issue
for references—just like the global
statement, nonlocal
causes searches for the names
listed in the statement to begin in the enclosing def
s’ scopes, not in the local scope of
the declaring function. That is, nonlocal
also means “skip my local scope
entirely.”
In fact, the names listed in a nonlocal
must have
been previously defined in an enclosing def
when the nonlocal
is reached, or an error is
raised. The net effect is much like global
: global
means the names reside in the
enclosing module, and nonlocal
means they reside in an enclosing def
. nonlocal
is even more strict, though—scope
search is restricted to only enclosing def
s. That is, nonlocal names can appear
only in enclosing def
s, not in
the module’s global scope or built-in scopes outside the def
s.
The addition of nonlocal
does not alter name reference scope rules in general; they still
work as before, per the “LEGB” rule described earlier. The nonlocal
statement mostly serves to allow
names in enclosing scopes to be changed rather than just referenced.
However, global
and nonlocal
statements do both restrict the
lookup rules somewhat, when coded in a function:
global
makes scope
lookup begin in the enclosing module’s scope and allows names
there to be assigned. Scope lookup continues on to the built-in
scope if the name does not exist in the module, but assignments
to global names always create or change them in the module’s
scope.
nonlocal
restricts
scope lookup to just enclosing def
s, requires that the names already
exist there, and allows them to be assigned. Scope lookup does
not continue on to the global or built-in scopes.
In Python 2.6, references to enclosing def
scope names are allowed, but not
assignment. However, you can still use classes with explicit
attributes to achieve the same changeable state information effect
as nonlocals (and you may be better off doing so in some contexts);
globals and function attributes can sometimes accomplish similar
goals as well. More on this in a moment; first, let’s turn to some
working code to make this more concrete.
On to some examples, all run in 3.0. References to enclosing
def
scopes work as they do in
2.6. In the following, tester
builds and returns the function
nested
, to be called later, and
the state
reference in nested
maps the local scope of tester
using the normal scope lookup
rules:
C:\misc>c:python30python
>>>def tester(start):
...state = start
# Referencing nonlocals works normally ...def nested(label):
...print(label, state)
# Remembers state in enclosing scope ...return nested
... >>>F = tester(0)
>>>F('spam')
spam 0 >>>F('ham')
ham 0
Changing a name in an enclosing def
’s scope is not allowed by default,
though; this is the normal case in 2.6 as well:
>>>def tester(start):
...state = start
...def nested(label):
...print(label, state)
...state += 1
# Cannot change by default (or in 2.6) ...return nested
... >>>F = tester(0)
>>>F('spam')
UnboundLocalError: local variable 'state' referenced before assignment
Now, under 3.0, if we declare state
in the tester
scope as nonlocal
within nested
, we get to change it inside the
nested function, too. This works even though tester
has returned and exited by the
time we call the returned nested
function through the name F
:
>>>def tester(start):
...state = start
# Each call gets its own state ...def nested(label):
...nonlocal state
# Remembers state in enclosing scope ...print(label, state)
...state += 1
# Allowed to change it if nonlocal ...return nested
... >>>F = tester(0)
>>>F('spam')
# Increments state on each call spam 0 >>>F('ham')
ham 1 >>>F('eggs')
eggs 2
As usual with enclosing scope references, we can call the
tester
factory function
multiple times to get multiple copies of its state in memory. The
state
object in the enclosing
scope is essentially attached to the nested
function object returned; each
call makes a new, distinct state
object, such that updating one
function’s state won’t impact the other. The following continues
the prior listing’s interaction:
>>>G = tester(42)
# Make a new tester that starts at 42 >>>G('spam')
spam 42 >>>G('eggs')
# My state information updated to 43 eggs 43 >>>F('bacon')
# But F's is where it left off: at 3 bacon 3 # Each call has different state information
There are a few things to watch out for. First, unlike the
global
statement, nonlocal
names really
must have previously been assigned in an
enclosing def
’s scope when a
nonlocal
is evaluated, or else
you’ll get an error—you cannot create them dynamically by
assigning them anew in the enclosing scope:
>>>def tester(start):
...def nested(label):
...nonlocal state
# Nonlocals must already exist in enclosing def! ...state = 0
...print(label, state)
...return nested
... SyntaxError: no binding for nonlocal 'state' found >>>def tester(start):
...def nested(label):
...global state
# Globals don't have to exist yet when declared ...state = 0
# This creates the name in the module now ...print(label, state)
...return nested
... >>>F = tester(0)
>>>F('abc')
abc 0 >>>state
0
Second, nonlocal
restricts the scope lookup to just enclosing def
s; nonlocals are not looked up in the
enclosing module’s global scope or the built-in scope outside all
def
s, even if they are already
there:
>>>spam = 99
>>>def tester():
...def nested():
...nonlocal spam
# Must be in a def, not the module! ...print('Current=', spam)
...spam += 1
...return nested
... SyntaxError: no binding for nonlocal 'spam' found
These restrictions make sense once you realize that Python
would not otherwise generally know which enclosing scope to create
a brand new name in. In the prior listing, should spam
be assigned in tester
, or the module outside? Because
this is ambiguous, Python must resolve nonlocals at function
creation time, not function
call time.
Given the extra complexity of nested functions, you might wonder what
the fuss is about. Although it’s difficult to see in our small
examples, state information becomes crucial in many programs. There
are a variety of ways to “remember” information across function and
method calls in Python. While there are tradeoffs for all, nonlocal
does improve this story
for enclosing scope references—the nonlocal
statement allows multiple copies
of changeable state to be retained in memory and addresses simple
state-retention needs where classes may not be warranted.
As we saw in the prior section, the following code allows
state to be retained and modified in an enclosing scope. Each call
to tester
creates a little
self-contained package of changeable
information, whose names do not clash with any other part
of the program:
def tester(start): state = start # Each call gets its own state def nested(label): nonlocal state # Remembers state in enclosing scope print(label, state) state += 1 # Allowed to change it if nonlocal return nested F = tester(0) F('spam')
Unfortunately, this code only works in Python 3.0. If you are using Python 2.6, other options are available, depending on your goals. The next three sections present some alternatives.
One usual prescription for achieving the nonlocal
effect in 2.6 and earlier is to
simply move the state out to the global scope
(the enclosing module):
>>>def tester(start):
...global state
# Move it out to the module to change it ...state = start
# global allows changes in module scope ...def nested(label):
...global state
...print(label, state)
...state += 1
...return nested
... >>>F = tester(0)
>>>F('spam')
# Each call increments shared global state spam 0 >>>F('eggs')
eggs 1
This works in this case, but it requires global
declarations in both functions
and is prone to name collisions in the global scope (what if
“state” is already being used?). A worse, and more subtle, problem
is that it only allows for a single shared
copy of the state information in the module scope—if we
call tester
again, we’ll wind
up resetting the module’s state
variable, such that prior calls will see their state
overwritten:
>>>G = tester(42)
# Resets state's single copy in global scope >>>G('toast')
toast 42 >>>G('bacon')
bacon 43 >>>F('ham')
# Oops -- my counter has been overwritten! ham 44
As shown earlier, when using nonlocal
instead of global
, each call to tester
remembers its own unique copy of
the state
object.
The other prescription for changeable state information in 2.6 and earlier is to use classes with attributes to make state information access more explicit than the implicit magic of scope lookup rules. As an added benefit, each instance of a class gets a fresh copy of the state information, as a natural byproduct of Python’s object model.
We haven’t explored classes in detail yet, but as a brief
preview, here is a reformulation of the tester
/nested
functions used earlier as a
class—state is recorded in objects explicitly as they are created.
To make sense of this code, you need to know that a def
within a class
like this works exactly like a
def
outside of a class
, except that the function’s
self
argument automatically
receives the implied subject of the call (an instance object
created by calling the class itself):
>>>class tester:
# Class-based alternative (see Part VI) ...def __init__(self, start):
# On object construction, ...self.state = start
# save state explicitly in new object ...def nested(self, label):
...print(label, self.state)
# Reference state explicitly ...self.state += 1
# Changes are always allowed ... >>>F = tester(0)
# Create instance, invoke __init__ >>>F.nested('spam')
# F is passed to self spam 0 >>>F.nested('ham')
ham 1 >>>G = tester(42)
# Each instance gets new copy of state >>>G.nested('toast')
# Changing one does not impact others toast 42 >>>G.nested('bacon')
bacon 43 >>>F.nested('eggs')
# F's state is where it left off eggs 2 >>>F.state
# State may be accessed outside class 3
With just slightly more magic, which we’ll delve into later
in this book, we could also make our class look like a callable
function using operator overloading. __call__
intercepts direct calls on an
instance, so we don’t need to call a named method:
>>>class tester:
...def __init__(self, start):
...self.state = start
...def __call__(self, label):
# Intercept direct instance calls ...print(label, self.state)
# So .nested() not required ...self.state += 1
... >>>H = tester(99)
>>>H('juice')
# Invokes __call__ juice 99 >>>H('pancakes')
pancakes 100
Don’t sweat the details in this code too much at this point
in the book; we’ll explore classes in depth in Part VI and will look at specific operator
overloading tools like __call__
in Chapter 29, so you may wish to
file this code away for future reference. The point here is that
classes can make state information more obvious, by leveraging
explicit attribute assignment instead of scope lookups.
While using classes for state information is generally a
good rule of thumb to follow, they might be overkill in cases like
this, where state is a single counter. Such trivial state cases
are more common than you might think; in such contexts, nested
def
s are sometimes more
lightweight than coding classes, especially if you’re not familiar
with OOP yet. Moreover, there are some scenarios in which nested
def
s may actually work better
than classes (see the description of method
decorators in Chapter 38 for an
example that is far beyond this chapter’s scope).
As a final state-retention option, we can also sometimes
achieve the same effect as nonlocals with function attributes—user-defined names
attached to functions directly. Here’s a final version of our
example based on this technique—it replaces a nonlocal with an
attribute attached to the nested function. Although this scheme
may not be as intuitive to some, it also allows the state variable
to be accessed outside the nested function
(with nonlocals, we can only see state variables within the nested
def
):
>>>def tester(start):
...def nested(label):
...print(label, nested.state)
# nested is in enclosing scope ...nested.state += 1
# Change attr, not nested itself ...nested.state = start
# Initial state after func defined ...return nested
... >>>F = tester(0)
>>>F('spam')
# F is a 'nested' with state attached spam 0 >>>F('ham')
ham 1 >>>F.state
# Can access state outside functions too 2 >>> >>>G = tester(42)
# G has own state, doesn't overwrite F's >>>G('eggs')
eggs 42 >>>F('ham')
ham 2
This code relies on the fact that the function name nested
is a local variable in the
tester
scope enclosing nested
; as such, it can be referenced
freely inside nested
. This code
also relies on the fact that changing an object in-place is not an
assignment to a name; when it increments nested.state
, it is changing part of the
object nested
references, not
the name nested
itself. Because
we’re not really assigning a name in the enclosing scope, no
nonlocal
is needed.
As you can see, globals, nonlocals, classes, and function attributes all offer state-retention options. Globals only support shared data, classes require a basic knowledge of OOP, and both classes and function attributes allow state to be accessed outside the nested function itself. As usual, the best tool for your program depends upon your program’s goals.[40]
In this chapter, we studied one of two key concepts related to functions: scopes (how variables are looked up when they are used). As we learned, variables are considered local to the function definitions in which they are assigned, unless they are specifically declared to be global or nonlocal. We also studied some more advanced scope concepts here, including nested function scopes and function attributes. Finally, we looked at some general design ideas, such as the need to avoid globals and cross-file changes.
In the next chapter, we’re going to continue our function tour with the second key function-related concept: argument passing. As we’ll find, arguments are passed into a function by assignment, but Python also provides tools that allow functions to be flexible in how items are passed. Before we move on, let’s take this chapter’s quiz to review the scope concepts we’ve covered here.
What is the output of the following code, and why?
>>>X = 'Spam'
>>>def func():
...print(X)
... >>>func()
What is the output of this code, and why?
>>>X = 'Spam'
>>>def func():
...X = 'NI!'
... >>>func()
>>>print(X)
What does this code print, and why?
>>>X = 'Spam'
>>>def func():
...X = 'NI'
...print(X)
... >>>func()
>>>print(X)
What output does this code produce? Why?
>>>X = 'Spam'
>>>def func():
...global X
...X = 'NI'
... >>>func()
>>>print(X)
What about this code—what’s the output, and why?
>>>X = 'Spam'
>>>def func():
...X = 'NI'
...def nested():
...print(X)
...nested()
... >>>func()
>>>X
How about this example: what is its output in Python 3.0, and why?
>>>def func():
...X = 'NI'
...def nested():
...nonlocal X
...X = 'Spam'
...nested()
...print(X)
... >>>func()
Name three or more ways to retain state information in a Python function.
The output here is 'Spam'
, because the function references
a global variable in the enclosing module (because it is not
assigned in the function, it is considered global).
The output here is 'Spam'
again because assigning the variable inside the function makes it
a local and effectively hides the global of the same name. The
print
statement finds the
variable unchanged in the global (module) scope.
It prints 'NI'
on one
line and 'Spam'
on another,
because the reference to the variable within the function finds
the assigned local and the reference in the print
statement finds the global.
This time it just prints 'NI'
because the global declaration
forces the variable assigned inside the function to refer to the
variable in the enclosing global scope.
The output in this case is again 'NI'
on one line and 'Spam'
on another, because the print
statement in the nested function
finds the name in the enclosing function’s local scope, and the
print
at the end finds the
variable in the global scope.
This example prints 'Spam'
, because the nonlocal
statement (available in Python
3.0 but not 2.6) means that the assignment to X
inside the nested function changes
X
in the enclosing function’s
local scope. Without this statement, this assignment would
classify X
as local to the
nested function, making it a different variable; the code would
then print 'NI'
instead.
Although the values of local variables go away when a function returns, you can make a Python function retain state information by using shared global variables, enclosing function scope references within nested functions, or using default argument values. Function attributes can sometimes allow state to be attached to the function itself, instead of looked up in scopes. Another alternative, using OOP with classes, sometimes supports state retention better than any of the scope-based techniques because it makes it explicit with attribute assignments; we’ll explore this option in Part VI.
[36] The scope lookup rule was called the “LGB rule” in the
first edition of this book. The enclosing def
“E” layer was added later in
Python to obviate the task of passing in enclosing scope names
explicitly with default arguments—a topic usually of marginal
interest to Python beginners that we’ll defer until later in
this chapter. Since this scope is addressed by the nonlocal
statement in Python 3.0, I
suppose the lookup rule might now be better named “LNGB,” but
backward compatibility matters in books, too!
[37] There is technically one more scope in Python: loop variables in comprehension and generator expressions are local to the expression itself in 3.X (in 2.X, they are local in generators but not in list comprehensions). This is a special and obscure case that rarely impacts real code, and differs from for-loop statements which never localize their variables.
[38] Multithreading runs function calls in
parallel with the rest of the program and is supported by
Python’s standard library modules _thread
, threading
, and queue
(thread
, threading
, and Queue
in Python 2.6). Because all
threaded functions run in the same process, global scopes often
serve as shared memory between them. Threading is commonly used
for long-running tasks in GUIs, to implement nonblocking
operations in general and to leverage CPU capacity. It is also
beyond this book’s scope; see the Python library manual, as well
as the follow-up texts listed in the Preface (such as O’Reilly’s
Programming
Python), for more details.
[39] In the section Function Gotchas at
the end of this part of the book, we’ll also see that there is
an issue with using mutable objects like lists and
dictionaries for default arguments (e.g., def f(a=[])
)—because defaults are
implemented as single objects attached to functions, mutable
defaults retain state from call to call, rather then being
initialized anew on each call. Depending on whom you ask, this
is either considered a feature that supports state retention,
or a strange wart on the language. More on this at the end of
Chapter 20.
[40] Function attributes are supported in both Python 2.6 and
3.X. We’ll explore them further in Chapter 19, and revisit all
the state options introduced here in Chapter 38 in a more
realistic context. Also note that it’s possible to change a
mutable object in the enclosing scope in 2.X and 3.X without
declaring its name nonlocal (e.g, state=[start]
in tester
, state[0]+=1
in nested
), though it’s perhaps more
obscure than either function attributes or 3.X’s
nonlocal.