This chapter begins our in-depth look at ways to apply Python to real programming tasks. In this and the following chapters, you’ll see how to use Python to write system tools, GUIs, database applications, Internet scripts, websites, and more. Along the way, we’ll also study larger Python programming concepts in action: code reuse, maintainability, object-oriented programming (OOP), and so on.
In this first part of the book, we begin our Python programming tour by exploring the systems application domain—scripts that deal with files, programs, and the general environment surrounding a program. Although the examples in this domain focus on particular kinds of tasks, the techniques they employ will prove to be useful in later parts of the book as well. In other words, you should begin your journey here, unless you are already a Python systems programming wizard.
Python’s system interfaces span application domains, but for the next five chapters, most of our examples fall into the category of system tools—programs sometimes called command-line utilities, shell scripts, system administration, systems programming, and other permutations of such words. Regardless of their title, you are probably already familiar with this sort of script; these scripts accomplish such tasks as processing files in a directory, launching test programs, and so on. Such programs historically have been written in nonportable and syntactically obscure shell languages such as DOS batch files, csh, and awk.
Even in this relatively simple domain, though, some of Python’s better attributes shine brightly. For instance, Python’s ease of use and extensive built-in library make it simple (and even fun) to use advanced system tools such as threads, signals, forks, sockets, and their kin; such tools are much less accessible under the obscure syntax of shell languages and the slow development cycles of compiled languages. Python’s support for concepts like code clarity and OOP also help us write shell tools that can be read, maintained, and reused. When using Python, there is no need to start every new script from scratch.
Moreover, we’ll find that Python not only includes all the interfaces we need in order to write system tools, but it also fosters script portability. By employing Python’s standard library, most system scripts written in Python are automatically portable to all major platforms. For instance, you can usually run in Linux a Python directory-processing script written in Windows without changing its source code at all—simply copy over the source code. Though writing scripts that achieve such portability utopia requires some extra effort and practice, if used well, Python could be the only system scripting tool you need to use.
To make this part of the book easier to study, I have broken it down into five chapters:
In this chapter, I’ll introduce the main system-related modules in overview fashion. We’ll meet some of the most commonly used system tools here for the first time.
In Chapter 3, we continue exploring the basic system interfaces by studying their role in core system programming concepts: streams, command-line arguments, environment variables, and so on.
Chapter 4 focuses on the tools Python provides for processing files, directories, and directory trees.
In Chapter 5, we’ll move on to cover Python’s standard tools for parallel processing—processes, threads, queues, pipes, signals, and more.
Chapter 6 wraps up by presenting a collection of complete system-oriented programs. The examples here are larger and more realistic, and they use the tools introduced in the prior four chapters to perform real, practical tasks. This collection includes both general system scripts, as well as scripts for processing directories of files.
Especially in the examples chapter at the end of this part, we will be concerned as much with system interfaces as with general Python development concepts. We’ll see non-object-oriented and object-oriented versions of some examples along the way, for instance, to help illustrate the benefits of thinking in more strategic ways.
To begin our exploration of the systems domain, we will take a quick
tour through the standard library sys
and
os
modules in this chapter, before
moving on to larger system programming concepts. As you can tell from
the length of their attribute lists, both of these are large
modules—the following reflects Python 3.1 running on Windows 7 outside
IDLE:
C:...PP4ESystem>python
Python 3.1.1 (r311:74483, Aug 17 2009, 17:02:12) [MSC v.1500 32 bit (...)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>>import sys, os
>>>len(dir(sys))
# 65 attributes 65 >>>len(dir(os))
# 122 on Windows, more on Unix 122 >>>len(dir(os.path))
# a nested module within os 52
The content of these two modules may vary per Python version and
platform. For example, os
is much
larger under Cygwin after building Python 3.1 from its source code
there (Cygwin is a system that provides Unix-like functionality on
Windows; it is discussed further in More on Cygwin Python for Windows):
$./python.exe
Python 3.1.1 (r311:74480, Feb 20 2010, 10:16:52) [GCC 3.4.4 (cygming special, gdc 0.12, using dmd 0.125)] on cygwin Type "help", "copyright", "credits" or "license" for more information. >>>import sys, os
>>>len(dir(sys))
64 >>>len(dir(os))
217 >>>len(dir(os.path))
51
As I’m not going to demonstrate every item in every built-in module, the first thing I want to do is show you how to get more details on your own. Officially, this task also serves as an excuse for introducing a few core system scripting concepts; along the way, we’ll code a first script to format documentation.
Most system-level interfaces in Python are shipped in just two modules:
sys
and os
. That’s somewhat oversimplified; other
standard modules belong to this domain too. Among them are the
following:
Third-party extensions such as pySerial (a serial port interface), Pexpect (an Expect work-alike for controlling
cross-program dialogs), and even Twisted (a networking framework) can be arguably
lumped into the systems domain as well. In addition, some built-in
functions are actually system interfaces as well—the open
function, for example, interfaces
with the file system. But by and large, sys
and os
together form the core of Python’s
built-in system tools arsenal.
In principle at least, sys
exports components related to the Python
interpreter itself (e.g., the module search
path), and os
contains variables and functions that map to the
operating system on which Python is run. In practice, this
distinction may not always seem clear-cut (e.g., the standard input
and output streams show up in sys
, but they are arguably tied to
operating system paradigms). The good news is that you’ll soon use
the tools in these modules so often that their locations will be
permanently stamped on your memory.[3]
The os
module also attempts
to provide a portable programming interface to
the underlying operating system; its functions may be implemented
differently on different platforms, but to Python scripts, they look
the same everywhere. And if that’s still not enough, the os
module also exports a nested
submodule, os.path
, which
provides a portable interface to file and directory processing
tools.
As you can probably deduce from the preceding paragraphs, learning to write system scripts in Python is mostly a matter of learning about Python’s system modules. Luckily, there are a variety of information sources to make this task easier—from module attributes to published references and books.
For instance, if you want to know everything that a built-in
module exports, you can read its library manual entry; study its
source code (Python is open source software, after all); or fetch
its attribute list and documentation string interactively. Let’s
import sys
in Python 3.1 and see
what it has to offer:
C:...PP4ESystem>python
>>>import sys
>>>dir(sys)
['__displayhook__', '__doc__', '__excepthook__', '__name__', '__package__', '__stderr__', '__stdin__', '__stdout__', '_clear_type_cache', '_current_frames', '_getframe', 'api_version', 'argv', 'builtin_module_names', 'byteorder', 'call_tracing', 'callstats', 'copyright', 'displayhook', 'dllhandle', 'dont_write_bytecode', 'exc_info', 'excepthook', 'exec_prefix', 'executable', 'exit', 'flags', 'float_info', 'float_repr_style', 'getcheckinterval', 'getdefaultencoding', 'getfilesystemencoding', 'getprofile', 'getrecursionlimit', 'getrefcount', 'getsizeof', 'gettrace', 'getwindowsversion', 'hexversion', 'int_info', 'intern', 'maxsize', 'maxunicode', 'meta_path', 'modules', 'path', 'path_hooks', 'path_importer_cache', 'platform', 'prefix', 'ps1', 'ps2', 'setcheckinterval', 'setfilesystemencoding', 'setprofile', 'setrecursionlimit', 'settrace', 'stderr', 'stdin', 'stdout', 'subversion', 'version', 'version_info', 'warnoptions', 'winver']
The dir
function simply returns a list containing the string names of
all the attributes in any object with attributes; it’s a handy
memory jogger for modules at the interactive prompt. For example, we
know there is something called sys.version
, because the name version
came back in the dir
result. If that’s not enough, we can
always consult the __doc__
string
of built-in modules:
>>> sys.__doc__
"This module provides access to some objects used or maintained by the
interpre
ter and to functions that interact strongly with the interpreter.
Dynamic obj
ects:
argv -- command line arguments; argv[0] is the script pathname if known
path -- module search path; path[0] is the script directory, else ''
modules
-- dictionary of loaded modules
displayhook -- called to show results in an i
...lots of text deleted here..."
The __doc__
built-in
attribute just shown usually contains a string of
documentation, but it may look a bit weird when displayed this
way—it’s one long string with embedded end-line characters that
print as
, not as a nice list
of lines. To format these strings for a more humane display, you can
simply use a print
function-call
statement:
>>> print(sys.__doc__)
This module provides access to some objects used or maintained by the
interpreter and to functions that interact strongly with the interpreter.
Dynamic objects:
argv -- command line arguments; argv[0] is the script pathname if known
path -- module search path; path[0] is the script directory, else ''
modules -- dictionary of loaded modules
...lots of lines deleted here...
The print
built-in
function, unlike interactive displays, interprets end-line
characters correctly. Unfortunately, print
doesn’t, by itself, do anything
about scrolling or paging and so can still be unwieldy on some
platforms. Tools such as the built-in help
function can do better:
>>> help(sys)
Help on built-in module sys:
NAME
sys
FILE
(built-in)
MODULE DOCS
http://docs.python.org/library/sys
DESCRIPTION
This module provides access to some objects used or maintained by the
interpreter and to functions that interact strongly with the interpreter.
Dynamic objects:
argv -- command line arguments; argv[0] is the script pathname if known
path -- module search path; path[0] is the script directory, else ''
modules -- dictionary of loaded modules
...lots of lines deleted here...
The help
function is one
interface provided by the PyDoc system—standard library code that ships with
Python and renders documentation (documentation strings, as well as
structural details) related to an object in a formatted way. The
format is either like a Unix manpage, which we get for help
, or an HTML page, which is more
grandiose. It’s a handy way to get basic information when working
interactively, and it’s a last resort before falling back on manuals
and books.
The help
function we just
met is also fairly fixed in the way it displays information;
although it attempts to page the display in some contexts, its page
size isn’t quite right on some of the machines I use. Moreover, it
doesn’t page at all in the IDLE GUI, instead relying on manual use
of the scrollbar—potentially painful for large displays. When I want
more control over the way help text is printed, I usually use a
utility script of my own, like the one in Example 2-1.
""" split and interactively page a string or file of text """ def more(text, numlines=15): lines = text.splitlines() # like split(' ') but no '' at end while lines: chunk = lines[:numlines] lines = lines[numlines:] for line in chunk: print(line) if lines and input('More?') not in ['y', 'Y']: break if __name__ == '__main__': import sys # when run, not imported more(open(sys.argv[1]).read(), 10) # page contents of file on cmdline
The meat of this file is its more
function, and
if you know enough Python to be qualified to read this book, it
should be fairly straightforward. It simply splits up a string
around end-line characters, and then slices off and displays a few
lines at a time (15 by default) to avoid scrolling off the screen. A
slice expression, lines[:15]
,
gets the first 15 items in a list, and lines[15:]
gets the rest; to show a
different number of lines each time, pass a number to the numlines
argument (e.g., the last line in
Example 2-1 passes 10 to the
numlines
argument of the more
function).
The splitlines
string
object method call that this script employs returns a list of
substrings split at line ends (e.g., ["line", "line",...]
). An alternative
splitlines
method does similar
work, but retains an empty line at the end of the result if the last
line is
terminated:
>>>line = 'aaa bbb ccc '
>>>line.split(' ')
['aaa', 'bbb', 'ccc', ''] >>>line.splitlines()
['aaa', 'bbb', 'ccc']
As we’ll see more formally in Chapter 4, the end-of-line
character is normally always
(which stands for a byte usually having
a binary value of 10) within a Python script, no matter what
platform it is run upon. (If you don’t already know why this
matters, DOS
characters in
text are dropped by default when read.)
Now, Example 2-1 is a simple Python program, but it already brings up three important topics that merit quick detours here: it uses string methods, reads from a file, and is set up to be run or imported. Python string methods are not a system-related tool per se, but they see action in most Python programs. In fact, they are going to show up throughout this chapter as well as those that follow, so here is a quick review of some of the more useful tools in this set. String methods include calls for searching and replacing:
>>>mystr = 'xxxSPAMxxx'
>>>mystr.find('SPAM')
# return first offset 3 >>>mystr = 'xxaaxxaa'
>>>mystr.replace('aa', 'SPAM')
# global replacement 'xxSPAMxxSPAM'
The find
call returns the
offset of the first occurrence of a substring, and replace
does global search and
replacement. Like all string operations, replace
returns a new string instead of
changing its subject in-place (recall that strings are immutable).
With these methods, substrings are just strings; in Chapter 19, we’ll also meet a module called
re
that allows regular expression
patterns to show up in searches and
replacements.
In more recent Pythons, the in
membership operator can often be used
as an alternative to find
if all
we need is a yes/no answer (it tests for a substring’s presence).
There are also a handful of methods for removing whitespace on the
ends of strings—especially useful for lines of text read from a
file:
>>>mystr = 'xxxSPAMxxx'
>>>'SPAM' in mystr
# substring search/test True >>>'Ni' in mystr
# when not found False >>>mystr.find('Ni')
-1 >>>mystr = ' Ni '
>>>mystr.strip()
# remove whitespace 'Ni' >>>mystr.rstrip()
# same, but just on right side ' Ni'
String methods also provide functions that are useful for
things such as case conversions, and a standard library module named
string
defines some useful preset variables, among other
things:
>>>mystr = 'SHRUBBERY'
>>>mystr.lower()
# case converters 'shrubbery' >>>mystr.isalpha()
# content tests True >>>mystr.isdigit()
False >>>import string
# case presets: for 'in', etc. >>>string.ascii_lowercase
'abcdefghijklmnopqrstuvwxyz' >>>string.whitespace
# whitespace characters ' x0bx0c'
There are also methods for splitting up strings around a substring delimiter and putting them back together with a substring in between. We’ll explore these tools later in this book, but as an introduction, here they are at work:
>>>mystr = 'aaa,bbb,ccc'
>>>mystr.split(',')
# split into substrings list ['aaa', 'bbb', 'ccc'] >>>mystr = 'a b c d'
>>>mystr.split()
# default delimiter: whitespace ['a', 'b', 'c', 'd'] >>>delim = 'NI'
>>>delim.join(['aaa', 'bbb', 'ccc'])
# join substrings list 'aaaNIbbbNIccc' >>>' '.join(['A', 'dead', 'parrot'])
# add a space between 'A dead parrot' >>>chars = list('Lorreta')
# convert to characters list >>>chars
['L', 'o', 'r', 'r', 'e', 't', 'a'] >>>chars.append('!')
>>>''.join(chars)
# to string: empty delimiter 'Lorreta!'
These calls turn out to be surprisingly powerful. For example,
a line of data columns separated by tabs can be parsed into its
columns with a single split
call;
the more.py script uses the splitlines
variant shown earlier to split
a string into a list of line strings. In fact, we can emulate the
replace
call we saw earlier in
this section with a split/join combination:
>>>mystr = 'xxaaxxaa'
>>>'SPAM'.join(mystr.split('aa'))
# str.replace, the hard way! 'xxSPAMxxSPAM'
For future reference, also keep in mind that Python doesn’t automatically convert strings to numbers, or vice versa; if you want to use one as you would use the other, you must say so with manual conversions:
>>>int("42"), eval("42")
# string to int conversions (42, 42) >>>str(42), repr(42)
# int to string conversions ('42', '42') >>>("%d" % 42), '{:d}'.format(42)
# via formatting expression, method ('42', '42') >>>"42" + str(1), int("42") + 1
# concatenation, addition ('421', 43)
In the last command here, the first expression triggers string concatenation (since both sides are strings), and the second invokes integer addition (because both objects are numbers). Python doesn’t assume you meant one or the other and convert automatically; as a rule of thumb, Python tries to avoid magic—and the temptation to guess—whenever possible. String tools will be covered in more detail later in this book (in fact, they get a full chapter in Part V), but be sure to also see the library manual for additional string method tools.
Technically speaking, the Python 3.X string story is a bit richer
than I’ve implied here. What I’ve shown so far is the str
object type—a
sequence of characters (technically, Unicode “code points”
represented as Unicode “code units”) which represents both ASCII and
wider Unicode text, and handles encoding and decoding both manually
on request and automatically on file transfers. Strings are coded in
quotes (e.g., 'abc'
), along with
various syntax for coding non-ASCII text (e.g., 'xc4xe8'
, 'u00c4u00e8'
).
Really, though, 3.X has two additional string types that
support most str
string
operations: bytes
—a sequence
of short integers for representing 8-bit binary data, and bytearray
—a
mutable variant of bytes. You generally know you are dealing with
bytes
if strings display or are
coded with a leading “b” character before the opening quote (e.g.,
b'abc'
, b'xc4xe8'
). As we’ll see in Chapter 4, files in 3.X follow a similar
dichotomy, using str
in text mode
(which also handles Unicode encodings and line-end conversions) and
bytes
in binary mode (which
transfers bytes to and from files unchanged). And in Chapter 5, we’ll see the same distinction
for tools like sockets, which deal in byte strings today.
Unicode text is used in Internationalized applications, and
many of Python’s binary-oriented tools deal in byte strings today.
This includes some file tools we’ll meet along the way, such as the
open
call, and the os.listdir
and os.walk
tools we’ll study in upcoming
chapters. As we’ll see, even simple directory tools sometimes have
to be aware of Unicode in file content and names. Moreover, tools
such as object pickling and binary data parsing are byte-oriented
today.
Later in the book, we’ll also find that Unicode also pops up
today in the text displayed in GUIs; the bytes shipped other
networks; Internet standard such as email; and even some persistence
topics such as DBM files and shelves. Any interface that deals in
text necessarily deals in Unicode today, because str
is Unicode,
whether ASCII or wider. Once we reach the realm of the applications
programming presented in this book, Unicode is no longer an optional
topic for most Python 3.X programmers.
In this book, we’ll defer further coverage of Unicode until we can see it in the context of application topics and practical programs. For more fundamental details on how 3.X’s Unicode text and binary data support impact both string and file usage in some roles, please see Learning Python, Fourth Edition; since this is officially a core language topic, it enjoys in-depth coverage and a full 45-page dedicated chapter in that book.
Besides processing strings, the more.py script also
uses files—it opens the external file whose name is listed on the
command line using the built-in open
function, and it reads that file’s
text into memory all at once with the file object read
method. Since file objects returned
by open
are part of the core
Python language itself, I assume that you have at least a passing
familiarity with them at this point in the text. But just in case
you’ve flipped to this chapter early on in your Pythonhood, the
following calls load a file’s contents into a string, load a
fixed-size set of bytes into a string, load a file’s contents into a
list of line strings, and load the next line in the file into a
string, respectively:
open('file').read() # read entire file into string open('file').read(N) # read next N bytes into string open('file').readlines() # read entire file into line strings list open('file').readline() # read next line, through ' '
As we’ll see in a moment, these calls can also be applied to
shell commands in Python to read their output. File objects also
have write
methods for sending
strings to the associated file. File-related topics are covered in
depth in Chapter 4, but making an
output file and reading it back is easy in Python:
>>>file = open('spam.txt', 'w')
# create file spam.txt >>>file.write(('spam' * 5) + ' ')
# write text: returns #characters written 21 >>>file.close()
>>>file = open('spam.txt')
# or open('spam.txt').read() >>>text = file.read()
# read into a string >>>text
'spamspamspamspamspam '
Also by way of review, the last few lines in the more.py file in Example 2-1 introduce one of the first big concepts in shell tool programming. They instrument the file to be used in either of two ways—as a script or as a library.
Recall that every Python module has a built-in __name__
variable
that Python sets to the __main__
string only when the file is run as a program, not when it’s
imported as a library. Because of that, the more
function in this file is executed
automatically by the last line in the file when this script is run
as a top-level program, but not when it is imported elsewhere. This
simple trick turns out to be one key to writing reusable script
code: by coding program logic as functions
rather than as top-level code, you can also import and reuse it in
other scripts.
The upshot is that we can run more.py by
itself or import and call its more
function elsewhere. When running the
file as a top-level program, we list on the command line the name of
a file to be read and paged: as I’ll describe in more depth in the
next chapter, words typed in the command that is used to start a
program show up in the built-in sys.argv
list in Python. For example, here
is the script file in action, paging itself (be sure to type this
command line in your PP4ESystem directory, or
it won’t find the input file; more on command lines later):
C:...PP4ESystem>python more.py more.py
""" split and interactively page a string or file of text """ def more(text, numlines=15): lines = text.splitlines() # like split(' ') but no '' at end while lines: chunk = lines[:numlines] lines = lines[numlines:] for line in chunk: print(line) More?y
if lines and input('More?') not in ['y', 'Y']: break if __name__ == '__main__': import sys # when run, not imported more(open(sys.argv[1]).read(), 10) # page contents of file on cmdline
When the more.py file is imported, we
pass an explicit string to its more
function, and this is exactly the
sort of utility we need for documentation text. Running this utility
on the sys
module’s documentation
string gives us a bit more information in human-readable form about
what’s available to scripts:
C:...PP4ESystem>python
>>>from more import more
>>>import sys
>>>more(sys.__doc__)
This module provides access to some objects used or maintained by the interpreter and to functions that interact strongly with the interpreter. Dynamic objects: argv -- command line arguments; argv[0] is the script pathname if known path -- module search path; path[0] is the script directory, else '' modules -- dictionary of loaded modules displayhook -- called to show results in an interactive session excepthook -- called to handle any uncaught exception other than SystemExit To customize printing in an interactive session or to install a custom top-level exception handler, assign other functions to replace these. stdin -- standard input file object; used by input() More?
Pressing “y” or “Y” here makes the function display the next
few lines of documentation, and then prompt again, unless you’ve run
past the end of the lines list. Try this on your own machine to see
what the rest of the module’s documentation string looks like. Also
try experimenting by passing a different window size in the second
argument—more(sys.__doc__, 5)
shows just 5 lines at
a time.
If that still isn’t enough detail, your next step is to read the Python library
manual’s entry for sys
to get the
full story. All of Python’s standard manuals are available online,
and they often install alongside Python itself. On Windows, the
standard manuals are installed automatically, but here are a few
simple pointers:
On Windows, click the Start button, pick All Programs, select the Python entry there, and then choose the Python Manuals item. The manuals should magically appear on your display; as of Python 2.4, the manuals are provided as a Windows help file and so support searching and navigation.
On Linux or Mac OS X, you may be able to click on the manuals’ entries in a file explorer or start your browser from a shell command line and navigate to the library manual’s HTML files on your machine.
If you can’t find the manuals on your computer, you can always read them online. Go to Python’s website at http://www.python.org and follow the documentation links there. This website also has a simple searching utility for the manuals.
However you get started, be sure to pick the Library manual
for things such as sys
; this
manual documents all of the standard library, built-in types and
functions, and more. Python’s standard manual set also includes a
short tutorial, a language reference, extending references, and
more.
At the risk of sounding like a marketing droid, I should mention that you can also purchase the Python manual set, printed and bound; see the book information page at http://www.python.org for details and links. Commercially published Python reference books are also available today, including Python Essential Reference, Python in a Nutshell, Python Standard Library, and Python Pocket Reference. Some of these books are more complete and come with examples, but the last one serves as a convenient memory jogger once you’ve taken a library tour or two.[4]
But enough about documentation sources (and scripting basics)—let’s
move on to system module details. As mentioned earlier, the sys
and os
modules form the core of much of Python’s
system-related tool set. To see how, we’ll turn to a quick,
interactive tour through some of the tools in these two modules before
applying them in bigger examples. We’ll start with sys
, the smaller of the two; remember that
to see a full list of all the attributes in sys
, you need to pass it to the dir
function (or see where we did so earlier
in this chapter).
Like most modules, sys
includes both informational names and functions that take action.
For instance, its attributes give us the name of the underlying
operating system on which the platform code is running, the largest
possible “natively sized” integer on this machine (though integers
can be arbitrarily long in Python 3.X), and the version number of
the Python interpreter running our code:
C:...PP4ESystem>python
>>>import sys
>>>sys.platform, sys.maxsize, sys.version
('win32', 2147483647, '3.1.1 (r311:74483, Aug 17 2009, 17:02:12) ...more deleted...') >>>if sys.platform[:3] == 'win': print('hello windows')
... hello windows
If you have code that must act differently on different
machines, simply test the sys.platform
string as done here; although most of Python is cross-platform,
nonportable tools are usually wrapped in if
tests like the one here. For instance,
we’ll see later that some program launch and low-level console
interaction tools may vary per platform—simply test sys.platform
to pick the right tool for
the machine on which your script is running.
The sys
module also
lets us inspect the module search path both
interactively and within a Python program. sys.path
is a list of directory name
strings representing the true search path in a running Python
interpreter. When a module is imported, Python scans this list from
left to right, searching for the module’s file on each directory
named in the list. Because of that, this is the place to look to verify that your search
path is really set as intended.[5]
The sys.path
list is simply
initialized from your PYTHONPATH
setting—the content of any .pth path files
located in Python’s directories on your machine plus system defaults—when the interpreter is first
started up. In fact, if you inspect sys.path
interactively, you’ll notice
quite a few directories that are not on your PYTHONPATH
: sys.path
also includes an indicator for
the script’s home directory (an empty string—something I’ll explain
in more detail after we meet os.getcwd
) and a set of standard library
directories that may vary per installation:
>>> sys.path
['', 'C:\PP4thEd\Examples', ...plus standard library paths deleted... ]
Surprisingly, sys.path
can
actually be changed by a program, too. A script
can use list operations such as append
, extend
, insert
, pop
, and remove
, as well as the del
statement to configure the search path
at runtime to include all the source directories to which it needs
access. Python always uses the current sys.path
setting to import, no matter what
you’ve changed it to:
>>>sys.path.append(r'C:mydir')
>>>sys.path
['', 'C:\PP4thEd\Examples', ...more deleted..., 'C:\mydir']
Changing sys.path
directly
like this is an alternative to setting your PYTHONPATH
shell variable, but not a very
permanent one. Changes to sys.path
are retained only until the
Python process ends, and they must be remade every time you start a
new Python program or session. However, some types of programs
(e.g., scripts that run on a web server) may not be able to depend
on PYTHONPATH
settings; such
scripts can instead configure sys.path
on startup to include all the
directories from which they will need to import modules. For a more
concrete use case, see Example 1-34 in the prior chapter—there we had to tweak the
search path dynamically this way, because the web server violated
our import path assumptions.
The sys
module also contains hooks into the interpreter; sys.modules
, for example, is a dictionary
containing one name:module
entry for
every module imported in your Python session or program (really, in
the calling Python process):
>>>sys.modules
{'reprlib': <module 'reprlib' from 'c:python31lib eprlib.py'>, ...more deleted... >>>list(sys.modules.keys())
['reprlib', 'heapq', '__future__', 'sre_compile', '_collections', 'locale', '_sre', 'functools', 'encodings', 'site', 'operator', 'io', '__main__', ...more deleted... ] >>>sys
<module 'sys' (built-in)> >>>sys.modules['sys']
<module 'sys' (built-in)>
We might use such a hook to write programs that display or
otherwise process all the modules loaded by a program (just iterate
over the keys of sys.modules
).
Also in the interpret hooks category, an object’s reference
count is available via sys.getrefcount
, and the names of modules
built-in to the Python executable are listed in sys.builtin_module_names
. See Python’s
library manual for details; these are mostly Python internals
information, but such hooks can sometimes become important to
programmers writing tools for other programmers to use.
Other attributes in the sys
module allow us to fetch all the information related to the
most recently raised Python exception. This is handy if we want to
process exceptions in a more generic fashion. For instance,
the sys.exc_info
function returns a tuple with the latest exception’s type, value,
and traceback object. In the all class-based exception model that
Python 3 uses, the first two of these correspond to the most
recently raised exception’s class, and the instance of it which was
raised:
>>>try:
...raise IndexError
...except:
...print(sys.exc_info())
... (<class 'IndexError'>, IndexError(), <traceback object at 0x019B8288>)
We might use such information to format our own error message
to display in a GUI pop-up window or HTML web page (recall that by
default, uncaught exceptions terminate programs with a Python error
display). The first two items returned by this call have reasonable
string displays when printed directly, and the third is a traceback
object that can be processed with the standard traceback
module:
>>>import traceback, sys
>>>def grail(x):
...raise TypeError('already got one')
... >>>try:
...grail('arthur')
...except:
...exc_info = sys.exc_info()
...print(exc_info[0])
...print(exc_info[1])
...traceback.print_tb(exc_info[2])
... <class 'TypeError'> already got one File "<stdin>", line 2, in <module> File "<stdin>", line 2, in grail
The traceback
module can
also format messages as strings and route them to specific file
objects; see the Python library manual for more details.
The sys
module exports additional commonly-used tools that we will
meet in the context of larger topics and examples introduced later
in this part of the book. For instance:
Command-line arguments show up as a list of strings called
sys.argv
.
Standard streams are available as sys.stdin
, sys.stdout
, and sys.stderr
.
Program exit can be forced with sys.exit
calls.
Since these lead us to bigger topics, though, we will cover them in sections of their own.
As mentioned, os
is
the larger of the two core system modules. It contains
all of the usual operating-system calls you use in C programs and
shell scripts. Its calls deal with directories, processes, shell
variables, and the like. Technically, this module provides POSIX
tools—a portable standard for operating-system calls—along with
platform-independent directory processing tools as the nested module
os.path
. Operationally, os
serves as a largely portable interface to
your computer’s system calls: scripts written with os
and os.path
can usually be run unchanged on any
platform. On some platforms, os
includes extra tools available just for that platform (e.g., low-level
process calls on Unix); by and large, though, it is as cross-platform
as is technically feasible.
Let’s take a quick look at the basic interfaces in os
. As a preview, Table 2-1 summarizes some of the
most commonly used tools in the os
module, organized by functional area.
Tasks | Tools |
Shell variables |
|
Running programs |
|
Spawning processes |
|
Descriptor files, locks |
|
File processing |
|
Administrative tools |
|
Portability tools |
|
Pathname tools |
|
If you inspect this module’s attributes interactively, you get a huge list of names that will vary per Python release, will likely vary per platform, and isn’t incredibly useful until you’ve learned what each name means (I’ve let this line-wrap and removed most of this list to save space—run the command on your own):
>>>import os
>>>dir(os)
['F_OK', 'MutableMapping', 'O_APPEND', 'O_BINARY', 'O_CREAT', 'O_EXCL', 'O_NOINH ERIT', 'O_RANDOM', 'O_RDONLY', 'O_RDWR', 'O_SEQUENTIAL', 'O_SHORT_LIVED', 'O_TEM PORARY', 'O_TEXT', 'O_TRUNC', 'O_WRONLY', 'P_DETACH', 'P_NOWAIT', 'P_NOWAITO', ' P_OVERLAY', 'P_WAIT', 'R_OK', 'SEEK_CUR', 'SEEK_END', 'SEEK_SET', 'TMP_MAX', ...9 lines removed here... 'pardir', 'path', 'pathsep', 'pipe', 'popen', 'putenv', 'read', 'remove', 'rem ovedirs', 'rename', 'renames', 'rmdir', 'sep', 'spawnl', 'spawnle', 'spawnv', 's pawnve', 'startfile', 'stat', 'stat_float_times', 'stat_result', 'statvfs_result ', 'strerror', 'sys', 'system', 'times', 'umask', 'unlink', 'urandom', 'utime', 'waitpid', 'walk', 'write']
Besides all of these, the nested os.path
module
exports even more tools, most of which are related to processing
file and directory names portably:
>>> dir(os.path)
['__all__', '__builtins__', '__doc__', '__file__', '__name__', '__package__',
'_get_altsep', '_get_bothseps', '_get_colon', '_get_dot', '_get_empty',
'_get_sep', '_getfullpathname', 'abspath', 'altsep', 'basename', 'commonprefix',
'curdir', 'defpath', 'devnull', 'dirname', 'exists', 'expanduser', 'expandvars',
'extsep', 'genericpath', 'getatime', 'getctime', 'getmtime', 'getsize', 'isabs',
'isdir', 'isfile', 'islink', 'ismount', 'join', 'lexists', 'normcase', 'normpath',
'os', 'pardir', 'pathsep', 'realpath', 'relpath', 'sep', 'split', 'splitdrive',
'splitext', 'splitunc', 'stat', 'supports_unicode_filenames', 'sys']
Just in case those massive listings aren’t quite enough to go on, let’s
experiment interactively with some of the more commonly used
os
tools. Like sys
, the os
module comes with a collection of
informational and administrative tools:
>>>os.getpid()
7980 >>>os.getcwd()
'C:\PP4thEd\Examples\PP4E\System' >>>os.chdir(r'C:Users')
>>>os.getcwd()
'C:\Users'
As shown here, the os.getpid
function
gives the calling process’s process ID (a unique system-defined
identifier for a running program, useful for process control and
unique name creation), and os.getcwd
returns
the current working directory. The current working directory is
where files opened by your script are assumed to live, unless their
names include explicit directory paths. That’s why earlier I told
you to run the following command in the directory where
more.py lives:
C:...PP4ESystem> python more.py more.py
The input filename argument here is given without an explicit
directory path (though you could add one to page files in another
directory). If you need to run in a different working directory,
call the os.chdir
function
to change to a new directory; your code will run relative to the new
directory for the rest of the program (or until the next os.chdir
call). The next chapter will have
more to say about the notion of a current working directory, and its
relation to module imports when it explores script execution
context.
The os
module also exports a set of names designed to make
cross-platform programming simpler. The set includes
platform-specific settings for path and directory separator
characters, parent and current directory indicators, and the
characters used to terminate lines on the underlying
computer.
>>> os.pathsep, os.sep, os.pardir, os.curdir, os.linesep
(';', '', '..', '.', '
')
os.sep
is whatever character is used to separate directory
components on the platform on which Python is running; it is
automatically preset to on
Windows,
/
for POSIX machines,
and :
on some Macs.
Similarly, os.pathsep
provides the character that separates directories on directory
lists, :
for POSIX and ;
for DOS and Windows.
By using such attributes when composing and decomposing
system-related strings in our scripts, we make the scripts fully
portable. For instance, a call of the form dirpath.split(os.sep)
will correctly split
platform-specific directory names into components, though dirpath
may look like dirdir
on Windows, dir/dir
on Linux, and dir:dir
on some Macs. As mentioned, on
Windows you can usually use forward slashes rather than backward
slashes when giving filenames to be opened, but these portability
constants allow scripts to be platform neutral in directory
processing code.
Notice also how os.linesep
comes
back as
here—the symbolic
escape code which reflects the carriage-return + line-feed line
terminator convention on Windows, which you don’t normally notice
when processing text files in Python. We’ll learn more about
end-of-line translations in Chapter 4.
The nested module os.path
provides a large set of directory-related tools of its
own. For example, it includes portable functions for tasks such as
checking a file’s type (isdir
,
isfile
, and others); testing file
existence (exists
); and fetching
the size of a file by name (getsize
):
>>>os.path.isdir(r'C:Users'), os.path.isfile(r'C:Users')
(True, False) >>>os.path.isdir(r'C:config.sys'), os.path.isfile(r'C:config.sys')
(False, True) >>>os.path.isdir('nonesuch'), os.path.isfile('nonesuch')
(False, False) >>>os.path.exists(r'c:UsersBrian')
False >>>os.path.exists(r'c:UsersDefault')
True >>>os.path.getsize(r'C:autoexec.bat')
24
The os.path.isdir
and
os.path.isfile
calls tell us
whether a filename is a directory or a simple file; both return
False
if the named file does not exist (that is, nonexistence
implies negation). We also get calls for splitting and joining
directory path strings, which automatically use the directory name
conventions on the platform on which Python is running:
>>>os.path.split(r'C: empdata.txt')
('C:\temp', 'data.txt') >>>os.path.join(r'C: emp', 'output.txt')
'C:\temp\output.txt' >>>name = r'C: empdata.txt'
# Windows paths >>>os.path.dirname(name), os.path.basename(name)
('C:\temp', 'data.txt') >>>name = '/home/lutz/temp/data.txt'
# Unix-style paths >>>os.path.dirname(name), os.path.basename(name)
('/home/lutz/temp', 'data.txt') >>>os.path.splitext(r'C:PP4thEdExamplesPP4EPyDemos.pyw')
('C:\PP4thEd\Examples\PP4E\PyDemos', '.pyw')
os.path.split
separates a filename from its directory path,
and os.path.join
puts
them back together—all in entirely portable fashion using the path
conventions of the machine on which they are called. The dirname
and basename
calls here return the first and
second items returned by a split
simply as a convenience, and splitext
strips the file extension (after
the last .
). Subtle point: it’s
almost equivalent to use string split
and join
method calls with the portable
os.sep
string, but not
exactly:
>>>os.sep
'' >>>pathname = r'C:PP4thEdExamplesPP4EPyDemos.pyw'
>>>os.path.split(pathname)
# split file from dir ('C:\PP4thEd\Examples\PP4E', 'PyDemos.pyw') >>>pathname.split(os.sep)
# split on every slash ['C:', 'PP4thEd', 'Examples', 'PP4E', 'PyDemos.pyw'] >>>os.sep.join(pathname.split(os.sep))
'C:\PP4thEd\Examples\PP4E\PyDemos.pyw' >>>os.path.join(*pathname.split(os.sep))
'C:PP4thEd\Examples\PP4E\PyDemos.pyw'
The last join call require individual arguments (hence the
*
) but doesn’t insert a first
slash because of the Windows drive syntax; use the
preceding str.join
method
instead if the difference matters. The normpath
call comes in handy if your paths
become a jumble of Unix and Windows separators:
>>>mixed
'C:\temp\public/files/index.html' >>>os.path.normpath(mixed)
'C:\temp\public\files\index.html' >>>print(os.path.normpath(r'C: emp\sub.file.ext'))
C: empsubfile.ext
This module also has an abspath
call that portably returns the
full directory pathname of a file; it accounts for adding the
current directory as a path prefix, ..
parent syntax, and more:
>>>os.chdir(r'C:Users')
>>>os.getcwd()
'C:\Users' >>>os.path.abspath('')
# empty string means the cwd 'C:\Users' >>>os.path.abspath('temp')
# expand to full pathname in cwd 'C:\Users\temp' >>>os.path.abspath(r'PP4Edev')
# partial paths relative to cwd 'C:\Users\PP4E\dev' >>>os.path.abspath('.')
# relative path syntax expanded 'C:\Users' >>>os.path.abspath('..')
'C:' >>>os.path.abspath(r'..examples')
'C:\examples' >>>os.path.abspath(r'C:PP4thEdchapters')
# absolute paths unchanged 'C:\PP4thEd\chapters' >>>os.path.abspath(r'C: empspam.txt')
'C:\temp\spam.txt'
Because filenames are relative to the current working
directory when they aren’t fully specified paths, the os.path.abspath
function helps if you want to show users what
directory is truly being used to store a file. On Windows, for
example, when GUI-based programs are launched by clicking on file
explorer icons and desktop shortcuts, the execution directory of the
program is the clicked file’s home directory, but that is not always
obvious to the person doing the clicking; printing a file’s abspath
can help.
The os
module is also the place where we run shell commands from within
Python scripts. This concept is intertwined with others, such as
streams, which we won’t cover fully until the next chapter, but
since this is a key concept employed throughout this part of the
book, let’s take a quick first look at the basics here. Two os
functions allow scripts to run any
command line that you can type in a console window:
In addition, the relatively new subprocess
module provides finer-grained
control over streams of spawned shell commands and can be used as an
alternative to, and even for the implementation of, the two calls
above (albeit with some cost in extra code complexity).
To understand the scope of the calls listed above, we first need to define a few terms. In this text, the term shell means the system that reads and runs command-line strings on your computer, and shell command means a command-line string that you would normally enter at your computer’s shell prompt.
For example, on Windows, you can start an MS-DOS console
window (a.k.a. “Command Prompt”) and type DOS commands
there—commands such as dir
to
get a directory listing, type
to view a file, names of programs you wish to start, and so on.
DOS is the system shell, and commands such as dir
and type
are shell commands. On Linux and
Mac OS X, you can start a new shell session by opening an xterm or
terminal window and typing shell commands there too—ls
to list directories, cat
to view files, and so on. A variety
of shells are available on Unix (e.g., csh, ksh), but they all
read and run command lines. Here are two shell commands typed and
run in an MS-DOS console box on Windows:
C:...PP4ESystem>dir /B
...type a shell command line helloshell.py ...its output shows up here more.py ...DOS is the shell on Windows more.pyc spam.txt __init__.py C:...PP4ESystem>type helloshell.py
# a Python program print('The Meaning of Life')
None of this is directly related to Python, of course (despite
the fact that Python command-line scripts are sometimes
confusingly called “shell tools”). But because the os
module’s system
and popen
calls let Python scripts run any
sort of command that the underlying system shell understands, our
scripts can make use of every command-line tool available on the
computer, whether it’s coded in Python or not. For example, here
is some Python code that runs the two DOS shell commands typed at
the shell prompt shown previously:
C:...PP4ESystem>python
>>>import os
>>>os.system('dir /B')
helloshell.py more.py more.pyc spam.txt __init__.py 0 >>>os.system('type helloshell.py')
# a Python program print('The Meaning of Life') 0 >>>os.system('type hellshell.py')
The system cannot find the file specified. 1
The 0
s at the end of the
first two commands here are just the return values of the system
call itself (its exit status; zero generally means success). The
system call can be used to run any command line that we could type
at the shell’s prompt (here, C:...PP4ESystem>
). The command’s
output normally shows up in the Python session’s or program’s
standard output stream.
But what if we want to grab a command’s output within a script? The
os.system
call simply runs a shell command line, but os.popen
also
connects to the standard input or output streams of the command;
we get back a file-like object connected to the command’s output
by default (if we pass a w
mode
flag to popen
, we connect to
the command’s input stream instead). By using this object to read
the output of a command spawned with popen
, we can intercept the text that
would normally appear in the console window where a command line
is typed:
>>>open('helloshell.py').read()
"# a Python program print('The Meaning of Life') " >>>text = os.popen('type helloshell.py').read()
>>>text
"# a Python program print('The Meaning of Life') " >>>listing = os.popen('dir /B').readlines()
>>>listing
['helloshell.py ', 'more.py ', 'more.pyc ', 'spam.txt ', '__init__.py ']
Here, we first fetch a file’s content the usual way (using
Python files), then as the output of a shell type
command. Reading the output of a
dir
command lets us get a
listing of files in a directory that we can then process in a
loop. We’ll learn other ways to obtain such a list in Chapter 4; there we’ll also learn how
file iterators make the readlines
call in the os.popen
example above unnecessary in
most programs, except to display the list interactively as we did here
(see also subprocess, os.popen, and Iterators for more on
the subject).
So far, we’ve run basic DOS commands; because these calls can run any command line that we can type at a shell prompt, they can also be used to launch other Python scripts. Assuming your system search path is set to locate your Python (so that you can use the shorter “python” in the following instead of the longer “C:Python31python”):
>>>os.system('python helloshell.py')
# run a Python program The Meaning of Life 0 >>>output = os.popen('python helloshell.py').read()
>>>output
'The Meaning of Life '
In all of these examples, the command-line strings sent to
system
and popen
are hardcoded, but there’s no
reason Python programs could not construct such strings at runtime using normal string
operations (+, %, etc.). Given that commands can be dynamically
built and run this way, system
and popen
turn Python scripts
into flexible and portable tools for launching and orchestrating
other programs. For example, a Python test “driver” script can be
used to run programs coded in any language (e.g., C++, Java,
Python) and analyze their output. We’ll explore such a script in
Chapter 6. We’ll also revisit
os.popen
in the next chapter in
conjunction with stream redirection; as we’ll find, this call can
also send input to programs.
As mentioned, in recent releases of Python the subprocess
module can achieve the same
effect as os.system
and
os.popen
; it generally requires
extra code but gives more control over how streams are connected
and used. This becomes especially useful when streams are tied in
more complex ways.
For example, to run a simple shell command like we did with
os.system
earlier, this new
module’s call
function works
roughly the same (running commands like “type” that are built into
the shell on Windows requires extra protocol, though normal
executables like “python” do not):
>>>import subprocess
>>>subprocess.call('python helloshell.py')
# roughly like os.system() The Meaning of Life 0 >>>subprocess.call('cmd /C "type helloshell.py"')
# built-in shell cmd # a Python program print('The Meaning of Life') 0 >>>subprocess.call('type helloshell.py', shell=True)
# alternative for built-ins # a Python program print('The Meaning of Life') 0
Notice the shell=True
in
the last command here. This is a subtle and platform-dependent requirement:
On Windows, we need to pass a shell=True
argument to subprocess
tools like call
and Popen
(shown ahead) in order to run
commands built into the shell. Windows commands like “type”
require this extra protocol, but normal executables like
“python” do not.
On Unix-like platforms, when shell
is False
(its default), the program
command line is run directly by os.execvp
, a call we’ll meet in
Chapter 5. If this argument is
True
, the command-line
string is run through a shell instead, and you can specify the
shell to use with additional arguments.
More on some of this later; for now, it’s enough to note
that you may need to pass shell=True
to run some of the examples
in this section and book in Unix-like environments, if they rely
on shell features like program path lookup. Since I’m running code
on Windows, this argument will often be omitted here.
Besides imitating os.system
, we can similarly use this
module to emulate the os.popen
call used earlier, to run a shell command and obtain its standard
output text in our script:
>>>pipe = subprocess.Popen('python helloshell.py', stdout=subprocess.PIPE)
>>>pipe.communicate()
(b'The Meaning of Life ', None) >>>pipe.returncode
0
Here, we connect the stdout stream to a pipe, and communicate to run the command to completion and receive its standard output and error streams’ text; the command’s exit status is available in an attribute after it completes. Alternatively, we can use other interfaces to read the command’s standard output directly and wait for it to exit (which returns the exit status):
>>>pipe = subprocess.Popen('python helloshell.py', stdout=subprocess.PIPE)
>>>pipe.stdout.read()
b'The Meaning of Life ' >>>pipe.wait()
0
In fact, there are direct mappings from os.popen
calls to subprocess.Popen
objects:
>>>from subprocess import Popen, PIPE
>>>Popen('python helloshell.py', stdout=PIPE).communicate()[0]
b'The Meaning of Life ' >>> >>>import os
>>>os.popen('python helloshell.py').read()
'The Meaning of Life '
As you can probably tell,
subprocess
is extra work in these relatively simple
cases. It starts to look better, though, when we need to control
additional streams in flexible ways. In fact, because it also
allows us to process a command’s error and input streams in
similar ways, in Python 3.X subprocess
replaces the original
os.popen2
, os.popen3
, and os.popen4
calls which were available in
Python 2.X; these are now just use cases for subprocess
object interfaces. Because
more advanced use cases for this module deal with standard
streams, we’ll postpone additional details about this module until
we study stream redirection in the next chapter.
Before we move on, you should keep in mind two limitations of system
and popen
. First, although these two
functions themselves are fairly portable, their use is really only
as portable as the commands that they run. The preceding examples
that run DOS dir
and type
shell commands, for instance, work
only on Windows, and would have to be changed in order to
run ls
and cat
commands on Unix-like
platforms.
Second, it is important to remember that running Python
files as programs this way is very different and generally much
slower than importing program files and calling functions they
define. When os.system
and
os.popen
are called, they must
start a brand-new, independent program running on your operating
system (they generally run the command in a new process). When
importing a program file as a module, the Python interpreter
simply loads and runs the file’s code in the same process in order
to generate a module object. No other program is spawned along the
way.[6]
There are good reasons to build systems as separate programs, too, and in the next chapter we’ll explore things such as command-line arguments and streams that allow programs to pass information back and forth. But in many cases, imported modules are a faster and more direct way to compose systems.
If you plan to use these calls in earnest, you should also
know that the os.system
call
normally blocks—that is, pauses—its caller until the spawned
command line exits. On Linux and Unix-like platforms, the spawned
command can generally be made to run independently and in parallel
with the caller by adding an &
shell background operator at the
end of the command line:
os.system("python program.py arg arg &")
On Windows, spawning with a DOS start
command will usually launch the command in parallel
too:
os.system("start program.py arg arg")
In fact, this is so useful that an os.startfile
call was added in recent
Python releases. This call opens a file with whatever program is
listed in the Windows registry for the file’s type—as though its
icon has been clicked with the mouse cursor:
os.startfile("webpage.html") # open file in your web browser os.startfile("document.doc") # open file in Microsoft Word os.startfile("myscript.py") # run file with Python
The os.popen
call does
not generally block its caller (by definition, the caller must be
able to read or write the file object returned) but callers may
still occasionally become blocked under both Windows and Linux if
the pipe object is closed—e.g., when garbage is collected—before
the spawned program exits or the pipe is read exhaustively (e.g.,
with its read()
method). As we
will see later in this part of the book, the Unix os.fork/exec
and Windows os.spawnv
calls can also be used to run
parallel programs without blocking.
Because the os
module’s
system
and popen
calls, as well as the subprocess
module, also fall under the
category of program launchers, stream redirectors, and
cross-process communication devices, they will show up again in
the following chapters, so we’ll defer further details for the
time being. If you’re looking for more details right away, be sure
to see the stream redirection section in the next chapter and the
directory listings section in Chapter 4.
That’s as much of a tour around os
as we have space for here. Since most
other os
module tools are even
more difficult to appreciate outside the context of larger
application topics, we’ll postpone a deeper look at them until later
chapters. But to let you sample the flavor of this module, here is a
quick preview for reference. Among the os
module’s other weapons are
these:
And so on. One caution up front: the os
module provides a set of file open
, read
, and write
calls, but all of these deal with
low-level file access and are entirely distinct from Python’s
built-in stdio
file objects that
we create with the built-in open
function. You should normally use the built-in open
function, not the os
module, for all but very special
file-processing needs (e.g., opening with exclusive access file
locking).
In the next chapter we will apply sys
and os
tools such as those we’ve introduced
here to implement common system-level tasks, but this book doesn’t
have space to provide an exhaustive list of the contents of modules
we will meet along the way. Again, if you have not already done so,
you should become acquainted with the contents of modules such as
os
and sys
using the resources described earlier.
For now, let’s move on to explore additional system tools in the
context of broader system programming concepts—the context surrounding a
running script.
[3] They may also work their way into your subconscious. Python newcomers sometimes describe a phenomenon in which they “dream in Python” (insert overly simplistic Freudian analysis here…).
[4] Full disclosure: I also wrote the last of the books listed as a replacement for the reference appendix that appeared in the first edition of this book; it’s meant to be a supplement to the text you’re reading, and its latest edition also serves as a translation resource for Python 2.X readers. As explained in the Preface, the book you’re holding is meant as tutorial, not reference, so you’ll probably want to find some sort of reference resource eventually (though I’m nearly narcissistic enough to require that it be mine).
[5] It’s not impossible that Python sees PYTHONPATH
differently than you do. A
syntax error in your system shell configuration files may botch
the setting of PYTHONPATH
,
even if it looks fine to you. On Windows, for example, if a
space appears around the =
of
a DOS set
command in your
configuration file (e.g., set NAME =
VALUE
), you may actually set NAME
to an empty string, not to
VALUE
!
[6] The Python code exec(open(file).read())
also runs a
program file’s code, but within the same process that called
it. It’s similar to an import in that regard, but it works
more as if the file’s text had been
pasted into the calling program at the
place where the exec
call
appears (unless explicit global or local namespace
dictionaries are passed). Unlike imports, such an exec
unconditionally reads and
executes a file’s code (it may be run more than once per
process), no module object is generated by the file’s
execution, and unless optional namespace dictionaries are
passed in, assignments in the file’s code may overwrite
variables in the scope where the exec
appears; see other resources or
the Python library manual for more details.