Python 3.0 represents an evolution of the language that will not execute most older code that was written against the 2.x intepreters. This doesn’t mean that you won’t recognize the old code any more, or that “major” porting is required to make old code work under 3.x. In fact, the new syntax is quite similar to that of the past. However, when the print
statement no longer exists, it makes it easy to “break” the old code. In this appendix, we discuss print
and other 3.x changes, and we shed some light on the “required evolution” that Python must undergo to be better than it was before. Finally, we present a few migration tools that may help you make this transition.
Python is currently undergoing its most significant transformation since it was released in the early 1990s. Even the revision change from 1.x to 2.x in 2000 was relatively mild—Python 2.0 ran 1.5.2 software just fine back then. One of the main reasons for Python’s stability over the years has been the steadfast determination of the core development team to preserve backward compatibility. Over the years, however, certain “sticky” flaws (issues that stick around from release to release) were identified by creator Guido van Rossum [Regrets], Andrew Kuchling [Warts], and other users. Their persistence made it clear that a release with hard changes was needed to ensure that the language evolved. Python 3.0 marks the first time that a Python interpreter has been released that (deliberately) breaks the backward-compatibility trend.
The changes in Python 3.0 are not mind-boggling—it’s not like you won’t recognize Python any more. The remainder of this appendix gives an overview of some of the major changes:
• print
becomes print()
.
• Strings are cast into Unicode by default.
• There is a single class type.
• The syntax for exceptions has been updated.
• Integers have been updated.
• Iterables are used everywhere.
The switch to print()
is easily the change that breaks the most existing Python code. Why is Python changing from a statement to a BIF? Having print
as a statement is limiting in many regards, as detailed by Guido in his “Python Regrets” talk, which outlined what he feels are shortcomings of the language. In addition, having print
as a statement limits improvements to it. However, when print()
is available as a function, new keyword parameters can be added, certain standard behaviors can be overridden with keyword parameters, and print()
can be replaced if desired, just like any other BIF. Here are “before” and “after” examples:
The next gotcha that current Python users face is that strings are now Unicode by default. This change couldn’t come soon enough. There is not one day that numerous Python developers don’t run into a problem when dealing with Unicode and regular ASCII strings that looks something like this:
These types of errors will no longer be an everyday occurrence in 3.x. (For more information on using Unicode in Python, see the Unicode HOWTO document.) With the model adopted by the new version of Python, users shouldn’t even use those terms (Unicode and ASCII/non-Unicode strings) anymore. The “What’s New in Python 3.0” document sums up this new model pretty explicitly.
Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. All text is Unicode; however, encoded Unicode is represented as binary data. The type used to hold text is str
, and the type used to hold data is bytes
.
As far as syntax goes, because Unicode is now the default, the leading u
or U
is deprecated. Similarly, the new bytes
objects require a leading b
or B
for its literals (more information can be found in PEP 3112).
Table C.1 compares the various string types, showing how they will change from Python 2.x to 3.x. The table also includes a mention of the new mutable bytearray
type.
Prior to Python 2.2, Python’s objects didn’t behave like classes in other languages: Classes were “class” objects and instances were “instance” objects. This is in stark contrast to what people perceive as “normal”: Classes are types and instances are objects of such types. Because of this “flaw,” you could not subclass data types and modify them. In Python 2.2, the core development team came up with “new-style classes,” which act more like what people expect. Furthermore, this change meant that regular Python types could be subclassed—a change described in Guido’s “Unifying Types and Classes in Python 2.2” essay. Python 3.0 supports only new-style classes.
In the past, the syntax to catch an exception and the exception argument/instance had the following form:
except ValueError, e:
To catch multiple exceptions with the same handler, the following syntax was used:
except (ValueError, TypeError), e:
The required parentheses confused some users, who often attempted to write invalid code:
except ValueError, TypeError, e:
The (new) as
keyword is intended to ensure that you do not become confused by the comma in the original syntax; however, the parentheses are still required when you’re trying to catch more than one type of exception using the same handler. Here are two equivalent examples of the new syntax that demonstrate this change:
except ValueError as e:
except (ValueError, TypeError) as e:
Python 2.6 accepts both forms when creating exception handlers, thereby facilitating the porting process. More information about this change can be found in PEP 3110.
Python’s two different integer types, int
and long
, began their unification in Python 2.2. That change is now almost complete, with the “new” int
behaving like a long
. As a consequence, OverflowError
exceptions no longer occur when you exceed the native integer size, and the trailing L
has been dropped. This change is outlined in PEP 237. long
still exists in Python 2.x but has disappeared in Python 3.0.
The current division operator (/
) doesn’t give the expected answer for those users who are new to programming, so it has been changed to do so. If this change has brought any controversy, it is simply that programmers are used to the floor division functionality. To see how the confusion arises, try to convince a newbie to programming that 1 divided by 2 is 0 (1 / 2 == 0
). The simplest way to describe this change is with examples. Following are some excerpted from “Keeping Up with Python: The 2.2 Release,” found in the July 2002 issue of Linux Journal. You can also find out more about this update in PEP 238.
The default Python 2.x division operation works this way: Given two integer operands, /
performs integer floor division (truncates the fraction as in the earlier example). If there is at least
one float involved, true division occurs:
In Python 3.x, given any two numeric operands, /
will always return a float:
To try true division starting in Python 2.2, you can either import division
from __future__
module or use the -Qnew
switch.
The double-slash division operator (//)
was added in Python 2.2 to always perform floor division regardless of operand type and to begin the transition process:
The minor integer literal changes were added in Python 2.6 to make literal nondecimal (hexadecimal, octal, and new binary) formats consistent. Hex representation stayed the same, with its leading 0x
or 0X
(where the octal had formerly led with a single 0
). This format proved confusing to some users, so it has been changed to 0o
for consistency. Instead of 0177
, you must use 0o177
now. Finally, the new binary literal lets you provide the bits of an integer value, prefixed with a leading 0b
, as in 0b0110
. Python 3.0 does not accept 0177
. More information on integer literals updates can be found in PEP 3127.
Another theme inherent to Python 3.x is memory conservation. Using iterators is much more efficient than maintaining entire lists in memory, especially when the target action on the objects in question is iteration. There’s no need to waste memory when it’s not necessary. Thus, in Python 3.x, code that returned lists in earlier versions of the language no longer does so. For example, the functions map(), filter(), range()
, and zip()
, plus the dictionary methods keys(), items()
, and values()
, all return some sort of iterator. Yes, this syntax may be more inconvenient if you want to glance at your data, but it’s better in terms of resource consumption. The changes are mostly under the covers—if you only use the functions’ return values to iterate over, you won’t notice a thing!
As you have seen, most of the Python 3.x changes do not represent some wild mutation of the familiar Python syntax. Instead, the changes are just enough to break the old code base. Of course, the changes affect users, so a good transition plan is clearly needed—and most good ones come with good tools or aids to help you out. For example, both the 2to3
code converter and the latest Python 2.x release (2.6 at the time of this writing) may facilitate the transition.
The 2to3
tool will take Python 2.x code and attempt to generate a working equivalent in Python 3.x. Here are some of the actions it performs:
• Converts a print
statement to a print()
function
• Removes the L
long suffix
• Replaces <> with !=
• Changes backtick-quoted strings (`...`
) to repr(...)
This tool does a lot of the manual labor—but not everything; the rest is up to you. You can read more about porting suggestions and the 2to3
tool in the “What’s New in Python 3.0” document as well as at the tool’s Web page (http://docs.python.org/3.0/library/2to3.html).
Because of the compatibility issue, the releases of Python that lead up to 3.0 play a much more significant role in the transition. Of particular note is Python 2.6, the first and most pivotal of such releases. For users, it represents the first time that they can start coding against the 3.x family of releases, as many 3.x features have been backported to 2.x. Whenever possible, Python 2.6 incorporates new features and syntax from version 3.0 while remaining compatible with existing code by not removing older features or syntax. Such features are described in the “What’s New in Python 2.6” document. We detail some of these 2.6 migration features in Appendix D.
Overall, the changes outlined in this appendix do have a high impact in terms of updates required to the interpreter but should not radically change the way programmers write their Python code. It’s simply a matter of changing old habits, such as using parentheses with print—
or rather, print()
. Once you’ve gotten these changes under your belt, you’re well on your way to being able to effectively jump to the new platform. It may be a bit startling at first, but these changes have been coming for some time now. Don’t panic: Python 2.x will live on for a long time to come. The transition will be slow, deliberate, pain resistant, and even keeled. Welcome to the dawn of the next generation!
Andrew Kuchling, “Python Warts,” July 2003, http://web.archive.org/web/20070607112039, http://www.amk.ca/python/writing/warts.html
A. M. Kuchling, “What’s New in Python 2.6,” December 2008, http://docs.python.org/whatsnew/2.6.html
Wesley J. Chun, “Keeping Up with Python: The 2.2 Release,” July 2002, http://www.linuxjournal.com/article/5597
PEP Index, http://www.python.org/dev/peps
“Unicode HOWTO,” December 2008, http://docs.python.org/3.0/howto/unicode.html
Guido van Rossum, “Python Regrets,” July 2002, http://www.python.org/doc/essays/ppt/regrets/PythonRegrets.pdf
Guido van Rossum, “Unifying Types and Classes in Python 2.2,” April 2002, http://www.python.org/2.2.3/descrintro.html
Guido van Rossum, “What’s New in Python 3.0,” December 2008, http://docs.python.org/3.0/whatsnew/3.0.html