Appendix C. Python 3: The Evolution of a Programming Language

Python 3.0 represents an evolution of the language that will not execute most older code that was written against the 2.x intepreters. This doesn’t mean that you won’t recognize the old code any more, or that “major” porting is required to make old code work under 3.x. In fact, the new syntax is quite similar to that of the past. However, when the print statement no longer exists, it makes it easy to “break” the old code. In this appendix, we discuss print and other 3.x changes, and we shed some light on the “required evolution” that Python must undergo to be better than it was before. Finally, we present a few migration tools that may help you make this transition.

Why Is Python Changing?

Python is currently undergoing its most significant transformation since it was released in the early 1990s. Even the revision change from 1.x to 2.x in 2000 was relatively mild—Python 2.0 ran 1.5.2 software just fine back then. One of the main reasons for Python’s stability over the years has been the steadfast determination of the core development team to preserve backward compatibility. Over the years, however, certain “sticky” flaws (issues that stick around from release to release) were identified by creator Guido van Rossum [Regrets], Andrew Kuchling [Warts], and other users. Their persistence made it clear that a release with hard changes was needed to ensure that the language evolved. Python 3.0 marks the first time that a Python interpreter has been released that (deliberately) breaks the backward-compatibility trend.

What Is Changing?

The changes in Python 3.0 are not mind-boggling—it’s not like you won’t recognize Python any more. The remainder of this appendix gives an overview of some of the major changes:

print becomes print().

• Strings are cast into Unicode by default.

• There is a single class type.

• The syntax for exceptions has been updated.

• Integers have been updated.

• Iterables are used everywhere.

print Becomes print()

The switch to print() is easily the change that breaks the most existing Python code. Why is Python changing from a statement to a BIF? Having print as a statement is limiting in many regards, as detailed by Guido in his “Python Regrets” talk, which outlined what he feels are shortcomings of the language. In addition, having print as a statement limits improvements to it. However, when print() is available as a function, new keyword parameters can be added, certain standard behaviors can be overridden with keyword parameters, and print() can be replaced if desired, just like any other BIF. Here are “before” and “after” examples:

Python 2.x

image

Python 3.x

image

The omission of a comma between 'Python' and 'is' is deliberate, meant to show you that direct string literal concatenation has not changed. More examples can be found in the “What’s New in Python 3.0” document; in addition, more information about this change is available in PEP 3105.

Strings: Unicode by Default

The next gotcha that current Python users face is that strings are now Unicode by default. This change couldn’t come soon enough. There is not one day that numerous Python developers don’t run into a problem when dealing with Unicode and regular ASCII strings that looks something like this:

image

These types of errors will no longer be an everyday occurrence in 3.x. (For more information on using Unicode in Python, see the Unicode HOWTO document.) With the model adopted by the new version of Python, users shouldn’t even use those terms (Unicode and ASCII/non-Unicode strings) anymore. The “What’s New in Python 3.0” document sums up this new model pretty explicitly.

Python 3.0 uses the concepts of text and (binary) data instead of Unicode strings and 8-bit strings. All text is Unicode; however, encoded Unicode is represented as binary data. The type used to hold text is str, and the type used to hold data is bytes.

As far as syntax goes, because Unicode is now the default, the leading u or U is deprecated. Similarly, the new bytes objects require a leading b or B for its literals (more information can be found in PEP 3112).

Table C.1 compares the various string types, showing how they will change from Python 2.x to 3.x. The table also includes a mention of the new mutable bytearray type.

Table C.1. Strings in Python 2 and 3

image

Single Class Type

Prior to Python 2.2, Python’s objects didn’t behave like classes in other languages: Classes were “class” objects and instances were “instance” objects. This is in stark contrast to what people perceive as “normal”: Classes are types and instances are objects of such types. Because of this “flaw,” you could not subclass data types and modify them. In Python 2.2, the core development team came up with “new-style classes,” which act more like what people expect. Furthermore, this change meant that regular Python types could be subclassed—a change described in Guido’s “Unifying Types and Classes in Python 2.2” essay. Python 3.0 supports only new-style classes.

Updated Syntax for Exceptions

Exception Handling

In the past, the syntax to catch an exception and the exception argument/instance had the following form:

except ValueError, e:

To catch multiple exceptions with the same handler, the following syntax was used:

except (ValueError, TypeError), e:

The required parentheses confused some users, who often attempted to write invalid code:

except ValueError, TypeError, e:

The (new) as keyword is intended to ensure that you do not become confused by the comma in the original syntax; however, the parentheses are still required when you’re trying to catch more than one type of exception using the same handler. Here are two equivalent examples of the new syntax that demonstrate this change:

except ValueError as e:

except (ValueError, TypeError) as e:

Python 2.6 accepts both forms when creating exception handlers, thereby facilitating the porting process. More information about this change can be found in PEP 3110.

Exception Raising

The most popular syntax for raising exceptions in Python 2.x looks like this:

raise ValueError, e

To truly emphasize that you are creating an instance of an exception, the only syntax supported in Python 3.x is this:

raise ValueError(e)

Updates to Integers

Single Integer Type

Python’s two different integer types, int and long, began their unification in Python 2.2. That change is now almost complete, with the “new” int behaving like a long. As a consequence, OverflowError exceptions no longer occur when you exceed the native integer size, and the trailing L has been dropped. This change is outlined in PEP 237. long still exists in Python 2.x but has disappeared in Python 3.0.

Changes to Division

The current division operator (/) doesn’t give the expected answer for those users who are new to programming, so it has been changed to do so. If this change has brought any controversy, it is simply that programmers are used to the floor division functionality. To see how the confusion arises, try to convince a newbie to programming that 1 divided by 2 is 0 (1 / 2 == 0). The simplest way to describe this change is with examples. Following are some excerpted from “Keeping Up with Python: The 2.2 Release,” found in the July 2002 issue of Linux Journal. You can also find out more about this update in PEP 238.

Classic Division

The default Python 2.x division operation works this way: Given two integer operands, / performs integer floor division (truncates the fraction as in the earlier example). If there is at least one float involved, true division occurs:

image

True Division

In Python 3.x, given any two numeric operands, / will always return a float:

image

To try true division starting in Python 2.2, you can either import division from __future__ module or use the -Qnew switch.

Floor Division

The double-slash division operator (//) was added in Python 2.2 to always perform floor division regardless of operand type and to begin the transition process:

image

Binary and Octal Literals

The minor integer literal changes were added in Python 2.6 to make literal nondecimal (hexadecimal, octal, and new binary) formats consistent. Hex representation stayed the same, with its leading 0x or 0X (where the octal had formerly led with a single 0). This format proved confusing to some users, so it has been changed to 0o for consistency. Instead of 0177, you must use 0o177 now. Finally, the new binary literal lets you provide the bits of an integer value, prefixed with a leading 0b, as in 0b0110. Python 3.0 does not accept 0177. More information on integer literals updates can be found in PEP 3127.

Iterables Everywhere

Another theme inherent to Python 3.x is memory conservation. Using iterators is much more efficient than maintaining entire lists in memory, especially when the target action on the objects in question is iteration. There’s no need to waste memory when it’s not necessary. Thus, in Python 3.x, code that returned lists in earlier versions of the language no longer does so. For example, the functions map(), filter(), range(), and zip(), plus the dictionary methods keys(), items(), and values(), all return some sort of iterator. Yes, this syntax may be more inconvenient if you want to glance at your data, but it’s better in terms of resource consumption. The changes are mostly under the covers—if you only use the functions’ return values to iterate over, you won’t notice a thing!

Migration Tools

As you have seen, most of the Python 3.x changes do not represent some wild mutation of the familiar Python syntax. Instead, the changes are just enough to break the old code base. Of course, the changes affect users, so a good transition plan is clearly needed—and most good ones come with good tools or aids to help you out. For example, both the 2to3 code converter and the latest Python 2.x release (2.6 at the time of this writing) may facilitate the transition.

2to3 Tool

The 2to3 tool will take Python 2.x code and attempt to generate a working equivalent in Python 3.x. Here are some of the actions it performs:

• Converts a print statement to a print() function

• Removes the L long suffix

• Replaces <> with !=

• Changes backtick-quoted strings (`...`) to repr(...)

This tool does a lot of the manual labor—but not everything; the rest is up to you. You can read more about porting suggestions and the 2to3 tool in the “What’s New in Python 3.0” document as well as at the tool’s Web page (http://docs.python.org/3.0/library/2to3.html).

Python 2.6

Because of the compatibility issue, the releases of Python that lead up to 3.0 play a much more significant role in the transition. Of particular note is Python 2.6, the first and most pivotal of such releases. For users, it represents the first time that they can start coding against the 3.x family of releases, as many 3.x features have been backported to 2.x. Whenever possible, Python 2.6 incorporates new features and syntax from version 3.0 while remaining compatible with existing code by not removing older features or syntax. Such features are described in the “What’s New in Python 2.6” document. We detail some of these 2.6 migration features in Appendix D.

Conclusion

Overall, the changes outlined in this appendix do have a high impact in terms of updates required to the interpreter but should not radically change the way programmers write their Python code. It’s simply a matter of changing old habits, such as using parentheses with print—or rather, print(). Once you’ve gotten these changes under your belt, you’re well on your way to being able to effectively jump to the new platform. It may be a bit startling at first, but these changes have been coming for some time now. Don’t panic: Python 2.x will live on for a long time to come. The transition will be slow, deliberate, pain resistant, and even keeled. Welcome to the dawn of the next generation!

References

Andrew Kuchling, “Python Warts,” July 2003, http://web.archive.org/web/20070607112039, http://www.amk.ca/python/writing/warts.html

A. M. Kuchling, “What’s New in Python 2.6,” December 2008, http://docs.python.org/whatsnew/2.6.html

Wesley J. Chun, “Keeping Up with Python: The 2.2 Release,” July 2002, http://www.linuxjournal.com/article/5597

PEP Index, http://www.python.org/dev/peps

“Unicode HOWTO,” December 2008, http://docs.python.org/3.0/howto/unicode.html

Guido van Rossum, “Python Regrets,” July 2002, http://www.python.org/doc/essays/ppt/regrets/PythonRegrets.pdf

Guido van Rossum, “Unifying Types and Classes in Python 2.2,” April 2002, http://www.python.org/2.2.3/descrintro.html

Guido van Rossum, “What’s New in Python 3.0,” December 2008, http://docs.python.org/3.0/whatsnew/3.0.html

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset