Appendix D. Migrating to Python 3 Starts with 2.6

Python 3: The Next Generation

Python is currently undergoing its most significant transformation since it was first released back in the winter of 1991. Python 3 is backward incompatible with all older versions, so porting will be a more significant issue than in the past.

Unlike other end-of-life efforts, however, Python 2.x will not disappear anytime soon. In fact, the remainder of the 2.x series will be developed in parallel with 3.x, thereby ensuring a smooth transition from the current to next generation. Python 2.6 is the first of these final 2.x releases.

Hybrid 2.6 as Transition Tool

Python 2.6 is a hybrid interpreter. That means it can run both 1.x and 2.x software as well as some 3.x code. Some will argue that Python releases dating back to 2.2 have already been mixed interpreters because they support creation of both classic classes as well as new-style classes, but that is as far as they go.

Version 2.6 is many steps ahead, and if they are eventually released, versions 2.7 and beyond will be even more so. The 2.6 release is the first version with specific 3.0 features backported to it. The most significant of these features are summarized here:

• Integers

• Single integer type

• New binary and modified octal literals

• Classic or true division

• The -Q division switch

• Built-in functions

print or print()

reduce()

• Other updates

• Object-oriented programming

• Two different class objects

• Strings

bytes literals

bytes type

• Exceptions

• Handling exceptions

• Raising exceptions

• Other transition tools and tips

• Warnings: the -3 switch

2to3 tool

This appendix does not discuss other new features of 2.6 that are standalone features, meaning they do not have any consequences for porting applications to 3.x. So without further ado . . .

Integers

Python integers face several changes in version 3.x and beyond, relating to their types, literals, and the integer division operation. We describe each of these changes next, highlighting the role that 2.6 plays in terms of migration.

Single Integer Type

Previous versions of Python featured two integer types, int and long. The original ints were limited in size to the architecture of the platform on which the code ran (i.e., 32-bit, 64-bit), while longs were unlimited in size except in terms of how much virtual memory the operating system provided. The process of unifying these two types into a single int type began in Python 2.2 and will be complete in version 3.0.a The new single int type will be unlimited in size, and the previous L or l designation for longs is removed. You can read more about this change in PEP 237.

a The bool type also might be considered part of this equation, because bools behave like 0 and 1 in numerical situations rather than having their natural values of False and True, respectively.

As of 2.6, there is little trace of long integers, save for the support of the trailing L. It is included for backward-compatibility purposes, to support all code that uses longs. Nevertheless, users should be actively purging long integers from their existing code and should no longer use longs in any new code written against Python 2.6+.

New Binary and Modified Octal Literals

Python 3 features a minor revision to the alternative base format for integers. It has basically streamlined the syntax to make it consistent with the existing hexadecimal format, prefixed with a leading 0x (or 0X for capital letters)—for example, 0x80, 0xffff, 0xDEADBEEF.

A new binary literal lets you provide the bits to an integer number, prefixed with a leading 0b (e.g., 0b0110). The original octal representation began with a single 0, but this format proved confusing to some users, so it has been changed to 0o to bring it in line with hexadecimal and binary literals as just described. In other words, 0177 is no longer allowed; you must use 0o177 instead. Here are some examples:

Python 2

>>> 0177
127

Python 3 (including versions 2.6+)

>>> 0o177
127
>>> 0b0110
6

Both the new binary and modified octal literal formats have been backported to 2.6 to help with migration. In fact, 2.6, in its role as a transition tool, accepts both octal formats, whereas 3.0 no longer accepts the old 0177 format. More information on the updates to integer literals can be found in PEP 3127.

Classic or True Division

A change that has been a long time coming, yet remains controversial to many, is the change to the division operator (/). The traditional division operation works in the following way: Given two integer operands, / performs integer floor division. If there is at least one float involved, true division occurs:

Python 2.x: Classic Division

image

In Python 3, the / operator will always return a float regardless of operand type:

Python 3.x: True Division

image

The double-slash division operator (//) was added as a proxy in Python 2.2 to always perform floor division regardless of the operand type and to begin the transition process:

Python 2.2+ and 3.x: Floor Division

image

Using // will be the only way to obtain floor division functionality in 3.x. To try true division in Python 2.2+, you can add the line from __future__ import division to your code, or use the -Q command-line option (discussed next). (There is no additional porting assistance available in Python 2.6.)

Python 2.2+: Division Command-Line Option

If you do not wish to import division from __future__ module in your code but want true division to always prevail, you can use the -Qnew switch. There are also other options for using -Q, as summarized in Table D.1.

Table D.1. Division Operation -Q Command-Line Options

image

For example, the -Qwarnall option is used in the Tools/scripts/fixdiv.py script found in the Python source distribution.

As you may have guessed by now, all of the transition efforts have already been implemented in Python 2.2, and no new 2.6-specific functionality has been added with respect to Python 3 migration. Table D.2 summarizes the division operators and their functionality in the various Python releases.

Table D.2. Python Release Default Division Operator Functionality

image

You can read more about the change to the division operator in PEP 238 as well as in an article called “Keeping Up with Python: The 2.2 Release” I wrote for Linux Journal back in July 2002.

Built-in Functions

print Statement or print() Function

It’s no secret that one of the most common causes of breakage between Python 2.x and 3.x applications is the change in the print statement, which becomes a built-in function in version 3.x. This change allows print() to be more flexible, upgradeable, and swappable if desired.

Python 2.6+ supports either the print statement or the print() built-in function. The default is the former usage, as it should be in a 2.x language. To discard the print statement and go with only the function in a “Python 3 mode” application, you would simply import print_function from __future__:

image

The preceding example demonstrates the power of print() being a function. Using the print statement, we display the strings "foo" and "bar" to the user, but we cannot change the default delimiter or separator between strings, which is a space. In contrast, print() makes this functionality available in its call as the argument sep, which replaces the default—and allows print to evolve and progress.

Note that this is a “one-way” import, meaning that there is no way to revert print() back to a function. Even issuing a "del print_function" will not have any effect. This major change is detailed in PEP 3105.

reduce() Moved to functools Module

In Python 3.x, the reduce() function, which is neither readily understood nor commonly used by many programmers today, has been “demoted” (much to the chagrin of many Python functional programmers) from being a built-in function to become a functools module attribute. It is available in functools beginning in 2.6.

image

Other Updates

One key theme in Python 3.x is the migration to greater use of iterators, especially for built-in functions and methods that have historically returned lists. Still other iterators are changing because of the updates to integers. The following are the most high-profile built-in functions changed in Python 3.x:

range()

zip()

map()

filter()

hex()

oct()

Starting in Python 2.6, programmers can access the new and updated functions by importing the future_builtins module. Here is an example demonstrating both the old and new oct() and zip() functions:

image

If you want to use only the Python 3.x versions of these functions in your current Python 2.x environment, you can override the old ones by importing all the new functions into your namespace. The following example demonstrates this process with oct():

image

Object-Oriented Programming: Two Different Class Objects

Python’s original classes are now called “classic classes.” They had many flaws and were eventually replaced by “new-style” classes. The transition began in Python 2.2 and continues today.

Classic classes have the following syntax:

class ClassicClass:
      pass

New-style classes have this syntax:

class NewStyleClass(object):
      pass

New-style classes feature so many more advantages than classic classes that the latter have been preserved only for backward-compatibility purposes and are eliminated entirely in Python 3. With new-style classes, types and classes are finally unified (see Guido’s “Unifying Types and Classes in Python 2.2” essay as well as PEP 252 and PEP 253).

There are no new changes added in Python 2.6 for migration purposes. Just be aware that all 2.2+ versions serve as hybrid interpreters, allowing for both class objects and instances of those classes. In Python 3, both syntaxes shown in the preceding examples result only in new-style classes being created. This behavior does not pose a serious porting issue, but you do need to be aware that classic classes don’t exist in Python 3.

Strings

One especially notable change in Python 3.x is that the default string type is changing. Python 2.x supports both ASCII and Unicode strings, with ASCII being the default. This support is swapped in Python 3: Unicode becomes the default, and ASCII strings are now called bytes. The bytes data structure contains byte values and really shouldn’t be considered a string (anymore) as much as it is an immutable byte array that contains data.

Current string literals will now require a leading b or B in Python 3.x, and current Unicode string literals will drop their leading u or U. The type and built-in function names will change from str to bytes and from unicode to str. In addition, there is a new mutable “string” type called bytearray that, like bytes, is also a byte array, only mutable.

You can find out more about using Unicode strings in the HOWTO and learn about the changes coming to string types in PEP 3137. Refer to Table C.1 for a chart on the various string types in both Python 2 and Python 3.

bytes Literals

To smooth the way for using bytes objects in Python 3.x, you can optionally prepend a regular ASCII/binary string in Python 2.6 with a leading b or B, thereby creating bytes literals (b'' or B'') as synonyms for str literals (''). The leading indicator has no bearing on any str object itself or any of the object’s operations (it is purely decorative), but it does prepare you for situations in Python 3 where you need to create such a literal. You can find out more about bytes literals in PEP 3112

bytes is str

It should not require much of a stretch of the imagination to recognize that if bytes literals are supported, then bytes objects themselves need to exist in Python 2.6. Indeed, the bytes type is synonymous with str, so much so that

>>> bytes is str
True

Thus you can use bytes or bytes()in Python 2.6 wherever you use str or str(). Further information on bytes objects can be found in PEP 358.

Exceptions

Python 2.6 has several features that allow for porting of exception handling and raising exceptions in Python 3.x.

Handling Exceptions (Using as)

Python 3’s syntax for catching and handling a single exception looks like this:

except ValueError as e:

The e variable contains the instance of the exception that provides the reason why the error was thrown. It is optional, as is the entire as e phrase. Thus this change really applies only to those users who save this value.

The equivalent Python 2 syntax uses a comma instead of the as keyword:

except ValueError, e:

This change was made in Python 3.x because of the confusion that occurs when programmers attempt to handle more than one exception with the same handler.

To catch multiple exceptions with the same handler, beginners often write this (invalid) code:

except ValueError, TypeError, e:

In fact, if you are trying to catch more than one exception, you need to use a tuple containing the exceptions:

except (ValueError, TypeError), e:

The as keyword in Python 3.0 (and 2.6+) is intended to ensure that the comma in the original syntax is no longer a source of confusion. However, the parentheses are still required when you are trying to catch more than one type of exception using the same handler:

except (ValueError, TypeError) as e:

For porting efforts, Python 2.6+ accepts either the comma or as when defining exception handlers that save the instance. In contrast; only the idiom with as is permitted in Python 3. More information about this change can be found in PEP 3110.

Raising Exceptions

The change in raising exceptions found in Python 3.x really isn’t a change at all; in fact, it doesn’t even have anything to do with the transition efforts associated with Python 2.6. Python 3’s syntax for raising exceptions (providing the optional reason for the exception) looks like this:

raise ValueError('Invalid value')

Long-time Python users have probably been using the following idiom (although both approaches are supported in all 2.x releases):

raise ValueError, 'Invalid value'

To emphasize that raising exceptions is equivalent to instantiating an exception class and to provide some additional flexibility, Python 3 supports only the first idiom. The good news is that you don’t have to wait until you adopt 2.6 to start using this technique—the syntax with parentheses has actually been valid since the Python 1 days.

Other Transition Tools and Tips

In addition to Python 2.6, developers have access to an array of tools that can make the transition to Python 3.x go more smoothly—in particular, the -3 switch (which provides obsolescence warnings) and the 2to3 tool (read more about it at http://docs.python.org/3.0/library/2to3.html). However, the most important tool that you can “write” is a good transition plan. In fact, there’s no substitute for planning.

Clearly, the Python 3.x changes do not represent some wild mutation of the familiar Python syntax. Instead, the variations are just enough to break the old code base. Of course, the changes will affect users, so a good transition plan is essential. Most good plans come with tools or aids to help you out in this regard. The porting recommendations in the “What’s New in Python 3.0” document specifically state that good testing of code is critical, in addition to the use of key tools (i.e., the 2to3 code conversion tool and Python 2.6). Without mincing words, here is exactly what is suggested at http://docs.python.org/3.0/whatsnew/3.0.html#porting-to-python-3-0:

  1. (Prerequisite) Start with excellent test coverage.
  2. Port to Python 2.6. This should involve no more work than the average port from Python 2.x to Python 2.(x+1). Make sure that all your tests pass.
  3. (Still using 2.6) Turn on the -3 command-line switch. It enables warnings about features that will be removed (or will change) in Python 3.0. Run your test suite again, and fix any code that generates warnings. Make sure that all your tests still pass.
  4. Run the 2to3 source-to-source translator over your source code tree. Run the result of the translation under Python 3.0. Manually fix any remaining issues, and continue fixing problems until all tests pass again.

Conclusion

We know big changes are coming in the next generation of Python, simply because 3.x code is backward incompatible with any older releases. The changes, although significant, won’t require entirely new ways of thinking for programmers—though there is obvious code breakage. To ease the transition period, current and future releases of the remainder of the 2.x interpreters will contain 3.x-backported features.

Python 2.6 is the first of the “dual-mode” interpreters that will allow users to start programming against the 3.0 code base. You can read more about all of the new 2.6 features (not just the ones applicable to the 3.x transition) in the “What’s New in Python 2.6” document. Python 2.6 runs all 2.x software as well as understands some 3.x code. In this way, the 2.6 release helps simplify the porting and migration process and eases users gently into the next generation of Python programming.

Online References

Wesley J. Chun, “Keeping Up with Python: The 2.2 Release,” July 2002, http://www.linuxjournal.com/article/5597

A. M. Kuchling (amk at amk.ca), “What’s New in Python 2.6,” December 2008, http://docs.python.org/whatsnew/2.6.html

PEP Index, http://www.python.org/dev/peps

Unicode HOWTO, http://docs.python.org/3.0/howto/unicode.html

Guido van Rossum (guido at python.org), “Unifying Types and Classes in Python 2.2” (second draft, published for 2.2.3), April 2002, http://www.python.org/2.2.3/descrintro.html

Guido van Rossum (guido at python.org). “What’s New in Python 3.0,” December 2008, http://docs.python.org/3.0/whatsnew/3.0.html

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset