Chapter 5. Numeric Types

This chapter begins our in-depth tour of the Python language. In Python, data takes the form of objects—either built-in objects that Python provides, or objects we create using Python tools and other languages such as C. In fact, objects are the basis of every Python program you will ever write. Because they are the most fundamental notion in Python programming, objects are also our first focus in this book.

In the preceding chapter, we took a quick pass over Python’s core object types. Although essential terms were introduced in that chapter, we avoided covering too many specifics in the interest of space. Here, we’ll begin a more careful second look at data type concepts, to fill in details we glossed over earlier. Let’s get started by exploring our first data type category: Python’s numeric types.

Numeric Type Basics

Most of Python’s number types are fairly typical and will probably seem familiar if you’ve used almost any other programming language in the past. They can be used to keep track of your bank balance, the distance to Mars, the number of visitors to your website, and just about any other numeric quantity.

In Python, numbers are not really a single object type, but a category of similar types. Python supports the usual numeric types (integers and floating points), as well as literals for creating numbers and expressions for processing them. In addition, Python provides more advanced numeric programming support and objects for more advanced work. A complete inventory of Python’s numeric toolbox includes:

  • Integers and floating-point numbers

  • Complex numbers

  • Fixed-precision decimal numbers

  • Rational fraction numbers

  • Sets

  • Booleans

  • Unlimited integer precision

  • A variety of numeric built-ins and modules

This chapter starts with basic numbers and fundamentals, then moves on to explore the other tools in this list. Before we jump into code, though, the next few sections get us started with a brief overview of how we write and process numbers in our scripts.

Numeric Literals

Among its basic types, Python provides integers (positive and negative whole numbers) and floating-point numbers (numbers with a fractional part, sometimes called “floats” for economy). Python also allows us to write integers using hexadecimal, octal, and binary literals; offers a complex number type; and allows integers to have unlimited precision (they can grow to have as many digits as your memory space allows). Table 5-1 shows what Python’s numeric types look like when written out in a program, as literals.

Table 5-1. Basic numeric literals

Literal

Interpretation

1234, −24, 0, 99999999999999

Integers (unlimited size)

1.23, 1., 3.14e-10, 4E210, 4.0e+210

Floating-point numbers

0177, 0x9ff, 0b101010

Octal, hex, and binary literals in 2.6

0o177, 0x9ff, 0b101010

Octal, hex, and binary literals in 3.0

3+4j, 3.0+4.0j, 3J

Complex number literals

In general, Python’s numeric type literals are straightforward to write, but a few coding concepts are worth highlighting here:

Integer and floating-point literals

Integers are written as strings of decimal digits. Floating-point numbers have a decimal point and/or an optional signed exponent introduced by an e or E and followed by an optional sign. If you write a number with a decimal point or exponent, Python makes it a floating-point object and uses floating-point (not integer) math when the object is used in an expression. Floating-point numbers are implemented as C “doubles,” and therefore get as much precision as the C compiler used to build the Python interpreter gives to doubles.

Integers in Python 2.6: normal and long

In Python 2.6 there are two integer types, normal (32 bits) and long (unlimited precision), and an integer may end in an l or L to force it to become a long integer. Because integers are automatically converted to long integers when their values overflow 32 bits, you never need to type the letter L yourself—Python automatically converts up to long integer when extra precision is needed.

Integers in Python 3.0: a single type

In Python 3.0, the normal and long integer types have been merged—there is only integer, which automatically supports the unlimited precision of Python 2.6’s separate long integer type. Because of this, integers can no longer be coded with a trailing l or L, and integers never print with this character either. Apart from this, most programs are unaffected by this change, unless they do type testing that checks for 2.6 long integers.

Hexadecimal, octal, and binary literals

Integers may be coded in decimal (base 10), hexadecimal (base 16), octal (base 8), or binary (base 2). Hexadecimals start with a leading 0x or 0X, followed by a string of hexadecimal digits (09 and AF). Hex digits may be coded in lower- or uppercase. Octal literals start with a leading 0o or 0O (zero and lower- or uppercase letter “o”), followed by a string of digits (07). In 2.6 and earlier, octal literals can also be coded with just a leading 0, but not in 3.0 (this original octal form is too easily confused with decimal, and is replaced by the new 0o format). Binary literals, new in 2.6 and 3.0, begin with a leading 0b or 0B, followed by binary digits (01).

Note that all of these literals produce integer objects in program code; they are just alternative syntaxes for specifying values. The built-in calls hex(I), oct(I), and bin(I) convert an integer to its representation string in these three bases, and int(str, base) converts a runtime string to an integer per a given base.

Complex numbers

Python complex literals are written as realpart+imaginarypart, where the imaginarypart is terminated with a j or J. The realpart is technically optional, so the imaginarypart may appear on its own. Internally, complex numbers are implemented as pairs of floating-point numbers, but all numeric operations perform complex math when applied to complex numbers. Complex numbers may also be created with the complex(real, imag) built-in call.

Coding other numeric types

As we’ll see later in this chapter, there are additional, more advanced number types not included in Table 5-1. Some of these are created by calling functions in imported modules (e.g., decimals and fractions), and others have literal syntax all their own (e.g., sets).

Built-in Numeric Tools

Besides the built-in number literals shown in Table 5-1, Python provides a set of tools for processing number objects:

Expression operators

+, -, *, /, >>, **, &, etc.

Built-in mathematical functions

pow, abs, round, int, hex, bin, etc.

Utility modules

random, math, etc.

We’ll meet all of these as we go along.

Although numbers are primarily processed with expressions, built-ins, and modules, they also have a handful of type-specific methods today, which we’ll meet in this chapter as well. Floating-point numbers, for example, have an as_integer_ratio method that is useful for the fraction number type, and an is_integer method to test if the number is an integer. Integers have various attributes, including a new bit_length method in the upcoming Python 3.1 release that gives the number of bits necessary to represent the object’s value. Moreover, as part collection and part number, sets also support both methods and expressions.

Since expressions are the most essential tool for most number types, though, let’s turn to them next.

Python Expression Operators

Perhaps the most fundamental tool that processes numbers is the expression: a combination of numbers (or other objects) and operators that computes a value when executed by Python. In Python, expressions are written using the usual mathematical notation and operator symbols. For instance, to add two numbers X and Y you would say X + Y, which tells Python to apply the + operator to the values named by X and Y. The result of the expression is the sum of X and Y, another number object.

Table 5-2 lists all the operator expressions available in Python. Many are self-explanatory; for instance, the usual mathematical operators (+, , *, /, and so on) are supported. A few will be familiar if you’ve used other languages in the past: % computes a division remainder, << performs a bitwise left-shift, & computes a bitwise AND result, and so on. Others are more Python-specific, and not all are numeric in nature: for example, the is operator tests object identity (i.e., address in memory, a strict form of equality), and lambda creates unnamed functions.

Table 5-2. Python expression operators and precedence

Operators

Description

yield x

Generator function send protocol

lambda args: expression

Anonymous function generation

x if y else z

Ternary selection (x is evaluated only if y is true)

x or y

Logical OR (y is evaluated only if x is false)

x and y

Logical AND (y is evaluated only if x is true)

not x

Logical negation

x in y, x not in y

Membership (iterables, sets)

x is y, x is not y

Object identity tests

x < y, x <= y, x > y, x >= y

x == y, x != y

Magnitude comparison, set subset and superset;

Value equality operators

x | y

Bitwise OR, set union

x ^ y

Bitwise XOR, set symmetric difference

x & y

Bitwise AND, set intersection

x << y, x >> y

Shift x left or right by y bits

x + y

x – y

Addition, concatenation;

Subtraction, set difference

x * y

x % y

x / y, x // y

Multiplication, repetition;

Remainder, format;

Division: true and floor

−x, +x

Negation, identity

˜x

Bitwise NOT (inversion)

x ** y

Power (exponentiation)

x[i]

Indexing (sequence, mapping, others)

x[i:j:k]

Slicing

x(...)

Call (function, method, class, other callable)

x.attr

Attribute reference

(...)

Tuple, expression, generator expression

[...]

List, list comprehension

{...}

Dictionary, set, set and dictionary comprehensions

Since this book addresses both Python 2.6 and 3.0, here are some notes about version differences and recent additions related to the operators in Table 5-2:

  • In Python 2.6, value inequality can be written as either X != Y or X <> Y. In Python 3.0, the latter of these options is removed because it is redundant. In either version, best practice is to use X != Y for all value inequality tests.

  • In Python 2.6, a backquotes expression `X` works the same as repr(X) and converts objects to display strings. Due to its obscurity, this expression is removed in Python 3.0; use the more readable str and repr built-in functions, described in Numeric Display Formats.

  • The X // Y floor division expression always truncates fractional remainders in both Python 2.6 and 3.0. The X / Y expression performs true division in 3.0 (retaining remainders) and classic division in 2.6 (truncating for integers). See Division: Classic, Floor, and True.

  • The syntax [...] is used for both list literals and list comprehension expressions. The latter of these performs an implied loop and collects expression results in a new list. See Chapters 4, 14, and 20 for examples.

  • The syntax (...) is used for tuples and expressions, as well as generator expressions—a form of list comprehension that produces results on demand, instead of building a result list. See Chapters 4 and 20 for examples. The parentheses may sometimes be omitted in all three constructs.

  • The syntax {...} is used for dictionary literals, and in Python 3.0 for set literals and both dictionary and set comprehensions. See the set coverage in this chapter and Chapters 4, 8, 14, and 20 for examples.

  • The yield and ternary if/else selection expressions are available in Python 2.5 and later. The former returns send(...) arguments in generators; the latter is shorthand for a multiline if statement. yield requires parentheses if not alone on the right side of an assignment statement.

  • Comparison operators may be chained: X < Y < Z produces the same result as X < Y and Y < Z. See Comparisons: Normal and Chained for details.

  • In recent Pythons, the slice expression X[I:J:K] is equivalent to indexing with a slice object: X[slice(I, J, K)].

  • In Python 2.X, magnitude comparisons of mixed types—converting numbers to a common type, and ordering other mixed types according to the type name—are allowed. In Python 3.0, nonnumeric mixed-type magnitude comparisons are not allowed and raise exceptions; this includes sorts by proxy.

  • Magnitude comparisons for dictionaries are also no longer supported in Python 3.0 (though equality tests are); comparing sorted(dict.items()) is one possible replacement.

We’ll see most of the operators in Table 5-2 in action later; first, though, we need to take a quick look at the ways these operators may be combined in expressions.

Mixed operators follow operator precedence

As in most languages, in Python, more complex expressions are coded by stringing together the operator expressions in Table 5-2. For instance, the sum of two multiplications might be written as a mix of variables and operators:

A * B + C * D

So, how does Python know which operation to perform first? The answer to this question lies in operator precedence. When you write an expression with more than one operator, Python groups its parts according to what are called precedence rules, and this grouping determines the order in which the expression’s parts are computed. Table 5-2 is ordered by operator precedence:

  • Operators lower in the table have higher precedence, and so bind more tightly in mixed expressions.

  • Operators in the same row in Table 5-2 generally group from left to right when combined (except for exponentiation, which groups right to left, and comparisons, which chain left to right).

For example, if you write X + Y * Z, Python evaluates the multiplication first (Y * Z), then adds that result to X because * has higher precedence (is lower in the table) than +. Similarly, in this section’s original example, both multiplications (A * B and C * D) will happen before their results are added.

Parentheses group subexpressions

You can forget about precedence completely if you’re careful to group parts of expressions with parentheses. When you enclose subexpressions in parentheses, you override Python’s precedence rules; Python always evaluates expressions in parentheses first before using their results in the enclosing expressions.

For instance, instead of coding X + Y * Z, you could write one of the following to force Python to evaluate the expression in the desired order:

(X + Y) * Z
X + (Y * Z)

In the first case, + is applied to X and Y first, because this subexpression is wrapped in parentheses. In the second case, the * is performed first (just as if there were no parentheses at all). Generally speaking, adding parentheses in large expressions is a good idea—it not only forces the evaluation order you want, but also aids readability.

Mixed types are converted up

Besides mixing operators in expressions, you can also mix numeric types. For instance, you can add an integer to a floating-point number:

40 + 3.14

But this leads to another question: what type is the result—integer or floating-point? The answer is simple, especially if you’ve used almost any other language before: in mixed-type numeric expressions, Python first converts operands up to the type of the most complicated operand, and then performs the math on same-type operands. This behavior is similar to type conversions in the C language.

Python ranks the complexity of numeric types like so: integers are simpler than floating-point numbers, which are simpler than complex numbers. So, when an integer is mixed with a floating point, as in the preceding example, the integer is converted up to a floating-point value first, and floating-point math yields the floating-point result. Similarly, any mixed-type expression where one operand is a complex number results in the other operand being converted up to a complex number, and the expression yields a complex result. (In Python 2.6, normal integers are also converted to long integers whenever their values are too large to fit in a normal integer; in 3.0, integers subsume longs entirely.)

You can force the issue by calling built-in functions to convert types manually:

>>> int(3.1415)     # Truncates float to integer
3
>>> float(3)        # Converts integer to float
3.0

However, you won’t usually need to do this: because Python automatically converts up to the more complex type within an expression, the results are normally what you want.

Also, keep in mind that all these mixed-type conversions apply only when mixing numeric types (e.g., an integer and a floating-point) in an expression, including those using numeric and comparison operators. In general, Python does not convert across any other type boundaries automatically. Adding a string to an integer, for example, results in an error, unless you manually convert one or the other; watch for an example when we meet strings in Chapter 7.

Note

In Python 2.6, nonnumeric mixed types can be compared, but no conversions are performed (mixed types compare according to a fixed but arbitrary rule). In 3.0, nonnumeric mixed-type comparisons are not allowed and raise exceptions.

Preview: Operator overloading and polymorphism

Although we’re focusing on built-in numbers right now, all Python operators may be overloaded (i.e., implemented) by Python classes and C extension types to work on objects you create. For instance, you’ll see later that objects coded with classes may be added or concatenated with + expressions, indexed with [i] expressions, and so on.

Furthermore, Python itself automatically overloads some operators, such that they perform different actions depending on the type of built-in objects being processed. For example, the + operator performs addition when applied to numbers but performs concatenation when applied to sequence objects such as strings and lists. In fact, + can mean anything at all when applied to objects you define with classes.

As we saw in the prior chapter, this property is usually called polymorphism—a term indicating that the meaning of an operation depends on the type of the objects being operated on. We’ll revisit this concept when we explore functions in Chapter 16, because it becomes a much more obvious feature in that context.

Numbers in Action

On to the code! Probably the best way to understand numeric objects and expressions is to see them in action, so let’s start up the interactive command line and try some basic but illustrative operations (see Chapter 3 for pointers if you need help starting an interactive session).

Variables and Basic Expressions

First of all, let’s exercise some basic math. In the following interaction, we first assign two variables (a and b) to integers so we can use them later in a larger expression. Variables are simply names—created by you or Python—that are used to keep track of information in your program. We’ll say more about this in the next chapter, but in Python:

  • Variables are created when they are first assigned values.

  • Variables are replaced with their values when used in expressions.

  • Variables must be assigned before they can be used in expressions.

  • Variables refer to objects and are never declared ahead of time.

In other words, these assignments cause the variables a and b to spring into existence automatically:

% python
>>> a = 3                  # Name created
>>> b = 4

I’ve also used a comment here. Recall that in Python code, text after a # mark and continuing to the end of the line is considered to be a comment and is ignored. Comments are a way to write human-readable documentation for your code. Because code you type interactively is temporary, you won’t normally write comments in this context, but I’ve added them to some of this book’s examples to help explain the code.[15] In the next part of the book, we’ll meet a related feature—documentation strings—that attaches the text of your comments to objects.

Now, let’s use our new integer objects in some expressions. At this point, the values of a and b are still 3 and 4, respectively. Variables like these are replaced with their values whenever they’re used inside an expression, and the expression results are echoed back immediately when working interactively:

>>> a + 1, a – 1           # Addition (3 + 1), subtraction (3 - 1)
(4, 2)
>>> b * 3, b / 2           # Multiplication (4 * 3), division (4 / 2)
(12, 2.0)
>>> a % 2, b ** 2          # Modulus (remainder), power (4 ** 2)
(1, 16)
>>> 2 + 4.0, 2.0 ** b      # Mixed-type conversions
(6.0, 16.0)

Technically, the results being echoed back here are tuples of two values because the lines typed at the prompt contain two expressions separated by commas; that’s why the results are displayed in parentheses (more on tuples later). Note that the expressions work because the variables a and b within them have been assigned values. If you use a different variable that has never been assigned, Python reports an error rather than filling in some default value:

>>> c * 2
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
NameError: name 'c' is not defined

You don’t need to predeclare variables in Python, but they must have been assigned at least once before you can use them. In practice, this means you have to initialize counters to zero before you can add to them, initialize lists to an empty list before you can append to them, and so on.

Here are two slightly larger expressions to illustrate operator grouping and more about conversions:

>>> b / 2 + a               # Same as ((4 / 2) + 3)
5.0
>>> print(b / (2.0 + a))    # Same as (4 / (2.0 + 3))
0.8

In the first expression, there are no parentheses, so Python automatically groups the components according to its precedence rules—because / is lower in Table 5-2 than +, it binds more tightly and so is evaluated first. The result is as if the expression had been organized with parentheses as shown in the comment to the right of the code.

Also, notice that all the numbers are integers in the first expression. Because of that, Python 2.6 performs integer division and addition and will give a result of 5, whereas Python 3.0 performs true division with remainders and gives the result shown. If you want integer division in 3.0, code this as b // 2 + a (more on division in a moment).

In the second expression, parentheses are added around the + part to force Python to evaluate it first (i.e., before the /). We also made one of the operands floating-point by adding a decimal point: 2.0. Because of the mixed types, Python converts the integer referenced by a to a floating-point value (3.0) before performing the +. If all the numbers in this expression were integers, integer division (4 / 5) would yield the truncated integer 0 in Python 2.6 but the floating-point 0.8 in Python 3.0 (again, stay tuned for division details).

Numeric Display Formats

Notice that we used a print operation in the last of the preceding examples. Without the print, you’ll see something that may look a bit odd at first glance:

>>> b / (2.0 + a)           # Auto echo output: more digits
0.80000000000000004

>>> print(b / (2.0 + a))    # print rounds off digits
0.8

The full story behind this odd result has to do with the limitations of floating-point hardware and its inability to exactly represent some values in a limited number of bits. Because computer architecture is well beyond this book’s scope, though, we’ll finesse this by saying that all of the digits in the first output are really there in your computer’s floating-point hardware—it’s just that you’re not accustomed to seeing them. In fact, this is really just a display issue—the interactive prompt’s automatic result echo shows more digits than the print statement. If you don’t want to see all the digits, use print; as the sidebar str and repr Display Formats will explain, you’ll get a user-friendly display.

Note, however, that not all values have so many digits to display:

>>> 1 / 2.0
0.5

and that there are more ways to display the bits of a number inside your computer than using print and automatic echoes:

>>> num = 1 / 3.0
>>> num                      # Echoes
0.33333333333333331
>>> print(num)               # print rounds
0.333333333333

>>> '%e' % num               # String formatting expression
'3.333333e-001'
>>> '%4.2f' % num            # Alternative floating-point format
'0.33'
>>> '{0:4.2f}'.format(num)   # String formatting method (Python 2.6 and 3.0)
'0.33'

The last three of these expressions employ string formatting, a tool that allows for format flexibility, which we will explore in the upcoming chapter on strings (Chapter 7). Its results are strings that are typically printed to displays or reports.

Comparisons: Normal and Chained

So far, we’ve been dealing with standard numeric operations (addition and multiplication), but numbers can also be compared. Normal comparisons work for numbers exactly as you’d expect—they compare the relative magnitudes of their operands and return a Boolean result (which we would normally test in a larger statement):

>>> 1 < 2                  # Less than
True
>>> 2.0 >= 1               # Greater than or equal: mixed-type 1 converted to 1.0
True
>>> 2.0 == 2.0             # Equal value
True
>>> 2.0 != 2.0             # Not equal value
False

Notice again how mixed types are allowed in numeric expressions (only); in the second test here, Python compares values in terms of the more complex type, float.

Interestingly, Python also allows us to chain multiple comparisons together to perform range tests. Chained comparisons are a sort of shorthand for larger Boolean expressions. In short, Python lets us string together magnitude comparison tests to code chained comparisons such as range tests. The expression (A < B < C), for instance, tests whether B is between A and C; it is equivalent to the Boolean test (A < B and B < C) but is easier on the eyes (and the keyboard). For example, assume the following assignments:

>>> X = 2
>>> Y = 4
>>> Z = 6

The following two expressions have identical effects, but the first is shorter to type, and it may run slightly faster since Python needs to evaluate Y only once:

>>> X < Y < Z              # Chained comparisons: range tests
True
>>> X < Y and Y < Z
True

The same equivalence holds for false results, and arbitrary chain lengths are allowed:

>>> X < Y > Z
False
>>> X < Y and Y > Z
False

>>> 1 < 2 < 3.0 < 4
True
>>> 1 > 2 > 3.0 > 4
False

You can use other comparisons in chained tests, but the resulting expressions can become nonintuitive unless you evaluate them the way Python does. The following, for instance, is false just because 1 is not equal to 2:

>>> 1 == 2 < 3        # Same as: 1 == 2 and 2 < 3
False                 # Not same as: False < 3 (which means 0 < 3, which is true)

Python does not compare the 1 == 2 False result to 3—this would technically mean the same as 0 < 3, which would be True (as we’ll see later in this chapter, True and False are just customized 1 and 0).

Division: Classic, Floor, and True

You’ve seen how division works in the previous sections, so you should know that it behaves slightly differently in Python 3.0 and 2.6. In fact, there are actually three flavors of division, and two different division operators, one of which changes in 3.0:

X / Y

Classic and true division. In Python 2.6 and earlier, this operator performs classic division, truncating results for integers and keeping remainders for floating-point numbers. In Python 3.0, it performs true division, always keeping remainders regardless of types.

X // Y

Floor division. Added in Python 2.2 and available in both Python 2.6 and 3.0, this operator always truncates fractional remainders down to their floor, regardless of types.

True division was added to address the fact that the results of the original classic division model are dependent on operand types, and so can be difficult to anticipate in a dynamically typed language like Python. Classic division was removed in 3.0 because of this constraint—the / and // operators implement true and floor division in 3.0.

In sum:

  • In 3.0, the / now always performs true division, returning a float result that includes any remainder, regardless of operand types. The // performs floor division, which truncates the remainder and returns an integer for integer operands or a float if any operand is a float.

  • In 2.6, the / does classic division, performing truncating integer division if both operands are integers and float division (keeping remainders) otherwise. The // does floor division and works as it does in 3.0, performing truncating division for integers and floor division for floats.

Here are the two operators at work in 3.0 and 2.6:

C:misc> C:Python30python
>>>
>>> 10 / 4            # Differs in 3.0: keeps remainder
2.5
>>> 10 // 4           # Same in 3.0: truncates remainder
2
>>> 10 / 4.0          # Same in 3.0: keeps remainder
2.5
>>> 10 // 4.0         # Same in 3.0: truncates to floor
2.0

C:misc> C:Python26python
>>>
>>> 10 / 4
2
>>> 10 // 4
2
>>> 10 / 4.0
2.5
>>> 10 // 4.0
2.0

Notice that the data type of the result for // is still dependent on the operand types in 3.0: if either is a float, the result is a float; otherwise, it is an integer. Although this may seem similar to the type-dependent behavior of / in 2.X that motivated its change in 3.0, the type of the return value is much less critical than differences in the return value itself. Moreover, because // was provided in part as a backward-compatibility tool for programs that rely on truncating integer division (and this is more common than you might expect), it must return integers for integers.

Supporting either Python

Although / behavior differs in 2.6 and 3.0, you can still support both versions in your code. If your programs depend on truncating integer division, use // in both 2.6 and 3.0. If your programs require floating-point results with remainders for integers, use float to guarantee that one operand is a float around a / when run in 2.6:

X = Y // Z        # Always truncates, always an int result for ints in 2.6 and 3.0

X = Y / float(Z)  # Guarantees float division with remainder in either 2.6 or 3.0

Alternatively, you can enable 3.0 / division in 2.6 with a __future__ import, rather than forcing it with float conversions:

C:misc> C:Python26python
>>> from __future__ import division         # Enable 3.0 "/" behavior
>>> 10 / 4
2.5
>>> 10 // 4
2

Floor versus truncation

One subtlety: the // operator is generally referred to as truncating division, but it’s more accurate to refer to it as floor division—it truncates the result down to its floor, which means the closest whole number below the true result. The net effect is to round down, not strictly truncate, and this matters for negatives. You can see the difference for yourself with the Python math module (modules must be imported before you can use their contents; more on this later):

>>> import math
>>> math.floor(2.5)
2
>>> math.floor(-2.5)
-3
>>> math.trunc(2.5)
2
>>> math.trunc(-2.5)
-2

When running division operators, you only really truncate for positive results, since truncation is the same as floor; for negatives, it’s a floor result (really, they are both floor, but floor is the same as truncation for positives). Here’s the case for 3.0:

C:misc> c:python30python
>>> 5 / 2, 5 / −2
(2.5, −2.5)

>>> 5 // 2, 5 // −2           # Truncates to floor: rounds to first lower integer
(2, −3)                       # 2.5 becomes 2, −2.5 becomes −3

>>> 5 / 2.0, 5 / −2.0
(2.5, −2.5)

>>> 5 // 2.0, 5 // −2.0       # Ditto for floats, though result is float too
(2.0, −3.0)

The 2.6 case is similar, but / results differ again:

C:misc> c:python26python
>>> 5 / 2, 5 / −2             # Differs in 3.0
(2, −3)

>>> 5 // 2, 5 // −2           # This and the rest are the same in 2.6 and 3.0
(2, −3)

>>> 5 / 2.0, 5 / −2.0
(2.5, −2.5)

>>> 5 // 2.0, 5 // −2.0
(2.0, −3.0)

If you really want truncation regardless of sign, you can always run a float division result through math.trunc, regardless of Python version (also see the round built-in for related functionality):

C:misc> c:python30python
>>> import math
>>> 5 / −2                      # Keep remainder
−2.5
>>> 5 // −2                     # Floor below result
-3
>>> math.trunc(5 / −2)          # Truncate instead of floor
−2

C:misc> c:python26python
>>> import math
>>> 5 / float(−2)               # Remainder in 2.6
−2.5
>>> 5 / −2, 5 // −2             # Floor in 2.6
(−3, −3)
>>> math.trunc(5 / float(−2))   # Truncate in 2.6
−2

Why does truncation matter?

If you are using 3.0, here is the short story on division operators for reference:

>>> (5 / 2), (5 / 2.0), (5 / −2.0), (5 / −2)        # 3.0 true division
(2.5, 2.5, −2.5, −2.5)

>>> (5 // 2), (5 // 2.0), (5 // −2.0), (5 // −2)    # 3.0 floor division
(2, 2.0, −3.0, −3)

>>> (9 / 3), (9.0 / 3), (9 // 3), (9 // 3.0)        # Both
(3.0, 3.0, 3, 3.0)

For 2.6 readers, division works as follows:

>>> (5 / 2), (5 / 2.0), (5 / −2.0), (5 / −2)        # 2.6 classic division
(2, 2.5, −2.5, −3)

>>> (5 // 2), (5 // 2.0), (5 // −2.0), (5 // −2)    # 2.6 floor division (same)
(2, 2.0, −3.0, −3)

>>> (9 / 3), (9.0 / 3), (9 // 3), (9 // 3.0)        # Both
(3, 3.0, 3, 3.0)

Although results have yet to come in, it’s possible that the nontruncating behavior of / in 3.0 may break a significant number of programs. Perhaps because of a C language legacy, many programmers rely on division truncation for integers and will have to learn to use // in such contexts instead. Watch for a simple prime number while loop example in Chapter 13, and a corresponding exercise at the end of Part IV that illustrates the sort of code that may be impacted by this / change. Also stay tuned for more on the special from command used in this section; it’s discussed further in Chapter 24.

Integer Precision

Division may differ slightly across Python releases, but it’s still fairly standard. Here’s something a bit more exotic. As mentioned earlier, Python 3.0 integers support unlimited size:

>>> 999999999999999999999999999999 + 1
1000000000000000000000000000000

Python 2.6 has a separate type for long integers, but it automatically converts any number too large to store in a normal integer to this type. Hence, you don’t need to code any special syntax to use longs, and the only way you can tell that you’re using 2.6 longs is that they print with a trailing “L”:

>>> 999999999999999999999999999999 + 1
1000000000000000000000000000000L

Unlimited-precision integers are a convenient built-in tool. For instance, you can use them to count the U.S. national debt in pennies in Python directly (if you are so inclined, and have enough memory on your computer for this year’s budget!). They are also why we were able to raise 2 to such large powers in the examples in Chapter 3. Here are the 3.0 and 2.6 cases:

>>> 2 ** 200
1606938044258990275541962092341162602522202993782792835301376

>>> 2 ** 200
1606938044258990275541962092341162602522202993782792835301376L

Because Python must do extra work to support their extended precision, integer math is usually substantially slower than normal when numbers grow large. However, if you need the precision, the fact that it’s built in for you to use will likely outweigh its performance penalty.

Complex Numbers

Although less widely used than the types we’ve been exploring thus far, complex numbers are a distinct core object type in Python. If you know what they are, you know why they are useful; if not, consider this section optional reading.

Complex numbers are represented as two floating-point numbers—the real and imaginary parts—and are coded by adding a j or J suffix to the imaginary part. We can also write complex numbers with a nonzero real part by adding the two parts with a +. For example, the complex number with a real part of 2 and an imaginary part of −3 is written 2 + −3j. Here are some examples of complex math at work:

>>> 1j * 1J
(-1+0j)
>>> 2 + 1j * 3
(2+3j)
>>> (2 + 1j) * 3
(6+3j)

Complex numbers also allow us to extract their parts as attributes, support all the usual mathematical expressions, and may be processed with tools in the standard cmath module (the complex version of the standard math module). Complex numbers typically find roles in engineering-oriented programs. Because they are advanced tools, check Python’s language reference manual for additional details.

Hexadecimal, Octal, and Binary Notation

As described earlier in this chapter, Python integers can be coded in hexadecimal, octal, and binary notation, in addition to the normal base 10 decimal coding. The coding rules were laid out at the start of this chapter; let’s look at some live examples here.

Keep in mind that these literals are simply an alternative syntax for specifying the value of an integer object. For example, the following literals coded in Python 3.0 or 2.6 produce normal integers with the specified values in all three bases:

>>> 0o1, 0o20, 0o377           # Octal literals
(1, 16, 255)
>>> 0x01, 0x10, 0xFF           # Hex literals
(1, 16, 255)
>>> 0b1, 0b10000, 0b11111111   # Binary literals
(1, 16, 255)

Here, the octal value 0o377, the hex value 0xFF, and the binary value 0b11111111 are all decimal 255. Python prints in decimal (base 10) by default but provides built-in functions that allow you to convert integers to other bases’ digit strings:

>>> oct(64), hex(64), bin(64)
('0o100', '0x40', '0b1000000')

The oct function converts decimal to octal, hex to hexadecimal, and bin to binary. To go the other way, the built-in int function converts a string of digits to an integer, and an optional second argument lets you specify the numeric base:

>>> int('64'), int('100', 8), int('40', 16), int('1000000', 2)
(64, 64, 64, 64)

>>> int('0x40', 16), int('0b1000000', 2)    # Literals okay too
(64, 64)

The eval function, which you’ll meet later in this book, treats strings as though they were Python code. Therefore, it has a similar effect (but usually runs more slowly—it actually compiles and runs the string as a piece of a program, and it assumes you can trust the source of the string being run; a clever user might be able to submit a string that deletes files on your machine!):

>>> eval('64'), eval('0o100'), eval('0x40'), eval('0b1000000')
(64, 64, 64, 64)

Finally, you can also convert integers to octal and hexadecimal strings with string formatting method calls and expressions:

>>> '{0:o}, {1:x}, {2:b}'.format(64, 64, 64)
'100, 40, 1000000'

>>> '%o, %x, %X' % (64, 255, 255)
'100, ff, FF'

String formatting is covered in more detail in Chapter 7.

Two notes before moving on. First, Python 2.6 users should remember that you can code octals with simply a leading zero, the original octal format in Python:

>>> 0o1, 0o20, 0o377     # New octal format in 2.6 (same as 3.0)
(1, 16, 255)
>>> 01, 020, 0377        # Old octal literals in 2.6 (and earlier)
(1, 16, 255)

In 3.0, the syntax in the second of these examples generates an error. Even though it’s not an error in 2.6, be careful not to begin a string of digits with a leading zero unless you really mean to code an octal value. Python 2.6 will treat it as base 8, which may not work as you’d expect—010 is always decimal 8 in 2.6, not decimal 10 (despite what you may or may not think!). This, along with symmetry with the hex and binary forms, is why the octal format was changed in 3.0—you must use 0o010 in 3.0, and probably should in 2.6.

Secondly, note that these literals can produce arbitrarily long integers. The following, for instance, creates an integer with hex notation and then displays it first in decimal and then in octal and binary with converters (run in 2.6 here to reveal the long precision):

>>> X = 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFF
>>> X
5192296858534827628530496329220095L
>>> oct(X)
'017777777777777777777777777777777777777L'
>>> bin(X)
'0b1111111111111111111111111111111111111111111111111111111111 ...and so on...

Speaking of binary digits, the next section shows tools for processing individual bits.

Bitwise Operations

Besides the normal numeric operations (addition, subtraction, and so on), Python supports most of the numeric expressions available in the C language. This includes operators that treat integers as strings of binary bits. For instance, here it is at work performing bitwise shift and Boolean operations:

>>> x = 1               # 0001
>>> x << 2              # Shift left 2 bits: 0100
4
>>> x | 2               # Bitwise OR: 0011
3
>>> x & 1               # Bitwise AND: 0001
1

In the first expression, a binary 1 (in base 2, 0001) is shifted left two slots to create a binary 4 (0100). The last two operations perform a binary OR (0001|0010 = 0011) and a binary AND (0001&0001 = 0001). Such bit-masking operations allow us to encode multiple flags and other values within a single integer.

This is one area where the binary and hexadecimal number support in Python 2.6 and 3.0 become especially useful—they allow us to code and inspect numbers by bit-strings:

>>> X = 0b0001          # Binary literals
>>> X << 2              # Shift left
4
>>> bin(X << 2)         # Binary digits string
'0b100'

>>> bin(X | 0b010)      # Bitwise OR
'0b11'
>>> bin(X & 0b1)        # Bitwise AND
'0b1'

>>> X = 0xFF            # Hex literals
>>> bin(X)
'0b11111111'
>>> X ^ 0b10101010      # Bitwise XOR
85
>>> bin(X ^ 0b10101010)
'0b1010101'

>>> int('1010101', 2)   # String to int per base
85
>>> hex(85)             # Hex digit string
'0x55'

We won’t go into much more detail on “bit-twiddling” here. It’s supported if you need it, and it comes in handy if your Python code must deal with things like network packets or packed binary data produced by a C program. Be aware, though, that bitwise operations are often not as important in a high-level language such as Python as they are in a low-level language such as C. As a rule of thumb, if you find yourself wanting to flip bits in Python, you should think about which language you’re really coding. In general, there are often better ways to encode information in Python than bit strings.

Note

In the upcoming Python 3.1 release, the integer bit_length method also allows you to query the number of bits required to represent a number’s value in binary. The same effect can often be achieved by subtracting 2 from the length of the bin string using the len built-in function we met in Chapter 4, though it may be less efficient:

>>> X = 99
>>> bin(X), X.bit_length()
('0b1100011', 7)
>>> bin(256), (256).bit_length()
('0b100000000', 9)
>>> len(bin(256)) - 2
9

Other Built-in Numeric Tools

In addition to its core object types, Python also provides both built-in functions and standard library modules for numeric processing. The pow and abs built-in functions, for instance, compute powers and absolute values, respectively. Here are some examples of the built-in math module (which contains most of the tools in the C language’s math library) and a few built-in functions at work:

>>> import math
>>> math.pi, math.e                               # Common constants
(3.1415926535897931, 2.7182818284590451)

>>> math.sin(2 * math.pi / 180)                   # Sine, tangent, cosine
0.034899496702500969

>>> math.sqrt(144), math.sqrt(2)                  # Square root
(12.0, 1.4142135623730951)

>>> pow(2, 4), 2 ** 4                             # Exponentiation (power)
(16, 16)

>>> abs(-42.0), sum((1, 2, 3, 4))                 # Absolute value, summation
(42.0, 10)

>>> min(3, 1, 2, 4), max(3, 1, 2, 4)              # Minimum, maximum
(1, 4)

The sum function shown here works on a sequence of numbers, and min and max accept either a sequence or individual arguments. There are a variety of ways to drop the decimal digits of floating-point numbers. We met truncation and floor earlier; we can also round, both numerically and for display purposes:

>>> math.floor(2.567), math.floor(-2.567)         # Floor (next-lower integer)
(2, −3)

>>> math.trunc(2.567), math.trunc(−2.567)         # Truncate (drop decimal digits)
(2, −2)

>>> int(2.567), int(−2.567)                       # Truncate (integer conversion)
(2, −2)

>>> round(2.567), round(2.467), round(2.567, 2)   # Round (Python 3.0 version)
(3, 2, 2.5699999999999998)

>>> '%.1f' % 2.567, '{0:.2f}'.format(2.567)       # Round for display (Chapter 7)
('2.6', '2.57')

As we saw earlier, the last of these produces strings that we would usually print and supports a variety of formatting options. As also described earlier, the second to last test here will output (3, 2, 2.57) if we wrap it in a print call to request a more user-friendly display. The last two lines still differ, though—round rounds a floating-point number but still yields a floating-point number in memory, whereas string formatting produces a string and doesn’t yield a modified number:

>>> (1 / 3), round(1 / 3, 2), ('%.2f' % (1 / 3))
(0.33333333333333331, 0.33000000000000002, '0.33')

Interestingly, there are three ways to compute square roots in Python: using a module function, an expression, or a built-in function (if you’re interested in performance, we will revisit these in an exercise and its solution at the end of Part IV, to see which runs quicker):

>>> import math
>>> math.sqrt(144)              # Module
12.0
>>> 144 ** .5                   # Expression
12.0
>>> pow(144, .5)                # Built-in
12.0

>>> math.sqrt(1234567890)       # Larger numbers
35136.418286444619
>>> 1234567890 ** .5
35136.418286444619
>>> pow(1234567890, .5)
35136.418286444619

Notice that standard library modules such as math must be imported, but built-in functions such as abs and round are always available without imports. In other words, modules are external components, but built-in functions live in an implied namespace that Python automatically searches to find names used in your program. This namespace corresponds to the module called builtins in Python 3.0 (__builtin__ in 2.6). There is much more about name resolution in the function and module parts of this book; for now, when you hear “module,” think “import.”

The standard library random module must be imported as well. This module provides tools for picking a random floating-point number between 0 and 1, selecting a random integer between two numbers, choosing an item at random from a sequence, and more:

>>> import random
>>> random.random()
0.44694718823781876
>>> random.random()
0.28970426439292829

>>> random.randint(1, 10)
5
>>> random.randint(1, 10)
4

>>> random.choice(['Life of Brian', 'Holy Grail', 'Meaning of Life'])
'Life of Brian'
>>> random.choice(['Life of Brian', 'Holy Grail', 'Meaning of Life'])
'Holy Grail'

The random module can be useful for shuffling cards in games, picking images at random in a slideshow GUI, performing statistical simulations, and much more. For more details, see Python’s library manual.

Other Numeric Types

So far in this chapter, we’ve been using Python’s core numeric types—integer, floating point, and complex. These will suffice for most of the number crunching that most programmers will ever need to do. Python comes with a handful of more exotic numeric types, though, that merit a quick look here.

Decimal Type

Python 2.4 introduced a new core numeric type: the decimal object, formally known as Decimal. Syntactically, decimals are created by calling a function within an imported module, rather than running a literal expression. Functionally, decimals are like floating-point numbers, but they have a fixed number of decimal points. Hence, decimals are fixed-precision floating-point values.

For example, with decimals, we can have a floating-point value that always retains just two decimal digits. Furthermore, we can specify how to round or truncate the extra decimal digits beyond the object’s cutoff. Although it generally incurs a small performance penalty compared to the normal floating-point type, the decimal type is well suited to representing fixed-precision quantities like sums of money and can help you achieve better numeric accuracy.

The basics

The last point merits elaboration. As you may or may not already know, floating-point math is less than exact, because of the limited space used to store values. For example, the following should yield zero, but it does not. The result is close to zero, but there are not enough bits to be precise here:

>>> 0.1 + 0.1 + 0.1 - 0.3
5.5511151231257827e-17

Printing the result to produce the user-friendly display format doesn’t completely help either, because the hardware related to floating-point math is inherently limited in terms of accuracy:

>>> print(0.1 + 0.1 + 0.1 - 0.3)
5.55111512313e-17

However, with decimals, the result can be dead-on:

>>> from decimal import Decimal
>>> Decimal('0.1') + Decimal('0.1') + Decimal('0.1') - Decimal('0.3')
Decimal('0.0')

As shown here, we can make decimal objects by calling the Decimal constructor function in the decimal module and passing in strings that have the desired number of decimal digits for the resulting object (we can use the str function to convert floating-point values to strings if needed). When decimals of different precision are mixed in expressions, Python converts up to the largest number of decimal digits automatically:

>>> Decimal('0.1') + Decimal('0.10') + Decimal('0.10') - Decimal('0.30')
Decimal('0.00')

Note

In Python 3.1 (to be released after this book’s publication), it’s also possible to create a decimal object from a floating-point object, with a call of the form decimal.Decimal.from_float(1.25). The conversion is exact but can sometimes yield a large number of digits.

Setting precision globally

Other tools in the decimal module can be used to set the precision of all decimal numbers, set up error handling, and more. For instance, a context object in this module allows for specifying precision (number of decimal digits) and rounding modes (down, ceiling, etc.). The precision is applied globally for all decimals created in the calling thread:

>>> import decimal
>>> decimal.Decimal(1) / decimal.Decimal(7)
Decimal('0.1428571428571428571428571429')

>>> decimal.getcontext().prec = 4
>>> decimal.Decimal(1) / decimal.Decimal(7)
Decimal('0.1429')

This is especially useful for monetary applications, where cents are represented as two decimal digits. Decimals are essentially an alternative to manual rounding and string formatting in this context:

>>> 1999 + 1.33
2000.3299999999999
>>>
>>> decimal.getcontext().prec = 2
>>> pay = decimal.Decimal(str(1999 + 1.33))
>>> pay
Decimal('2000.33')

Decimal context manager

In Python 2.6 and 3.0 (and later), it’s also possible to reset precision temporarily by using the with context manager statement. The precision is reset to its original value on statement exit:

C:misc> C:Python30python
>>> import decimal
>>> decimal.Decimal('1.00') / decimal.Decimal('3.00')
Decimal('0.3333333333333333333333333333')
>>>
>>> with decimal.localcontext() as ctx:
...     ctx.prec = 2
...     decimal.Decimal('1.00') / decimal.Decimal('3.00')
...
Decimal('0.33')
>>>
>>> decimal.Decimal('1.00') / decimal.Decimal('3.00')
Decimal('0.3333333333333333333333333333')

Though useful, this statement requires much more background knowledge than you’ve obtained at this point; watch for coverage of the with statement in Chapter 33.

Because use of the decimal type is still relatively rare in practice, I’ll defer to Python’s standard library manuals and interactive help for more details. And because decimals address some of the same floating-point accuracy issues as the fraction type, let’s move on to the next section to see how the two compare.

Fraction Type

Python 2.6 and 3.0 debut a new numeric type, Fraction, which implements a rational number object. It essentially keeps both a numerator and a denominator explicitly, so as to avoid some of the inaccuracies and limitations of floating-point math.

The basics

Fraction is a sort of cousin to the existing Decimal fixed-precision type described in the prior section, as both can be used to control numerical accuracy by fixing decimal digits and specifying rounding or truncation policies. It’s also used in similar ways—like Decimal, Fraction resides in a module; import its constructor and pass in a numerator and a denominator to make one. The following interaction shows how:

>>> from fractions import Fraction
>>> x = Fraction(1, 3)                    # Numerator, denominator
>>> y = Fraction(4, 6)                    # Simplified to 2, 3 by gcd

>>> x
Fraction(1, 3)
>>> y
Fraction(2, 3)
>>> print(y)
2/3

Once created, Fractions can be used in mathematical expressions as usual:

>>> x + y
Fraction(1, 1)
>>> x – y                           # Results are exact: numerator, denominator
Fraction(-1, 3)
>>> x * y
Fraction(2, 9)

Fraction objects can also be created from floating-point number strings, much like decimals:

>>> Fraction('.25')
Fraction(1, 4)
>>> Fraction('1.25')
Fraction(5, 4)
>>>
>>> Fraction('.25') + Fraction('1.25')
Fraction(3, 2)

Numeric accuracy

Notice that this is different from floating-point-type math, which is constrained by the underlying limitations of floating-point hardware. To compare, here are the same operations run with floating-point objects, and notes on their limited accuracy:

>>> a = 1 / 3.0                     # Only as accurate as floating-point hardware
>>> b = 4 / 6.0                     # Can lose precision over calculations
>>> a
0.33333333333333331
>>> b
0.66666666666666663

>>> a + b
1.0
>>> a - b
-0.33333333333333331
>>> a * b
0.22222222222222221

This floating-point limitation is especially apparent for values that cannot be represented accurately given their limited number of bits in memory. Both Fraction and Decimal provide ways to get exact results, albeit at the cost of some speed. For instance, in the following example (repeated from the prior section), floating-point numbers do not accurately give the zero answer expected, but both of the other types do:

>>> 0.1 + 0.1 + 0.1 - 0.3           # This should be zero (close, but not exact)
5.5511151231257827e-17

>>> from fractions import Fraction
>>> Fraction(1, 10) + Fraction(1, 10) + Fraction(1, 10) - Fraction(3, 10)
Fraction(0, 1)

>>> from decimal import Decimal
>>> Decimal('0.1') + Decimal('0.1') + Decimal('0.1') - Decimal('0.3')
Decimal('0.0')

Moreover, fractions and decimals both allow more intuitive and accurate results than floating points sometimes can, in different ways (by using rational representation and by limiting precision):

>>> 1 / 3                              # Use 3.0 in Python 2.6 for true "/"
0.33333333333333331

>>> Fraction(1, 3)                     # Numeric accuracy
Fraction(1, 3)

>>> import decimal
>>> decimal.getcontext().prec = 2
>>> decimal.Decimal(1) / decimal.Decimal(3)
Decimal('0.33')

In fact, fractions both retain accuracy and automatically simplify results. Continuing the preceding interaction:

>>> (1 / 3) + (6 / 12)                 # Use ".0" in Python 2.6 for true "/"
0.83333333333333326

>>> Fraction(6, 12)                    # Automatically simplified
Fraction(1, 2)

>>> Fraction(1, 3) + Fraction(6, 12)
Fraction(5, 6)

>>> decimal.Decimal(str(1/3)) + decimal.Decimal(str(6/12))
Decimal('0.83')

>>> 1000.0 / 1234567890
8.1000000737100011e-07
>>> Fraction(1000, 1234567890)
Fraction(100, 123456789)

Conversions and mixed types

To support fraction conversions, floating-point objects now have a method that yields their numerator and denominator ratio, fractions have a from_float method, and float accepts a Fraction as an argument. Trace through the following interaction to see how this pans out (the * in the second test is special syntax that expands a tuple into individual arguments; more on this when we study function argument passing in Chapter 18):

>>> (2.5).as_integer_ratio()               # float object method
(5, 2)

>>> f = 2.5
>>> z = Fraction(*f.as_integer_ratio())    # Convert float -> fraction: two args
>>> z                                      # Same as Fraction(5, 2)
Fraction(5, 2)

>>> x                                      # x from prior interaction
Fraction(1, 3)
>>> x + z
Fraction(17, 6)                            # 5/2 + 1/3 = 15/6 + 2/6

>>> float(x)                               # Convert fraction -> float
0.33333333333333331
>>> float(z)
2.5
>>> float(x + z)
2.8333333333333335
>>> 17 / 6
2.8333333333333335

>>> Fraction.from_float(1.75)              # Convert float -> fraction: other way
Fraction(7, 4)
>>> Fraction(*(1.75).as_integer_ratio())
Fraction(7, 4)

Finally, some type mixing is allowed in expressions, though Fraction must sometimes be manually propagated to retain accuracy. Study the following interaction to see how this works:

>>> x
Fraction(1, 3)
>>> x + 2                                  # Fraction + int -> Fraction
Fraction(7, 3)
>>> x + 2.0                                # Fraction + float -> float
2.3333333333333335
>>> x + (1./3)                             # Fraction + float -> float
0.66666666666666663

>>> x + (4./3)
1.6666666666666665
>>> x + Fraction(4, 3)                     # Fraction + Fraction -> Fraction
Fraction(5, 3)

Caveat: although you can convert from floating-point to fraction, in some cases there is an unavoidable precision loss when you do so, because the number is inaccurate in its original floating-point form. When needed, you can simplify such results by limiting the maximum denominator value:

>>> 4.0 / 3
1.3333333333333333
>>> (4.0 / 3).as_integer_ratio()                # Precision loss from float
(6004799503160661, 4503599627370496)

>>> x
Fraction(1, 3)
>>> a = x + Fraction(*(4.0 / 3).as_integer_ratio())
>>> a
Fraction(22517998136852479, 13510798882111488)

>>> 22517998136852479 / 13510798882111488.      # 5 / 3 (or close to it!)
1.6666666666666667

>>> a.limit_denominator(10)                     # Simplify to closest fraction
Fraction(5, 3)

For more details on the Fraction type, experiment further on your own and consult the Python 2.6 and 3.0 library manuals and other documentation.

Sets

Python 2.4 also introduced a new collection type, the set—an unordered collection of unique and immutable objects that supports operations corresponding to mathematical set theory. By definition, an item appears only once in a set, no matter how many times it is added. As such, sets have a variety of applications, especially in numeric and database-focused work.

Because sets are collections of other objects, they share some behavior with objects such as lists and dictionaries that are outside the scope of this chapter. For example, sets are iterable, can grow and shrink on demand, and may contain a variety of object types. As we’ll see, a set acts much like the keys of a valueless dictionary, but it supports extra operations.

However, because sets are unordered and do not map keys to values, they are neither sequence nor mapping types; they are a type category unto themselves. Moreover, because sets are fundamentally mathematical in nature (and for many readers, may seem more academic and be used much less often than more pervasive objects like dictionaries), we’ll explore the basic utility of Python’s set objects here.

Set basics in Python 2.6

There are a few ways to make sets today, depending on whether you are using Python 2.6 or 3.0. Since this book covers both, let’s begin with the 2.6 case, which also is available (and sometimes still required) in 3.0; we’ll refine this for 3.0 extensions in a moment. To make a set object, pass in a sequence or other iterable object to the built-in set function:

>>> x = set('abcde')
>>> y = set('bdxyz')

You get back a set object, which contains all the items in the object passed in (notice that sets do not have a positional ordering, and so are not sequences):

>>> x
set(['a', 'c', 'b', 'e', 'd'])                    # 2.6 display format

Sets made this way support the common mathematical set operations with expression operators. Note that we can’t perform these expressions on plain sequences—we must create sets from them in order to apply these tools:

>>> 'e' in x                                      # Membership
True

>>> x – y                                         # Difference
set(['a', 'c', 'e'])

>>> x | y                                         # Union
set(['a', 'c', 'b', 'e', 'd', 'y', 'x', 'z'])

>>> x & y                                         # Intersection
set(['b', 'd'])

>>> x ^ y                                         # Symmetric difference (XOR)
set(['a', 'c', 'e', 'y', 'x', 'z'])

>>> x > y, x < y                                  # Superset, subset
(False, False)

In addition to expressions, the set object provides methods that correspond to these operations and more, and that support set changes—the set add method inserts one item, update is an in-place union, and remove deletes an item by value (run a dir call on any set instance or the set type name to see all the available methods). Assuming x and y are still as they were in the prior interaction:

>>> z = x.intersection(y)                         # Same as x & y
>>> z
set(['b', 'd'])
>>> z.add('SPAM')                                 # Insert one item
>>> z
set(['b', 'd', 'SPAM'])
>>> z.update(set(['X', 'Y']))                     # Merge: in-place union
>>> z
set(['Y', 'X', 'b', 'd', 'SPAM'])
>>> z.remove('b')                                 # Delete one item
>>> z
set(['Y', 'X', 'd', 'SPAM'])

As iterable containers, sets can also be used in operations such as len, for loops, and list comprehensions. Because they are unordered, though, they don’t support sequence operations like indexing and slicing:

>>> for item in set('abc'): print(item * 3)
...
aaa
ccc
bbb

Finally, although the set expressions shown earlier generally require two sets, their method-based counterparts can often work with any iterable type as well:

>>> S = set([1, 2, 3])

>>> S | set([3, 4])          # Expressions require both to be sets
set([1, 2, 3, 4])
>>> S | [3, 4]
TypeError: unsupported operand type(s) for |: 'set' and 'list'

>>> S.union([3, 4])          # But their methods allow any iterable
set([1, 2, 3, 4])
>>> S.intersection((1, 3, 5))
set([1, 3])
>>> S.issubset(range(-5, 5))
True

For more details on set operations, see Python’s library reference manual or a reference book. Although set operations can be coded manually in Python with other types, like lists and dictionaries (and often were in the past), Python’s built-in sets use efficient algorithms and implementation techniques to provide quick and standard operation.

Set literals in Python 3.0

If you think sets are “cool,” they recently became noticeably cooler. In Python 3.0 we can still use the set built-in to make set objects, but 3.0 also adds a new set literal form, using the curly braces formerly reserved for dictionaries. In 3.0, the following are equivalent:

set([1, 2, 3, 4])                # Built-in call
{1, 2, 3, 4}                     # 3.0 set literals

This syntax makes sense, given that sets are essentially like valueless dictionaries—because a set’s items are unordered, unique, and immutable, the items behave much like a dictionary’s keys. This operational similarity is even more striking given that dictionary key lists in 3.0 are view objects, which support set-like behavior such as intersections and unions (see Chapter 8 for more on dictionary view objects).

In fact, regardless of how a set is made, 3.0 displays it using the new literal format. The set built-in is still required in 3.0 to create empty sets and to build sets from existing iterable objects (short of using set comprehensions, discussed later in this chapter), but the new literal is convenient for initializing sets of known structure:

C:Misc> c:python30python
>>> set([1, 2, 3, 4])            # Built-in: same as in 2.6
{1, 2, 3, 4}
>>> set('spam')                  # Add all items in an iterable
{'a', 'p', 's', 'm'}

>>> {1, 2, 3, 4}                 # Set literals: new in 3.0
{1, 2, 3, 4}
>>> S = {'s', 'p', 'a', 'm'}
>>> S.add('alot')
>>> S
{'a', 'p', 's', 'm', 'alot'}

All the set processing operations discussed in the prior section work the same in 3.0, but the result sets print differently:

>>> S1 = {1, 2, 3, 4}
>>> S1 & {1, 3}                  # Intersection
{1, 3}
>>> {1, 5, 3, 6} | S1            # Union
{1, 2, 3, 4, 5, 6}
>>> S1 - {1, 3, 4}               # Difference
{2}
>>> S1 > {1, 3}                  # Superset
True

Note that {} is still a dictionary in Python. Empty sets must be created with the set built-in, and print the same way:

>>> S1 - {1, 2, 3, 4}            # Empty sets print differently
set()
>>> type({})                     # Because {} is an empty dictionary
<class 'dict'>

>>> S = set()                    # Initialize an empty set
>>> S.add(1.23)
>>> S
{1.23}

As in Python 2.6, sets created with 3.0 literals support the same methods, some of which allow general iterable operands that expressions do not:

>>> {1, 2, 3} | {3, 4}
{1, 2, 3, 4}
>>> {1, 2, 3} | [3, 4]
TypeError: unsupported operand type(s) for |: 'set' and 'list'

>>> {1, 2, 3}.union([3, 4])
{1, 2, 3, 4}
>>> {1, 2, 3}.union({3, 4})
{1, 2, 3, 4}
>>> {1, 2, 3}.union(set([3, 4]))
{1, 2, 3, 4}

>>> {1, 2, 3}.intersection((1, 3, 5))
{1, 3}
>>> {1, 2, 3}.issubset(range(-5, 5))
True

Immutable constraints and frozen sets

Sets are powerful and flexible objects, but they do have one constraint in both 3.0 and 2.6 that you should keep in mind—largely because of their implementation, sets can only contain immutable (a.k.a “hashable”) object types. Hence, lists and dictionaries cannot be embedded in sets, but tuples can if you need to store compound values. Tuples compare by their full values when used in set operations:

>>> S
{1.23}
>>> S.add([1, 2, 3])                   # Only immutable objects work in a set
TypeError: unhashable type: 'list'
>>> S.add({'a':1})
TypeError: unhashable type: 'dict'
>>> S.add((1, 2, 3))
>>> S                                  # No list or dict, but tuple okay
{1.23, (1, 2, 3)}

>>> S | {(4, 5, 6), (1, 2, 3)}         # Union: same as S.union(...)
{1.23, (4, 5, 6), (1, 2, 3)}
>>> (1, 2, 3) in S                     # Membership: by complete values
True
>>> (1, 4, 3) in S
False

Tuples in a set, for instance, might be used to represent dates, records, IP addresses, and so on (more on tuples later in this part of the book). Sets themselves are mutable too, and so cannot be nested in other sets directly; if you need to store a set inside another set, the frozenset built-in call works just like set but creates an immutable set that cannot change and thus can be embedded in other sets.

Set comprehensions in Python 3.0

In addition to literals, 3.0 introduces a set comprehension construct; it is similar in form to the list comprehension we previewed in Chapter 4, but is coded in curly braces instead of square brackets and run to make a set instead of a list. Set comprehensions run a loop and collect the result of an expression on each iteration; a loop variable gives access to the current iteration value for use in the collection expression. The result is a new set created by running the code, with all the normal set behavior:

>>> {x ** 2 for x in [1, 2, 3, 4]}         # 3.0 set comprehension
{16, 1, 4, 9}

In this expression, the loop is coded on the right, and the collection expression is coded on the left (x ** 2). As for list comprehensions, we get back pretty much what this expression says: “Give me a new set containing X squared, for every X in a list.” Comprehensions can also iterate across other kinds of objects, such as strings (the first of the following examples illustrates the comprehension-based way to make a set from an existing iterable):

>>> {x for x in 'spam'}                    # Same as: set('spam')
{'a', 'p', 's', 'm'}

>>> {c * 4 for c in 'spam'}                # Set of collected expression results
{'ssss', 'aaaa', 'pppp', 'mmmm'}
>>> {c * 4 for c in 'spamham'}
{'ssss', 'aaaa', 'hhhh', 'pppp', 'mmmm'}

>>> S = {c * 4 for c in 'spam'}
>>> S | {'mmmm', 'xxxx'}
{'ssss', 'aaaa', 'pppp', 'mmmm', 'xxxx'}
>>> S & {'mmmm', 'xxxx'}
{'mmmm'}

Because the rest of the comprehensions story relies upon underlying concepts we’re not yet prepared to address, we’ll postpone further details until later in this book. In Chapter 8, we’ll meet a first cousin in 3.0, the dictionary comprehension, and I’ll have much more to say about all comprehensions (list, set, dictionary, and generator) later, especially in Chapters14 and 20. As we’ll learn later, all comprehensions, including sets, support additional syntax not shown here, including nested loops and if tests, which can be difficult to understand until you’ve had a chance to study larger statements.

Why sets?

Set operations have a variety of common uses, some more practical than mathematical. For example, because items are stored only once in a set, sets can be used to filter duplicates out of other collections. Simply convert the collection to a set, and then convert it back again (because sets are iterable, they work in the list call here):

>>> L = [1, 2, 1, 3, 2, 4, 5]
>>> set(L)
{1, 2, 3, 4, 5}
>>> L = list(set(L))                     # Remove duplicates
>>> L
[1, 2, 3, 4, 5]

Sets can also be used to keep track of where you’ve already been when traversing a graph or other cyclic structure. For example, the transitive module reloader and inheritance tree lister examples we’ll study in Chapters 24 and 30, respectively, must keep track of items visited to avoid loops. Although recording states visited as keys in a dictionary is efficient, sets offer an alternative that’s essentially equivalent (and may be more or less intuitive, depending on who you ask).

Finally, sets are also convenient when dealing with large data sets (database query results, for example)—the intersection of two sets contains objects in common to both categories, and the union contains all items in either set. To illustrate, here’s a somewhat more realistic example of set operations at work, applied to lists of people in a hypothetical company, using 3.0 set literals (use set in 2.6):

>>> engineers = {'bob', 'sue', 'ann', 'vic'}
>>> managers  = {'tom', 'sue'}

>>> 'bob' in engineers                   # Is bob an engineer?
True

>>> engineers & managers                 # Who is both engineer and manager?
{'sue'}

>>> engineers | managers                 # All people in either category
{'vic', 'sue', 'tom', 'bob', 'ann'}

>>> engineers – managers                 # Engineers who are not managers
{'vic', 'bob', 'ann'}

>>> managers – engineers                 # Managers who are not engineers
{'tom'}

>>> engineers > managers                 # Are all managers engineers? (superset)
False

>>> {'bob', 'sue'} < engineers           # Are both engineers? (subset)
True

>>> (managers | engineers) > managers    # All people is a superset of managers
True

>>> managers ^ engineers                 # Who is in one but not both?
{'vic', 'bob', 'ann', 'tom'}

>>> (managers | engineers) - (managers ^ engineers)     # Intersection!
{'sue'}

You can find more details on set operations in the Python library manual and some mathematical and relational database theory texts. Also stay tuned for Chapter 8’s revival of some of the set operations we’ve seen here, in the context of dictionary view objects in Python 3.0.

Booleans

Some argue that the Python Boolean type, bool, is numeric in nature because its two values, True and False, are just customized versions of the integers 1 and 0 that print themselves differently. Although that’s all most programmers need to know, let’s explore this type in a bit more detail.

More formally, Python today has an explicit Boolean data type called bool, with the values True and False available as new preassigned built-in names. Internally, the names True and False are instances of bool, which is in turn just a subclass (in the object-oriented sense) of the built-in integer type int. True and False behave exactly like the integers 1 and 0, except that they have customized printing logic—they print themselves as the words True and False, instead of the digits 1 and 0. bool accomplishes this by redefining str and repr string formats for its two objects.

Because of this customization, the output of Boolean expressions typed at the interactive prompt prints as the words True and False instead of the older and less obvious 1 and 0. In addition, Booleans make truth values more explicit. For instance, an infinite loop can now be coded as while True: instead of the less intuitive while 1:. Similarly, flags can be initialized more clearly with flag = False. We’ll discuss these statements further in Part III.

Again, though, for all other practical purposes, you can treat True and False as though they are predefined variables set to integer 1 and 0. Most programmers used to preassign True and False to 1 and 0 anyway; the bool type simply makes this standard. Its implementation can lead to curious results, though. Because True is just the integer 1 with a custom display format, True + 4 yields 5 in Python:

>>> type(True)
<class 'bool'>
>>> isinstance(True, int)
True
>>> True == 1                # Same value
True
>>> True is 1                # But different object: see the next chapter
False
>>> True or False            # Same as: 1 or 0
True
>>> True + 4                 # (Hmmm)
5

Since you probably won’t come across an expression like the last of these in real Python code, you can safely ignore its deeper metaphysical implications....

We’ll revisit Booleans in Chapter 9 (to define Python’s notion of truth) and again in Chapter 12 (to see how Boolean operators like and and or work).

Numeric Extensions

Finally, although Python core numeric types offer plenty of power for most applications, there is a large library of third-party open source extensions available to address more focused needs. Because numeric programming is a popular domain for Python, you’ll find a wealth of advanced tools.

For example, if you need to do serious number crunching, an optional extension for Python called NumPy (Numeric Python) provides advanced numeric programming tools, such as a matrix data type, vector processing, and sophisticated computation libraries. Hardcore scientific programming groups at places like Los Alamos and NASA use Python with NumPy to implement the sorts of tasks they previously coded in C++, FORTRAN, or Matlab. The combination of Python and NumPy is often compared to a free, more flexible version of Matlab—you get NumPy’s performance, plus the Python language and its libraries.

Because it’s so advanced, we won’t talk further about NumPy in this book. You can find additional support for advanced numeric programming in Python, including graphics and plotting tools, statistics libraries, and the popular SciPy package at Python’s PyPI site, or by searching the Web. Also note that NumPy is currently an optional extension; it doesn’t come with Python and must be installed separately.

Chapter Summary

This chapter has taken a tour of Python’s numeric object types and the operations we can apply to them. Along the way, we met the standard integer and floating-point types, as well as some more exotic and less commonly used types such as complex numbers, fractions, and sets. We also explored Python’s expression syntax, type conversions, bitwise operations, and various literal forms for coding numbers in scripts.

Later in this part of the book, I’ll fill in some details about the next object type, the string. In the next chapter, however, we’ll take some time to explore the mechanics of variable assignment in more detail than we have here. This turns out to be perhaps the most fundamental idea in Python, so make sure you check out the next chapter before moving on. First, though, it’s time to take the usual chapter quiz.

Test Your Knowledge: Quiz

  1. What is the value of the expression 2 * (3 + 4) in Python?

  2. What is the value of the expression 2 * 3 + 4 in Python?

  3. What is the value of the expression 2 + 3 * 4 in Python?

  4. What tools can you use to find a number’s square root, as well as its square?

  5. What is the type of the result of the expression 1 + 2.0 + 3?

  6. How can you truncate and round a floating-point number?

  7. How can you convert an integer to a floating-point number?

  8. How would you display an integer in octal, hexadecimal, or binary notation?

  9. How might you convert an octal, hexadecimal, or binary string to a plain integer?

Test Your Knowledge: Answers

  1. The value will be 14, the result of 2 * 7, because the parentheses force the addition to happen before the multiplication.

  2. The value will be 10, the result of 6 + 4. Python’s operator precedence rules are applied in the absence of parentheses, and multiplication has higher precedence than (i.e., happens before) addition, per Table 5-2.

  3. This expression yields 14, the result of 2 + 12, for the same precedence reasons as in the prior question.

  4. Functions for obtaining the square root, as well as pi, tangents, and more, are available in the imported math module. To find a number’s square root, import math and call math.sqrt(N). To get a number’s square, use either the exponent expression X ** 2 or the built-in function pow(X, 2). Either of these last two can also compute the square root when given a power of 0.5 (e.g., X ** .5).

  5. The result will be a floating-point number: the integers are converted up to floating point, the most complex type in the expression, and floating-point math is used to evaluate it.

  6. The int(N) and math.trunc(N) functions truncate, and the round(N, digits) function rounds. We can also compute the floor with math.floor(N) and round for display with string formatting operations.

  7. The float(I) function converts an integer to a floating point; mixing an integer with a floating point within an expression will result in a conversion as well. In some sense, Python 3.0 / division converts too—it always returns a floating-point result that includes the remainder, even if both operands are integers.

  8. The oct(I) and hex(I) built-in functions return the octal and hexadecimal string forms for an integer. The bin(I) call also returns a number’s binary digits string in Python 2.6 and 3.0. The % string formatting expression and format string method also provide targets for some such conversions.

  9. The int(S, base) function can be used to convert from octal and hexadecimal strings to normal integers (pass in 8, 16, or 2 for the base). The eval(S) function can be used for this purpose too, but it’s more expensive to run and can have security issues. Note that integers are always stored in binary in computer memory; these are just display string format conversions.



[15] If you’re working along, you don’t need to type any of the comment text from the # through to the end of the line; comments are simply ignored by Python and not required parts of the statements we’re running.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset