Chapter 5. Expressions

C++ provides a rich set of operators and defines what these operators do when applied to operands of built-in type. It also allows us to define meanings for the operators when applied to class types. This facility, known as operator overloading, is used by the library to define the operators that apply to the library types.

In this chapter our focus is on the operators as defined in the language and applied to operands of built-in type. We will also look at some of the operators defined by the library. Chapter 14 shows how we can define our own overloaded operators.

An expression is composed of one or more operands that are combined by operators. The simplest form of an expression consists of a single literal constant or variable. More complicated expressions are formed from an operator and one or more operands.

Every expression yields a result. In the case of an expression with no operator, the result is the operand itself, e.g., a literal constant or a variable. When an object is used in a context that requires a value, then the object is evaluated by fetching the object’s value. For example, assuming ival is an int object,

if (ival) // evaluate ival as a condition
// ....

we could use ival as an expression in the condition of an if. The condition succeeds if the value of ival is not zero and fails otherwise.

The result of expressions that involve operators is determined by applying each operator to its operand(s). Except when noted otherwise, the result of an expression is an rvalue (Section 2.3.1, p. 45). We can read the result but cannot assign to it.

The meaning of an operator—what operation is performed and the type of the result—depends on the types of its operands.

Until one knows the type of the operand(s), it is not possible to know what a particular expression means. The expression

i + j

might mean integer addition, concatenation of strings, floating-point addition, or something else entirely. How the expression is evaluated depends on the types of i and j.

There are both unary operators and binary operators. Unary operators, such as address-of (&) and dereference (*), act on one operand. Binary operators, such as addition (+) and subtraction (-), act on two operands. There is also one ternary operator that takes three operands. We’ll look at this operator in Section 5.7 (p. 165).

Some symbols, such as *, are used to represent both a unary and a binary operator. The * symbol is used as the (unary) dereference operator and as the (binary) multiplication operator. The uses of the symbol are independent; it can be helpful to think of them as two different symbols. The context in which an operator symbol is used always determines whether the symbol represents a unary or binary operator.

Operators impose requirements on the type(s) of their operand(s). The language defines the type requirements for the operators when applied to built-in or compound types. For example, the dereference operator, when applied to an object of built-in type, requires that its operand be a pointer type. Attempting to dereference an object of any other built-in or compound type is an error.

The binary operators, when applied to operands of built-in or compound type, usually require that the operands be the same type, or types that can be converted to a common type. We’ll look at conversions in Section 5.12 (p. 178). Although the rules can be complex, for the most part conversions happen in expected ways. For example, we can convert an integer to floating-point, and vice versa, but we cannot convert a pointer type to floating-point.

Understanding expressions with multiple operators requires understanding operator precedence, associativity, and the order of evaluation of the operands. For example, the expression

5 + 10 * 20/2;

uses addition, multiplication, and division. The result of this expression depends on how the operands are grouped to the operators. For example, the operands to the * operator could be 10 and 20, or 10 and 20/2, or 15 and 20 or 15 and 20/2. Associativity and precedence rules specify the grouping of operators and their operands. In C++ this expression evaluates to 105, which is the result of multiplying 10 and 20, dividing that result by 2, and then adding 5.

Knowing how operands and operators are grouped is not always sufficient to determine the result. It may also be necessary to know in what order the operands to each operator are evaluated. Each operator controls what assumptions, if any, can be made as to the order in which the operands will be evaluated—that is, whether we can assume that the left-hand operand is always evaluated before the right or not. Most operators do not guarantee a particular order of evaluation. We will cover these topics in Section 5.10 (p. 168).

5.1 Arithmetic Operators

Table 5.1. Arithmetic Operators

Unless noted otherwise, these operators may be applied to any of the arithmetic types (Section 2.1, p. 34), or any type that can be converted to an arithmetic type.

The table groups the operators by their precedence—the unary operators have the highest precedence, then the multiplication and division operators, and then the binary addition and subtraction operators. Operators of higher precedence group more tightly than do operators with lower precedence. These operators are all left associative, meaning that they group left to right when the precedence levels are the same.

Applying precedence and associativity to the previous expression:

5 + 10 * 20/2;

we can see that the operands to the multiplication operator (*) are 10 and 20. The result of that expression and 2 are the operands to the division operator (/). The result of that division and 5 are the operands to the addition operator (+).

The unary minus operator has the obvious meaning. It negates its operand:

int i = 1024;
int k = -i; // negates the value of its operand

Unary plus returns the operand itself. It makes no change to its operand.

Caution: Overflow and Other Arithmetic Exceptions

The result of evaluating some arithmetic expressions is undefined. Some expressions are undefined due to the nature of mathematics—for example, division by zero. Others are undefined due to the nature of computers—such as overflow, in which a value is computed that is too large for its type.

Consider a machine on which shorts are 16 bits. In that case, the maximum short is 32767. Given only 16 bits, the following compound assignment overflows:

     // max value if shorts are 16 bits
     short short_value = 32767;
     short ival = 1;
     // this calculation overflows
     short_value += ival;
     cout << "short_value: " << short_value << endl;

Representing a signed value of 32768 requires 17 bits, but only 16 are available. On many systems, there is no compile-time or run-time warning when an overflow might occur. The actual value put into short_value varies across different machines. On our system the program completes and writes

short_value: -32768

The value “wrapped around:” The sign bit, which had been 0, was set to 1, resulting in a negative value. Because the arithmetic types have limited size, it is always possible for some calculations to overflow. Adhering to the recommendations from the “Advice” box on page 38 can help avoid such problems.

The binary + and - operators may also be applied to pointer values. The use of these operators with pointers was described in Section 4.2.4 (p. 123).

The arithmetic operators, +, -, *, and / have their obvious meanings: addition, subtraction, multiplication, and division. Division between integers results in an integer. If the quotient contains a fractional part, it is truncated:

int ival1 = 21/6; // integral result obtained by truncating the remainder
int ival2 = 21/7; // no remainder, result is an integral value

Both ival1 and ival2 are initialized with a value of 3.

The % operator is known as the “remainder” or the “modulus” operator. It computes the remainder of dividing the left-hand operand by the right-hand operand. This operator can be applied only to operands of the integral types: bool, char, short, int, long, and their associated unsigned types:

     int ival = 42;
     double dval = 3.14;
     ival % 12;   //  ok: returns 6
     ival % dval; //  error: floating point operand

For both division (/) and modulus(%), when both operands are positive, the result is positive (or zero). If both operands are negative, the result of division is positive (or zero) and the result of modulus is negative (or zero). If only one operand is negative, then the value of the result is machine-dependent for both operators. The sign is also machine-dependent for modulus; the sign is negative (or zero) for division:

     21 % 6;   //  ok: result is 3
     21 % 7;   //  ok: result is 0
     -21 % -8; //  ok: result is -5
     21 % -5;  //  machine-dependent: result is 1 or -4
     21 / 6;   //  ok: result is 3
     21 / 7;   //  ok: result is 3
     -21 / -8; //  ok: result is 2
     21 / -5;  //  machine-dependent: result -4 or -5

When only one operand is negative, the sign and value of the result for the modulus operator can follow either the sign of the numerator or of the denominator. On a machine where modulus follows the sign of the numerator then the value of division truncates toward zero. If modulus matches the sign of the denominator, then the result of division truncates toward minus infinity.

Exercises Section 5.1

Exercise 5.1: Parenthesize the following expression to indicate how it is evaluated. Test your answer by compiling the expression and printing its result.

12 / 3 * 4 + 5 * 15 + 24 % 4 / 2

Exercise 5.2: Determine the result of the following expressions and indicate which results, if any, are machine-dependent.

     -30 * 3 + 21 / 5
     -30 + 3 * 21 / 5
     30 / 3 * 21 % 5
     -30 / 3 * 21 % 4

Exercise 5.3: Write an expression to determine whether an int value is even or odd.

Exercise 5.4: Define the term overflow. Show three expressions that will overflow.

5.2 Relational and Logical Operators

Table 5.2. Relational and Logical Operators

The relational and logical operators take operands of arithmetic or pointer type and return values of type bool.

Logical AND and OR Operators

The logical operators treat their operands as conditions (Section 1.4.1, p. 12). The operand is evaluated; if the result is zero the condition is false, otherwise it is true. The overall result of the AND operator is true if and only if both its operands evaluate to true. The logical OR (||) operator evaluates to true if either of its operands evaluates to true. Given the forms

expr1 && expr2 // logical AND
expr1 || expr2 // logical OR

expr2 is evaluated if and only if expr1 does not by itself determine the result. In other words, we’re guaranteed that expr2 will be evaluated if and only if

• In a logical AND expression, expr1 evaluates to true. If expr1 is false, then the expression will be false regardless of the value of expr2. When expr1 is true, it is possible for the expression to be true if expr2 is also true.

• In a logical OR expression, expr1 evaluates to false; if expr1 is false, then the expression depends on whether expr2 is true.

The logical AND and OR operators always evaluate their left operand before the right. The right operand is evaluated only if the left operand does not determine the result. This evaluation strategy is often referred to as “short-circuit evaluation.”

A valuable use of the logical AND operator is to have expr1 evaluate to false in the presence of a boundary condition that would make the evaluation of expr2 dangerous. As an example, we might have a string that contains the characters in a sentence and we might want to make the first word in the sentence all uppercase. We could do so as follows:

In this case, we combine our two tests in the condition in the while. First we test whether it has reached the end of the string. If not, it refers to a character in s. Only if that test succeeds is the right-hand operand evaluated. We’re guaranteed that it refers to an actual character before we test to see whether the character is a space or not. The loop ends either when a space is encountered or, if there are no spaces in s, when we reach the end of s.

Logical NOT Operator

The logical NOT operator (!) treats its operand as a condition. It yields a result that has the opposite truth value from its operand. If the operand evaluates as nonzero, then ! returns false. For example, we might determine that a vector has elements by applying the logical NOT operator to the value returned by empty:

     // assign value of first element in vec to x if there is one
     int x = 0;
     if (!vec.empty())
         x = *vec.begin();

The subexpression

!vec.empty()

evaluates to true if the call to empty returns false.

The Relational Operators Do Not Chain Together

The relational operators (<, <=, >, <=) are left associative. The fact that they are left associative is rarely of any use because the relational operators return bool results. If we do chain these operators together, the result is likely to be surprising:

// oops! this condition does not determine if the 3 values are unequal
if (i < j < k) { /* ... */ }

As written, this expression will evaluate as true if k is greater than one! The reason is that the left operand of the second less-than operator is the true/ false result of the first—that is, the condition compares k to the integer values of 0 or 1. To accomplish the test we intended, we must rewrite the expression as follows:

if (i < j && j < k) { /* ... */ }

Equality Tests and the `bool` Literals

As we’ll see in Section 5.12.2 (p. 180) a bool can be converted to any arithmetic type—the bool value false converts to zero and true converts to one.

Because bool converts to one, is almost never right to write an equality test that tests against the bool literal true:

if (val == true) { /* ... */ }

Either val is itself a bool or it is a type to which a bool can be converted. If val is a bool, then this test is equivalent to writing

if (val) { /* ... */ }

which is shorter and more direct (although admittedly when first learning the language this kind of abbreviation can be perplexing).

More importantly, if val is not a bool, then comparing val with true is equivalent to writing

if (val == 1) { /* ... */ }

which is very different from

// condition succeeds if val is any nonzero value
if (val) { /* ... */ }

in which any nonzero value in val is true. If we write the comparison explicitly, then we are saying that the condition will succeed only for the specific value 1.

Exercises Section 5.2

Exercise 5.5: Explain when operands are evaluated in the logical AND operator, logical OR operator, and equality operator.

Exercise 5.6: Explain the behavior of the following while condition:

char *cp = "Hello World";
while (cp && *cp)

Exercise 5.7: Write the condition for a while loop that would read ints from the standard input and stop when the value read is equal to 42.

Exercise 5.8: Write an expression that tests four values, a, b, c, and d, and ensures that a is greater than b, which is greater than c, which is greater than d.

5.3 The Bitwise Operators

The bitwise operators take operands of integral type. These operators treat their integral operands as a collection of bits, providing operations to test and set individual bits. In addition, these operators may be applied to bitset (Section 3.5, p. 101) operands with the behavior as described here for integral operands.

Table 5.3. Bitwise Operators

The type of an integer manipulated by the bitwise operators can be either signed or unsigned. If the value is negative, then the way that the “sign bit” is handled in a number of the bitwise operations is machine-dependent. It is, therefore, likely to differ across implementations; programs that work under one implementation may fail under another.

Because there are no guarantees for how the sign bit is handled, we strongly recommend using an unsigned type when using an integral value with the bitwise operators.

In the following examples we assume that an unsigned char has 8 bits. The bitwise NOT operator (>~) is similar in behavior to the bitset flip (Section 3.5.2, p. 105) operation: It generates a new value with the bits of its operand inverted. Each 1 bit is set to 0; each 0 bit is set to 1:

The <<, >> operators are the bitwise shift operators. These operators use their right-hand operand to indicate by how many bits to shift. They yield a value that is a copy of the left-hand operand with the bits shifted as directed by the right-hand operand. The bits are shifted left (<<) or right (>>), discarding the bits that are shifted off the end.

The left shift operator (<<) inserts 0-valued bits in from the right. The right shift operator (>>) inserts 0-valued bits in from the left if the operand is unsigned. If the operand is signed, it can either insert copies of the sign bit or insert 0-valued bits; which one it uses is implementation defined. The right-hand operand must not be negative and must be a value that is strictly less than the number of bits in the left-hand operand. Otherwise, the effect of the operation is undefined.

The bitwise AND operator (&) takes two integral operands. For each bit position, the result is 1 if both operands contain 1; otherwise, the result is 0.

It is a common error to confuse the bitwise AND operator (&) with the logical AND operator (&&) (Section 5.2, p. 152). Similarly, it is common to confuse the bitwise OR operator (|) and the logical OR operator(||).

Here we illustrate the result of bitwise AND of two unsigned char values, each of which is initialized by an octal literal:

The bitwise XOR (exclusive or) operator (^) also takes two integral operands. For each bit position, the result is 1 if either but not both operands contain 1; otherwise, the result is 0.

The bitwise OR (inclusive or) operator (|) takes two integral operands. For each bit position, the result is 1 if either or both operands contain 1; otherwise, the result is 0.

5.3.1 Using `bitset` Objects or Integral Values

We said that the bitset class was easier to use than the lower-level bitwise operations on integral values. Let’s look at a simple example and show how we might solve a problem using either a bitset or the bitwise operators. Assume that a teacher has 30 students in a class. Each week the class is given a pass/fail quiz. We’ll track the results of each quiz using one bit per student to represent the pass or fail grade on a given test. We might represent each quiz in either a bitset or as an integral value:

bitset<30> bitset_quiz1; // bitset solution
unsigned long int_quiz1 = 0; // simulated collection of bits

In the bitset case we can define bitset_quiz1 to be exactly the size we need. By default each of the bits is set to zero. In the case where we use a built-in type to hold our quiz results, we define int_quiz1 as an unsigned long, meaning that it will have at least 32 bits on any machine. Finally, we explicitly initialize int_quiz1 to ensure that the bits start out with well-defined values.

The teacher must be able to set and test individual bits. For example, assuming that the student represented by position 27 passed, we’d like to be able to set that bit appropriately:

bitset_quiz1.set(27); // indicate student number 27 passed
int_quiz1 |= 1UL<<27; // indicate student number 27 passed

In the bitset case we do so directly by passing the bit we want turned on to set. The unsigned long case will take a bit more explanation. The way we’ll set a specific bit is to OR our quiz data with another integer that has only one bit—the one we want—turned on. That is, we need an unsigned long where bit 27 is a one and all the other bits are zero. We can obtain such a value by using the left shift operator and the integer constant 1:

1UL << 27; // generate a value with only bit number 27 set

Now when we bitwise OR this value with int_quiz1, all the bits except bit 27 will remain unchanged. That bit will be turned on. We use a compound assignment (Section 1.4.1, p. 13) to OR this value into int_quiz1. This operator, |=, executes in the same way that += does. It is equivalent to the more verbose:

// following assignment is equivalent to int_quiz1 |= 1UL << 27;
int_quiz1 = int_quiz1 | 1UL << 27;

Imagine that the teacher reexamined the quiz and discovered that student 27 actually had failed the test. The teacher must now turn off bit 27:

bitset_quiz1.reset(27); // student number 27 failed
int_quiz1 &= ~(1UL<<27); // student number 27 failed

Again, the bitset version is direct. We reset the indicated bit. For the simulated case, we need to do the inverse of what we did to set the bit: This time we’ll need an integer that has bit 27 turned off and all the other bits turned on. We’ll bitwise AND this value with our quiz data to turn off just that bit. We can obtain a value with all but bit 27 turned on by inverting our previous value. Applying the bitwise NOT to the previous integer will turn on every bit except the 27th. When we bitwise AND this value with int_quiz1, all except bit 27 will remain unchanged.

Finally, we might want to know how the student at position 27 fared. To do so, we could write

     bool status;
     status = bitset_quiz1[27];       // how did student number 27 do?
     status = int_quiz1 & (1UL<<27);  // how did student number 27 do?

In the bitset case we can fetch the value directly to determine how that student did. In the unsigned long case, the first step is to set the 27th bit of an integer to 1. The bitwise AND of this value with int_quiz1 evaluates to nonzero if bit 27 of int_quiz1 is also on; otherwise, it evaluates to zero.

In general, the library bitset operations are more direct, easier to read, easier to write, and more likely to be used correctly. Moreover, the size of a bitset is not limited by the number of bits in an unsigned. Ordinarily bitset should be used in preference to lower-level direct bit manipulation of integral values.

Exercises Section 5.3.1

Exercise 5.9: Assume the following two definitions:

unsigned long ul1 = 3, ul2 = 7;

What is the result of each of the following expressions?

(a) ul1 & ul2 (c) ul1 | ul2
(b) ul1 && ul2 (d) ul1 || ul2

Exercise 5.10: Rewrite the bitset expressions that set and reset the quiz results using a subscript operator.

5.3.2 Using the Shift Operators for IO

The IO library redefines the bitwise >> and << operators to do input and output. Even though many programmers never need to use the bitwise operators directly, most programs do make extensive use of the overloaded versions of these operators for IO. When we use an overloaded operator, it has the same precedence and associativity as is defined for the built-in version of the operator. Therefore, programmers need to understand the precedence and associativity of these operators even if they never use them with their built-in meaning as the shift operators.

The IO Operators Are Left Associative

Like the other binary operators, the shift operators are left associative. These operators group from left to right, which accounts for the fact that we can concatenate input and output operations into a single statement:

cout << "hi" << " there" << endl;

executes as:

( (cout << "hi") << " there" ) << endl;

In this statement, the operand "hi" is grouped with the first << symbol. Its result is grouped with the second, and then that result is grouped to the third.

The shift operators have midlevel precedence: lower precedence than the arithmetic operators but higher than the relational, assignment, or conditional operators. These relative precedence levels affect how we write IO expressions involving operands that use operators with lower precedence. We often need to use parentheses to force the right grouping:

     cout << 42 + 10;   // ok, + has higher precedence, so the sum is printed
     cout << (10 < 42); // ok: parentheses force intended grouping; prints 1
     cout << 10 < 42;   // error: attempt to compare cout to 42!

The second cout is interpreted as

(cout << 10) < 42;

this expression says to “write 10 onto cout and then compare the result of that operation (e.g., cout) to 42.”

5.4 Assignment Operators

The left-hand operand of an assignment operator must be a nonconst lvalue. Each of these assignments is illegal:

     int i, j, ival;
     const int ci = i;  // ok: initialization not assignment
     1024 = ival;       // error: literals are rvalues
     i + j = ival;      // error: arithmetic expressions are rvalues
     ci = ival;         // error: can't write to ci

Array names are nonmodifiable lvalues: An array cannot be the target of an assignment. Both the subscript and dereference operators return lvalues. The result of dereference or subscript, when applied to a nonconst array, can be the left-hand operand of an assignment:

     int ia[10];
     ia[0] = 0;    // ok: subscript is an lvalue
     *ia = 0;      // ok: dereference also is an lvalue

The result of an assignment is the left-hand operand; the type of the result is the type of the left-hand operand.

The value assigned to the left-hand operand ordinarily is the value that is in the right-hand operand. However, assignments where the types of the left and right operands differ may require conversions that might change the value being assigned. In such cases, the value stored in the left-hand operand might differ from the value of the right-hand operand:

ival = 0; // result: type int value 0
ival = 3.14159; // result: type int value 3

Both these assignments yield values of type int. In the first case the value stored in ival is the same value as in its right-hand operand. In the second case the value stored in ival is different from the right-hand operand.

5.4.1 Assignment Is Right Associative

Like the subscript and dereference operators, assignment returns an lvalue. As such, we can perform multiple assignments in a single expression, provided that each of the operands being assigned is of the same general type:

int ival, jval;
ival = jval = 0; // ok: each assigned 0

Unlike the other binary operators, the assignment operators are right associative. We group an expression with multiple assignment operators from right to left. In this expression, the result of the rightmost assignment (i.e., jval) is assigned to ival. The types of the objects in a multiple assignment either must be the same type or of types that can be converted (Section 5.12, p. 178) to one another:

The first assignment is illegal because ival and pval are objects of different types. It is illegal even though zero happens to be a value that could be assigned to either object. The problem is that the result of the assignment to pval is a value of type int*, which cannot be assigned to an object of type int. On the other hand, the second assignment is fine. The string literal is converted to string, and that string is assigned to s2. The result of that assignment is s2, which is then assigned to s1.

5.4.2 Assignment Has Low Precedence

Inside a condition is another common place where assignment is used as a part of a larger expression. Writing an assignment in a condition can shorten programs and clarify the programmer’s intent. For example, the following loop uses a function named get_value, which we assume returns int values. We can test those values until we obtain some desired value—say, 42:

The program begins by getting the first value and storing it in i. Then it establishes the loop, which tests whether i is 42, and if not, does some processing. The last statement in the loop gets a value from get_value(), and the loop repeats. We can write this loop more succinctly as

The condition now more clearly expresses our intent: We want to continue until get_value returns 42. The condition executes by assigning the result returned by get_value to i and then comparing the result of that assignment with 42.

The additional parentheses around the assignment are necessary because assignment has lower precedence than inequality.

Without the parentheses, the operands to != would be the value returned from calling get_value and 42. The true or false result of that test would be assigned to i—clearly not what we intended!

Beware of Confusing Equality and Assignment Operators

The fact that we can use assignment in a condition can have surprising effects:

if (i = 42)

This code is legal: What happens is that 42 is assigned to i and then the result of the assignment is tested. In this case, 42 is nonzero, which is interpreted as a true value. The author of this code almost surely intended to test whether i was 42:

if (i == 42)

Bugs of this sort are notoriously difficult to find. Some, but not all, compilers are kind enough to warn about code such as this example.

Exercises Section 5.4.2

Exercise 5.11: What are the values of i and d after the each assignment:

     int i;   double d;
     d = i = 3.5;
     i = d = 3.5;

Exercise 5.12: Explain what happens in each of the if tests:

if (42 = i) // . . .
if (i = 42) // . . .

5.4.3 Compound Assignment Operators

We often apply an operator to an object and then reassign the result to that same object. As an example, consider the sum program from page 14:

This kind of operation is common not just for addition but for the other arithmetic operators and the bitwise operators. There are compound assignments for each of these operators. The general syntactic form of a compound assignment operator is

a op= b;

where op= may be one of the following ten operators:

+= -= *= /= %= // arithmetic operators
<<= >>= &= ^= |= // bitwise operators

Each compound operator is essentially equivalent to

a = a op b;

There is one important difference: When we use the compound assignment, the left-hand operand is evaluated only once. If we write the similar longer version, that operand is evaluated twice: once as the right-hand operand and again as the left. In many, perhaps most, contexts this difference is immaterial aside from possible performance consequences.

Exercises Section 5.4.3

Exercise 5.13: The following assignment is illegal. Why? How would you correct it?

double dval; int ival; int *pi;
dval = ival = pi = 0;

Exercise 5.14: Although the following are legal, they probably do not behave as the programmer expects. Why? Rewrite the expressions as you think they should be.

     (a) if (ptr = retrieve_pointer() != 0)
     (b) if (ival = 1024)
     (c) ival += ival + 1;

5.5 Increment and Decrement Operators

The increment (++) and decrement (--) operators provide a convenient notational shorthand for adding or subtracting 1 from an object. There are two forms of these operators: prefix and postfix. So far, we have used only the prefix increment, which increments its operand and yields the changed value as its result. The prefix decrement operates similarly, except that it decrements its operand. The postfix versions of these operators increment (or decrement) the operand but yield a copy of the original, unchanged value as its result:

     int i = 0, j;
     j = ++i; // j = 1, i = 1: prefix yields incremented value
     j = i++; // j = 1, i = 2: postfix yields unincremented value

Because the prefix version returns the incremented value, it returns the object itself as an lvalue. The postfix versions return an rvalue.

Advice: Use Postfix Operators Only When Necessary

Readers from a C background might be surprised that we use the prefix increment in the programs we’ve written. The reason is simple: The prefix version does less work. It increments the value and returns the incremented version. The postfix operator must store the original value so that it can return the unincremented value as its result. For ints and pointers, the compiler can optimize away this extra work. For more complex iterator types, this extra work potentially could be more costly. By habitually favoring the use of the prefix versions, we do not have to worry if the performance difference matters.

Postfix Operators Return the Unincremented Value

The postfix version of ++ and -- is used most often when we want to use the current value of a variable and increment it in a single compound expression:

This program uses the postfix version of -- to decrement cnt. We want to assign the value of cnt to the next element in the vector and then decrement cnt before the next iteration. Had the loop used the prefix version, then the decremented value of cnt would be used when creating the elements in ivec and the effect would be to add elements from 9 down to 0.

Combining Dereference and Increment in a Single Expression

The following program, which prints the contents of ivec, represents a very common C++ programming pattern:

The expression *iter++ is usually very confusing to programmers new to both C++ and C.

The precedence of postfix increment is higher than that of the dereference operator, so *iter++ is equivalent to *(iter++). The subexpression iter++ increments iter and yields a copy of the previous value of iter as its result. Accordingly, the operand of * is a copy of the unincremented value of iter.

This usage relies on the fact that postfix increment returns a copy of its original, unincremented operand. If it returned the incremented value, we’d dereference the incremented value, with disastrous results: The first element of ivec would not get written. Worse, we’d attempt to dereference one too many elements!

Advice: Brevity Can Be a Virtue

Programmers new to C++ who have not previously programmed in a C-based language often have trouble with the terseness of some expressions. In particular, expressions such as *iter++ can be bewildering—at first. Experienced C++ programmers value being concise. They are much more likely to write

cout << *iter++ << endl;

than the more verbose equivalent

cout << *iter << endl;
++iter;

For programmers new to C++, the second form is clearer because the action of incrementing the iterator and fetching the value to print are kept separate. However, the first version is much more natural to most C++ programmers.

It is worthwhile to study examples of such code until their meanings are immediately clear. Most C++ programs use succinct expressions rather than more verbose equivalents. Therefore, C++ programmers must be comfortable with such usages. Moreover, once these expressions are familiar, you will find them less error-prone.

Exercises Section 5.5

Exercise 5.15: Explain the difference between prefix and postfix increment.

Exercise 5.16: Why do you think C++ wasn’t named ++C?

Exercise 5.17: What would happen if the while loop that prints the contents of a vector used the prefix increment operator?

5.6 The Arrow Operator

The arrow operator (->) provides a synonym for expressions involving the dot and dereference operators. The dot operator (Section 1.5.2, p. 25) fetches an element from an object of class type:

item1.same_isbn(item2); // run the same_isbn member of item1

If we had a pointer (or iterator) to a Sales_item, we would have to dereference the pointer (or iterator) before applying the dot operator:

Sales_item *sp = &item1;
(*sp).same_isbn(item2); // run same_isbn on object to which sp points

Here we dereference sp to get the underlying Sales_item. Then we use the dot operator to run same_isbn on that object. We must parenthesize the dereference because dereference has a lower precedence than dot. If we omit the parentheses, this code means something quite different:

This expression attempts to fetch the same_isbn member of the object sp. It is equivalent to

*(sp.same_isbn(item2)); // equivalent to *sp.same_isbn(item2);

However, sp is a pointer, which has no members; this code will not compile.

Because it is easy to forget the parentheses and because this kind of code is a common usage, the language defines the arrow operator as a synonym for a dereference followed by the dot operator. Given a pointer (or iterator) to an object of class type, the following expressions are equivalent:

More concretely, we can rewrite the call to same_isbn as

sp->same_isbn(item2); // equivalent to (*sp).same_isbn(item2)

Exercises Section 5.6

Exercise 5.18: Write a program that defines a vector of pointers to strings. Read the vector, printing each string and its corresponding size.

Exercise 5.19: Assuming that iter is a vector<string>::iterator, indicate which, if any, of the following expressions is legal. Explain the behavior of the legal expressions.

     (a) *iter++;         (b) (*iter)++;
     (c) *iter.empty()    (d) iter->empty();
     (e) ++*iter;         (f) iter++->empty();

5.7 The Conditional Operator

The conditional operator is the only ternary operator in C++. It allows us to embed simple if-else tests inside an expression. The conditional operator has the following syntactic form

cond ? expr1 : expr2;

where cond is an expression that is used as a condition (Section 1.4.1, p. 12). The operator executes by evaluating cond. If cond evaluates to 0, then the condition is false; any other value is true. cond is always evaluated. If it is true, then expr1 is evaluated; otherwise, expr2 is evaluated. Like the logical AND and OR (&& and ||) operators, the conditional operator guarantees this order of evaluation for its operands. Only one of expr1 or expr2 is evaluated. The following program illustrates use of the conditional operator:

     int i = 10, j = 20, k = 30;
     // if i > j then maxVal = i else maxVal = j
     int maxVal = i > j ? i : j;

Avoid Deep Nesting of the Conditional Operator

We could use a set of nested conditional expressions to set max to the largest of three variables:

     int max = i > j
                   ? i > k ? i : k
                   : j > k ? j : k;

We could do the equivalent comparison in the following longer but simpler way:

     int max = i;
     if (j > max)
         max = j;
     if (k > max)
         max = k;

Using a Conditional Operator in an Output Expression

The conditional operator has fairly low precedence. When we embed a conditional expression in a larger expression, we usually must parenthesize the conditional subexpression. For example, the conditional operator is often used to print one or another value, depending on the result of a condition. Incompletely parenthesized uses of the conditional operator in an output expression can have surprising results:

     cout << (i < j ? i : j);  // ok: prints larger of i and j
     cout << (i < j) ? i : j;  // prints 1 or 0!
     cout << i < j ? i : j;    // error: compares cout to int

The second expression is the most interesting: It treats the comparison between i and j as the operand to the << operator. The value 1 or 0 is printed, depending on whether i < j is true or false. The << operator returns cout, which is tested as the condition for the conditional operator. That is, the second expression is equivalent to

     cout << (i < j); // prints 1 or 0
     cout ? i : j;    // test cout and then evaluate i or j
                      // depending on whether cout evaluates to true or false

Exercises Section 5.7

Exercise 5.20: Write a program to prompt the user for a pair of numbers and report which is smaller.

Exercise 5.21: Write a program to process the elements of a vector<int>. Replace each element with an odd value by twice that value.

5.8 The `sizeof` Operator

The sizeof operator returns a value of type size_t (Section 3.5.2, p. 104) that is the size, in bytes (Section 2.1, p. 35), of an object or type name. The result of sizeof expression is a compile-time constant. The sizeof operator takes one of the following forms:

     sizeof (type name);
     sizeof (expr);
     sizeof expr;

Applying sizeof to an expr returns the size of the result type of that expression:

Evaluating sizeof expr does not evaluate the expression. In particular, in sizeof *p, the pointer p may hold an invalid address, because p is not dereferenced.

The result of applying sizeof depends in part on the type involved:

• sizeof char or an expression of type char is guaranteed to be 1

• sizeof a reference type returns the size of the memory necessary to contain an object of the referenced type

• sizeof a pointer returns the size needed hold a pointer; to obtain the size of the object to which the pointer points, the pointer must be dereferenced

• sizeof an array is equivalent to taking the sizeof the element type times the number of elements in the array

Because sizeof returns the size of the entire array, we can determine the number of elements by dividing the sizeof the array by the sizeof an element:

// sizeof(ia)/sizeof(*ia) returns the number of elements in ia
int sz = sizeof(ia)/sizeof(*ia);

Exercises Section 5.8

Exercise 5.22: Write a program to print the size of each of the built-in types.

Exercise 5.23: Predict the output of the following program and explain your reasoning. Now run the program. Is the output what you expected? If not, figure out why.

     int x[10];   int *p = x;
     cout << sizeof(x)/sizeof(*x) << endl;
     cout << sizeof(p)/sizeof(*p) << endl;

5.9 Comma Operator

A comma expression is a series of expressions separated by commas. The expressions are evaluated from left to right. The result of a comma expression is the value of the rightmost expression. The result is an lvalue if the rightmost operand is an lvalue. One common use for the comma operator is in a for loop.

This loop increments ix and decrements cnt in the expression in the for header. Both ix and cnt are changed on each trip through the loop. As long as the test of ix succeeds, we reset the next element to the current value of cnt.

Exercises Section 5.9

Exercise 5.24: The program in this section is similar to the program on page 163 that added elements to a vector. Both programs decremented a counter to generate the element values. In this program we used the prefix decrement and the earlier one used postfix. Explain why we used prefix in one and postfix in the other.

5.10 Evaluating Compound Expressions

An expression with two or more operators is a compound expression. In a compound expression, the way in which the operands are grouped to the operators may determine the result of the overall expression. If the operands group in one way, the result differs from what it would be if they grouped another way.

Precedence and associativity determine how the operands are grouped. That is, precedence and associativity determine which part of the expression is the operand for each of the operators in the expression. Programmers can override these rules by parenthesizing compound expressions to force a particular grouping.

Precedence specifies how the operands are grouped. It says nothing about the order in which the operands are evaluated. In most cases, operands may be evaluated in whatever order is convenient.

5.10.1 Precedence

The value of an expression depends on how the subexpressions are grouped. For example, in the following expression, a purely left-to-right evaluation yields 20:

6 + 3 * 4 / 2 + 2;

Other imaginable results include 9, 14, and 36. In C++, the result is 14.

Multiplication and division have higher precedence than addition. Their operands are bound to the operator in preference to the operands to addition. Multiplication and division have the same precedence as each other. Operators also have associativity, which determines how operators at the same precedence level are grouped. The arithmetic operators are left associative, which means they group left to right. We now can see that our expression is equivalent to

     int temp = 3 * 4;           // 12
     int temp2 = temp / 2;       // 6
     int temp3 = temp2 + 6;      // 12
     int result = temp3 + 2;     // 14

Parentheses Override Precedence

We can override precedence with parentheses. Parenthesized expressions are evaluated by treating each parenthesized subexpression as a unit and otherwise applying the normal precedence rules. For example, we can use parentheses on our initial expression to force the evaluation to result in any of the four possible values:

We have already seen examples where precedence rules affect the correctness of our programs. For example, consider the expression described in the “Advice” box on page 164:

*iter++;

Precedence says that ++ has higher precedence than *. That means that iter++ is grouped first. The operand of *, therefore, is the result of applying the increment operator to iter. If we wanted to increment the value that iter denotes, we’d have to use parentheses to force our intention:

(*iter)++; // increment value to which iter refers and yield unincremented value

The parentheses specify that the operand of * is iter. The expression now uses *iter as the operand to ++.

As another example, recall the condition in the while on page 161:

while ((i = get_value()) != 42) {

The parentheses around the assignment were necessary to implement the desired operation, which was to assign to i the value returned from get_value and then test that value to see whether it was 42. Had we failed to parenthesize the assignment, the effect would be to test the return value to see whether it was 42. The true or false value of that test would then be assigned to i, meaning that i would either be 1 or 0.

5.10.2 Associativity

Associativity specifies how to group operators at the same precedence level. We have also seen cases where associativity matters. As one example, the assignment operator is right associative. This fact allows concatenated assignments:

This expression first assigns lval to kval, then the result of that to jval, and finally the result of that to ival.

The arithmetic operators, on the other hand, are left associative. The expression

multiplies ival and jval, then divides that result by kval, and finally multiplies the result of the division by lval.

Table 5.4 presents the full set of operators ordered by precedence. The table is organized into segments separated by double lines. Operators in each segment have the same precedence, and have higher precedence than operators in sub-sequent segments. For example, the prefix increment and dereference operators share the same precedence and have higher precedence than the arithmetic or relational operators. We have seen most of these operators, although a few will not be defined until later chapters.

Table 5.4. Operator Precedence

Exercises Section 5.10.2

Exercise 5.25: Using Table 5.4 (p. 170), parenthesize the following expressions to indicate the order in which the operands are grouped:

(a) ! ptr == ptr->next
(b) ch = buf[ bp++ ] != ' '

Exercise 5.26: The expressions in the previous exercise evaluate in an order that is likely to be surprising. Parenthesize these expressions to evaluate in an order you imagine is intended.

Exercise 5.27: The following expression fails to compile due to operator precedence. Using Table 5.4 (p. 170), explain why it fails. How would you fix it?

     string s = "word";
     // add an 's' to the end, if the word doesn't already end in 's'
     string pl = s + s[s.size() - 1] == 's' ? "" : "s" ;

5.10.3 Order of Evaluation

In Section 5.2 (p. 152) we saw that the && and || operators specify the order in which their operands are evaluated: In both cases the right-hand operand is evaluated if and only if doing so might affect the truth value of the overall expression. Because we can rely on this property, we can write code such as

// iter only dereferenced if it isn't at end
while (iter != vec.end() && *iter != some_val)

The only other operators that guarantee the order in which operands are evaluated are the conditional (?:) and comma operators. In all other cases, the order is unspecified.

For example, in the expression

f1() * f2();

we know that both f1 and f2 must be called before the multiplication can be done. After all, their results are what is multiplied. However, we have no way to know whether f1 will be called before f2 or vice versa.

The order of operand evaluation often, perhaps even usually, doesn’t matter. It can matter greatly, though, if the operands refer to and change the same objects.

The order of operand evaluation matters if one subexpression changes the value of an operand used in another subexpression:

// oops! language does not define order of evaluation
if (ia[index++] < ia[index])

The behavior of this expression is undefined. The problem is that the left- and right-hand operands to the < both use the variable index. However, the left-hand operand involves changing the value of that variable. Assuming index is zero, the compiler might evaluate this expression in one of the following two ways:

if (ia[0] < ia[0]) // execution if rhs is evaluated first
if (ia[0] < ia[1]) // execution if lhs is evaluated first

We can guess that the programmer intended that the left operand be evaluated, thereby incrementing index. If so, the comparison would be between ia[0] and ia[1]. The language, however, does not guarantee a left-to-right evaluation order. In fact, an expression like this is undefined. An implementation might evaluate the right-hand operand first, in which case ia[0] is compared to itself. Or the implementation might do something else entirely.

Advice: Managing Compound Expressions

Beginning C and C++ programmers often have difficulties understanding order of evaluation and the rules of precedence and associativity. Misunderstanding how expressions and operands are evaluated is a rich source of bugs. Moreover, the resulting bugs are difficult to find because reading the program does not reveal the error unless the programmer already understands the rules.

Two rules of thumb can be helpful:

When in doubt, parenthesize expressions to force the grouping that the logic of your program requires.
If you change the value of an operand, don’t use that operand elsewhere in the same statement. If you need to use the changed value, then break the expression up into separate statements in which the operand is changed in one statement and then used in a subsequent statement.

An important exception to the second rule is that subexpressions that use the result of the subexpression that changes the operand are safe. For example, in *++iter the increment changes the value of iter, and the (changed) value of iter is then used as the operand to *. In this, and similar, expressions, order of evaluation of the operand isn’t an issue. To evaluate the larger expression, the subexpression that changes the operand must first be evaluated. Such usage poses no problems and is quite common.

Do not use an increment or decrement operator on the same object in more than two subexpressions of the same expression.

One safe and machine-independent way to rewrite the previous comparison of two array elements is

Now neither operand can affect the value of the other.

Exercises Section 5.10.3

Exercise 5.28: With the exception of the logical AND and OR, the order of evaluation of the binary operators is left undefined to permit the compiler freedom to provide an optimal implementation. The trade-off is between an efficient implementation and a potential pitfall in the use of the language by the programmer. Do you consider that an acceptable trade-off? Why or why not?

Exercise 5.29: Given that ptr points to a class with an int member named ival, vec is a vector holding ints, and that ival, jval, and kval are also ints, explain the behavior of each of these expressions. Which, if any, are likely to be incorrect? Why? How might each be corrected?

     (a) ptr->ival != 0            (b) ival != jval < kval
     (c) ptr != 0 && *ptr++        (d) ival++ && ival
     (e) vec[ival++] <= vec[ival]

5.11 The `new` and `delete` Expressions

In Section 4.3.1 (p. 134) we saw how to use new and delete expressions to dynamically allocate and free arrays. We can also use new and delete to dynamically allocate and free single objects.

When we define a variable, we specify a type and a name. When we dynamically allocate an object, we specify a type but do not name the object. Instead, the new expression returns a pointer to the newly allocated object; we use that pointer to access the object:

This new expression allocates one object of type int from the free store and returns the address of that object. We use that address to initialize the pointer pi.

Initializing Dynamically Allocated Objects

Dynamically allocated objects may be initialized, in much the same way as we initialize variables:

We must use the direct-initialization syntax (Section 2.3.3, p. 48) to initialize dynamically allocated objects. When an initializer is present, the new expression allocates the required memory and initializes that memory using the given initializer(s). In the case of pi, the newly allocated object is initialized to 1024. The object pointed to by ps is initialized to a string of 10 nines.

Default Initialization of Dynamically Allocated Objects

If we do not explicitly state an initializer, then a dynamically allocated object is initialized in the same way as is a variable that is defined inside a function. (Section 2.3.4, p. 50) If the object is of class type, it is initialized using the default constructor for the type; if it is of built-in type, it is uninitialized.

As usual, it is undefined to use the value associated with an uninitialized object in any way other than to assign a good value to it.

Just as we (almost) always initialize the objects we define as variables, it is (almost) always a good idea to initialize dynamically allocated objects.

We can also value-initialize (Section 3.3.1, p. 92) a dynamically allocated object:

We indicate that we want to value-initialize the newly allocated object by following the type name by a pair of empty parentheses. The empty parentheses signal that we want initialization but are not supplying a specific initial value. In the case of class types (such as string) that define their own constructors, requesting value-initialization is of no consequence: The object is initialized by running the default constructor whether we leave it apparently uninitialized or ask for value-initialization. In the case of built-in types or types that do not define any constructors, the difference is significant:

In the first case, the int is uninitialized; in the second case, the int is initialized to zero.

The () syntax for value initialization must follow a type name, not a variable. As we’ll see in Section 7.4 (p. 251)

int x(); // does not value initialize x

declares a function named x with no arguments that returns an int.

Memory Exhaustion

Although modern machines tend to have huge memory capacity, it is always possible that the free store will be exhausted. If the program uses all of available memory, then it is possible for a new expression to fail. If the new expression cannot acquire the requested memory, it throws an exception named bad_alloc. We’ll look at how exceptions are thrown in Section 6.13 (p. 215).

Destroying Dynamically Allocated Objects

When our use of the object is complete, we must explicitly return the object’s memory to the free store. We do so by applying the delete expression to a pointer that addresses the object we want to release.

delete pi;

frees the memory associated with the int object addressed by pi.

It is illegal to apply delete to a pointer that addresses memory that was not allocated by new.

The effect of deleting a pointer that addresses memory that was not allocated by new is undefined. The following are examples of safe and unsafe delete expressions:

It is worth noting that the compiler might refuse to compile the delete of str. The compiler knows that str is not a pointer and so can detect this error at compile-time. The second error is more insidious: In general, compilers cannot tell what kind of object a pointer addresses. Most compilers will accept this code, even though it is in error.

`delete` of a Zero-Valued Pointer

It is legal to delete a pointer whose value is zero; doing so has no effect:

int *ip = 0;
delete ip; // ok: always ok to delete a pointer that is equal to 0

The language guarantees that deleting a pointer that is equal to zero is safe.

Resetting the Value of a Pointer after a `delete`

When we write

delete p;

p becomes undefined. Although p is undefined, on many machines, p still contains the address of the object to which it pointed. However, the memory to which p points was freed, so p is no longer valid.

After deleting a pointer, the pointer becomes what is referred to as a dangling pointer. A dangling pointer is one that refers to memory that once held an object but does so no longer. A dangling pointer can be the source of program errors that are difficult to detect.

Setting the pointer to 0 after the object it refers to has been deleted makes it clear that the pointer points to no object.

Dynamic Allocation and Deallocation of `const` Objects

It is legal to dynamically create const objects:

// allocate and initialize a const object
const int *pci = new const int(1024);

Like any const, a dynamically created const must be initialized when it is created and once initialized cannot be changed. The value returned from this new expression is a pointer to const int. Like the address of any other const object, the return from a new that allocates a const object may only be assigned to a pointer to const.

A const dynamic object of a class type that defines a default constructor may be initialized implicitly:

// allocate default initialized const empty string
const string *pcs = new const string;

This new expression does not explicitly initialize the object pointed to by pcs. Instead, the object to which pcs points is implicitly initialized to the empty string. Objects of built-in type or of a class type that does not provide a default constructor must be explicitly initialized.

Caution: Managing Dynamic Memory Is Error-Prone

The following three common program errors are associated with dynamic memory allocation:

Failing to delete a pointer to dynamically allocated memory, thus preventing the memory from being returned to the free store. Failure to delete dynamically allocated memory is spoken of as a “memory leak.” Testing for memory leaks is difficult because they often do not appear until the application is run for a test period long enough to actually exhaust memory.
Reading or writing to the object after it has been deleted. This error can sometimes be detected by setting the pointer to 0 after deleting the object to which the pointer had pointed.
Applying a delete expression to the same memory location twice. This error can happen when two pointers address the same dynamically allocated object. If delete is applied to one of the pointers, then the object’s memory is returned to the free store. If we subsequently delete the second pointer, then the free store may be corrupted.

These kinds of errors in manipulating dynamically allocated memory are considerably easier to make than they are to track down and fix.

Deleting a `const` Object

Although the value of a const object cannot be modified, the object itself can be destroyed. As with any other dynamic object, a const dynamic object is freed by deleting a pointer that points to it:

delete pci; // ok: deletes a const object

Even though the operand of the delete expression is a pointer to const int, the delete expression is valid and causes the memory to which pci refers to be deallocated.

Exercises Section 5.11

Exercise 5.30: Which of the following, if any, are illegal or in error?

     (a) vector<string> svec(10);
     (b) vector<string> *pvec1 = new vector<string>(10);
     (c) vector<string> **pvec2 = new vector<string>[10];
     (d) vector<string> *pv1 = &svec;
     (e) vector<string> *pv2 = pvec1;

     (f) delete svec;
     (g) delete pvec1;
     (h) delete [] pvec2;
     (i) delete pv1;
     (j) delete pv2;

5.12 Type Conversions

The type of the operand(s) determine whether an expression is legal and, if the expression is legal, determines the meaning of the expression. However, in C++ some types are related to one another. When two types are related, we can use an object or value of one type where an operand of the related type is expected. Two types are related if there is a conversion between them.

As an example, consider

int ival = 0;
ival = 3.541 + 3; // typically compiles with a warning

which assigns 6 to ival.

The operands to the addition operator are values of two different types: 3.541 is a literal of type double, and 3 is a literal of type int. Rather than attempt to add values of the two different types, C++ defines a set of conversions to transform the operands to a common type before performing the arithmetic. These conversions are carried out automatically by the compiler without programmer intervention—and sometimes without programmer knowledge. For that reason, they are referred to as implicit type conversions.

The built-in conversions among the arithmetic types are defined to preserve precision, if possible. Most often, if an expression has both integral and floating-point values, the integer is converted to floating-point. In this addition, the integer value 3 is converted to double. Floating-point addition is performed and the result, 6.541, is of type double.

The next step is to assign that double value to ival, which is an int. In the case of assignment, the type of the left-hand operand dominates, because it is not possible to change the type of the object on the left-hand side. When the left- and right-hand types of an assignment differ, the right-hand side is converted to the type of the left-hand side. Here the double is converted to int. Converting a double to an int truncates the value; the decimal portion is discarded. 6.541 becomes 6, which is the value assigned to ival. Because the conversion of a double to int may result in a loss of precision, most compilers issue a warning. For example, the compiler we used to check the examples in this book warns us:

warning: assignment to 'int' from 'double'

To understand implicit conversions, we need to know when they occur and what conversions are possible.

5.12.1 When Implicit Type Conversions Occur

The compiler applies conversions for both built-in and class type objects as necessary. Implicit type conversions take place in the following situations:

• In expressions with operands of mixed types, the types are converted to a common type:

• An expression used as a condition is converted to bool:

Conditions occur as the first operand of the conditional (?:) operator and as the operand(s) to the logical NOT (!), logical AND (&&), and logical OR (||) operators. Conditions also appear in the if, while, for, and do while statements. (We cover the do while in Chapter 6)

• An expression used to initialize or assign to a variable is converted to the type of the variable:

In addition, as we’ll see in Chapter 7, implicit conversions also occur during function calls.

5.12.2 The Arithmetic Conversions

The language defines a set of conversions among the built-in types. Among these, the most common are the arithmetic conversions, which ensure that the two operands of a binary operator, such as an arithmetic or logical operator, are converted to a common type before the operator is evaluated. That common type is also the result type of the expression.

The rules define a hierarchy of type conversions in which operands are converted to the widest type in the expression. The conversion rules are defined so as to preserve the precision of the values involved in a multi-type expression. For example, if one operand is of type long double, then the other is converted to type long double regardless of what the second type is.

The simplest kinds of conversion are integral promotions. Each of the integral types that are smaller than int— char, signed char, unsigned char, short, and unsigned short—is promoted to int if all possible values of that type fit in an int. Otherwise, the value is promoted to unsigned int. When bool values are promoted to int, a false value promotes to zero and true to one.

Conversions between Signed and Unsigned Types

When an unsigned value is involved in an expression, the conversion rules are defined to preserve the value of the operands. Conversions involving unsigned operands depend on the relative sizes of the integral types on the machine. Hence, such conversions are inherently machine dependent.

In expressions involving shorts and ints, values of type short are converted to int. Expressions involving unsigned short are converted to int if the int type is large enough to represent all the values of an unsigned short. Otherwise, both operands are converted to unsigned int. For example, if shorts are a half word and ints a word, then any unsigned value will fit inside an int. On such a machine, unsigned shorts are converted to int.

The same conversion happens among operands of type long and unsigned int. The unsigned int operand is converted to long if type long on the machine is large enough to represent all the values of the unsigned int. Otherwise, both operands are converted to unsigned long.

On a 32-bit machine, long and int are typically represented in a word. On such machines, expressions involving unsigned ints and longs are converted to unsigned long.

Conversions for expressions involving signed and unsigned int can be surprising. In these expressions the signed value is converted to unsigned. For example, if we compare a plain int and an unsigned int, the int is first converted to unsigned. If the int happens to hold a negative value, the result will be converted as described in Section 2.1.1 (p. 36), with all the attendant problems discussed there.

Understanding the Arithmetic Conversions

The best way to understand the arithmetic conversions is to study lots of examples. In most of the following examples, either the operands are converted to the largest type involved in the expression or, in the case of assignment expressions, the right-hand operand is converted to the type of the left-hand operand:

In the first addition, the character constant lowercase 'a' has type char, which as we know from Section 2.1.1 (p. 34) is a numeric value. The numeric value that 'a' represents depends on the machine’s character set. On our ASCII machine, 'a' represents the number 97. When we add 'a' to a long double, the char value is promoted to int and then that int value is converted to a long double. That converted value is added to the long double literal. The other interesting cases are the last two expressions involving unsigned values.

5.12.3 Other Implicit Conversions

Pointer Conversions

In most cases when we use an array, the array is automatically converted to a pointer to the first element:

The exceptions when an array is not converted to a pointer are: as the operand of the address-of (&) operator or of sizeof, or when using the array to initialize a reference to the array. We’ll see how to define a reference (or pointer) to an array in Section 7.2.4 (p. 240).

There are two other pointer conversions: A pointer to any data type can be converted to a void*, and a constant integral value of 0 can be converted to any pointer type.

Conversions to `bool`

Arithmetic and pointer values can be converted to bool. If the pointer or arithmetic value is zero, then the bool is false; any other value converts to true:

Here, the if converts any nonzero value of cp to true. The while dereferences cp, which yields a char. The null character has value zero and converts to false. All other char values convert to true.

Arithmetic Type and `bool` Conversions

Arithmetic objects can be converted to bool and bool objects can be converted to int. When an arithmetic type is converted to bool, zero converts as false and any other value converts as true. When a bool is converted to an arithmetic type, true becomes one and false becomes zero:

Conversions and Enumeration Types

Objects of an enumeration type (Section 2.7, p. 62) or an enumerator can be automatically converted to an integral type. As a result, they can be used where an integral value is required—for example, in an arithmetic expression:

The type to which an enum object or enumerator is promoted is machine-defined and depends on the value of the largest enumerator. Regardless of that value, an enum or enumerator is always promoted at least to int. If the largest enumerator does not fit in an int, then the promotion is to the smallest type larger than int (unsigned int, long or unsigned long) that can hold the enumerator value.

Conversion to `const`

A nonconst object can be converted to a const object, which happens when we use a nonconst object to initialize a reference to const object. We can also convert the address of a nonconst object (or convert a nonconst pointer) to a pointer to the related const type:

Conversions Defined by the Library Types

Class types can define conversions that the compiler will apply automatically. Of the library types we’ve used so far, there is one important conversion that we have used. When we read from an istream as a condition

string s;
while (cin >> s)

we are implicitly using a conversion defined by the IO library. In a condition such as this one, the expression cin >> s is evaluated, meaning cin is read. Whether the read succeeds or fails, the result of the expression is cin.

The condition in the while expects a value of type bool, but it is given a value of type istream. That istream value is converted to bool. The effect of converting an istream to bool is to test the state of the stream. If the last attempt to read from cin succeeded, then the state of the stream will cause the conversion to bool to be true—the while test will succeed. If the last attempt failed—say because we hit end-of-file—then the conversion to bool will yield false and the while condition will fail.

Exercises Section 5.12.3

Exercise 5.31: Given the variable definitions on page 180, explain what conversions take place when evaluating the following expressions:

     (a) if (fval)
     (b) dval = fval + ival;
     (c) dval + ival + cval;

Remember that you may need to consider associativity of the operators in order to determine the answer in the case of expressions involving more than one operator.

5.12.4 Explicit Conversions

An explicit conversion is spoken of as a cast and is supported by the following set of named cast operators: static_cast, dynamic_cast, const_cast, and reinterpret_cast.

Although necessary at times, casts are inherently dangerous constructs.

5.12.5 When Casts Might Be Useful

One reason to perform an explicit cast is to override the usual standard conversions. The following compound assignment

converts ival to double in order to multiply it by dval. That double result is then truncated to int in order to assign it to ival. We can eliminate the unnecessary conversion of ival to double by explicitly casting dval to int:

ival *= static_cast<int>(dval); // converts dval to int

Another reason for an explicit cast is to select a specific conversion when more than one conversion is possible. We will look at this case more closely in Chapter 14.

5.12.6 Named Casts

The general form for the named cast notation is the following:

cast-name<type>(expression);

cast-name may be one of static_cast, const_cast, dynamic_cast, or reinterpret_cast. type is the target type of the conversion, and expression is the value to be cast. The type of cast determines the specific kind of conversion that is performed on the expression.

`dynamic_cast`

A dynamic_cast supports the run-time identification of objects addressed either by a pointer or reference. We cover dynamic_cast in Section 18.2 (p. 772).

`const_cast`

A const_cast, as its name implies, casts away the constness of its expression. For example, we might have a function named string_copy that we are certain reads, but does not write, its single parameter of type char*. If we have access to the code, the best alternative would be to correct it to take a const char*. If that is not possible, we could call string_copy on a const value using a const_cast:

const char *pc_str;
char *pc = string_copy(const_cast<char*>(pc_str));

Only a const_cast can be used to cast away constness. Using any of the other three forms of cast in this case would result in a compile-time error. Similarly, it is a compile-time error to use the const_cast notation to perform any type conversion other than adding or removing const.

`static_cast`

Any type conversion that the compiler performs implicitly can be explicitly requested by using a static_cast:

     double d = 97.0;
     // cast specified to indicate that the conversion is intentional
     char ch = static_cast<char>(d);

Such casts are useful when assigning a larger arithmetic type to a smaller type. The cast informs both the reader of the program and the compiler that we are aware of and are not concerned about the potential loss of precision. Compilers often generate a warning for assignments of a larger arithmetic type to a smaller type. When we provide the explicit cast, the warning message is turned off.

A static_cast is also useful to perform a conversion that the compiler will not generate automatically. For example, we can use a static_cast to retrieve a pointer value that was stored in a void* pointer (Section 4.2.2, p. 119):

When we store a pointer in a void* and then use a static_cast to cast the pointer back to its original type, we are guaranteed that the pointer value is preserved. That is, the result of the cast will be equal to the original address value.

`reinterpret_cast`

A reinterpret_cast generally performs a low-level reinterpretation of the bit pattern of its operands.

A reinterpret_cast is inherently machine-dependent. Safely using reinterpret_cast requires completely understanding the types involved as well as the details of how the compiler implements the cast.

As an example, in the following cast

int *ip;
char *pc = reinterpret_cast<char*>(ip);

the programmer must never forget that the actual object addressed by pc is an int, not a character array. Any use of pc that assumes it’s an ordinary character pointer is likely to fail at run time in interesting ways. For example, using it to initialize a string object such as

string str(pc);

is likely to result in bizarre run-time behavior.

The use of pc to initialize str is a good example of why explicit casts are dangerous. The problem is that types are changed, yet there are no warnings or errors from the compiler. When we initialized pc with the address of an int, there is no error or warning from the compiler because we explicitly said the conversion was okay. Any subsequent use of pc will assume that the value it holds is a char*. The compiler has no way of knowing that it actually holds a pointer to an int. Thus, the initialization of str with pc is absolutely correct—albeit in this case meaningless or worse! Tracking down the cause of this sort of problem can prove extremely difficult, especially if the cast of ip to pc occurs in a file separate from the one in which pc is used to initialize a string.

Advice: Avoid Casts

By using a cast, the programmer turns off or dampens normal type-checking (Section 2.3, p. 44). We strongly recommend that programmers avoid casts and believe that most well-formed C++ programs can be written without relying on casts.

This advice is particularly important regarding use of reinterpret_casts. Such casts are always hazardous. Similarly, use of const_cast almost always indicates a design flaw. Properly designed systems should not need to cast away const. The other casts, static_cast and dynamic_cast, have their uses but should be needed infrequently. Every time you write a cast, you should think hard about whether you can achieve the same result in a different way. If the cast is unavoidable, errors can be mitigated by limiting the scope in which the cast value is used and by documenting all assumptions about the types involved.

5.12.7 Old-Style Casts

Prior to the introduction of named cast operators, an explicit cast was performed by enclosing a type in parentheses:

char *pc = (char*) ip;

The effect of this cast is the same as using the reinterpret_cast notation. However, the visibility of this cast is considerably less, making it even more difficult to track down the rogue cast.

Standard C++ introduced the named cast operators to make casts more visible and to give the programmer a more finely tuned tool to use when casts are necessary. For example, nonpointer static_casts and const_casts tend to be safer than reinterpret_casts. As a result, the programmer (as well as readers and tools operating on the program) can clearly identify the potential risk level of each explicit cast in code.

Although the old-style cast notation is supported by Standard C++, we recommend it be used only when writing code to be compiled either under the C language or pre-Standard C++.

The old-style cast notation takes one of the following two forms:

type (expr); // Function-style cast notation
(type) expr; // C-language-style cast notation

Depending on the types involved, an old-style cast has the same behavior as a const_cast, a static_cast, ora reinterpret_cast. When used where a static_cast or a const_cast would be legal, an old-style cast does the same conversion as the respective named cast. If neither is legal, then an old-style cast performs a reinterpret_cast. For example, we might rewrite the casts from the previous section less clearly using old-style notation:

The old-style cast notation remains supported for backward compatibility with programs written under pre-Standard C++ and to maintain compatibility with the C language.

Exercises Section 5.12.7

Exercise 5.32: Given the following set of definitions,

char cval; int ival; unsigned int ui;
float fval; double dval;

identify the implicit type conversions, if any, taking place:

(a) cval = 'a' + 3; (b) fval = ui - ival * 1.0;
(c) dval = ui * fval; (d) cval = ival + fval + dval;

Exercise 5.33: Given the following set of definitions,

int ival; double dval;
const string *ps; char *pc; void *pv;

rewrite each of the following using a named cast notation:

(a) pv = (void*)ps; (b) ival = int(*pc);
(c) pv = &dval; (d) pc = (char*) pv;

Chapter Summary

C++ provides a rich set of operators and defines their meaning when applied to values of the built-in types. Additionally, the language supports operator overloading, which allows us to define the meaning of the operators for class types. We’ll see in Chapter 14 how to define operators for our own types.

To understand compound expressions—expressions involving more than one operator—it is necessary to understand precedence, associativity, and order of operand evaluation. Each operator has a precedence level and associativity. Precedence determines how operators are grouped in a compound expression. Associativity determines how operators at the same precedence level are grouped.

Most operators do not specify the order in which operands are evaluated: The compiler is free to evaluate either the left- or right-hand operand first. Often, the order of operand evaluation has no impact on the result of the expression. However, if both operands refer to the same object and one of the operands changes that object, then the program has a serious bug—and a bug that may be hard to find.

Finally, it is possible to write an expression that is given one type but where a value of another type is required. In such cases, the compiler will automatically apply a conversion (either built-in or defined for a class type) to transform the given type into the type that is required. Conversions can also be requested explicitly by using a cast.

Defined Terms

arithmetic conversion

A conversion from one arithmetic type to another. In the context of the binary arithmetic operators, arithmetic conversions usually attempt to preserve precision by converting a smaller type to a larger type (e.g., small integral types, such as char and short, are converted to int).

associativity

Determines how operators of the same precedence are grouped. Operators can be either right associative (operators are grouped from right to left) or left associative (operators are grouped from left to right).

binary operators

Operators that take two operands.

cast

An explicit conversion.

compound expression

An expression involving more than one operator.

const_cast

A cast that converts a const object to the corresponding nonconst type.

conversion

Process whereby a value of one type is transformed into a value of another type. The language defines conversions among the built-in types. Conversions to and from class types are also possible.

dangling pointer

A pointer that refers to memory that once had an object but no longer does. Dangling pointers are the source of program errors that are quite difficult to detect.

delete expression

A delete expression frees memory that was allocated by new. There are two forms of delete:

delete p; // delete object
delete [] p; // delete array

In the first case, p must be a pointer to a dynamically allocated object; in the second, p must point to the first element in a dynamically allocated array. In C++ programs, delete replaces the use of the C library free function.

dynamic_cast

Used in combination with inheritance and run-time type identification. See Section 18.2 (p. 772).

expression

The lowest level of computation in a C++ program. Expressions generally apply an operator to one or more operands. Each expression yields a result. Expressions can be used as operands, so we can write compound expressions requiring the evaluation of multiple operators.

implicit conversion

A conversion that is automatically generated by the compiler. Given an expression that needs a particular type but has an operand of a differing type, the compiler will automatically convert the operand to the desired type if an appropriate conversion exists.

integral promotions

Subset of the standard conversions that take a smaller integral type to its most closely related larger type. Integral types (e.g. short, char, etc.) are promoted to int or unsigned int.

new expression

A new expression allocates memory at run time from the free store. This chapter looked at the form that allocates a single object:

new type;
new type(inits);

allocates an object of the indicated type and optionally initializes that object using the initializers in inits. Returns a pointer to the object. In C++ programs, new replaces use of the C library malloc function.

operands

Values on which an expression

operator

Symbol that determines what action an expression performs. The language defines a set of operators and what those operators mean when applied to values of built-in type. The language also defines the precedence and associativity of each operator and specifies how many operands each operator takes. Operators may be overloaded and applied to values of class type.

operator overloading

The ability to redefine an operator to apply to class types. We’ll see in Chapter 14 how to define overloaded versions of operators.

order of evaluation

Order, if any, in which the operands to an operator are evaluated. In most cases in C++ the compiler is free to evaluate operands in any order.

precedence

Defines the order in which different operators in a compound expression are grouped. Operators with higher precedence are grouped more tightly than operators with lower precedence.

reinterpret_cast

Interprets the contents of the operand as a different type. Inherently machine-dependent and dangerous.

result

The value or object obtained by evaluating an expression.

static_cast

An explicit request for a type conversion that the compiler would do implicitly. Often used to override an implicit conversion that the compiler would otherwise perform.

unary operators

Operators that take a single operand.

~ operator

The bitwise NOT operator. Inverts the bits of its operand.

, operator

The comma operator. Expressions separated by a comma are evaluated left to right. Result of a comma expression is the value of the right-most expression.

?: operator

The conditional operator. If-then-else expression of the form: operates.

cond ? expr1 : expr2;

If the condition cond is true then expr1 is evaluated. Otherwise, expr2 is evaluated.

& operator

Bitwise AND operator. Generates a new integral value in which each bit position is 1 if both operands have a 1 in that position; otherwise the bit is 0.

^ operator

The bitwise exclusive or operator. Generates a new integral value in which each bit position is 1 if either but not both operands contain a 1 in that bit position; otherwise, the bit is 0.

| operator

The bitwise OR operator. Generates a new integral value in which each bit position is 1 if either operand has a 1 in that position; otherwise the bit is 0.

++ operator

The increment operator. The increment operator has two forms, prefix and postfix. Prefix increment yields an lvalue. It adds one to the operand and returns the changed value of the operand. Postfix increment yields an rvalue. It adds one to the operand and returns the original, unchanged value of the operand.

-- operator

The decrement operator. has two forms, prefix and postfix. Prefix decrement yields an lvalue. It subtracts one from the operand and returns the changed value of the operand. Postfix decrement yields an rvalue. It subtracts one from the operand and returns the original, unchanged value of the operand.

<< operator

The left-shift operator. Shifts bits in the left-hand operand to the left. Shifts as many bits as indicated by the right-hand operand. The right-hand operand must be zero or positive and strictly less than the number of bits in the left-hand operand.

>> operator

The right-shift operator. Like the left-shift operator except that bits are shifted to the right. The right-hand operand must be zero or positive and strictly less than the number of bits in the left-hand operand.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 5. Expressions

Create new playlist

Sign In

Sign Up