Chapter 6. Statements

Statements are analogous to sentences in a natural language. In C++ there are simple statements that execute a single task and compound statements that consist of a block of statements that execute as a unit. Like most languages, C++ provides statements for conditional execution and loops that repeatedly execute the same body of code. This chapter looks in detail at the statements supported by C++.

By default, statements are executed sequentially. Except for the simplest programs, sequential execution is inadequate. Therefore, C++ also defines a set of flow-of-control statements that allow statements to be executed conditionally or repeatedly. The if and switch statements support conditional execution. The for, while, and do while statements support repetitive execution. These latter statements are often referred to as loops or iteration statements.

6.1 Simple Statements

Most statements in C++ end with a semicolon. An expression, such as ival + 5, becomes an expression statement by following it with a semicolon. Expression statements cause the expression to be evaluated. In the case of

ival + 5; // expression statement

evaluating this expression is useless: The result is calculated but not assigned or otherwise used. More commonly, expression statements contain expressions that when evaluated affect the program’s state. Examples of such expressions are those that use assignment, increment, input, or output operators.

Null Statements

The simplest form of statement is the empty, or null statement. It takes the following form (a single semicolon):

; // null statement

A null statement is useful where the language requires a statement but the program’s logic does not. Such usage is most common when a loop’s work can be done within the condition. For example, we might want to read an input stream, ignoring everything we read until we encounter a particular value:

The condition reads a value from the standard input and implicitly tests cin to see whether the read was successful. Assuming the read succeeded, the second part of the condition tests whether the value we read is equal to the value in sought. If we found the value we want, then the while loop is exited; otherwise, the condition is tested again, which involves reading another value from cin.

A null statement should be commented, so that anyone reading the code can see that the statement was omitted intentionally.

Because a null statement is a statement, it is legal anywhere a statement is expected. For this reason, semicolons that might appear illegal are often nothing more than null statements:

// ok: second semicolon is superfluous null statement
ival = v1 + v2;;

This fragment is composed of two statements: the expression statement and the null statement.

Extraneous null statements are not always harmless.

An extra semicolon following the condition in a while or if can drastically alter the programmer’s intent:

This program will loop indefinitely. Contrary to the indentation, the increment is not part of the loop. The loop body is a null statement caused by the extra semicolon following the condition.

6.2 Declaration Statements

Defining or declaring an object or a class is a statement. Definition statements are usually referred to as declaration statements although definition statement might be more accurate. We covered definitions and declarations of variables in Section 2.3 (p. 43). Class definitions were introduced in Section 2.8 (p. 63) and will be covered in more detail in Chapter 12.

6.3 Compound Statements (Blocks)

A compound statement, usually referred to as a block, is a (possibly empty) sequence of statements surrounded by a pair of curly braces. A block is a scope. Names introduced within a block are accessible only from within that block or from blocks nested inside the block. As usual, a name is visible only from its point of definition until the end of the enclosing block.

Compound statements can be used where the rules of the language require a single statement, but the logic of our program needs to execute more than one. For example, the body of a while or for loop must be a single statement. Yet, we often need to execute more than one statement in the body of a loop. We can do so by enclosing the statements in a pair of braces, thus turning the sequence of statements into a block.

As an example, recall the while loop from our solution to the bookstore problem on page 26:

In the else branch, the logic of our program requires that we print total and then reset it from trans. An else may be followed by only a single statement. By enclosing both statements in curly braces, we transform them into a single (com-pound) statement. This statement satisfies the rules of the language and the needs of our program.

Unlike most other statements, a block is not terminated by a semicolon.

Just as there is a null statement, we also can define an empty block. We do so by using a pair of curlies with no statements:

Exercises Section 6.3

Exercise 6.1: What is a null statement? Give an example of when you might use a null statement.

Exercise 6.2: What is a block? Give an example of when you might use a block.

Exercise 6.3: Use the comma operator (Section 5.9, p. 168) to rewrite the else branch in the while loop from the bookstore problem so that it no longer requires a block. Explain whether this rewrite improves or diminishes the readability of this code.

Exercise 6.4: In the while loop that solved the bookstore problem, what effect, if any, would removing the curly brace following the while and its corresponding close curly have on the program?

6.4 Statement Scope

Some statements permit variable definitions within their control structure:

Variables defined in a condition must be initialized. The value tested by the condition is the value of the initialized object.

Variables defined as part of the control structure of a statement are visible only until the end of the statement in which they are defined. The scope of such variables is limited to the statement body. Often the statement body itself is a block, which in turn may contain other blocks. A name introduced in a control structure is local to the statement and the scopes nested inside the statement:

If the program needs to access the value of a variable used in the control statement, then that variable must be defined outside the control structure:

Earlier versions of C++ treated the scope of variables defined inside a for differently: Variables defined in the for header were treated as if they were defined just before the for. Older C++ programs may have code that relies on being able to access these control variables outside the scope of the for.

One advantage of limiting the scope of variables defined within a control statement to that statement is that the names of such variables can be reused without worrying about whether their current value is correct at each use. If the name is not in scope, then it is impossible to use that name with an incorrect, leftover value.

6.5 The `if` Statement

An if statement conditionally executes another statement based on whether a specified expression is true. There are two forms of the if: one with an else branch and one without. The syntactic form of the plain if is the following:

if (condition)
statement

The condition must be enclosed in parentheses. It can be an expression, such as

if (a + b > c) {/* ... */}

or an initialized declaration, such as

As usual, statement could be a compound statement—that is, a block of statements enclosed in curly braces.

When a condition defines a variable, the variable must be initialized. The value of the initialized variable is converted to bool (Section 5.12.3, p. 181) and the resulting bool determines the value of the condition. The variable can be of any type that can be converted to bool, which means it can be an arithmetic or pointer type. As we’ll see in Chapter 14, whether a class type can be used in a condition depends on the class. Of the types we’ve used so far, the IO types can be used in a condition, but the vector and string types may not be used as a condition.

To illustrate the use of the if statement, we’ll find the smallest value in a vector<int>, keeping a count of how many times that minimum value occurs. To solve this problem, we’ll need two if statements: one to determine whether we have a new minimum and the other to increment a count of the number of occurrences of the current minimum value:

if (minVal > ivec[i]) { /* process new minVal */ }
if (minVal == ivec[i]) { /* increment occurrence count */ }

Statement Block as Target of an `if`

We’ll start by considering each if in isolation. One of these if statements will determine whether there is a new minimum and, if so, reset the counter and update minVal:

The other conditionally updates the counter. This if needs only one statement, so it need not be enclosed in curlies:

It is a somewhat common error to forget the curly braces when multiple statements must be executed as a single statement.

In the following program, contrary to the indentation and intention of the programmer, the assignment to occurs is not part of the if statement:

Written this way, the assignment to occurs will be executed unconditionally. Uncovering this kind of error can be very difficult because the text of the program looks correct.

Many editors and development environments have tools to automatically indent source code to match its structure. It is a good idea to use such tools if they are available.

6.5.1 The `if` Statement `else` Branch

Our next task is to put these if statements together into an execution sequence. The order of the if statements is significant. If we use the following order

our count will always be off by 1. This code double-counts the first occurrence of the minimum.

Not only is the execution of both if statements on the same value potentially dangerous, it is also unnecessary. The same element cannot be both less than minVal and equal to it. If one condition is true, the other condition can be safely ignored. The if statement allows for this kind of either-or condition by providing an else clause.

The syntactic form of the if else statement is

If condition is true, then statement1 is executed; otherwise, statement2 is executed:

It is worth noting that statement2 can be any statement or a block of statements enclosed in curly braces. In this example, statement2 is itself an if statement.

Dangling `else`

There is one important complexity in using if statements that we have not yet covered. Notice that neither if directly handles the case where the current element is greater than minVal. Logically, ignoring these elements is fine—there is nothing to do if the element is greater than the minimum we’ve found so far. However, it is often the case that an if needs to do something on all three cases: Unique steps may be required if one value is greater than, less than, or equal to some other value. We’ve rewritten our loop to explicitly handle all three cases:

This three-way test handles each case correctly. However, a simple rewrite that collapses the first two tests into a single, nested if runs into problems:

This version illustrates a source of potential ambiguity common to if statements in all languages. The problem, usually referred to as the dangling-else problem, occurs when a statement contains more if clauses than else clauses. The question then arises: To which if does each else clause belong?

The indentation in our code indicates the expectation that the else should match up with the outer if clause. In C++, however, the dangling-else ambiguity is resolved by matching the else with the last occurring unmatched if. In this case, the actual evaluation of the if else statement is as follows:

We can force an else to match an outer if by enclosing the inner if in a compound statement:

Some coding styles recommend always using braces after any if. Doing so avoids any possible confusion and error in later modifications of the code. At a minimum, it is nearly always a good idea to use braces after an if (or while) when the statement in the body is anything other than a simple expression statement, such as an assignment or output expression.

Exercises Section 6.5.1

Exercise 6.5: Correct each of the following:

Exercise 6.6: What is a “dangling else”? How are else clauses resolved in C++?

6.6 The `switch` Statement

Deeply nested if else statements can often be correct syntactically and yet not correctly reflect the programmer’s logic. For example, mistaken else if matchings are more likely to pass unnoticed. Adding a new condition and associated logic or making other changes to the statements is also hard to get right. A switch statement provides a more convenient way to write deeply nested if/else logic.

Suppose that we have been asked to count how often each of the five vowels appears in some segment of text. Our program logic is as follows:

• Read each character until there are no more characters to read

• Compare each character to the set of vowels

• If the character matches one of the vowels, add 1 to that vowel’s count

• Display the results

The program was used to analyze this chapter. Here is the output:

     Number of vowel a: 3499
     Number of vowel e: 7132
     Number of vowel i: 3577
     Number of vowel o: 3530
     Number of vowel u: 1185

6.6.1 Using a `switch`

We can solve our problem most directly using a switch statement:

A switch statement executes by evaluating the parenthesized expression that follows the keyword switch. That expression must yield an integral result. The result of the expression is compared with the value associated with each case. The case keyword and its associated value together are known as the case label. Each case label’s value must be a constant expression (Section 2.7, p. 62). There is also a special case label, the default label, which we cover on page 203.

If the expression matches the value of a case label, execution begins with the first statement following that label. Execution continues normally from that statement through the end of the switch or until a break statement. If no match is found, (and if there is no default label), execution falls through to the first statement following the switch. In this program, the switch is the only statement in the body of a while. Here, falling through the switch returns control to the while condition.

We’ll look at break statements in Section 6.10 (p. 212). Briefly, a break statement interrupts the current control flow. In this case, the break transfers control out of the switch. Execution continues at the first statement following the switch. In this example, as we already know, transferring control to the statement following the switch returns control to the while.

6.6.2 Control Flow within a `switch`

It is essential to understand that execution flows across case labels.

It is a common misunderstanding to expect that only the statements associated with the matched case label are executed. However, execution continues across case boundaries until the end of the switch statement or a break is encountered.

Sometimes this behavior is indeed correct. We want to execute the code for a particular label as well as the code for following labels. More often, we want to execute only the code particular to a given label. To avoid executing code for subsequent cases, the programmer must explicitly tell the compiler to stop execution by specifying a break statement. Under most conditions, the last statement before the next case label is break. For example, here is an incorrect implementation of our vowel-counting switch statement:

To understand what happens, we’ll trace through this version assuming that value of ch is ’i’. Execution begins following case ’i’—thus incrementing iCnt. Execution does not stop there but continues across the case labels incrementing oCnt and uCnt as well. If ch had been ’e’, then eCnt, iCnt, oCnt, and uCnt would all be incremented.

Forgetting to provide a break is a common source of bugs in switch statements.

Although it is not strictly necessary to specify a break statement after the last label of a switch, the safest course is to provide a break after every label, even the last. If an additional case label is added later, then the break is already in place.

`break` Statements Aren’t Always Appropriate

There is one common situation where the programmer might wish to omit a break statement from a case label, allowing the program to fall through multiple case labels. That happens when two or more values are to be handled by the same sequence of actions. Only a single value can be associated with a case label. To indicate a range, therefore, we typically stack case labels following one another. For example, if we wished only to count vowels seen rather than count the individual vowels, we might write the following:

Case labels need not appear on a new line. We could emphasize that the cases represent a range of values by listing them all on a single line:

Less frequently, we deliberately omit a break because we want to execute code for one case and then continue into the next case, executing that code as well.

Deliberately omitting a break at the end of a case happens rarely enough that a comment explaining the logic should be provided.

6.6.3 The `default` Label

The default label provides the equivalent of an else clause. If no case label matches the value of the switch expression and there is a default label, then the statements following the default are executed. For example, we might add a counter to track how many nonvowels we read. We’ll increment this counter, which we’ll name otherCnt, in the default case:

In this version, if ch is not a vowel, execution will fall through to the default label, and we’ll increment otherCnt.

It can be useful always to define a default label even if there is no processing to be done in the default case. Defining the label indicates to subsequent readers that the case was considered but that there is no work to be done.

A label may not stand alone; it must precede a statement. If a switch ends with the default case in which there is no work to be done, then the default label must be followed by a null statement.

6.6.4 `switch` Expression and Case Labels

The expression evaluated by a switch can be arbitrarily complex. In particular, the expression can define and intialize a variable:

switch(int ival = get_response())

In this case, ival is initialized, and the value of ival is compared with each case label. The variable ival exists throughout the entire switch statement but not outside it.

Case labels must be constant integral expressions (Section 2.7, p. 62). For example, the following labels result in compile-time errors:

It is also an error for any two case labels to have the same value.

6.6.5 Variable Definitions inside a `switch`

Because execution can flow across case labels, we are not permitted to define a variable in a place that might allow execution to bypass its definition:

Recall that a variable can be used from its point of definition until the end of the block in which it is defined. Now, consider what would happen if we could define a variable between two case labels. That variable would continue to exist until the end of the enclosing block. It could be used by code in case labels following the one in which it was defined. If the switch begins executing in one of these subsequent case labels, then the variable might be used even though it had not been defined.

If we need to define a variable for a particular case, we can do so by defining the variable inside a block, thereby ensuring that the variable can be used only where it is guaranteed to have been defined and initialized:

6.7 The `while` Statement

A while statement repeatedly executes a target statement as long as a condition is true. Its syntactic form is

Exercises Section 6.6.5

Exercise 6.7: There is one problem with our vowel-counting program as we’ve implemented it: It doesn’t count capital letters as vowels. Write a program that counts both lower- and uppercase letters as the appropriate vowel—that is, your program should count both ’a’ and ’A’ as part of aCnt, and so forth.

Exercise 6.8: Modify our vowel-count program so that it also counts the number of blank spaces, tabs, and newlines read.

Exercise 6.9: Modify our vowel-count program so that it counts the number of occurrences of the following two-character sequences: ff, fl, and fi.

Exercise 6.10: Each of the programs in the highlighted text on page 206 contains a common programming error. Identify and correct each error.

The statement (which is often a block) is executed as long as the condition evaluates as true. The condition may not be empty. If the first evaluation of condition yields false, statement is not executed.

The condition can be an expression or an initialized variable definition:

Any variable defined in the condition is visible only within the block associated with the while. On each trip through the loop, the initialized value is converted to bool (Section 5.12.3, p. 182). If the value evaluates as true, the while body is executed. Ordinarily, the condition itself or the loop body must do something to change the value of the expression. Otherwise, the loop might never terminate.

Variables defined in the condition are created and destroyed on each trip through the loop.

Using a `while` Loop

We have already seen a number of while loops, but for completeness, here is an example that copies the contents of one array into another:

We start by initializing source and dest to point to the first element of their respective arrays. The condition in the while tests whether we’ve reached the end of the array from which we are copying. If not, we execute the body of the loop. The body contains only a single statement, which copies the element and increments both pointers so that they point to the next element in their corresponding arrays.

As we saw in the “Advice” box on page 164, C++ programmers tend to write terse expressions. The statement in the body of the while

*dest++ = *source++;

is a classic example. This expression is equivalent to

The assignment in the while loop represents a very common usage. Because such code is widespread, it is important to study this expression until its meaning is immediately clear.

6.8 The `for` Loop Statement

The syntactic form of a for statement is

The init-statement must be a declaration statement, an expression statement, or a null statement. Each of these statements is terminated by a semicolon, so the syntactic form can also be thought of as

although technically speaking, the semicolon after the initializer is part of the statement that begins the for header.

In general, the init-statement is used to initialize or assign a starting value that is modified over the course of the loop. The condition serves as the loop control. As long as condition evaluates as true, statement is executed. If the first evaluation of condition evaluates to false, statement is not executed. The expression usually is used to modify the variable(s) initialized in init-statement and tested in condition. It is evaluated after each iteration of the loop. If condition evaluates to false on the first iteration, expression is never executed. As usual, statement can be either a single or a compound statement.

Exercises Section 6.7

Exercise 6.11: Explain each of the following loops. Correct any problems you detect.

Exercise 6.12: Write a small program to read a sequence of strings from standard input looking for duplicated words. The program should find places in the input where one word is followed immediately by itself. Keep track of the largest number of times a single repetition occurs and which word is repeated. Print the maximum number of duplicates, or else print a message saying that no word was repeated. For example, if the input is

how, now now now brown cow cow

the output should indicate that the word "now" occurred three times.

Exercise 6.13: Explain in detail how the statement in the while loop is executed:

*dest++ = *source++;

Using a `for` Loop

Given the following for loop, which prints the contents of a vector,

the order of evaluation is as follows:

The init-statement is executed once at the start of the loop. In this example, ind is defined and initialized to zero.
Next, condition is evaluated. If ind is not equal to svec.size(), then the for body is executed. Otherwise, the loop terminates. If the condition is false on the first trip, then the for body is not executed.
If the condition is true, the for body executes. In this case, the for body prints the current element and then tests whether this element is the last one. If not, it prints a space to separate it from the next element.
Finally, expression is evaluated. In this example, ind is incremented by 1.

These four steps represent the first iteration of the for loop. Step 2 is now repeated, followed by steps 3 and 4, until the condition evaluates to false—that is, when ind is equal to svec.size().

It is worth remembering that the visibility of any object defined within the for header is limited to the body of the for loop. Thus, in this example, ind is inaccessible after the for completes.

6.8.1 Omitting Parts of the `for` Header

A for header can omit any (or all) of init-statement, condition, or expression.

The init-statement is omitted if an initialization is unnecessary or occurs elsewhere. For example, if we rewrote the program to print the contents of a vector using iterators instead of subscripts, we might, for readability reasons, move the initialization outside the loop:

Note that the semicolon is necessary to indicate the absence of the init-statement— more precisely, the semicolon represents a null init-statement.

If the condition is omitted, then it is equivalent to having written true as the condition:

for (int i = 0; /* no condition */ ; ++i)

It is as if the program were written as

for (int i = 0; true ; ++i)

It is essential that the body of the loop contain a break or return statement. Otherwise the loop will execute until it exhausts the system resources. Similarly, if the expression is omitted, then the loop must exit through a break or return or the loop body must arrange to change the value tested in the condition:

If the body doesn’t change the value of i, then i remains 0 and the test will always succeed.

6.8.2 Multiple Definitions in the `for` Header

Multiple objects may be defined in the init-statement; however, only one statement may appear, so all the objects must be of the same general type:

Exercises Section 6.8.2

Exercise 6.14: Explain each of the following loops. Correct any problems you detect.

Exercise 6.15: The while loop is particularly good at executing while some condition holds; for example, while the end-of-file is not reached, read a next value. The for loop is generally thought of as a step loop: An index steps through a range of values in a collection. Write an idiomatic use of each loop and then rewrite each using the other loop construct. If you were able to program with only one loop, which construct would you choose? Why?

Exercise 6.16: Given two vectors of ints, write a program to determine whether one vector is a prefix of the other. For vectors of unequal length, compare the number of elements of the smaller vector. For example, given the vectors (0,1,1,2) and (0,1,1,2,3,5,8), your program should return true.

6.9 The `do while` Statement

We might want to write a program that interactively performs some calculation for its user. As a simple example, we might want to do sums for the user: Our program prompts the user for a pair of numbers and produces their sum. Having generated one sum, we’d like the program to give the user the option to repeat the process and generate another.

The body of this program is pretty easy. We’ll need to write a prompt, then read a pair of values and print the sum of the values we read. After we print the sum, we’ll ask the user whether to continue.

The hard part is deciding on a control structure. The problem is that we want to execute the loop until the user asks to exit. In particular, we want to do a sum even on the first iteration. The do while loop does exactly what we need. It guarantees that its body is always executed at least once. The syntactic form is as follows:

Unlike a while statement, a do-while statement always ends with a semicolon.

The statement in a do is executed before condition is evaluated. The condition cannot be empty. If condition evaluates as false, then the loop terminates; otherwise, the loop is repeated. Using a do while, we can write our program:

The body of this loop is similar to others we’ve written and so should be easy to follow. What might be a bit surprising is that we defined rsp before the do rather than defining it inside the loop. Had we defined rsp inside the do, then rsp would go out of scope at the close curly brace before the while. Any variable referenced inside the condition must exist before the do statement itself.

Because the condition is not evaluated until after the statement or block is executed, the do while loop does not allow variable definitions:

If we could define variables in the condition, then any use of the variable would happen before the variable was defined!

Exercises Section 6.9

Exercise 6.17: Explain each of the following loops. Correct any problems you detect.

Exercise 6.18: Write a small program that requests two strings from the user and reports which string is lexicographically less than the other (that is, comes before the other alphabetically). Continue to solicit the user until the user requests to quit. Use the string type, the string less-than operator, and a do while loop.

6.10 The `break` Statement

A break statement terminates the nearest enclosing while, do while, for, or switch statement. Execution resumes at the statement immediately following the terminated statement. For example, the following loop searches a vector for the first occurrence of a particular value. Once it’s found, the loop is exited:

In this example, the break terminates the while loop. Execution resumes at the if statement immediately following the while.

A break can appear only within a loop or switch statement or in a statement nested inside a loop or switch. A break may appear within an if only when the if is inside a switch or a loop. A break occurring outside a loop or switch is a compile-time error. When a break occurs inside a nested switch or loop statement, the enclosing loop or switch statement is unaffected by the termination of the inner switch or loop:

The break labeled #1 terminates the for loop within the hyphen case label. It does not terminate the enclosing switch statement and in fact does not even terminate the processing for the current case. Processing continues with the first statement following the for, which might be additional code to handle the hyphen case or the break that completes that section.

The break labeled #2 terminates the switch statement after handling the hyphen case but does not terminate the enclosing while loop. Processing continues after that break by executing the condition in the while, which reads the next string from the standard input.

Exercises Section 6.10

Exercise 6.19: The first program in this section could be written more succinctly. In fact, most of its action could be contained in the condition in the while. Rewrite the loop so that the loop body increments the iterator but the work of finding the element is done inside the condition.

Exercise 6.20: Write a program to read a sequence of strings from standard input until either the same word occurs twice in succession or all the words have been read. Use a while loop to read the text one word at a time. Use the break statement to terminate the loop if a word occurs twice in succession. Print the word if it occurs twice in succession, or else print a message saying that no word was repeated.

6.11 The `continue` Statement

A continue statement causes the current iteration of the nearest enclosing loop to terminate. Execution resumes with the evaluation of the condition in the case of a while or do while statement. In a for loop, execution continues by evaluating the expression inside the for header.

For example, the following loop reads the standard input one word at a time. Only words that begin with an underscore will be processed. For any other value, we terminate the current iteration and get the next input:

A continue can appear only inside a for, while, or do while loop, including inside blocks nested inside such loops.

Exercises Section 6.11

Exercise 6.21: Revise the program from the last exercise in Section 6.10 (p. 213) so that it looks only for duplicated words that start with an uppercase letter.

6.12 The `goto` Statement

A goto statement provides an unconditional jump from the goto to a labeled statement in the same function.

Use of gotos has been discouraged since the late 1960s. gotos make it difficult to trace the control flow of a program, making the program hard to understand and hard to modify. Any program that uses a goto can be rewritten so that it doesn’t need the goto.

The syntactic form of a goto statement is

goto label;

where label is an identifier that identifies a labeled statement. A labeled statement is any statement that is preceded by an identifier followed by a colon:

end: return; // labeled statement, may be target of a goto

The identifier that forms the label may be used only as the target of a goto. For this reason, label identifiers may be the same as variable names or other identifiers in the program without interfering with other uses of those identifiers. The goto and the labeled statement to which it transfers control must be in the same function.

A goto may not jump forward over a variable definition:

If definitions are needed between a goto and its corresponding label, the definitions must be enclosed in a block:

A jump backward over an already executed definition is okay. Why? Jumping over an unexecuted definition would mean that a variable could be used even though it was never defined. Jumping back to a point before a variable is defined destroys the variable and constructs it again:

Note that sz is destroyed when the goto executes and is defined and initialized anew when control jumps back to begin:.

Exercises Section 6.12

Exercise 6.22: The last example in this section that jumped back to begin could be better written using a loop. Rewrite the code to eliminate the goto.

6.13 `try` Blocks and Exception Handling

Handling errors and other anomalous behavior in programs can be one of the most difficult parts of designing any system. Long-lived, interactive systems such as communication switches and routers can devote as much as 90 percent of their code to error detection and error handling. With the proliferation of Web-based applications that run indefinitely, attention to error handling is becoming more important to more and more programmers.

Exceptions are run-time anomalies, such as running out of memory or encountering unexpected input. Exceptions exist outside the normal functioning of the program and require immediate handling by the program.

In well-designed systems, exceptions represent a subset of the program’s error handling. Exceptions are most useful when the code that detects a problem cannot handle it. In such cases, the part of the program that detects the problem needs a way to transfer control to the part of the program that can handle the problem. The error-detecting part also needs to be able to indicate what kind of problem occurred and may want to provide additional information.

Exceptions support this kind of communication between the error-detecting and error-handling parts of a program. In C++ exception handling involves:

throw expressions, which the error-detecting part uses to indicate that it encountered an error that it cannot handle. We say that a throw raises an exceptional condition.
try blocks, which the error-handling part uses to deal with an exception. A try block starts with keyword try and ends with one or more catch clauses. Exceptions thrown from code executed inside a try block are usually handled by one of the catch clauses. Because they “handle” the exception, catch clauses are known as handlers.
A set of exception classes defined by the library, which are used to pass the information about an error between a throw and an associated catch.

In the remainder of this section we’ll introduce these three components of exception handling. We’ll have more to say about exceptions in Section 17.1 (p. 688).

6.13.1 A `throw` Expression

An exception is thrown using a throw expression, which consists of the keyword throw followed by an expression. A throw expression is usually followed by a semicolon, making it into an expression statement. The type of the expression determines what kind of exception is thrown.

As a simple example, recall the program on page 24 that added two objects of type Sales_item. That program checked whether the records it read referred to the same book. If not, it printed a message and exited.

In a less simple program that used Sales_items, the part that adds the objects might be separated from the part that manages the interaction with a user. In this case, we might rewrite the test to throw an exception instead:

In this code we check whether the ISBNs differ. If so, we discontinue execution and transfer control to a handler that will know how to handle this error.

A throw takes an expression. In this case, that expression is an object of type runtime_error. The runtime_error type is one of the standard library exception types and is defined in the stdexcept header. We’ll have more to say about these types shortly. We create a runtime_error by giving it a string, which provides additional information about the kind of problem that occurred.

6.13.2 The `try` Block

The general form of a try block is

A try block begins with the keyword try followed by a block, which as usual is a sequence of statements enclosed in braces. Following this block is a list of one or more catch clauses. A catch clause consists of three parts: the keyword catch, the declaration of a single type or single object within parentheses (referred to as an exception specifier), and a block. If the catch clause is selected to handle an exception, the associated block is executed. Once the catch clause finishes, execution continues with the statement immediately following the last catch clause of the try block.

The program-statements inside the try constitute the normal logic of the program. They can contain any C++ statement, including declarations. Like any block, the program-statements block introduces a local scope. Variables declared in this block are inaccessible outside the block—in particular, they are not accessible to the catch clauses.

Writing a Handler

In the preceeding example we used a throw to avoid adding two Sales_items that represented different books. We imagined that the part of the program that added to Sales_items was separate from the part that communicated with the user. The part that interacts with the user might contain code something like the following to handle the exception that was thrown:

Following the try keyword is the program-statements block. That block would invoke the part of the program that processes Sales_item objects. That part might throw an exception of type runtime_error.

This try block has a single catch clause, which handles exceptions of type runtime_error. The statements in the block following the catch define the actions that will be executed if code inside the program-statements block throws a runtime_error. Our catch handles the error by printing a message and asking the user to indicate whether to continue. If the user enters an ’n’, then we break out of the while. Otherwise the loop continues by reading two new Sales_items.

The prompt to the user prints the return from err.what(). We know that err has type runtime_error, so we can infer that what is a member function (Section 1.5.2, p. 24) of the runtime_error class. Each of the library exception classes defines a member function named what. This function takes no arguments and returns a C-style character string. In the case of runtime_error, the C-style string that what returns is a copy of the string that was used to initialize the runtime_error. If the code described in the previous section threw an exception, then the output printed by this catch would be

Data must refer to same ISBN
Try Again? Enter y or n

Functions Are Exited during the Search for a Handler

In complex systems the execution path of a program may pass through multiple try blocks before encountering code that actually throws an exception. For example, a try block might call a function that contains a try, that calls another function with its own try, and so on.

The search for a handler reverses the call chain. When an exception is thrown, the function that threw the exception is searched first. If no matching catch is found, the function terminates, and the function that called the one that threw is searched for a matching catch. If no handler is found, then that function also exits and the function that called it is searched; and so on back up the execution path until a catch of an appropriate type is found.

If no catch clause capable of handling the exception exists, program execution is transferred to a library function named terminate, which is defined in the exception header. The behavior of that function is system dependent, but it usually aborts the program.

Exceptions that occur in programs that define no try blocks are handled in the same manner: After all, if there are no try blocks, there can be no handlers for any exception that might be thrown. If an exception occurs, then terminate is called and the program (ordinarily) is aborted.

Exercises Section 6.13.2

Exercise 6.23: The bitset operation to_ulong throws an overflow_error exception if the bitset is larger than the size of an unsigned long. Write a program that generates this exception.

Exercise 6.24: Revise your program to catch this exception and print a message.

6.13.3 Standard Exceptions

The C++ library defines a set of classes that it uses to report problems encountered in the functions in the standard library. These standard exception classes are also intended to be used in the programs we write. Library exception classes are defined in four headers:

The exception header defines the most general kind of exception class named exception. It communicates only that an exception occurs but provides no additional information.
The stdexcept header defines several general purpose exception classes. These types are listed in Table 6.1 on the following page.

Table 6.1. Standard Exception Classes Defined in <stdexcept>
The new header defines the bad_alloc exception type, which is the exception thrown by new (Section 5.11, p. 174) if it cannot allocate memory.
The type_info header defines the bad_cast exception type, which we will discuss in Section 18.2 (p. 772).

Standard Library Exception Classes

The library exception classes have only a few operations. We can create, copy, and assign objects of any of the exception types. The exception, bad_alloc, and bad_cast types define only a default constructor (Section 2.3.4, p. 50); it is not possible to provide an initializer for objects of these types. The other exception types define only a single constructor that takes a string initializer. When we create objects of any of these other exception types, we must supply a string argument. That string initializer is used to provide additional information about the error that occurred.

The exception types define only a single operation named what. That function takes no arguments and returns a const char*. The pointer it returns points to a C-style character string (Section 4.3, p. 130). The purpose of this C-style character string is to provide some sort of textual description of the exception thrown.

The contents of the C-style character array to which what returns a pointer depends on the type of the exception object. For the types that take a string initializer, the what function returns that string as a C-style character array. For the other types, the value returned varies by compiler.

6.14 Using the Preprocessor for Debugging

In Section 2.9.2 (p. 71) we learned how to use preprocessor variables to prevent header files being included more than once. C++ programmers sometimes use a technique similar to header guards to conditionally execute debugging code. The idea is that the program will contain debugging code that is executed only while the program is being developed. When the application is completed and ready to ship, the debugging code is turned off. We can write conditional debugging code using the NDEBUG preprocessor variable:

     int main()
     {
     #ifndef NDEBUG
     cerr << "starting main" << endl;
     #endif
     // ...

If NDEBUG is not defined, then the program writes the message to cerr. If NDEBUG is defined, then the program executes without ever passing through the code between the #ifndef and the #endif.

By default, NDEBUG is not defined, meaning that by default, the code inside the #ifndef and #endif is processed. When the program is being developed, we leave NDEBUG undefined so that the debugging statements are executed. When the program is built for delivery to customers, these debugging statements can be (effectively) removed by defining the NDEBUG preprocessor variable. Most compilers provide a command line option that defines NDEBUG:

$ CC -D NDEBUG main.C

has the same effect as writing #define NDEBUG at the beginning of main.C.

The preprocessor defines four other constants that can be useful in debugging:

_ _FILE_ _ name of the file.

_ _LINE_ _ current line number.

_ _TIME_ _ time the file was compiled.

_ _DATE_ _ date the file was compiled.

We might use these constants to report additional information in error messages:

If we give this program a string that is shorter than the threshold, then the following error message will be generated:

The `assert` Preprocessor Macro

Another common debugging approach uses the NDEBUG preprocessor variable and the assert preprocessor macro. The assert macro is defined in the cassert header, which we must include in any file that uses assert.

Like preprocessor directives (Section 2.9.2, p. 69), macros are handled by the preprocessor. Typically a macro contains a body of C++ code. The preprocessor inserts that code wherever we use the name of the macro.

As with preprocessor variables (Section 2.9.2, p. 71), macro names must be unique within the program. Programs that include the cassert header may not define a variable, function, or other entity named assert. In addition, because the preprocessor, not the compiler, handles preprocessor names, those names are not defined within normal C++ scopes. As a result, we use preprocessor names directly and do not provide a using declaration for them. That is, we refer to assert, not std::assert, and provide no using declaration for assert.

A preprocessor macro acts something like a function call. The assert macro takes a single expression, which it uses as a condition:

assert(expr)

As long as NDEBUG is not defined, the assert macro evaluates the condition and if the result is false, then assert writes a message and terminates the program. If the expression has a nonzero (e.g., true) value, then assert does nothing.

Unlike exceptions, which deal with errors that a program expects might happen in production, programmers use assert to test conditions that “cannot happen.” For example, a program that does some manipulation of input text might know that all words it is given are always longer than a threshold. That program might contain a statement such as:

assert(word.size() > threshold);

During testing the assert has the effect of verifying that the data are always of the expected size. Once development and test are complete, the program is built and NDEBUG is defined. In production code, assert does nothing, so there is no run-time cost. Of course, there is also no run-time check. assert should be used only to verify things that truly should not be possible. It can be useful as an aid in getting a program debugged but should not be used to substitute for run-time logic checks or error checking that the program should be doing.

Exercises Section 6.14

Exercise 6.25: Revise the program you wrote for the exercise in Section 6.11 (p. 214) to conditionally print information about its execution. For example, you might print each word as it is read to let you determine whether the loop correctly finds the first duplicated word that begins with an uppercase letter. Compile and run the program with debugging turned on and again with it turned off.

Exercise 6.26: What happens in the following loop:

Explain whether this usage seems like a good application of the assert macro.

Exercise 6.27: Explain this loop:

Chapter Summary

C++ provides a fairly limited number of statements. Most of these affect the flow of control within a program:

while, for, and do while statements, which implement iterative loops

if and switch, which provide conditional execution

continue, which stops the current iteration of a loop

break, which exits a loop or switch statement

goto, which transfers control to a labeled statement

try, catch, which define a try block enclosing a sequence of statements that might throw an exception. The catch clause(s) are intended to handle the exception(s) that the enclosed code might throw.

throw expressions, which exit a block of code, transferring control to an associated catch clause

There is also a return statement, which will be covered in Chapter 7.

In addition, there are expression statements and declaration statements. An expression statement causes the subject expression to be evaluated. Declarations and definitions of variables were described in Chapter 2.

Defined Terms

assert

Preprocessor macro that takes a single expression, which it uses as a condition. If the preprocessor variable NDEBUG is not defined, then assert evaluates the condition. If the condition is false, assert writes a message and terminates the program.

block

A sequence of statements enclosed in curly braces. A block is a statement, so it can appear anywhere a statement is expected.

break statement

Terminates the nearest enclosing loop or switch statement. Execution transfers to the first statement following the terminated loop or switch.

case label

Integral constant value that follows the keyword case in a switch statement. No two case labels in the same switch statement may have the same value. If the value in the switch condition is equal to that in one of the case labels, control transfers to the first statement following the matched label. Execution continues from that point until a break is encountered or it flows off the end of the switch statement.

catch clause

The catch keyword, an exception specifier in parentheses, and a block of statements. The code inside a catch clause does whatever is necessary to handle an exception of the type defined in its exception specifier.

compound statement

Synonym for block.

continue statement

Terminates the current iteration of the nearest enclosing loop. Execution transfers to the loop condition in a while or do or to the expression in the for header.

dangling else

Colloquial term used to refer to the problem of how to process nested if statements in which there are more ifs than elses. In C++, an else is always paired with the closest preceding unmatched if. Note that curly braces can be used to effectively hide an inner if so that the programmer can control with which if a given else should be matched.

declaration statement

A statement that defines or declares a variable. Declarations were covered in Chapter 2.

default label

The switch case label that matches any otherwise unmatched value computed in the switch condition.

exception classes

Set of classes defined by the standard library to be used to represent errors. Table 6.1 (p. 220) lists the general purpose exceptions.

exception handler

Code that deals with an exception raised in another part of the program. Synonym for catch clause.

exception specifier

The declaration of an object or a type that indicates the kind of exceptions a catch clause can handle.

expression statement

An expression followed by a semicolon. An expression statement causes the expression to be evaluated.

flow of control

Execution path through a program.

goto statement

Statement that causes an unconditional transfer of control to a specified labeled statement elsewhere in the program. gotos obfuscate the flow of control within a program and should be avoided.

if else statement

Conditional execution of code following the if or the else, depending on the truth value of the condition.

if statement

Conditional execution based on the value of the specified condition. If the condition is true, then the if body is executed. If not, control flows to the statement following the if.

labeled statement

A statement preceded by a label. A label is an identifier followed by a colon.

null statement

An empty statement. Indicated by a single semicolon.

preprocessor macro

Function like facility defined by the preprocessor. assert is a macro. Modern C++ programs make very little use of the preprocessor macros.

raise

Often used as a synonym for throw. C++ programmers speak of “throwing” or “raising” an exception interchangably.

switch statement

A conditional execution statement that starts by evaluating the expression that follows the switch keyword. Control passes to the labeled statement with a case label that matches the value of the expression. If there is no matching label, execution either branches to the default label, if there is one, or falls out of the switch if there is no default label.

terminate

Library function that is called if an exception is not caught. Usually aborts the program.

throw expression

Expression that interrupts the current execution path. Each throw throws an object and transfers control to the nearest enclosing catch clause that can handle the type of exception that is thrown.

try block

A block enclosed by the keyword try and one or more catch clauses. If the code inside the try block raises an exception and one of the catch clauses matches the type of the exception, then the exception is handled by that catch. Otherwise, the exception is handled by an enclosing try block or the program terminates.

while loop

Control statement that executes its target statement as long as a specified condition is true. The statement is executed zero or more times, depending on the truth value of the condition.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 6. Statements

Create new playlist

Sign In

Sign Up