8. Pointers and Pointer-Based Strings

Objectives

In this chapter you’ll learn:

Image  What pointers are.

Image  The similarities and differences between pointers and references, and when to use each.

Image  To use pointers to pass arguments to functions by reference.

Image  To use pointer-based C-style strings.

Image  The close relationships among pointers, arrays and C-style strings.

Image  To use pointers to functions.

Image  To declare and use arrays of C-style strings.

Addresses are given to us to conceal our whereabouts.

Saki (H. H. Munro)

By indirection find direction out.

William Shakespeare

Many things, having full reference To one consent, may work contrariously.

William Shakespeare

You will find it a very good practice always to verify your references, sir!

Dr. Routh

Outline

8.1 Introduction

This chapter discusses one of the most powerful features of the C++ programming language, the pointer. In Chapter 6, we saw that references can be used to perform pass-by-reference. Pointers also enable pass-by-reference and can be used to create and manipulate dynamic data structures (i.e., data structures that can grow and shrink), such as linked lists, queues, stacks and trees. This chapter explains basic pointer concepts and reinforces the intimate relationship among arrays and pointers. The view of arrays as pointers derives from the C programming language. As we saw in Chapter 7, C++ Standard Library class vector provides an implementation of arrays as full-fledged objects.

Similarly, C++ actually offers two types of strings—string class objects (which we have been using since Chapter 3) and C-style, char * pointer-based strings. This chapter on pointers discusses char * strings to deepen your knowledge of pointers. In fact, the null-terminated strings that we introduced in Section 7.4 and used in Fig. 7.12 are char * pointer-based strings. C-style, char * pointer-based strings are widely used in legacy C and C++ systems. So, if you work with legacy C or C++ systems, you may be required to manipulate these char * pointer-based strings.

We’ll examine the use of pointers with classes in Chapter 13, Object-Oriented Programming: Polymorphism, where we’ll see that the so-called “polymorphic processing” of object-oriented programming is performed with pointers and references.

8.2 Pointer Variable Declarations and Initialization

Pointer variables contain memory addresses as their values. Normally, a variable directly contains a specific value. However, a pointer contains the memory address of a variable that, in turn, contains a specific value. In this sense, a variable name directly references a value, and a pointer indirectly references a value (Fig. 8.1). Referencing a value through a pointer is often called indirection. Note that diagrams typically represent a pointer as an arrow from the variable that contains an address to the variable located at that address in memory.

Fig. 8.1 Directly and indirectly referencing a variable.

Directly and indirectly referencing a variable.

Pointers, like any other variables, must be declared before they can be used. For example, for the pointer in Fig. 8.1, the declaration

int *countPtr, count;

declares the variable countPtr to be of type int * (i.e., a pointer to an int value) and is read, “countPtr is a pointer to int” or “countPtr points to an object of type int.” Also, variable count in the preceding declaration is declared to be an int, not a pointer to an int. The * in the declaration applies only to countPtr. Each variable being declared as a pointer must be preceded by an asterisk (*). For example, the declaration

double *xPtr, *yPtr;

indicates that both xPtr and yPtr are pointers to double values. When * appears in a declaration, it is not an operator; rather, it indicates that the variable being declared is a pointer. Pointers can be declared to point to objects of any data type.

Common Programming Error 8.1

Common Programming Error 8.1

Assuming that the * used to declare a pointer distributes to all variable names in a declaration’s comma-separated list of variables can lead to errors. Each pointer must be declared with the * prefixed to the name (either with or without a space in between—the compiler ignores the space). Declaring only one variable per declaration helps avoid these types of errors and improves program readability.

Good Programming Practice 8.1

Good Programming Practice 8.1

Although it is not a requirement, including the letters Ptr in pointer variable names makes it clear that these variables are pointers and that they must be handled accordingly.

Pointers should be initialized either when they are declared or in an assignment. A pointer may be initialized to 0, NULL or an address of the corresponding type. A pointer with the value 0 or NULL points to nothing and is known as a null pointer. Symbolic constant NULL is defined in header file <iostream> (and in several other standard library header files) to represent the value 0. Initializing a pointer to NULL is equivalent to initializing a pointer to 0, but in C++, 0 is used by convention. When 0 is assigned, it is converted to a pointer of the appropriate type. The value 0 is the only integer value that can be assigned directly to a pointer variable without first casting the integer to a pointer type. Assigning a variable’s numeric address to a pointer is discussed in Section 8.3.

Error-Prevention Tip 8.1

Error-Prevention Tip 8.1

Initialize pointers to prevent pointing to unknown or uninitialized areas of memory.

8.3 Pointer Operators

The address operator (&) is a unary operator that obtains the memory address of its operand. For example, assuming the declarations

int y = 5; // declare variable y
int *yPtr; // declare pointer variable yPtr

the statement

yPtr = &y; // assign address of y to yPtr

assigns the address of the variable y to pointer variable yPtr. Then variable yPtr is said to “point to” y. Now, yPtr indirectly references variable y’s value. Note that the use of the & in the preceding statement is not the same as the use of the & in a reference variable declaration, which is always preceded by a data-type name. When declaring a reference, the & is part of the type. In an expression like &y, the & is an operator.

Figure 8.2 shows a schematic representation of memory after the preceding assignment. The “pointing relationship” is indicated by drawing an arrow from the box that represents the pointer yPtr in memory to the box that represents the variable y in memory.

Fig. 8.2 Graphical representation of a pointer pointing to a variable in memory.

Graphical representation of a pointer pointing to a variable in memory.

Figure 8.3 shows another representation of the pointer in memory, assuming that integer variable y is stored at memory location 600000 and that pointer variable yPtr is stored at memory location 500000. The operand of the address operator must be an lvalue (i.e., something to which a value can be assigned, such as a variable name or a reference); the address operator cannot be applied to constants or to expressions that do not result in references.

Fig. 8.3 Representation of y and yPtr in memory.

Representation of y and yPtr in memory.

The * operator, commonly referred to as the indirection operator or dereferencing operator, returns a synonym (i.e., an alias or a nickname) for the object to which its pointer operand points. For example (referring again to Fig. 8.2), the statement

cout << *yPtr << endl;

prints the value of variable y, namely, 5, just as the statement

cout << y << endl;

would. Using * in this manner is called dereferencing a pointer. Note that a dereferenced pointer may also be used on the left side of an assignment statement, as in

*yPtr = 9;

which would assign 9 to y in Fig. 8.3. The dereferenced pointer may also be used to receive an input value as in

cin >> *yPtr;

which places the input value in y. The dereferenced pointer is an lvalue.

Common Programming Error 8.2

Common Programming Error 8.2

Dereferencing a pointer that has not been properly initialized or that has not been assigned to point to a specific location in memory could cause a fatal execution-time error, or it could accidentally modify important data and allow the program to run to completion, possibly with incorrect results.

Common Programming Error 8.3

Common Programming Error 8.3

An attempt to dereference a variable that is not a pointer is a compilation error.

Common Programming Error 8.4

Common Programming Error 8.4

Dereferencing a null pointer is often a fatal execution-time error.

The program in Fig. 8.4 demonstrates the & and * pointer operators. Memory locations are output by << in this example as hexadecimal (i.e., base-16) integers. Note that the hexadecimal memory addresses output by this program are compiler and operating-system dependent, so you may get different results when you run the program.

Fig. 8.4 Pointer operators & and *.

 1   // Fig. 8.4: fig08_04.cpp
 2   // Pointer operators & and *.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   int main( )
 8   {
 9      int a; // a is an integer
10      int *aPtr; // aPtr is an int * -- pointer to an integer
11
12      a = 7; // assigned 7 to a
13      aPtr = &a; // assign the address of a to aPtr
14
15      cout << "The address of a is " << &a
16         << " The value of aPtr is " << aPtr;
17      cout << " The value of a is " << a
18         << " The value of *aPtr is " << *aPtr;
19      cout << " Showing that * and & are inverses of "
20         << "each other. &*aPtr = " << &*aPtr
21         << " *&aPtr = " << *&aPtr << endl;
22      return 0; // indicates successful termination
23   } // end main

The address of a is 0012F580
The value of aPtr is 0012F580

The value of a is 7
The value of *aPtr is 7

Showing that * and & are inverses of each other.
&*aPtr = 0012F580
*&aPtr = 0012F580

Portability Tip 8.1

Portability Tip 8.1

The format in which a pointer is output is compiler dependent. Some output pointer values as hexadecimal integers, some use decimal integers and some use other formats.

Notice that the address of a (line 15) and the value of aPtr (line 16) are identical in the output, confirming that the address of a is indeed assigned to the pointer variable aPtr. The & and * operators are inverses of one another—when they are both applied consecutively to aPtr in either order, they “cancel one another out” and the same result (the value in aPtr) is printed.

Figure 8.5 lists the precedence and associativity of the operators introduced to this point. Note that the address operator (&) and the dereferencing operator (*) are unary operators on the third level of precedence in the chart.

Fig. 8.5 Operator precedence and associativity.

Image
Image

8.4 Passing Arguments to Functions by Reference with Pointers

There are three ways in C++ to pass arguments to a function—pass-by-value, pass-by-reference with reference arguments and pass-by-reference with pointer arguments. Chapter 6 compared and contrasted pass-by-value and pass-by-reference with reference arguments. In this section, we explain pass-by-reference with pointer arguments.

As we saw in Chapter 6, return can be used to return one value from a called function to a caller (or to return control from a called function without passing back a value). We also saw that arguments can be passed to a function using reference arguments. Such arguments enable the called function to modify the original values of the arguments in the caller. Reference arguments also enable programs to pass large data objects to a function and avoid the overhead of passing the objects by value (which, of course, requires making a copy of the object). Pointers, like references, also can be used to modify one or more variables in the caller or to pass pointers to large data objects to avoid the overhead of passing the objects by value.

In C++, programmers can use pointers and the indirection operator (*) to accomplish pass-by-reference (exactly as pass-by-reference is done in C programs—C does not have references). When calling a function with an argument that should be modified, the address of the argument is passed. This is normally accomplished by applying the address operator (&) to the name of the variable whose value will be modified.

As we saw in Chapter 7, arrays are not passed using operator &, because the name of the array is the starting location in memory of the array (i.e., an array name is already a pointer). The name of an array, arrayName, is equivalent to &arrayName[ 0 ]. When the address of a variable is passed to a function, the indirection operator (*) can be used in the function to form a synonym for the name of the variable—this in turn can be used to modify the value of the variable at that location in the caller’s memory.

Figure 8.6 and Fig. 8.7 present two versions of a function that cubes an integer—cubeByValue and cubeByReference. Figure 8.6 passes variable number by value to function cubeByValue (line 15). Function cubeByValue (lines 21–24) cubes its argument and passes the new value back to main using a return statement (line 23). The new value is assigned to number (line 15) in main. Note that the calling function has the opportunity to examine the result of the function call before modifying variable number’s value. For example, in this program, we could have stored the result of cubeByValue in another variable, examined its value and assigned the result to number only after determining that the returned value was reasonable.

Fig. 8.6 Pass-by-value used to cube a variable’s value.

 1   // Fig. 8.6: fig08_06.cpp
 2   // Pass-by-value used to cube a variable's value.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   int cubeByValue( int ); // prototype
 8
 9   int main( )
10   {
11      int number = 5;
12
13      cout << "The original value of number is " << number;
14
15      number = cubeByValue( number ); // pass number by value to cubeByValue
16      cout << " The new value of number is " << number << endl;
17      return 0; // indicates successful termination
18   } // end main
19
20   // calculate and return cube of integer argument               
21   int cubeByValue( int n )                                       
22   {                                                              
23      return n * n * n; // cube local variable n and return result
24   } // end function cubeByValue                                  

The original value of number is 5
The new value of number is 125

Fig. 8.7 Pass-by-reference with a pointer argument used to cube a variable’s value.

 1   // Fig. 8.7: fig08_07.cpp
 2   // Pass-by-reference with a pointer argument used to cube a
 3   // variable's value.
 4   #include <iostream>
 5   using std::cout;
 6   using std::endl;
 7
 8   void cubeByReference( int * ); // prototype
 9
10   int main( )
11   {
12      int number = 5;
13
14      cout << "The original value of number is " << number;
15
16      cubeByReference( &number ); // pass number address to cubeByReference
17
18      cout << " The new value of number is " << number << endl;
19      return 0; // indicates successful termination
20   } // end main
21
22   // calculate cube of *nPtr; modifies variable number in main
23   void cubeByReference( int *nPtr )                           
24   {                                                           
25      *nPtr = *nPtr * *nPtr * *nPtr; // cube *nPtr             
26   } // end function cubeByReference                           

The original value of number is 5
The new value of number is 125

Figure 8.7 passes the variable number to function cubeByReference using pass-by-reference with a pointer argument (line 16)—the address of number is passed to the function. Function cubeByReference (lines 23–26) specifies parameter nPtr (a pointer to int) to receive its argument. The function dereferences the pointer and cubes the value to which nPtr points (line 25). This directly changes the value of number in main.

Common Programming Error 8.5

Common Programming Error 8.5

Not dereferencing a pointer when it is necessary to do so to obtain the value to which the pointer points is an error.

A function receiving an address as an argument must define a pointer parameter to receive the address. For example, the header for function cubeByReference (line 23) specifies that cubeByReference receives the address of an int variable (i.e., a pointer to an int) as an argument, stores the address locally in nPtr and does not return a value.

The function prototype for cubeByReference (line 8) contains int * in parentheses. As with other variable types, it is not necessary to include names of pointer parameters in function prototypes. Parameter names included for documentation purposes are ignored by the compiler.

Figures 8.88.9 analyze graphically the execution of the programs in Fig. 8.6 and Fig. 8.7, respectively.

Fig. 8.8 Pass-by-value analysis of the program of Fig. 8.6.

Pass-by-value analysis of the program of Fig. .

Fig. 8.9 Pass-by-reference analysis (with a pointer argument) of the program of Fig. 8.7.

Pass-by-reference analysis (with a pointer argument) of the program of Fig. .

Software Engineering Observation 8.1

Software Engineering Observation 8.1

Use pass-by-value to pass arguments to a function unless the caller explicitly requires that the called function directly modify the value of the argument variable in the caller. This is another example of the principle of least privilege.

In the function header and in the prototype for a function that expects a one-dimensional array as an argument, the pointer notation in the parameter list of cubeByReference may be used. The compiler does not differentiate between a function that receives a pointer and a function that receives a one-dimensional array. This, of course, means that the function must “know” when it is receiving an array or simply a single variable which is being passed by reference. When the compiler encounters a function parameter for a one-dimensional array of the form int b[ ], the compiler converts the parameter to the pointer notation int *b (pronounced “b is a pointer to an integer”). Both forms of declaring a function parameter as a one-dimensional array are interchangeable.

8.5 Using const with Pointers

Recall that the const qualifier enables you to inform the compiler that the value of a particular variable should not be modified.

Over the years, a large base of legacy code was written in early versions of C that did not use const, because it was not available. For this reason, there are great opportunities for improvement in the software engineering of old (also called “legacy”) C code. Also, many programmers currently using ANSI C and C++ do not use const in their programs, because they began programming in early versions of C. These programmers are missing many opportunities for good software engineering.

Many possibilities exist for using (or not using) const with function parameters. How do you choose the most appropriate of these possibilities? Let the principle of least privilege be your guide. Always award a function enough access to the data in its parameters to accomplish its specified task, but no more. This section discusses how to combine const with pointer declarations to enforce the principle of least privilege.

Chapter 6 explained that when a function is called using pass-by-value, a copy of the argument (or arguments) in the function call is made and passed to the function. If the copy is modified in the function, the original value is maintained in the caller without change. In many cases, a value passed to a function is modified so that the function can accomplish its task. However, in some instances, the value should not be altered in the called function, even though the called function manipulates only a copy of the original value.

For example, consider a function that takes a one-dimensional array and its size as arguments and subsequently prints the array. Such a function should loop through the array and output each array element individually. The size of the array is used in the function body to determine the highest subscript of the array so the loop can terminate when the printing completes. The size of the array does not change in the function body, so it should be declared const. Of course, because the array is only being printed, it, too, should be declared const. This is especially important because an entire array is always passed by reference and could easily be changed in the called function.

Software Engineering Observation 8.2

Software Engineering Observation 8.2

If a value does not (or should not) change in the body of a function to which it is passed, the parameter should be declared const to ensure that it is not accidentally modified.

If an attempt is made to modify a const value, a warning or an error is issued, depending on the particular compiler.

Error-Prevention Tip 8.2

Error-Prevention Tip 8.2

Before using a function, check its function prototype to determine the parameters that it can modify.

There are four ways to pass a pointer to a function: a nonconstant pointer to nonconstant data (Fig. 8.10), a nonconstant pointer to constant data (Fig. 8.11 and Fig. 8.12), a constant pointer to nonconstant data (Fig. 8.13) and a constant pointer to constant data (Fig. 8.14). Each combination provides a different level of access privileges.

Nonconstant Pointer to Nonconstant Data

The highest access is granted by a nonconstant pointer to nonconstant data—the data can be modified through the dereferenced pointer, and the pointer can be modified to point to other data. The declaration for such a pointer does not include const. Such a pointer can be used to receive a null-terminated string in a function that changes the pointer value to process (and possibly modify) each character in the string. Recall from Section 7.4 that a null-terminated string can be placed in a character array that contains the characters of the string and a null character indicating where the string ends.

In Fig. 8.10, function convertToUppercase (lines 25–34) declares parameter sPtr (line 25) to be a nonconstant pointer to nonconstant data (again, const is not used). The function processes one character at a time from the null-terminated string stored in character array phrase (lines 27–33). Keep in mind that a character array’s name is really equivalent to a const pointer to the first character of the array, so passing phrase as an argument to convertToUppercase is possible. Function islower (line 29) takes a character argument and returns true if the character is a lowercase letter and false otherwise. Characters in the range 'a' through 'z' are converted to their corresponding uppercase letters by function toupper (line 30); others remain unchanged—function toupper takes one character as an argument. If the character is a lowercase letter, the corresponding uppercase letter is returned; otherwise, the original character is returned. Function toupper and function islower are part of the character-handling library <cctype> (see Chapter 19, Bits, Characters, C-Strings and structs). After processing one character, line 32 increments sPtr by 1 (this would not be possible if sPtr were declared const). When operator ++ is applied to a pointer that points to an array, the memory address stored in the pointer is modified to point to the next element of the array (in this case, the next character in the string). Adding one to a pointer is one valid operation in pointer arithmetic, which is covered in detail in Sections 8.88.9.

Fig. 8.10 Converting a string to uppercase letters using a nonconstant pointer to nonconstant data.

Image

Nonconstant Pointer to Constant Data

A nonconstant pointer to constant data is a pointer that can be modified to point to any data item of the appropriate type, but the data to which it points cannot be modified through that pointer. Such a pointer might be used to receive an array argument to a function that will process each element of the array, but should not be allowed to modify the data. For example, function printCharacters (lines 22–26 of Fig. 8.11) declares parameter sPtr (line 22) to be of type const char *, so that it can receive a null-terminated pointer-based string. The declaration is read from right to left as “sPtr is a pointer to a character constant.” The body of the function uses a for statement (lines 24–25) to output each character in the string until the null character is encountered. After each character is printed, pointer sPtr is incremented to point to the next character in the string (this works because the pointer is not const). Function main creates char array phrase to be passed to printCharacters. Again, we can pass the array phrase to printCharacters because the name of the array is really a pointer to the first character in the array.

Fig. 8.11 Printing a string one character at a time using a nonconstant pointer to constant data.

Image

Figure 8.12 demonstrates the compilation error messages produced when attempting to compile a function that receives a nonconstant pointer to constant data, then tries to use that pointer to modify the data. [Note: Recall that compiler error messages may vary among compilers.]

Fig. 8.12 Attempting to modify data through a nonconstant pointer to constant data.

Image

As we know, arrays are aggregate data types that store related data items of the same type under one name. When a function is called with an array as an argument, the array is passed to the function by reference. However, objects are always passed by value—a copy of the entire object is passed. This requires the execution-time overhead of making a copy of each data item in the object and storing it on the function call stack. When an object must be passed to a function, we can use a pointer to constant data (or a reference to constant data) to get the performance of pass-by-reference and the protection of pass-by-value. When a pointer to an object is passed, only a copy of the address of the object must be made—the object itself is not copied. On a machine with four-byte addresses, a copy of four bytes of memory is made rather than a copy of a possibly large object.

Performance Tip 8.1

Performance Tip 8.1

If they do not need to be modified by the called function, pass large objects using pointers to constant data or references to constant data, to obtain the performance benefits of pass-by-reference.

Siftware Ebgubeerubg Observation 8.3

Software Engineering Observation 8.3

Pass large objects using pointers to constant data, or references to constant data, to obtain the security of pass-by-value.

Constant Pointer to Nonconstant Data

A constant pointer to nonconstant data is a pointer that always points to the same memory location; the data at that location can be modified through the pointer. An example of such a pointer is an array name, which is a constant pointer to the beginning of the array. All data in the array can be accessed and changed by using the array name and array subscripting. A constant pointer to nonconstant data can be used to receive an array as an argument to a function that accesses array elements using array subscript notation. Pointers that are declared const must be initialized when they are declared. (If the pointer is a function parameter, it is initialized with a pointer that is passed to the function.) The program of Fig. 8.13 attempts to modify a constant pointer. Line 11 declares pointer ptr to be of type int * const. The declaration in the figure is read from right to left as “ptr is a constant pointer to a nonconstant integer.” The pointer is initialized with the address of integer variable x. Line 14 attempts to assign the address of y to ptr, but the compiler generates an error message. Note that no error occurs when line 13 assigns the value 7 to *ptr—the nonconstant value to which ptr points can be modified using the dereferenced ptr, even though ptr itself has been declared const.

Fig. 8.13 Attempting to modify a constant pointer to nonconstant data.

Image

Common Programming Error 8.6

Common Programming Error 8.6

Not initializing a pointer that is declared const is a compilation error.

Constant Pointer to Constant Data

The least amount of access privilege is granted by a constant pointer to constant data. Such a pointer always points to the same memory location, and the data at that memory location cannot be modified using the pointer. This is how an array should be passed to a function that only reads the array, using array subscript notation, and does not modify the array. The program of Fig. 8.14 declares pointer variable ptr to be of type const int * const (line 14). This declaration is read from right to left as “ptr is a constant pointer to an integer constant.” The figure shows the error messages generated when an attempt is made to modify the data to which ptr points (line 18) and when an attempt is made to modify the address stored in the pointer variable (line 19). Note that no errors occur when the program attempts to dereference ptr, or when the program attempts to output the value to which ptr points (line 16), because neither the pointer nor the data it points to is being modified in this statement.

Fig. 8.14 Attempting to modify a constant pointer to constant data.

 1   // Fig. 8.14: fig08_14.cpp
 2   // Attempting to modify a constant pointer to constant data.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   int main( )
 8   {
 9      int x = 5, y;
10
11      // ptr is a constant pointer to a constant integer.
12      // ptr always points to the same location; the integer
13      // at that location cannot be modified.
14      const int *const ptr = &x;
15
16      cout << *ptr << endl;
17
18      *ptr = 7; // error: *ptr is const; cannot assign new value
19      ptr = &y; // error: ptr is const; cannot assign new address
20      return 0// indicates successful termination
21   } // end main

Borland C++ command-line compiler error message:

Error E2024 fig08_14.cpp 18: Cannot modify a const object in function main( )
Error E2024 fig08_14.cpp 19: Cannot modify a const object in function main( )

Microsoft Visual C++ compiler error message:

c:cppfp_examplesch08Fig08_14fig08_14.cpp(18) : error C3892: 'ptr' :
   you cannot assign to a variable that is const
c:cppfp_examplesch08Fig08_14fig08_14.cpp(19) : error C3892: 'ptr' :
   you cannot assign to a variable that is const

GNU C++ compiler error message:

fig08_14.cpp: In function `int main( )':
fig08_14.cpp:18: error: assignment of read-only location
fig08_14.cpp:19: error: assignment of read-only variable `ptr'

8.6 Selection Sort Using Pass-by-Reference

In this section, we define a sorting program to demonstrate passing arrays and individual array elements by reference. We use the selection sort algorithm, which is an easy-to-program, but unfortunately inefficient, sorting algorithm. The first iteration of the algorithm selects the smallest element in the array and swaps it with the first element. The second iteration selects the second-smallest element (which is the smallest element of the remaining elements) and swaps it with the second element. The algorithm continues until the last iteration selects the second-largest element and swaps it with the second-to-last index, leaving the largest element in the last index. After the ith iteration, the smallest i items of the array will be sorted into increasing order in the first i elements of the array.

As an example, consider the array

34   56   4   10   77   51   93   30   5   52

A program that implements the selection sort first determines the smallest value (4) in the array, which is contained in element 2. The program swaps the 4 with the value in element 0 (34), resulting in

4   56   34   10   77   51   93   30   5   52

[Note: We use bold to highlight the values that were swapped.] The program then determines the smallest value of the remaining elements (all elements except 4), which is 5, contained in element 8. The program swaps the 5 with the 56 in element 1, resulting in

4   5   34   10   77   51   93   30   56   52

On the third iteration, the program determines the next smallest value, 10, and swaps it with the value in element 2 (34).

4   5   10   34   77   51   93   30   56   52

The process continues until the array is fully sorted.

4   5   10   30   34   51   52   56   77   93

Note that after the first iteration, the smallest element is in the first position. After the second iteration, the two smallest elements are in order in the first two positions. After the third iteration, the three smallest elements are in order in the first three positions.

Figure 8.15 implements selection sort using two functions—selectionSort and swap. Function selectionSort (lines 36–53) sorts the array. Line 38 declares the variable smallest, which will store the index of the smallest element in the remaining array. Lines 41–52 loop size - 1 times. Line 43 sets the index of the smallest element to the current index. Lines 46–49 loop over the remaining elements in the array. For each of these elements, line 48 compares its value to the value of the smallest element. If the current element is smaller than the smallest element, line 49 assigns the current element’s index to smallest. When this loop finishes, smallest will contain the index of the smallest element in the remaining array. Line 51 calls function swap (lines 57–62) to place the smallest remaining element in the next spot in the array (i.e., exchange the array elements array[i] and array[smallest]).

Fig. 8.15 Selection sort with pass-by-reference.

 1   // Fig. 8.15: fig08_15.cpp
 2   // Selection sort with pass-by-reference. This program puts values into an
 3   // array, sorts them into ascending order and prints the resulting array.
 4   #include <iostream>
 5   using std::cout;
 6   using std::endl;
 7
 8   #include <iomanip>
 9   using std::setw;
10
11   void selectionSort( int * constconst int ); // prototype
12   void swap( int * constint * const ); // prototype
13
14   int main( )
15   {
16      const int arraySize = 10;
17      int a[ arraySize ] = { 2648101289684537 };
18
19      cout << "Data items in original order ";
20
21      for ( int i = 0; i < arraySize; i++ )
22         cout << setw( 4 ) << a[ i ];
23
24      selectionSort( a, arraySize ); // sort the array
25
26      cout << " Data items in ascending order ";
27
28      for ( int j = 0; j < arraySize; j++ )
29         cout << setw( 4 ) << a[ j ];
30
31      cout << endl;
32      return 0// indicates successful termination
33   } // end main

34
35   // function to sort an array
36   void selectionSort( int * const array, const int size )
37   {
38      int smallest; // index of smallest element
39
40      // loop over size - 1 elements
41      for ( int i = 0; i < size - 1; i++ )
42      {
43         smallest = i; // first index of remaining array
44
45        // loop to find index of smallest element
46        for ( int index = i + 1; index < size; index++ )
47
48           if ( array[ index ] < array[ smallest ] )
49              smallest = index;
50
51        swap( &array[ i ], &array[ smallest ] );
52      } // end if
53   } // end function selectionSort
54
55   // swap values at memory locations to which                  
56   // element1Ptr and element2Ptr point                         
57   void swap( int * const element1Ptr, int * const element2Ptr )
58   {                                                            
59      int hold = *element1Ptr;                                  
60      *element1Ptr = *element2Ptr;                              
61      *element2Ptr = hold;                                      
62   } // end function swap                                       

Data items in original order
   2   6   4   8  10  12  89   68   45   37
Data items in ascending order
   2   4   6   8  10  12  37   45   68   89

Let us now look more closely at function swap. Remember that C++ enforces information hiding between functions, so swap does not have access to individual array elements in selectionSort. Because selectionSort wants swap to have access to the array elements to be swapped, selectionSort passes each of these elements to swap by reference—the address of each array element is passed explicitly. Although entire arrays are passed by reference, individual array elements are scalars and are ordinarily passed by value. Therefore, selectionSort uses the address operator (&) on each array element in the swap call (line 51) to effect pass-by-reference. Function swap (lines 57–62) receives &array[ i ] in pointer variable element1Ptr. Information hiding prevents swap from “knowing” the name array[ i ], but swap can use *element1Ptr as a synonym for array[ i ]. Thus, when swap references *element1Ptr, it is actually referencing array[ i ] in selectionSort. Similarly, when swap references *element2Ptr, it is actually referencing array[ smallest ] in selectionSort.

Even though swap is not allowed to use the statements

hold = array[ i ];
array[ i ] = array[ smallest ];
array[ smallest ] = hold;

precisely the same effect is achieved by

int hold = *element1Ptr;
*element1Ptr = *element2Ptr;
*element2Ptr = hold;

in the swap function of Fig. 8.15.

Several features of function selectionSort should be noted. The function header (line 36) declares array as int * const array, rather than int array[ ], to indicate that the function receives a one-dimensional array as an argument. Both parameter array’s pointer and parameter size are declared const to enforce the principle of least privilege. Although parameter size receives a copy of a value in main and modifying the copy cannot change the value in main, selectionSort does not need to alter size to accomplish its task—the array size remains fixed during the execution of selectionSort. Therefore, size is declared const to ensure that it is not modified. If the size of the array were to be modified during the sorting process, the sorting algorithm would not run correctly.

Note that function selectionSort receives the size of the array as a parameter, because the function must have that information to sort the array. When an array is passed to a function, only the memory address of the first element of the array is received by the function; the array size must be passed separately to the function.

By defining function selectionSort to receive the array size as a parameter, we enable the function to be used by any program that sorts one-dimensional int arrays of arbitrary size. The size of the array could have been programmed directly into the function, but this would restrict the function to processing an array of a specific size and reduce the function’s reusability—only programs processing one-dimensional int arrays of the specific size “hard coded” into the function could use the function.

Software Engineering Observation 8.4

Software Engineering Observation 8.4

When passing an array to a function, also pass the size of the array (rather than building into the function knowledge of the array size)—this makes the function more reusable.

8.7 sizeof Operator

C++ provides the unary operator sizeof to determine the size of an array (or of any other data type, variable or constant) in bytes during program compilation. When applied to the name of an array, as in Fig. 8.16 (line 14), the sizeof operator returns the total number of bytes in the array as a value of type size_t (an unsigned integer type that is at least as big as unsigned int). Note that this is different from the size of a vector< int >, for example, which is the number of integer elements in the vector. The computer we used to compile this program stores variables of type double in 8 bytes of memory, and array is declared to have 20 elements (line 12), so array uses 160 bytes in memory. When applied to a pointer parameter (line 24) in a function that receives an array as an argument, the sizeof operator returns the size of the pointer in bytes (4 on the system we used)—not the size of the array.

Fig. 8.16 sizeof operator when applied to an array name returns the number of bytes in the array.

 1   // Fig. 8.16: fig08_16.cpp
 2   // Sizeof operator when used on an array name
 3   // returns the number of bytes in the array.
 4   #include <iostream>
 5   using std::cout;
 6   using std::endl;
 7
 8   size_t getSize( double * ); // prototype
 9
10   int main( )
11   {
12      double array[ 20 ]; // 20 doubles; occupies 160 bytes on our system
13
14      cout << "The number of bytes in the array is " << sizeof( array );
15
16      cout << " The number of bytes returned by getSize is "
17         << getSize( array ) << endl;
18      return 0; // indicates successful termination
19   } // end main
20
21   // return size of ptr        
22   size_t getSize( double *ptr )
23   {                            
24      return sizeof( ptr );     
25   } // end function getSize    

The number of bytes in the array is 160
The number of bytes returned by getSize is 4

Common Programming Error 8.7

Common Programming Error 8.7

Using the sizeof operator in a function to find the size in bytes of an array parameter results in the size in bytes of a pointer, not the size in bytes of the array.

[Note: When the Borland C++ compiler is used to compile Fig. 8.16, the compiler generates the warning message “Parameter 'ptr' is never used in function getSize(double *).” This warning occurs because sizeof is actually a compile-time operator; thus, variable ptr is not used in the function’s body at execution time. Many compilers issue warnings like this to let you know that a variable is not being used so that you can either remove it from your code or modify your code to use the variable properly. Similar messages occur in Fig. 8.17 with various compilers.]

The number of elements in an array also can be determined using the results of two sizeof operations. For example, consider the following array declaration:

double realArray[ 22 ];

If variables of data type double are stored in eight bytes of memory, array realArray contains a total of 176 bytes. To determine the number of elements in the array, the following expression (which is evaluated at compile time) can be used:

sizeof realArray / sizeofdouble ) // calculate number of elements

The expression determines the number of bytes in array realArray (176) and divides that value by the number of bytes used in memory to store a double value (8)—the result is the number of elements in realArray (22).

Determining the Sizes of the Fundamental Types, an Array and a Pointer

Figure 8.17 uses sizeof to calculate the number of bytes used to store most of the standard data types. Notice that, in the output, the types double and long double have the same size. Types may have different sizes based on the platform running the program. On another system, for example, double and long double may be of different sizes.

Fig. 8.17 sizeof operator used to determine standard data type sizes.

Image

Portability Tip 8.2

Portability Tip 8.2

The number of bytes used to store a particular data type may vary among systems. When writing programs that depend on data type sizes, and that will run on several computer systems, use sizeof to determine the number of bytes used to store the data types.

Operator sizeof can be applied to any expression or type name. When sizeof is applied to a variable name (which is not an array name) or other expression, the number of bytes used to store the specific type of the expression’s value is returned. Note that the parentheses used with sizeof are required only if a type name (e.g., int) is supplied as its operand. The parentheses used with sizeof are not required when sizeof’s operand is an expression. Remember that sizeof is an operator, not a function, and that it has its effect at compile time, not execution time.

Common Programming Error 8.8

Common Programming Error 8.8

Omitting the parentheses in a sizeof operation when the operand is a type name is a compilation error.

Performance Tip 8.2

Performance Tip 8.2

Because sizeof is a compile-time unary operator, not an execution-time operator, using sizeof does not negatively impact execution performance.

Error-Prevention Tip 8.3

Error-Prevention Tip 8.3

To avoid errors associated with omitting the parentheses around the operand of operator sizeof, many programmers include parentheses around every sizeof operand.

8.8 Pointer Expressions and Pointer Arithmetic

Pointers are valid operands in arithmetic expressions, assignment expressions and comparison expressions. However, not all the operators normally used in these expressions are valid with pointer variables. This section describes the operators that can have pointers as operands and how these operators are used with pointers.

Several arithmetic operations may be performed on pointers. A pointer may be incremented (++) or decremented (--), an integer may be added to a pointer (+ or +=), an integer may be subtracted from a pointer (- or -=) or one pointer may be subtracted from another of the same type.

Assume that array int v[5] has been declared and that its first element is at memory location 3000. Assume that pointer vPtr has been initialized to point to v[0] (i.e., the value of vPtr is 3000). Figure 8.18 diagrams this situation for a machine with four-byte integers. Note that vPtr can be initialized to point to array v with either of the following statements (because the name of an array is equivalent to the address of its first element):

Fig. 8.18 Array v and a pointer variable int *vPtr that points to v.

Array v and a pointer variable int *vPtr that points to v.

int *vPtr = v;
int *vPtr = &v[ 0 ];

Portability Tip 8.3

Portability Tip 8.3

Most computers today have two-byte or four-byte integers. Some of the newer machines use eight-byte integers. Because the results of pointer arithmetic depend on the size of the objects a pointer points to, pointer arithmetic is machine dependent.

In conventional arithmetic, the addition 3000 + 2 yields the value 3002. This is normally not the case with pointer arithmetic. When an integer is added to, or subtracted from, a pointer, the pointer is not simply incremented or decremented by that integer, but by that integer times the size of the object to which the pointer refers. The number of bytes depends on the object’s data type. For example, the statement

vPtr += 2;

would produce 3008 (3000 + 2 * 4), assuming that an int is stored in four bytes of memory. In the array v, vPtr would now point to v[ 2 ] (Fig. 8.19). If an integer is stored in two bytes of memory, then the preceding calculation would result in memory location 3004 (3000 + 2 * 2). If the array elements were of a different data type, the preceding statement would increment the pointer by twice the number of bytes it takes to store an object of that data type. When performing pointer arithmetic on a character array, the results will be consistent with regular arithmetic, because each character is one byte long.

Fig. 8.19 Pointer vPtr after pointer arithmetic.

Pointer vPtr after pointer arithmetic.

If vPtr had been incremented to 3016, which points to v[4], the statement

vPtr -= 4;

would set vPtr back to 3000—the beginning of the array. If a pointer is being incremented or decremented by one, the increment (++) and decrement (--) operators can be used. Each of the statements

++vPtr;
vPtr++;

increments the pointer to point to the next element of the array. Each of the statements

--vPtr;
vPtr--;

decrements the pointer to point to the previous element of the array.

Pointer variables pointing to the same array may be subtracted from one another. For example, if vPtr contains the address 3000 and v2Ptr contains the address 3008, the statement

x = v2Ptr - vPtr;

would assign to x the number of array elements from vPtr to v2Ptr—in this case, 2. Pointer arithmetic is meaningless unless performed on a pointer that points to an array. We cannot assume that two variables of the same type are stored contiguously in memory unless they are adjacent elements of an array.

Common Programming Error 8.9

Common Programming Error 8.9

Using pointer arithmetic on a pointer that does not refer to an array of values is a logic error.

Common Programming Error 8.10

Common Programming Error 8.10

Subtracting or comparing two pointers that do not refer to elements of the same array is a logic error.

Common Programming Error 8.11

Common Programming Error 8.11

Using pointer arithmetic to increment or decrement a pointer such that the pointer refers to an element outside the bounds of the array is normally a logic error.

A pointer can be assigned to another pointer if both pointers are of the same type. Otherwise, a cast operator must be used to convert the value of the pointer on the right of the assignment to the pointer type on the left of the assignment. The exception to this rule is the pointer to void (i.e., void *), which is a generic pointer capable of representing any pointer type. All pointer types can be assigned to a pointer of type void * without casting. However, a pointer of type void * cannot be assigned directly to a pointer of another type—the pointer of type void * must first be cast to the proper pointer type.

Software Engineering Observation 8.5

Software Engineering Observation 8.5

Nonconstant pointer arguments can be passed to constant pointer parameters. This is helpful when the body of a program uses a nonconstant pointer to access data, but does not want that data to be modified by a function called in the body of the program.

A void * pointer cannot be dereferenced. For example, the compiler “knows” that a pointer to int refers to four bytes of memory on a machine with four-byte integers, but a pointer to void simply contains a memory address for an unknown data type—the precise number of bytes to which the pointer refers and the type of the data are not known by the compiler. The compiler must know the data type to determine the number of bytes to be dereferenced for a particular pointer—for a pointer to void, this number of bytes cannot be determined from the type.

Common Programming Error 8.12

Common Programming Error 8.12

Assigning a pointer of one type to a pointer of another (other than void *) without casting the first pointer to the type of the second pointer is a compilation error.

Common Programming Error 8.13

Common Programming Error 8.13

All operations on a void * pointer are compilation errors, except comparing void * pointers with other pointers, casting void * pointers to valid pointer types and assigning addresses to void * pointers.

Pointers can be compared using equality and relational operators. Comparisons using relational operators are meaningless unless the pointers point to members of the same array. Pointer comparisons compare the addresses stored in the pointers. A comparison of two pointers pointing to the same array could show, for example, that one pointer points to a higher numbered element of the array than the other pointer does. A common use of pointer comparison is determining whether a pointer is 0 (i.e., the pointer is a null pointer—it does not point to anything).

8.9 Relationship Between Pointers and Arrays

Arrays and pointers are intimately related in C++ and may be used almost interchangeably. An array name can be thought of as a constant pointer. Pointers can be used to do any operation involving array subscripting.

Assume the following declarations:

int b[ 5 ]; // create 5-element int array b
int *bPtr; // create int pointer bPtr

Because the array name (without a subscript) is a (constant) pointer to the first element of the array, we can set bPtr to the address of the first element in array b with the statement

bPtr = b; // assign address of array b to bPtr

This is equivalent to assigning the address of the first element of the array as follows:

bPtr = &b[ 0 ]; // also assigns address of array b to bPtr

Array element b[3] can alternatively be referenced with the pointer expression

*( bPtr + 3 )

The 3 in the preceding expression is the offset to the pointer. When the pointer points to the beginning of an array, the offset indicates which element of the array should be referenced, and the offset value is identical to the array subscript. The preceding notation is referred to as pointer/offset notation. The parentheses are necessary, because the precedence of * is higher than the precedence of +. Without the parentheses, the above expression would add 3 to the value of *bPtr (i.e., 3 would be added to b[0], assuming that bPtr points to the beginning of the array). Just as the array element can be referenced with a pointer expression, the address

&b[ 3 ]

can be written with the pointer expression

bPtr + 3

The array name (which is implicitly const) can be treated as a pointer and used in pointer arithmetic. For example, the expression

*( b + 3 )

also refers to the array element b[3]. In general, all subscripted array expressions can be written with a pointer and an offset. In this case, pointer/offset notation was used with the name of the array as a pointer. Note that the preceding expression does not modify the array name in any way; b still points to the first element in the array.

Pointers can be subscripted exactly as arrays can. For example, the expression

bPtr[ 1 ]

refers to the array element b[1]; this expression uses pointer/subscript notation.

Remember that an array name is a constant pointer; it always points to the beginning of the array. Thus, the expression

b += 3

causes a compilation error, because it attempts to modify the value of the array name (a constant) with pointer arithmetic.

Common Programming Error 8.14

Common Programming Error 8.14

Although array names are pointers to the beginning of the array and pointers can be modified in arithmetic expressions, array names cannot be modified in arithmetic expressions, because array names are constant pointers.

Good Programming Practice 8.2

Good Programming Practice 8.2

For clarity, use array notation instead of pointer notation when manipulating arrays.

Figure 8.20 uses the four notations discussed in this section for referring to array elements—array subscript notation, pointer/offset notation with the array name as a pointer, pointer subscript notation and pointer/offset notation with a pointer—to accomplish the same task, namely printing the four elements of the integer array b.

Fig. 8.20 Referencing array elements with the array name and with pointers.

 1   // Fig. 8.20: fig08_20.cpp
 2   // Using subscripting and pointer notations with arrays.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   int main( )
 8   {
 9      int b[ ] = { 10, 20, 30, 40 }; // create 4-element array b
10      int *bPtr = b; // set bPtr to point to array b
11
12      // output array b using array subscript notation
13      cout << "Array b printed with: Array subscript notation ";

14
15      for ( int i = 0; i < 4; i++ )
16         cout << "b[" << i << "] = " << b[ i ] << ' ';
17
18      // output array b using the array name and pointer/offset notation
19      cout << " Pointer/offset notation where "
20         << "the pointer is the array name ";
21
22      for ( int offset1 = 0; offset1 < 4; offset1++ )
23         cout << "*(b + " << offset1 << ") = " << *( b + offset1 ) << ' ';
24
25      // output array b using bPtr and array subscript notation
26      cout << " Pointer subscript notation ";
27
28      for ( int j = 0; j < 4; j++ )
29         cout << "bPtr[" << j << "] = " << bPtr[ j ] << ' ';
30
31      cout << " Pointer/offset notation ";
32
33      // output array b using bPtr and pointer/offset notation
34      for ( int offset2 = 0; offset2 < 4; offset2++ )
35         cout << "*(bPtr + " << offset2 << ") = "
36            << *( bPtr + offset2 ) << ' ';
37
38      return 0; // indicates successful termination
39   } // end main

Array b printed with:

Array subscript notation
b[0] = 10
b[1] = 20
b[2] = 30
b[3] = 40

Pointer/offset notation where the pointer is the array name
*(b + 0) = 10
*(b + 1) = 20
*(b + 2) = 30
*(b + 3) = 40

Pointer subscript notation
bPtr[0] = 10
bPtr[1] = 20
bPtr[2] = 30
bPtr[3] = 40

Pointer/offset notation
*(bPtr + 0) = 10
*(bPtr + 1) = 20
*(bPtr + 2) = 30
*(bPtr + 3) = 40

To further illustrate the interchangeability of arrays and pointers, let us look at the two string-copying functions—copy1 and copy2—in the program of Fig. 8.21. Both functions copy a string into a character array. After a comparison of the function prototypes for copy1 and copy2, the functions appear identical (because of the interchangeability of arrays and pointers). These functions accomplish the same task, but they are implemented differently.

Fig. 8.21 String copying using array notation and pointer notation.

 1   // Fig. 8.21: fig08_21.cpp
 2   // Copying a string using array notation and pointer notation.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   void copy1( char *, const char * ); // prototype
 8   void copy2( char *, const char * ); // prototype
 9
10   int main( )
11   {
12      char string1[ 10 ];
13      char *string2 = "Hello";
14      char string3[ 10 ];
15      char string4[ ] = "Good Bye";
16
17      copy1( string1, string2 ); // copy string2 into string1
18      cout << "string1 = " << string1 << endl;
19
20      copy2( string3, string4 ); // copy string4 into string3
21      cout << "string3 = " << string3 << endl;
22      return 0; // indicates successful termination
23   } // end main
24
25   // copy s2 to s1 using array notation   
26   void copy1( char * s1, const char * s2 )
27   {
28      // copying occurs in the for header
29      for ( int i = 0; ( s1[ i ] = s2[ i ] ) != ''; i++ )
30         ; // do nothing in body                           
31   } // end function copy1
32
33   // copy s2 to s1 using pointer notation
34   void copy2( char *s1, const char *s2 ) 
35   {
36      // copying occurs in the for header
37      for ( ; ( *s1 = *s2 ) != ''; s1++, s2++ )
38         ; // do nothing in body                 
39   } // end function copy2

string1 = Hello
string3 = Good Bye

Function copy1 (lines 26–31) uses array subscript notation to copy the string in s2 to the character array s1. The function declares an integer counter variable i to use as the array subscript. The for statement header (line 29) performs the entire copy operation—its body is the empty statement. The header specifies that i is initialized to zero and incremented by one on each iteration of the loop. The condition in the for, (s1[ i] = s2[i]) != '', performs the copy operation character by character from s2 to s1. When the null character is encountered in s2, it is assigned to s1, and the loop terminates, because the null character is equal to ''. Remember that the value of an assignment statement is the value assigned to its left operand.

Function copy2 (lines 34–39) uses pointers and pointer arithmetic to copy the string in s2 to the character array s1. Again, the for statement header (line 37) performs the entire copy operation. The header does not include any variable initialization. As in function copy1, the condition ( *s1 = *s2 ) != '' performs the copy operation. Pointer s2 is dereferenced, and the resulting character is assigned to the dereferenced pointer s1. After the assignment in the condition, the loop increments both pointers, so they point to the next element of array s1 and the next character of string s2, respectively. When the loop encounters the null character in s2, the null character is assigned to the dereferenced pointer s1 and the loop terminates. Note that the “increment portion” of this for statement has two increment expressions separated by a comma operator.

The first argument to both copy1 and copy2 must be an array large enough to hold the string in the second argument. Otherwise, an error may occur when an attempt is made to write into a memory location beyond the bounds of the array (recall that when using pointer-based arrays, there is no “built-in” bounds checking). Also, note that the second parameter of each function is declared as const char * (a pointer to a character constant—i.e., a constant string). In both functions, the second argument is copied into the first argument—characters are copied from the second argument one at a time, but the characters are never modified. Therefore, the second parameter is declared to point to a constant value to enforce the principle of least privilege—neither function needs to modify the second argument, so neither function is allowed to modify the second argument.

8.10 Arrays of Pointers

Arrays may contain pointers. A common use of such a data structure is to form an array of pointer-based strings, referred to simply as a string array. Each entry in the array is a string, but in C++ a string is essentially a pointer to its first character, so each entry in an array of strings is simply a pointer to the first character of a string. Consider the declaration of string array suit that might be useful in representing a deck of cards:

const char *suit[ 4 ] =
   { "Hearts""Diamonds""Clubs""Spades" };

The suit[4] portion of the declaration indicates an array of four elements. The const char * portion of the declaration indicates that each element of array suit is of type “pointer to char constant data.” The four values to be placed in the array are "Hearts", "Diamonds", "Clubs" and "Spades". Each is stored in memory as a null-terminated character string that is one character longer than the number of characters between quotes. The four strings are seven, nine, six and seven characters long (including their terminating null characters), respectively. Although it appears as though these strings are being placed in the suit array, only pointers are actually stored in the array, as shown in Fig. 8.22. Each pointer points to the first character of its corresponding string. Thus, even though the suit array is fixed in size, it provides access to character strings of any length. This flexibility is one example of C++’s powerful data-structuring capabilities.

Fig. 8.22 Graphical representation of the suit array.

Graphical representation of the suit array.

The suit strings could be placed into a two-dimensional array, in which each row represents one suit and each column represents one of the letters of a suit name. Such a data structure must have a fixed number of columns per row, and that number must be as large as the largest string. Therefore, considerable memory is wasted when we store a large number of strings, of which most are shorter than the longest string. We use arrays of strings to help represent a deck of cards in the next section.

String arrays are commonly used with command-line arguments that are passed to function main when a program begins execution. Such arguments follow the program name when a program is executed from the command line. A typical use of command-line arguments is to pass options to a program. For example, from the command line on a Windows computer, the user can type

dir /p

to list the contents of the current directory and pause after each screen of information. When the dir command executes, the option /p is passed to dir as a command-line argument. Such arguments are placed in a string array that main receives as an argument.

8.11 Case Study: Card Shuffling and Dealing Simulation

This section uses random-number generation to develop a card shuffling and dealing simulation program. This program can then be used as a basis for implementing programs that play specific card games. To reveal some subtle performance problems, we have intentionally used suboptimal shuffling and dealing algorithms.

Using the top-down, stepwise-refinement approach, we develop a program that will shuffle a deck of 52 playing cards and then deal each of the 52 cards. The top-down approach is particularly useful in attacking larger, more complex problems than we have seen in the early chapters.

We use a 4-by-13 two-dimensional array deck to represent the deck of playing cards (Fig. 8.23). The rows correspond to the suits—row 0 corresponds to hearts, row 1 to diamonds, row 2 to clubs and row 3 to spades. The columns correspond to the face values of the cards—columns 0 through 9 correspond to the faces ace through 10, respectively, and columns 10 through 12 correspond to the jack, queen and king, respectively. We shall load the string array suit with character strings representing the four suits (as in Fig. 8.22) and the string array face with character strings representing the 13 face values.

Fig. 8.23 Two-dimensional array representation of a deck of cards.

Two-dimensional array representation of a deck of cards.

This simulated deck of cards may be shuffled as follows. First the 52-element array deck is initialized to zeros. Then, a row (0–3) and a column (0–12) are each chosen at random. The number 1 is inserted in array element deck[ row ][ column ] to indicate that this card is going to be the first one dealt from the shuffled deck. This process continues with the numbers 2, 3, ..., 52 being randomly inserted in the deck array to indicate which cards are to be placed second, third, ..., and 52nd in the shuffled deck. As the deck array begins to fill with card numbers, it is possible that a card will be selected twice (i.e., deck[ row ][ column ] will be nonzero when it is selected). This selection is simply ignored, and other row and column combinations are repeatedly chosen at random until an unselected card is found. Eventually, the numbers 1 through 52 will occupy the 52 slots of the deck array. At this point, the deck of cards is fully shuffled.

This shuffling algorithm could execute for an indefinitely long period if cards that have already been shuffled are repeatedly selected at random. This phenomenon is known as indefinite postponement (also called starvation).

Performance Tip 8.3

Performance Tip 8.3

Sometimes algorithms that emerge in a “natural” way can contain subtle performance problems such as indefinite postponement. Seek algorithms that avoid indefinite postponement.

To deal the first card, we search the array for the element deck[ row ][ column ] that matches 1. This is accomplished with nested for statements that vary row from 0 to 3 and column from 0 to 12. What card does that slot of the array correspond to? The suit array has been preloaded with the four suits, so to get the suit, we print the character string suit[ row ]. Similarly, to get the face value of the card, we print the character string face[ column ]. We also print the character string "of". Printing this information in the proper order enables us to print each card in the form "King of Clubs", "Ace of Diamonds" and so on.

Figures 8.248.26 contain the card shuffling and dealing program and a sample execution. Note the output formatting used in function deal (lines 81–83 of Fig. 8.25). The output statement outputs the face right justified in a field of five characters and outputs the suit left justified in a field of eight characters (Fig. 8.26). The output is printed in two-column format—if the card being output is in the first column, a tab is output after the card to move to the second column (line 83); otherwise, a newline is output.

Fig. 8.24 DeckOfCards header file.

 1   // Fig. 8.24: DeckOfCards.h
 2   // Definition of class DeckOfCards that
 3   // represents a deck of playing cards.
 4
 5   // DeckOfCards class definition
 6   class DeckOfCards
 7   {
 8   public:
 9      DeckOfCards( ); // constructor initializes deck
10      void shuffle( ); // shuffles cards in deck
11      void deal( ); // deals cards in deck
12   private:
13      int deck[ 4 ][ 13 ]; // represents deck of cards
14   }; // end class DeckOfCards

Fig. 8.25 Definitions of member functions for shuffling and dealing.

 1   // Fig. 8.25: DeckOfCards.cpp
 2   // Member-function definitions for class DeckOfCards that simulates
 3   // the shuffling and dealing of a deck of playing cards.
 4   #include <iostream>
 5   using std::cout;
 6   using std::left;
 7   using std::right;
 8
 9   #include <iomanip>
10   using std::setw;
11
12   #include <cstdlib> // prototypes for rand and srand
13   using std::rand;
14   using std::srand;
15
16   #include <ctime> // prototype for time
17   using std::time;
18
19   #include "DeckOfCards.h" // DeckOfCards class definition
20
21   // DeckOfCards default constructor initializes deck
22   DeckOfCards::DeckOfCards( )
23   {
24      // loop through rows of deck
25      for ( int row = 0; row <= 3; row++ )
26      {
27         // loop through columns of deck for current row
28         for ( int column = 0; column <= 12; column++ )
29         {
30            deck[ row ][ column ] = 0; // initialize slot of deck to 0
31         } // end inner for
32      } // end outer for
33
34      srand( time( 0 ) ); // seed random number generator
35   } // end DeckOfCards default constructor

36
37   // shuffle cards in deck
38   void DeckOfCards::shuffle( )
39   {
40      int row; // represents suit value of card
41      int column; // represents face value of card
42
43      // for each of the 52 cards, choose a slot of the deck randomly
44      for ( int card = 1; card <= 52; card++ )
45      {
46         do // choose a new random location until unoccupied slot is found
47         {
48            row = rand( ) % 4; // randomly select the row (0 to 3)
49            column = rand( ) % 13; // randomly select the column (0 to 12)
50         } while( deck[ row ][ column ] != 0 ); // end do...while
51
52         // place card number in chosen slot of deck
53         deck[ row ][ column ] = card;
54      } // end for
55   } // end function shuffle
56
57   // deal cards in deck
58   void DeckOfCards::deal( )
59   {
60      // initialize suit array                       
61      static const char *suit[ 4 ] =                 
62         { "Hearts""Diamonds""Clubs""Spades" };
63
64      // initialize face array                                     
65      static const char *face[ 13 ] =                              
66         { "Ace""Deuce""Three""Four""Five""Six""Seven",
67         "Eight", "Nine", "Ten", "Jack", "Queen", "King" };        
68
69      // for each of the 52 cards
70      for ( int card = 1; card <= 52; card++ )
71      {
72         // loop through rows of deck
73         for ( int row = 0; row <= 3; row++ )
74         {
75            // loop through columns of deck for current row
76            for ( int column = 0; column <= 12; column++ )
77            {
78               // if slot contains current card, display card
79               if ( deck[ row ][ column ] == card )
80               {
81                  cout << setw( 5 ) << right << face[ column ]    
82                     << " of " << setw( 8 ) << left << suit[ row ]
83                     << ( card % 2 == 0 ? ' ' : ' ' );          
84               } // end if
85            } // end innermost for
86         } // end inner for
87      } // end outer for
88   } // end function deal

Fig. 8.26 Card shuffling and dealing program.

Image

There is a weakness in the dealing algorithm. Once a match is found, even if it is found on the first try, the two inner for statements continue searching the remaining elements of deck for a match.

8.12 Function Pointers

A pointer to a function contains the address of the function in memory. In Chapter 7, we saw that the name of an array is actually the address in memory of the first element of the array. Similarly, the name of a function is actually the starting address in memory of the code that performs the function’s task. Pointers to functions can be passed to functions, returned from functions, stored in arrays, assigned to other function pointers and used to call the underlying function.

Multipurpose Selection Sort Using Function Pointers

To illustrate the use of pointers to functions, Fig. 8.27 modifies the selection sort program of Fig. 8.15. Figure 8.27 consists of main (lines 17–55) and the functions selectionSort (lines 59–76), swap (lines 80–85), ascending (lines 89–92) and descending (lines 96–99). Function selectionSort receives a pointer to a function—either function ascending or function descending—as an argument in addition to the integer array to sort and the size of the array. Functions ascending and descending determine the sorting order. The program prompts the user to choose whether the array should be sorted in ascending order or in descending order (lines 24–26). If the user enters 1, a pointer to function ascending is passed to function selectionSort (line 37), causing the array to be sorted into increasing order. If the user enters 2, a pointer to function descending is passed to function selectionSort (line 45), causing the array to be sorted into decreasing order.

Fig. 8.27 Multipurpose sorting program using function pointers.

 1   // Fig. 8.27: fig08_27.cpp
 2   // Multipurpose sorting program using function pointers.
 3   #include <iostream>
 4   using std::cout;
 5   using std::cin;
 6   using std::endl;
 7
 8   #include <iomanip>
 9   using std::setw;
10
11   // prototypes
12   void selectionSort( int [ ], const intbool (*)( int, int) );
13   void swap( int * constint * const );
14   bool ascending( int, int ); // implements ascending order
15   bool descending( intint ); // implements descending order
16
17   int main( )
18   {
19      const int arraySize = 10;
20      int order; // 1 = ascending, 2 = descending
21      int counter; // array index
22      int a[ arraySize ] = { 2, 6, 4, 8, 10, 12, 89, 68, 45, 37 };
23
24      cout << "Enter 1 to sort in ascending order, "
25         << "Enter 2 to sort in descending order: ";
26      cin >> order;
27      cout << " Data items in original order ";
28
29      // output original array
30      for ( counter = 0; counter < arraySize; counter++ )
31         cout << setw( 4 ) << a[ counter ];
32
33      // sort array in ascending order; pass function ascending
34      // as an argument to specify ascending sorting order
35      if ( order == 1 )
36      {
37         selectionSort( a, arraySize, ascending );

38         cout << " Data items in ascending order ";
39      } // end if
40
41      // sort array in descending order; pass function descending
42      // as an argument to specify descending sorting order
43      else
44      {
45         selectionSort( a, arraySize, descending );
46         cout << " Data items in descending order ";
47      } // end else part of if...else
48
49      // output sorted array
50      for ( counter = 0; counter < arraySize; counter++ )
51         cout << setw( 4 ) << a[ counter ];
52
53      cout << endl;
54      return 0; // indicates successful termination
55   } // end main
56
57   // multipurpose selection sort; the parameter compare is a pointer to
58   // the comparison function that determines the sorting order
59   void selectionSort( int work[ ], const int size,
60                       bool (*compare)( int, int ) )
61   {
62      int smallestOrLargest; // index of smallest (or largest) element
63
64      // loop over size - 1 elements
65      for ( int i = 0; i < size - 1; i++ )
66      {
67         smallestOrLargest = i; // first index of remaining vector
68
69         // loop to find index of smallest (or largest) element
70         for ( int index = i + 1; index < size; index++ )
71            if ( !(*compare)( work[ smallestOrLargest ], work[ index ] ) )
72               smallestOrLargest = index;
73
74         swap( &work[ smallestOrLargest ], &work[ i ] );
75      } // end if
76   } // end function selectionSort
77
78   // swap values at memory locations to which
79   // element1Ptr and element2Ptr point
80   void swap( int * const element1Ptr, int * const element2Ptr )
81   {
82      int hold = *element1Ptr;
83      *element1Ptr = *element2Ptr;
84      *element2Ptr = hold;
85   } // end function swap
86
87   // determine whether element a is less than         
88   // element b for an ascending order sort            
89   bool ascending( int a, int b )                      
90   {                                                   

91      return a < b; // returns true if a is less than b
92   } // end function ascending                         
93
94   // determine whether element a is greater than         
95   // element b for a descending order sort               
96   bool descending( int a, int b )                        
97   {                                                      
98      return a > b; // returns true if a is greater than b
99   } // end function descending                           

Enter 1 to sort in ascending order,
Enter 2 to sort in descending order: 1

Data items in original order
   2   6   4   8  10  12  89  68  45  37
Data items in ascending order
   2   4   6   8  10  12  37  45  68  89

Enter 1 to sort in ascending order,
Enter 2 to sort in descending order: 2

Data items in original order
   2   6   4   8  10  12  89  68  45  37
Data items in descending order
  89  68  45  37  12  10   8   6   4   2

The following parameter appears in line 60 of selectionSort’s function header:

bool ( *compare )( intint )

This parameter specifies a pointer to a function. The keyword bool indicates that the function being pointed to returns a bool value. The text ( *compare ) indicates the name of the pointer to the function (the * indicates that parameter compare is a pointer). The text ( int, int ) indicates that the function pointed to by compare takes two integer arguments. Parentheses are needed around *compare to indicate that compare is a pointer to a function. If we had not included the parentheses, the declaration would have been

bool *compare( intint )

which declares a function that receives two integers as parameters and returns a pointer to a bool value.

The corresponding parameter in the function prototype of selectionSort is

bool (*)( intint )

Note that only types have been included. As always, for documentation purposes, you can include names that the compiler will ignore.

The function passed to selectionSort is called in line 71 as follows:

( *compare )( work[ smallestOrLargest ], work[ index ] )

Just as a pointer to a variable is dereferenced to access the value of the variable, a pointer to a function is dereferenced to execute the function. The parentheses around *compare are necessary—if they were left out, the * operator would attempt to dereference the value returned from the function call. The call to the function could have been made without dereferencing the pointer, as in

compare( work[ smallestOrLargest ], work[ index ] )

which uses the pointer directly as the function name. We prefer the first method of calling a function through a pointer, because it explicitly illustrates that compare is a pointer to a function that is dereferenced to call the function. The second method of calling a function through a pointer makes it appear as though compare is the name of an actual function in the program. This may be confusing to a user of the program who would like to see the definition of function compare and finds that it is not defined in the file.

Chapter 20, Standard Template Library (STL), presents many common uses of function pointers.

Arrays of Pointers to Functions

One use of function pointers is in menu-driven systems. For example, a program might prompt a user to select an option from a menu by entering an integer value. The user’s choice can be used as a subscript into an array of function pointers, and the pointer in the array can be used to call the function.

Figure 8.28 provides a mechanical example that demonstrates declaring and using an array of pointers to functions. The program defines three functions—function0, function1 and function2—that each take an integer argument and do not return a value. Line 17 stores pointers to these three functions in array f. In this case, all the functions to which the array points must have the same return type and same parameter types. The declaration in line 17 is read beginning in the leftmost set of parentheses as, “f is an array of three pointers to functions that each take an int as an argument and return void.” The array is initialized with the names of the three functions (which, again, are pointers). The program prompts the user to enter a number between 0 and 2, or 3 to terminate. When the user enters a value between 0 and 2, the value is used as the subscript into the array of pointers to functions. Line 29 invokes one of the functions in array f. In the call, f[ choice ] selects the pointer at location choice in the array. The pointer is dereferenced to call the function, and choice is passed as the argument to the function. Each function prints its argument’s value and its function name to indicate that the function is called correctly. We’ll see in Chapter 13, Object-Oriented Programming: Polymorphism, that arrays of pointers to functions are used by compiler developers to implement the mechanisms that support virtual functions—the key technology behind polymorphism.

Fig. 8.28 Array of pointers to functions.

 1   // Fig. 8.28: fig08_28.cpp
 2   // Demonstrating an array of pointers to functions.
 3   #include <iostream>
 4   using std::cout;
 5   using std::cin;
 6   using std::endl;

 7
 8   // function prototypes -- each function performs similar actions
 9   void function0( int );
10   void function1( int );
11   void function2( int );
12
13   int main( )
14   {
15      // initialize array of 3 pointers to functions that each    
16      // take an int argument and return void                     
17      void (*f[ 3 ])( int ) = { function0, function1, function2 };
18
19      int choice;
20
21      cout << "Enter a number between 0 and 2, 3 to end: ";
22      cin >> choice;
23
24      // process user's choice
25      while ( ( choice >= 0 ) && ( choice < 3 ) )
26      {
27         // invoke the function at location choice in 
28         // the array f and pass choice as an argument
29         (*f[ choice ])( choice );                    
30
31         cout << "Enter a number between 0 and 2, 3 to end: ";
32         cin >> choice;
33      } // end while
34
35      cout << "Program execution completed." << endl;
36      return 0; // indicates successful termination
37   } // end main
38
39   void function0( int a )
40   {
41      cout << "You entered " << a << " so function0 was called ";
42   } // end function function0
43
44   void function1( int b )
45   {
46      cout << "You entered " << b << " so function1 was called ";
47   } // end function function1
48
49   void function2( int c )
50   {
51      cout << "You entered " << c << " so function2 was called ";
52   } // end function function2

Enter a number between 0 and 2, 3 to end: 0
You entered 0 so function0 was called

Enter a number between 0 and 2, 3 to end: 1
You entered 1 so function1 was called

Enter a number between 0 and 2, 3 to end: 2
You entered 2 so function2 was called

Enter a number between 0 and 2, 3 to end: 3
Program execution completed.

8.13 Introduction to Pointer-Based String Processing

In this section, we introduce some common C++ Standard Library functions that facilitate string processing. The techniques discussed here are appropriate for developing text editors, word processors, page layout software, computerized typesetting systems and other kinds of text-processing software. We have already used the C++ Standard Library string class in several examples to represent strings as full-fledged objects. For example, the GradeBook class case study in Chapters 37 represents a course name using a string object. In Chapter 18 we present class string in detail. Although using string objects is usually straightforward, we use null-terminated, pointer-based strings in this section. Many C++ Standard Library functions operate only on null-terminated, pointer-based strings, which are more complicated to use than string objects. Also, if you work with legacy C++ programs, you may be required to manipulate these pointer-based strings.

8.13.1 Fundamentals of Characters and Pointer-Based Strings

Characters are the fundamental building blocks of C++ source programs. Every program is composed of a sequence of characters that—when grouped together meaningfully—is interpreted by the compiler as a series of instructions used to accomplish a task. A program may contain character constants. A character constant is an integer value represented as a character in single quotes. The value of a character constant is the integer value of the character in the machine’s character set. For example, 'z' represents the integer value of z (122 in the ASCII character set; see Appendix B), and ' ' represents the integer value of newline (10 in the ASCII character set).

A string is a series of characters treated as a single unit. A string may include letters, digits and various special characters such as +, -, *, / and $. String literals, or string constants, in C++ are written in double quotation marks as follows:

"John Q. Doe"                                  (a name)

"9999 Main Street"                     (a street address)               

"Maynard, Massachusetts"     (a city and state)

"(201) 555-1212"                           (a telephone number)

A pointer-based string in C++ is an array of characters ending in the null character (''), which marks where the string terminates in memory. A string is accessed via a pointer to its first character. The value of a string is the address of its first character, but the sizeof a string literal is the length of the string including the terminating null character. In this sense, strings are like arrays, because an array name is also a pointer to its first element.

A string literal may be used as an initializer in the declaration of either a character array or a variable of type char *. The declarations

char color[ ] = "blue";
const char *colorPtr = "blue";

each initialize a variable to the string "blue". The first declaration creates a five-element array color containing the characters 'b', 'l', 'u', 'e' and ''. The second declaration creates pointer variable colorPtr that points to the letter b in the string "blue" (which ends in '') somewhere in memory. String literals have static storage class (they exist for the duration of the program) and may or may not be shared if the same string literal is referenced from multiple locations in a program. According to the C++ standard (Section 2.13.4), the effect of attempting to modify a string literal is undefined; thus, you should always declare a pointer to a string literal as const char *.

The declaration char color[ ] = "blue"; could also be written

char color[ ] = { 'b''l''u''e''' };

When declaring a character array to contain a string, the array must be large enough to store the string and its terminating null character. The preceding declaration determines the size of the array, based on the number of initializers provided in the initializer list.

Common Programming Error 8.15

Common Programming Error 8.15

Not allocating sufficient space in a character array to store the null character that terminates a string is an error.

Common Programming Error 8.16

Common Programming Error 8.16

Creating or using a C-style string that does not contain a terminating null character can lead to logic errors.

Error-Prevention Tip 8.4

Error-Prevention Tip 8.4

When storing a string of characters in a character array, be sure that the array is large enough to hold the largest string that will be stored. C++ allows strings of any length to be stored. If a string is longer than the character array in which it is to be stored, characters beyond the end of the array will overwrite data in memory following the array, leading to logic errors.

A string can be read into a character array using stream extraction with cin. For example, the following statement can be used to read a string into character array word[ 20 ]:

cin >> word;

The string entered by the user is stored in word. The preceding statement reads characters until a white-space character or end-of-file indicator is encountered. Note that the string should be no longer than 19 characters to leave room for the terminating null character. The setw stream manipulator can be used to ensure that the string read into word does not exceed the size of the array. For example, the statement

cin >> setw( 20 ) >> word;

specifies that cin should read a maximum of 19 characters into array word and save the 20th location in the array to store the terminating null character for the string. The setw stream manipulator applies only to the next value being input. If more than 19 characters are entered, the remaining characters are not saved in word, but will be read in and can be stored in another variable.

In some cases, it is desirable to input an entire line of text into an array. For this purpose, C++ provides the function cin.getline in header file <iostream>. In Chapter 3 you were introduced to the similar function getline from header file <string>, which read input until a newline character was entered, and stored the input (without the newline character) into a string specified as an argument. The cin.getline function takes three arguments—a character array in which the line of text will be stored, a length and a delimiter character. For example, the program segment

char sentence[ 80 ];
cin.getline( sentence, 80' ' );

declares array sentence of 80 characters and reads a line of text from the keyboard into the array. The function stops reading characters when the delimiter character ' ' is encountered, when the end-of-file indicator is entered or when the number of characters read so far is one less than the length specified in the second argument. (The last character in the array is reserved for the terminating null character.) If the delimiter character is encountered, it is read and discarded. The third argument to cin.getline has ' ' as a default value, so the preceding function call could have been written as follows:

cin.getline( sentence, 80 );

Chapter 15, Stream Input/Output, provides a detailed discussion of cin.getline and other input/output functions.

Common Programming Error 8.17

Common Programming Error 8.17

Processing a single character as a char * string can lead to a fatal runtime error. A char * string is a pointer—probably a respectably large integer. However, a character is a small integer (ASCII values range 0–255). On many systems, dereferencing a char value causes an error, because low memory addresses are reserved for special purposes such as operating system interrupt handlers—so “memory access violations” occur.

Common Programming Error 8.18

Common Programming Error 8.18

Passing a string as an argument to a function when a character is expected is a compilation error.

8.13.2 String-Manipulation Functions of the String-Handling Library

The string-handling library provides many useful functions for manipulating string data, comparing strings, searching strings for characters and other strings, tokenizing strings (separating strings into logical pieces such as the separate words in a sentence) and determining the length of strings. This section presents some common string-manipulation functions of the string-handling library (from the C++ standard library). The functions are summarized in Fig. 8.29; then each is used in a live-code example. The prototypes for these functions are located in header file <cstring>.

Fig. 8.29 String-manipulation functions of the string-handling library.

Image

Note that several functions in Fig. 8.29 contain parameters with data type size_t. This type is defined in the header file <cstring> to be an unsigned integral type such as unsigned int or unsigned long.

Common Programming Error 8.19

Common Programming Error 8.19

Forgetting to include the <cstring> header file when using functions from the string-handling library causes compilation errors.

Copying Strings with strcpy and strncpy

Function strcpy copies its second argument—a string—into its first argument—a character array that must be large enough to store the string and its terminating null character, (which is also copied). Function strncpy is much like strcpy, except that strncpy specifies the number of characters to be copied from the string into the array. Note that function strncpy does not necessarily copy the terminating null character of its second argument—a terminating null character is written only if the number of characters to be copied is at least one more than the length of the string. For example, if "test" is the second argument, a terminating null character is written only if the third argument to strncpy is at least 5 (four characters in "test" plus one terminating null character). If the third argument is larger than 5, null characters are appended to the array until the total number of characters specified by the third argument is written.

Common Programming Error 8.20

Common Programming Error 8.20

When using strncpy, the terminating null character of the second argument (a char * string) will not be copied if the number of characters specified by strncpy’s third argument is not greater than the second argument’s length. In that case, a fatal error may occur if you do not manually terminate the resulting char * string with a null character.

Figure 8.30 uses strcpy (line 17) to copy the entire string in array x into array y and uses strncpy (line 23) to copy the first 14 characters of array x into array z. Line 24 appends a null character ('') to array z, because the call to strncpy in the program does not write a terminating null character. (The third argument is less than the string length of the second argument plus one.)

Fig. 8.30 strcpy and strncpy.

 1   // Fig. 8.30: fig08_30.cpp
 2   // Using strcpy and strncpy.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   #include <cstring> // prototypes for strcpy and strncpy
 8   using std::strcpy;                                     
 9   using std::strncpy;                                    
10
11   int main( )
12   {
13      char x[ ] = "Happy Birthday to You"; // string length 21
14      char y[ 25 ];
15      char z[ 15 ];
16
17      strcpy( y, x ); // copy contents of x into y
18
19      cout << "The string in array x is: " << x
20         << " The string in array y is: " << y << ' ';
21
22      // copy first 14 characters of x into z             
23      strncpy( z, x, 14 ); // does not copy null character
24      z[ 14 ] = ''; // append '' to z's contents      
25
26      cout << "The string in array z is: " << z << endl;
27      return 0; // indicates successful termination
28   } // end main

The string in array x is: Happy Birthday to You
The string in array y is: Happy Birthday to You
The string in array z is: Happy Birthday

Concatenating Strings with strcat and strncat

Function strcat appends its second argument (a string) to its first argument (a character array containing a string). The first character of the second argument replaces the null character ('') that terminates the string in the first argument. You must ensure that the array used to store the first string is large enough to store the combination of the first string, the second string and the terminating null character (copied from the second string). Function strcat appends a specified number of characters from the second string to the first string and appends a terminating null character to the result. The program of Fig. 8.31 demonstrates function strcat (lines 19 and 29) and function strncat (line 24).

Fig. 8.31 strcat and strncat.

 1   // Fig. 8.31: fig08_31.cpp
 2   // Using strcat and strncat.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   #include <cstring> // prototypes for strcat and strncat
 8   using std::strcat;                                     
 9   using std::strncat;                                    
10
11   int main( )
12   {
13      char s1[ 20 ] = "Happy "; // length 6
14      char s2[ ] = "New Year "; // length 9
15      char s3[ 40 ] = "";
16
17      cout << "s1 = " << s1 << " s2 = " << s2;
18
19      strcat( s1, s2 ); // concatenate s2 to s1 (length 15)
20
21      cout << " After strcat(s1, s2): s1 = " << s1 << " s2 = " << s2;
22
23      // concatenate first 6 characters of s1 to s3            
24      strncat( s3, s1, 6 ); // places '' after last character
25
26      cout << " After strncat(s3, s1, 6): s1 = " << s1
27         << " s3 = " << s3;
28
29      strcat( s3, s1 ); // concatenate s1 to s3
30      cout << " After strcat(s3, s1): s1 = " << s1
31         << " s3 = " << s3 << endl;
32      return 0; // indicates successful termination
33   } // end main

s1 = Happy
s2 = New Year

After strcat(s1, s2):
s1 = Happy New Year
s2 = New Year

After strncat(s3, s1, 6):
s1 = Happy New Year
s3 = Happy

After strcat(s3, s1):
s1 = Happy New Year
s3 = Happy Happy New Year

Comparing Strings with strcmp and strncmp

Figure 8.32 compares three strings using strcmp (lines 21, 22 and 23) and strncmp (lines 26, 27 and 28). Function strcmp compares its first string argument with its second string argument character by character. The function returns zero if the strings are equal, a negative value if the first string is less than the second string and a positive value if the first string is greater than the second string. Function strncmp is equivalent to strcmp, except that strncmp compares up to a specified number of characters. Function strncmp stops comparing characters if it reaches the null character in one of its string arguments. The program prints the integer value returned by each function call.

Fig. 8.32 strcmp and strncmp.

 1   // Fig. 8.32: fig08_32.cpp
 2   // Using strcmp and strncmp.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   #include <iomanip>
 8   using std::setw;
 9
10   #include <cstring> // prototypes for strcmp and strncmp
11   using std::strcmp;                                     
12   using std::strncmp;                                    
13
14   int main( )
15   {
16      char *s1 = "Happy New Year";
17      char *s2 = "Happy New Year";
18      char *s3 = "Happy Holidays";
19
20      cout << "s1 = " << s1 << " s2 = " << s2 << " s3 = " << s3
21         << " strcmp(s1, s2) = " << setw( 2 ) << strcmp( s1, s2 )
22         << " strcmp(s1, s3) = " << setw( 2 ) << strcmp( s1, s3 )
23         << " strcmp(s3, s1) = " << setw( 2 ) << strcmp( s3, s1 );
24

25      cout << " strncmp(s1, s3, 6) = " << setw( 2 )
26         << strncmp( s1, s3, 6 ) << " strncmp(s1, s3, 7) = " << setw( 2 )
27         << strncmp( s1, s3, 7 ) << " strncmp(s3, s1, 7) = " << setw( 2 )
28         << strncmp( s3, s1, 7 ) << endl;
29      return 0; // indicates successful termination
30   } // end main

s1 = Happy New Year
s2 = Happy New Year
s3 = Happy Holidays

strcmp(s1, s2) =  0
strcmp(s1, s3) =  1
strcmp(s3, s1) = -1

strncmp(s1, s3, 6) =  0
strncmp(s1, s3, 7) =  1
strncmp(s3, s1, 7) = -1

Common Programming Error 8.21

Common Programming Error 8.21

Assuming that strcmp and strncmp return one (a true value) when their arguments are equal is a logic error. Both functions return zero (C++’s false value) for equality. Therefore, when testing two strings for equality, the result of the strcmp or strncmp function should be compared with zero to determine whether the strings are equal.

To understand just what it means for one string to be “greater than” or “less than” another string, consider the process of alphabetizing a series of last names. You would, no doubt, place “Jones” before “Smith,” because the first letter of “Jones” comes before the first letter of “Smith” in the alphabet. But the alphabet is more than just a list of 26 letters—it is an ordered list of characters. Each letter occurs in a specific position within the list. “Z” is more than just a letter of the alphabet; “Z” is specifically the 26th letter of the alphabet.

How does the computer know that one letter comes before another? All characters are represented inside the computer as numeric codes; when the computer compares two strings, it actually compares the numeric codes of the characters in the strings.

In an effort to standardize character representations, most computer manufacturers have designed their machines to utilize one of two popular coding schemes—ASCII or EBCDIC. Recall that ASCII stands for “American Standard Code for Information Interchange.” EBCDIC stands for “Extended Binary Coded Decimal Interchange Code.” There are other coding schemes as well.

ASCII and EBCDIC are called character codes, or character sets. Most readers of this book will be using desktop or notebook computers that use the ASCII character set. IBM mainframe computers use the EBCDIC character set. As Internet and World Wide Web usage becomes pervasive, the newer Unicode® character set is growing in popularity (www.unicode.org). String and character manipulations actually involve the manipulation of the appropriate numeric codes and not the characters themselves. This explains the interchangeability of characters and small integers in C++. Since it is meaningful to say that one numeric code is greater than, less than or equal to another numeric code, it becomes possible to relate various characters or strings to one another by referring to the character codes. Appendix B contains the ASCII character codes.

Portability Tip 8.4

Portability Tip 8.4

The internal numeric codes used to represent characters may be different on different computers that use different character sets.

Portability Tip 8.5

Portability Tip 8.5

Do not explicitly test for ASCII codes, as in if ( rating == 65 ); rather, use the corresponding character constant, as in if ( rating == 'A' ).

[Note: With some compilers, functions strcmp and strncmp always return -1, 0 or 1, as in the sample output of Fig. 8.32. With other compilers, these functions return 0 or the difference between the numeric codes of the first characters that differ in the strings being compared. For example, when s1 and s3 are compared, the first characters that differ between them are the first character of the second word in each string—N (numeric code 78) in s1 and H (numeric code 72) in s3, respectively. In this case, the return value will be 6 (or -6 if s3 is compared to s1).]

Tokenizing a String with strtok

Function strtok breaks a string into a series of tokens. A token is a sequence of characters separated by delimiting characters (usually spaces or punctuation marks). For example, in a line of text, each word can be considered a token, and the spaces separating the words can be considered delimiters.

Multiple calls to strtok are required to break a string into tokens (assuming that the string contains more than one token). The first call to strtok contains two arguments, a string to be tokenized and a string containing characters that separate the tokens (i.e., delimiters). Line 19 in Fig. 8.33 assigns to tokenPtr a pointer to the first token in sentence. The second argument, " ", indicates that tokens in sentence are separated by spaces. Function strtok searches for the first character in sentence that is not a delimiting character (space). This begins the first token. The function then finds the next delimiting character in the string and replaces it with a null ('') character. This terminates the current token. Function strtok saves (in a static variable) a pointer to the next character following the token in sentence and returns a pointer to the current token.

Fig. 8.33 Using strtok to tokenize a string.

 1   // Fig. 8.33: fig08_33.cpp
 2   // Using strtok to tokenize a string.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   #include <cstring> // prototype for strtok
 8   using std::strtok;                        
 9
10   int main( )
11   {

12     char sentence[ ] = "This is a sentence with 7 tokens";
13     char *tokenPtr;
14
15     cout << "The string to be tokenized is: " << sentence
16        << " The tokens are: ";
17
18     // begin tokenization of sentence
19     tokenPtr = strtok( sentence, " " );
20
21     // continue tokenizing sentence until tokenPtr becomes NULL
22     while ( tokenPtr != NULL )
23     {
24        cout << tokenPtr << ' ';
25        tokenPtr = strtok( NULL" " ); // get next token
26     } // end while
27
28     cout << " After strtok, sentence = " << sentence << endl;
29     return 0; // indicates successful termination
30   } // end main

The string to be tokenized is:
This is a sentence with 7 tokens

The tokens are:

This
is
a
sentence
with
7
tokens

After strtok, sentence = This

Subsequent calls to strtok to continue tokenizing sentence contain NULL as the first argument (line 25). The NULL argument indicates that the call to strtok should continue tokenizing from the location in sentence saved by the last call to strtok. Note that strtok maintains this saved information in a manner that is not visible to you. If no tokens remain when strtok is called, strtok returns NULL. The program of Fig. 8.33 uses strtok to tokenize the string "This is a sentence with 7 tokens". The program prints each token on a separate line. Line 28 outputs sentence after tokenization. Note that strtok modifies the input string; therefore, a copy of the string should be made if the program requires the original after the calls to strtok. When sentence is output after tokenization, note that only the word "This" prints, because strtok replaced each blank in sentence with a null character ('') during the tokenization process.

Common Programming Error 8.22

Common Programming Error 8.22

Not realizing that strtok modifies the string being tokenized, then attempting to use that string as if it were the original unmodified string is a logic error.

Determining String Lengths

Function strlen takes a string as an argument and returns the number of characters in the string—the terminating null character is not included in the length. The length is also the index of the null character. The program of Fig. 8.34 demonstrates function strlen.

Fig. 8.34 strlen returns the length of a char * string.

 1   // Fig. 8.34: fig08_34.cpp
 2   // Using strlen.
 3   #include <iostream>
 4   using std::cout;
 5   using std::endl;
 6
 7   #include <cstring> // prototype for strlen
 8   using std::strlen;                        
 9
10   int main( )
11   {
12      char *string1 = "abcdefghijklmnopqrstuvwxyz";
13      char *string2 = "four";
14      char *string3 = "Boston";
15
16      cout << "The length of "" << string1 << "" is " << strlen( string1 )
17         << " The length of "" << string2 << "" is " << strlen( string2 )
18         << " The length of "" << string3 << "" is " << strlen( string3 )
19         << endl;
20      return 0; // indicates successful termination
21   } // end main

The length of "abcdefghijklmnopqrstuvwxyz" is 26
The length of "four" is 4
The length of "Boston" is 6

8.14 Wrap-Up

In this chapter we provided a detailed introduction to pointers, or variables that contain memory addresses as their values. We began by demonstrating how to declare and initialize pointers. You saw how to use the address operator (&) to assign the address of a variable to a pointer and the indirection operator (*) to access the data stored in the variable indirectly referenced by a pointer. We discussed passing arguments by reference using both pointer arguments and reference arguments.

You learned how to use const with pointers to enforce the principle of least privilege. We demonstrated using nonconstant pointers to nonconstant data, nonconstant pointers to constant data, constant pointers to nonconstant data, and constant pointers to constant data. We then used selection sort to demonstrate passing arrays and individual array elements by reference. We discussed the sizeof operator, which can be used to determine the sizes of data types and variables in bytes during program compilation.

We demonstrated how to use pointers in arithmetic and comparison expressions. You saw that pointer arithmetic can be used to jump from one element of an array to another. You learned how to use arrays of pointers, and more specifically string arrays (arrays of strings). We discussed function pointers, which enable programmers to pass functions as parameters. We introduced several C++ functions that manipulate pointer-based strings. You learned string-processing capabilities such as copying strings, tokenizing strings and determining the length of strings.

In the next chapter, we begin our deeper treatment of classes. You’ll learn about the scope of a class’s members, and how to keep objects in a consistent state. You’ll also learn about using special member functions called constructors and destructors, which execute when an object is created and destroyed, respectively, and we’ll discuss when constructors and destructors are called. In addition, we’ll demonstrate using default arguments with constructors and using default memberwise assignment to assign one object of a class to another object of the same class. We’ll also discuss the danger of returning a reference to a private data member of a class.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset