Chapter 4. Arrays and Pointers

CONTENTS

Section 4.1 Arrays 110

Section 4.2 Introducing Pointers 114

Section 4.3 C-Style Character Strings 130

Section 4.4 Multidimensioned Arrays 141

Chapter Summary 145

Defined Terms 145

The language defines two lower-level compound types—arrays and pointers—that are similar to vectors and iterators. Like a vector, an array holds a collection of objects of some type. Unlike vectors, arrays are fixed size; once an array is created, new elements cannot be added. Like iterators, pointers can be used to navigate among and examine the elements in an array.

Modern C++ programs should almost always use vectors and iterators in preference to the lower-level arrays and pointers. Well-designed programs use arrays and pointers only in the internals of class implementations where speed is essential.

Arrays are data structures that are similar to library vectors but are built into the language. Like a vector, an array is a container of objects of a single data type. The individual objects are not named; rather, each one is accessed by its position in the array.

Arrays have significant drawbacks compared to vectors: They are fixed size, and they offer no help to the programmer in keeping track of how big a given array is. There is no size operation on arrays. Similarly, there is no push_back to automatically add elements. If the array size needs to change, then the programmer must allocate a new, larger array and copy the elements into that new space.

image

Programs that rely on built-in arrays rather than using the standard vector are more error-prone and harder to debug.

Prior to the advent of the standard library, C++ programs made heavy use of arrays to hold collections of objects. Modern C++ programs should almost always use vectors instead of arrays. Arrays should be restricted to the internals of programs and used only where performance testing indicates that vectors cannot provide the necessary speed. However, there will be a large body of existing C++ code that relies on arrays for some time to come. Hence, all C++ programmers must know a bit about how arrays work.

4.1 Arrays

An array is a compound type (Section 2.5, p. 58) that consists of a type specifier, an identifier, and a dimension. The type specifier indicates what type the elements stored in the array will have. The dimension specifies how many elements the array will contain.

image

The type specifier can denote a built-in data or class type. With the exception of references, the element type can also be any compound type. There are no arrays of references.

4.1.1 Defining and Initializing Arrays

The dimension must be a constant expression (Section 2.7, p. 62) whose value is greater than or equal to one. A constant expression is any expression that involves only integral literal constants, enumerators (Section 2.7, p. 62), or const objects of integral type that are themselves initialized from constant expressions. A nonconst variable, or a const variable whose value is not known until run time, cannot be used to specify the dimension of an array.

The dimension is specified inside a [] bracket pair:

image

Although staff_size is initialized with a literal constant, staff_size itself is a nonconst object. Its value can be known only at run time, so it is illegal as an array dimension. Even though size is a const object, its value is not known until get_size is called at run time. Therefore, it may not be used as a dimension. On the other hand, the expression

          max_files + 1

is a constant expression because max_files is a const variable. The expression can be and is evaluated at compile time to a value of 21.

Explicitly Initializing Array Elements

When we define an array, we can provide a comma-separated list of initializers for its elements. The initializer list must be enclosed in braces:

          const unsigned array_size = 3;
          int ia[array_size] = {0, 1, 2};

If we do not supply element initializers, then the elements are initialized in the same way that variables are initialized (Section 2.3.4, p. 50).

• Elements of an array of built-in type defined outside the body of a function are initialized to zero.

• Elements of an array of built-in type defined inside the body of a function are uninitialized.

• Regardless of where the array is defined, if it holds elements of a class type, then the elements are initialized by the default constructor for that class if it has one. If the class does not have a default constructor, then the elements must be explicitly initialized.

image

Unless we explicitly supply element initializers, the elements of a local array of built-in type are uninitialized. Using these elements for any purpose other than to assign a new value is undefined.

An explicitly initialized array need not specify a dimension value. The compiler will infer the array size from the number of elements listed:

          int ia[] = {0, 1, 2}; // an array of dimension 3

If the dimension size is specified, the number of elements provided must not exceed that size. If the dimension size is greater than the number of listed elements, the initializers are used for the first elements. The remaining elements are initialized to zero if the elements are of built-in type or by running the default constructor if they are of class type:

          const unsigned array_size = 5;
          // Equivalent to ia = {0, 1, 2, 0, 0}
          // ia[3] and ia[4] default initialized to 0
          int ia[array_size] = {0, 1, 2};
          // Equivalent to str_arr = {"hi", "bye", "", "", ""}
          // str_arr[2] through str_arr[4] default initialized to the empty string
          string str_arr[array_size] = {"hi", "bye"};

Character Arrays Are Special

A character array can be initialized with either a list of comma-separated character literals enclosed in braces or a string literal. Note, however, that the two forms are not equivalent. Recall that a string literal (Section 2.2, p. 40) contains an additional terminating null character. When we create a character array from a string literal, the null is also inserted into the array:

image

The dimension of ca1 is 3; the dimension of ca2 and ca3 is 4. It is important to remember the null-terminator when initializing an array of characters to a literal. For example, the following is a compile-time error:

          const char ch3[6] = "Daniel"; // error: Daniel is 7 elements

While the literal contains only six explicit characters, the required array size is seven—six to hold the literal and one for the null.

No Array Copy or Assignment

Unlike a vector, it is not possible to initialize an array as a copy of another array. Nor is it legal to assign one array to another:

image

image

Some compilers allow array assignment as a compiler extension. If you intend to run a given program on more than one compiler, it is usually a good idea to avoid using nonstandard compiler-specific features such as array assignment.

Exercises Section 4.1.1

Exercise 4.1: Assuming get_size is a function that takes no arguments and returns an int value, which of the following definitions are illegal? Explain why.

          unsigned buf_size = 1024;

          (a) int ia[buf_size];
          (b) int ia[get_size()];
          (c) int ia[4 * 7 - 14];
          (d) char st[11] = "fundamental";

Exercise 4.2: What are the values in the following arrays?

          string sa[10];
          int ia[10];
          int main() {
              string sa2[10];
              int    ia2[10];
          }

Exercise 4.3: Which, if any, of the following definitions are in error?

          (a) int ia[7] = { 0, 1, 1, 2, 3, 5, 8 };
          (b) vector<int> ivec = { 0, 1, 1, 2, 3, 5, 8 };
          (c) int ia2[ ] = ia1;
          (d) int ia3[ ] = ivec;

Exercise 4.4: How can you initialize some or all the elements of an array?

Exercise 4.5: List some of the drawbacks of using an array instead of a vector.

4.1.2 Operations on Arrays

Array elements, like vector elements, may be accessed using the subscript operator (Section 3.3.2, p. 94). Like the elements of a vector, the elements of an array are numbered beginning with 0. For an array of ten elements, the correct index values are 0 through 9, not 1 through 10.

When we subscript a vector, we use vector::size_type as the type for the index. When we subscript an array, the right type to use for the index is size_t (Section 3.5.2, p. 104).

In the following example, a for loop steps through the 10 elements of an array, assigning to each the value of its index:

image

Using a similar loop, we can copy one array into another:

image

Checking Subscript Values

As with both strings and vectors, the programmer must guarantee that the subscript value is in range—that the array has an element at the index value.

Nothing stops a programmer from stepping across an array boundary except attention to detail and thorough testing of the code. It is not inconceivable for a program to compile and execute and still be fatally wrong.

image

By far, the most common causes of security problems are so-called “buffer overflow” bugs. These bugs occur when a subscript is not checked and reference is made to an element outside the bounds of an array or other similar data structure.

4.2 Introducing Pointers

Just as we can traverse a vector either by using a subscript or an iterator, we can also traverse an array by using either a subscript or a pointer. A pointer is a compound type; a pointer points to an object of some other type. Pointers are iterators for arrays: A pointer can point to an element in an array. The dereference and increment operators, when applied to a pointer that points to an array element, have similar behavior as when applied to an iterator. When we dereference a pointer, we obtain the object to which the pointer points. When we increment a pointer, we advance the pointer to denote the next element in the array. Before we write programs using pointers, we need to know a bit more about them.

Exercises Section 4.1.2

Exercise 4.6: This code fragment intends to assign the value of its index to each array element. It contains a number of indexing errors. Identify them.

image

Exercise 4.7: Write the code necessary to assign one array to another. Now, change the code to use vectors. How might you assign one vector to another?

Exercise 4.8: Write a program to compare two arrays for equality. Write a similar program to compare two vectors.

Exercise 4.9: Write a program to define an array of 10 ints. Give each element the same value as its position in the array.

4.2.1 What Is a Pointer?

For newcomers, pointers are often hard to understand. Debugging problems due to pointer errors bedevil even experienced programmers. However, pointers are an important part of most C programs and to a much lesser extent remain important in many C++ programs.

Conceptually, pointers are simple: A pointer points at an object. Like an iterator, a pointer offers indirect access to the object to which it points. However, pointers are a much more general construct. Unlike iterators, pointers can be used to point at single objects. Iterators are used only to access elements in a container.

Specifically, a pointer holds the address of another object:

          string s("hello world");
          string *sp = &s; // sp holds the address of s

The second statement defines sp as a pointer to string and initializes sp to point to the string object named s. The * in *sp indicates that sp is a pointer. The & operator in &s is the address-of operator. It returns a value that when dereferenced yields the original object. The address-of operator may be applied only to an lvalue (Section 2.3.1, p. 45). Because a variable is an lvalue, we may take its address. Similarly, the subscript and dereference operators, when applied to a vector, string, or built-in array, yield lvalues. Because these operators yield lvalues, we may apply the address-of to the result of the subscript or dereference operator. Doing so gives us the address of a particular element.

4.2.2 Defining and Initializing Pointers

Every pointer has an associated type. The type of a pointer determines the type of the objects to which the pointer may point. A pointer to int, for example, may only point to an object of type int.

Defining Pointer Variables

We use the * symbol in a declaration to indicate that an identifier is a pointer:

image

image

When attempting to understand pointer declarations, read them from right to left.

Reading the definition of pstring from right to left, we see that

          string *pstring;

defines pstring as a pointer that can point to string objects. Similarly,

          int *ip1, *ip2; // ip1 and ip2 can point to an int

defines ip2 as a pointer and ip1 as a pointer. Both pointers point to ints.

The * can come anywhere in a list of objects of a given type:

          double dp, *dp2; // dp2 is a ponter, dp is an object: both type double

defines dp2 as a pointer and dp as an object, both of type double.

A Different Pointer Declaration Style

The * symbol may be separated from its identifier by a space. It is legal to write:

          string* ps; // legal but can be misleading

which says that ps is a pointer to string.

We say that this definition can be misleading because it encourages the belief that string* is the type and any variable defined in the same definition is a pointer to string. However,

          string* ps1, ps2; // ps1 is a pointer to string,  ps2 is a string

defines ps1 as a pointer, but ps2 is a plain string. If we want to define two pointers in a single definition, we must repeat the * on each identifier:

          string* ps1, *ps2; // both ps1 and ps2 are pointers to string

Multiple Pointer Declarations Can Be Confusing

There are two common styles for declaring multiple pointers of the same type. One style requires that a declaration introduce only a single name. In this style, the * is placed with the type to emphasize that the declaration is declaring a pointer:

          string* ps1;
          string* ps2;

The other style permits multiple declarations in a single statement but places the * adjacent to the identifier. This style emphasizes that the object is a pointer:

          string *ps1, *ps2;

image

As with all questions of style, there is no single right way to declare pointers. The important thing is to choose a style and stick with it.

In this book we use the second style and place the * with the pointer variable name.

Possible Pointer Values

A valid pointer has one of three states: It can hold the address of a specific object, it can point one past the end of an object, or it can be zero. A zero-valued pointer points to no object. An uninitialized pointer is invalid until it is assigned a value. The following definitions and assignments are all legal:

image

Avoid Uninitialized Pointers

image

Uninitialized pointers are a common source of run-time errors.

As with any other uninitialized variable, what happens when we use an uninitialized pointer is undefined. Using an uninitialized pointer almost always results in a run-time crash. However, the fact that the crash results from using an uninitialized pointer can be quite hard to track down.

Under most compilers, if we use an uninitialized pointer the effect will be to use whatever bits are in the memory in which the pointer resides as if it were an address. Using an uninitialized pointer uses this supposed address to manipulate the underlying data at that supposed location. Doing so usually leads to a crash as soon as we attempt to dereference the uninitialized pointer.

It is not possible to detect whether a pointer is uninitialized. There is no way to distinguish a valid address from an address formed from the bits that are in the memory in which the pointer was allocated. Our recommendation to initialize all variables is particularly important for pointers.

image

If possible, do not define a pointer until the object to which it should point has been defined. That way, there is no need to define an uninitialized pointer.

If you must define a pointer separately from pointing it at an object, then initialize the pointer to zero. The reason is that a zero-valued pointer can be tested and the program can detect that the pointer does not point to an object.

Constraints on Initialization of and Assignment to Pointers

There are only four kinds of values that may be used to initialize or assign to a pointer:

  1. A constant expression (Section 2.7, p. 62) with value 0 (e.g., a const integral object whose value is zero at compile time or a literal constant 0)
  2. An address of an object of an appropriate type
  3. The address one past the end of another object
  4. Another valid pointer of the same type

It is illegal to assign an int to a pointer, even if the value of the int happens to be 0. It is okay to assign the literal 0 or a const whose value is known to be 0 at compile time:

          int ival;
          int zero = 0;
          const int c_ival = 0;
          int *pi = ival; // error: pi initialized from int value of ival
          pi = zero;      // error: pi assigned int value of zero
          pi = c_ival;    // ok: c_ival is a const with compile-time value of 0
          pi = 0;         // ok: directly initialize to literal constant 0

In addition to using a literal 0 or a const with a compile-time value of 0, we can also use a facility that C++ inherits from C. The cstdlib header defines a preprocessor variable (Section 2.9.2, p. 69) named NULL, which is defined as 0. When we use a preprocessor variable in our code, it is automatically replaced by its value. Hence, initializing a pointer to NULL is equivalent to initializing it to 0:

          // cstdlib #defines NULL to 0
          int *pi = NULL; // ok: equivalent to int *pi = 0;

As with any preprocessor variable (Section 2.9.2, p. 71) we should not use the name NULL for our own variables.

image

Preprocessor variables are not defined in the std namespace and hence the name is NULL, not std::NULL.

With two exceptions, which we cover in Sections 4.2.5 and 15.3, we may only initialize or assign a pointer from an address or another pointer that has the same type as the target pointer:

          double dval;
          double *pd = &dval;   // ok: initializer is address of a double
          double *pd2 = pd;     // ok: initializer is a pointer to double

          int *pi = pd;   // error: types of pi and pd differ
          pi = &dval;     // error: attempt to assign address of a double to int *

The reason the types must match is that the type of the pointer is used to determine the type of the object that it addresses. Pointers are used to indirectly access an object. The operations that the pointer can perform are based on the type of the pointer: A pointer to int treats the underlying object as if it were an int. If that pointer actually addressed an object of some other type, such as double, then any operations performed by the pointer would be in error.

void* Pointers

The type void* is a special pointer type that can hold an address of any object:

          double obj = 3.14;
          double *pd = &obj;
          // ok: void* can hold the address value of any data pointer type
          void *pv = &obj;       // obj can be an object of any type
          pv = pd;               // pd can be a pointer to any type

A void* indicates that the associated value is an address but that the type of the object at that address is unknown.

There are only a limited number of actions we can perform on a void* pointer: We can compare it to another pointer, we can pass or return it from a function, and we can assign it to another void* pointer. We cannot use the pointer to operate on the object it addresses. We’ll see in Section 5.12.4 (p. 183) how we can retrieve the address stored in a void* pointer.

4.2.3 Operations on Pointers

Pointers allow indirect manipulation of the object to which the pointer points. We can access the object by dereferencing the pointer. Dereferencing a pointer is similar to dereferencing an iterator (Section 3.4, p. 98). The * operator (the dereference operator) returns the object to which the pointer points:

          string s("hello world");
          string *sp = &s; // sp holds the address of s
          cout  <<*sp;     // prints hello world

Exercises Section 4.2.2

Exercise 4.10: Explain the rationale for preferring the first form of pointer declaration:

          int *ip; // good practice
          int* ip; // legal but misleading

Exercise 4.11: Explain each of the following definitions. Indicate whether any are illegal and if so why.

          (a) int* ip;
          (b) string s, *sp = 0;
          (c) int i; double* dp = &i;
          (d) int* ip, ip2;
          (e) const int i = 0, *p = i;
          (f) string *p = NULL;

Exercise 4.12: Given a pointer, p, can you determine whether p points to a valid object? If so, how? If not, why not?

Exercise 4.13: Why is the first pointer initialization legal and the second illegal?

          int i = 42;
          void *p = &i;
          long *lp = &i;

When we dereference sp, we fetch the value of s. We hand that value to the output operator. The last statement, therefore, prints the contents of s—that is, hello world.

Dereference Yields an Lvalue

The dereference operator returns the lvalue of the underlying object, so we can use it to change the value of the object to which the pointer points:

          *sp = "goodbye"; // contents of s now changed

Because we assign to *sp, this statement leaves sp pointing to s and changes the value of s.

We can also assign a new value to sp itself. Assigning to sp causes sp to point to a different object:

          string s2 = "some value";
          sp = &s2;  // sp now points to s2

We change the value of a pointer by assigning to it directly—without dereferencing the pointer.

Comparing Pointers and References

While both references and pointers are used to indirectly access another value, there are two important differences between references and pointers. The first is that a reference always refers to an object: It is an error to define a reference without initializing it. The behavior of assignment is the second important difference: Assigning to a reference changes the object to which the reference is bound; it does not rebind the reference to another object. Once initialized, a reference always refers to the same underlying object.

Consider these two program fragments. In the first, we assign one pointer to another:

          int ival = 1024, ival2 = 2048;
          int *pi = &ival, *pi2 = &ival2;
          pi = pi2;    // pi now points to ival2

After the assignment, ival, the object addressed by pi remains unchanged. The assignment changes the value of pi, making it point to a different object. Now consider a similar program that assigns two references:

          int &ri = ival, &ri2 = ival2;
          ri = ri2;    // assigns ival2 to ival

This assignment changes ival, the value referenced by ri, and not the reference itself. After the assignment, the two references still refer to their original objects, and the value of those objects is now the same as well.

Pointers to Pointers

Pointers are themselves objects in memory. They, therefore, have addresses that we can store in a pointer:

          int ival = 1024;
          int *pi = &ival; // pi points to an int
          int **ppi = &pi; // ppi points to a pointer to int

which yields a pointer to a pointer. We designate a pointer to a pointer by using **. We might represent these objects as

image

As usual, dereferencing ppi yields the object to which ppi points. In this case, that object is a pointer to an int:

          int *pi2 = *ppi; // ppi points to a pointer

To actually access ival, we need to dereference ppi twice:

          cout << "The value of ival "
               << "direct value: " << ival << " "
               << "indirect value: " << *pi << " "
               << "doubly indirect value: " << **ppi
               << endl;

This program prints the value of ival three different ways. First, by direct reference to the variable. Then, through the pointer to int in pi, and finally, by dereferencing ppi twice to get to the underlying value in ival.

Exercises Section 4.2.3

Exercise 4.14: Write code to change the value of a pointer. Write code to change the value to which the pointer points.

Exercise 4.15: Explain the key differences between pointers and references.

Exercise 4.16: What does the following program do?

          int i = 42, j = 1024;
          int *p1 = &i, *p2 = &j;
          *p2 = *p1 * *p2;
          *p1 *= *p1;

4.2.4 Using Pointers to Access Array Elements

Pointers and arrays are closely intertwined in C++. In particular, when we use the name of an array in an expression, that name is automatically converted into a pointer to the first element of the array:

          int ia[] = {0,2,4,6,8};
          int *ip = ia; // ip points to ia[0]

If we want to point to another element in the array, we could do so by using the subscript operator to locate the element and then applying the address-of operator to find its location:

          ip = &ia[4];    // ip points to last element in ia

Pointer Arithmetic

Rather than taking the address of the value returned by subscripting, we could use pointer arithmetic. Pointer arithmetic works the same way (and has the same constraints) as iterator arithmetic (Section 3.4.1, p. 100). Using pointer arithmetic, we can compute a pointer to an element by adding (or subtracting) an integral value to (or from) a pointer to another element in the array:

          ip = ia;            // ok: ip points to ia[0]
          int *ip2 = ip + 4;  // ok: ip2 points to ia[4], the last element in ia

When we add 4 to the pointer ip, we are computing a new pointer. That new pointer points to the element four elements further on in the array from the one to which ip currently points.

More generally, when we add (or subtract) an integral value to a pointer, the effect is to compute a new pointer. The new pointer points to the element as many elements as that integral value ahead of (or behind) the original pointer.

image

Pointer arithmetic is legal only if the original pointer and the newly calculated pointer address elements of the same array or an element one past the end of that array. If we have a pointer to an object, we can also compute a pointer that points just after that object by adding one to the pointer.

Given that ia has 4 elements, adding 10 to ia would be an error:

          // error: ia has only 4 elements, ia + 10 is an invalid address
          int *ip3 = ia + 10;

We can also subtract two pointers as long as they point into the same array or to an element one past the end of the array:

          ptrdiff_t n = ip2 - ip; // ok: distance between the pointers

The result is four, the distance between the two pointers, measured in objects. The result of subtracting two pointers is a library type named ptrdiff_t. Like size_t, the ptrdiff_t type is a machine-specific type and is defined in the cstddef header. The size_t type is an unsigned type, whereas ptrdiff_t is a signed integral type.

The difference in type reflects how these two types are used: size_t is used to hold the size of an array, which must be a positive value. The ptrdiff_t type is guaranteed to be large enough to hold the difference between any two pointers into the same array, which might be a negative value. For example, had we subtracted ip2 from ip, the result would be -4.

It is always possible to add or subtract zero to a pointer, which leaves the pointer unchanged. More interestingly, given a pointer that has a value of zero, it is also legal to add zero to that pointer. The result is another zero-valued pointer. We can also subtract two pointers that have a value of zero. The result of subtracting two zero-valued pointers is zero.

Interaction between Dereference and Pointer Arithmetic

The result of adding an integral value to a pointer is itself a pointer. We can dereference the resulting pointer directly without first assigning it to another pointer:

          int last = *(ia + 4); // ok: initializes last to 8, the value of ia[4]

This expression calculates the address four elements past ia and dereferences that pointer. It is equivalent to writing ia[4].

image

The parentheses around the addition are essential. Writing

          last = *ia + 4;     // ok: last = 4, equivalent to ia[0]+4


means dereference ia and add four to the dereferenced value.

The parentheses are required due to the precedence of the addition and dereference operators. We’ll learn more about precedence in Section 5.10.1 (p. 168). Simply put, precedence stipulates how operands are grouped in expressions with multiple operators. The dereference operator has a higher precedence than the addition operator.

The operands to operators with higher precedence are grouped more tightly than those of lower precedence. Without the parentheses, the dereference operator would use ia as its operand. The expression would be evaluated by dereferencing ia and adding four to the value of the element at the beginning of ia.

By parenthesizing the expression, we override the normal precedence rules and effectively treat (ia + 4) as a single operand. That operand is an address of an element four past the one to which ia points. That new address is dereferenced.

Subscripts and Pointers

We have already seen that when we use an array name in an expression, we are actually using a pointer to the first element in the array. This fact has a number of implications, which we shall point out as they arise.

One important implication is that when we subscript an array, we are really subscripting a pointer:

          int ia[] = {0,2,4,6,8};
          int i = ia[0]; // ia points to the first element in ia

When we write ia[0], that is an expression that uses the name of an array. When we subscript an array, we are really subscripting a pointer to an element in that array. We can use the subscript operator on any pointer, as long as that pointer points to an element in an array:

          int *p = &ia[2];     // ok: p points to the element indexed by 2
          int j = p[1];        // ok: p[1] equivalent to *(p + 1),
                               //    p[1] is the same element as ia[3]
          int k = p[-2];       // ok: p[-2] is the same element as ia[0]

Computing an Off-the-End Pointer

When we use a vector, the end operation returns an iterator that refers just past the end of the vector. We often use this iterator as a sentinel to control loops that process the elements in the vector. Similarly, we can compute an off-the-end pointer value:

          const size_t arr_size = 5;
          int arr[arr_size] = {1,2,3,4,5};
          int *p = arr;           // ok: p points to arr[0]
          int *p2 = p + arr_size; // ok: p2 points one past the end of arr
                                  //    use caution -- do not dereference!

In this case, we set p to point to the first element in arr. We then calculate a pointer one past the end of arr by adding the size of arr to the pointer value in p. When we add 5 to p, the effect is to calculate the address of that is five ints away from p—in other words, p + 5 points just past the end of arr.

image

It is legal to compute an address one past the end of an array or object. It is not legal to dereference a pointer that holds such an address. Nor is it legal to compute an address more than one past the end of an array or an address before the beginning of an array.

The address we calculated and stored in p2 acts much like the iterator returned from the end operation on vectors. The iterator we obtain from end denotes “one past the end” of the vector. We may not dereference that iterator, but we may compare it to another iterator value to see whether we have processed all the elements in the vector. Similarly, the value we calculated for p2 can be used only to compare to another pointer value or as an operand in a pointer arithmetic expression. If we attempt to dereference p2, the most likely result is that it would yield some garbage value. Most compilers, would treat the result of dereferencing p2 as an int, using whatever bits happened to be in memory at the location just after the last element in arr.

Printing the Elements of an Array

Now we are ready to write a program that uses pointers:

image

This program uses a feature of the for loop that we have not yet used: We may define multiple variables inside the init-statement (Section 1.4.2, p. 14) of a for as long as the variables are defined using the same type. In this case, we’re defining two int pointers named pbegin and pend.

We use these pointers to traverse the array. Like other built-in types, arrays have no member functions. Hence, there are no begin and end operations on arrays. Instead, we must position pointers to denote the first and one past the last elements ourselves. We do so in the initialization of our two pointers. We initialize pbegin to address the first element of int_arr and pend to one past the last element in the array:

image

The pointer pend serves as a sentinel, allowing the for loop to know when to stop. Each iteration of the for loop increments pbegin to address the next element. On the first trip through the loop, pbegin denotes the first element, on the second iteration, the second element, and so on. After processing the last element in the array, pbegin will be incremented once more and will then equal pend. At that point we know that we have iterated across the entire array.

Pointers Are Iterators for Arrays

Astute readers will note that this program is remarkably similar to the program on page 99, which traversed and printed the contents of a vector of strings. The loop in that program

image

used iterators in much the same way that pointers are used in the program to print the contents of the array. This similarity is not a coincidence. In fact, the built-in array type has many of the properties of a library container, and pointers, when we use them in conjunction with arrays, are themselves iterators. We’ll have much more to say about containers and iterators in Part II.

4.2.5 Pointers and the const Qualifier

There are two kinds of interactions between pointers and the const qualifier discussed in Section 2.4 (p. 56): We can have pointers to const objects and pointers that are themselves const. This section discusses both kinds of pointers.

Pointers to const Objects

The pointers we’ve seen so far can be used to change the value of the objects to which they point. But if we have a pointer to a const object, we do not want to allow that pointer to change the underlying, const value. The language enforces this property by requiring that pointers to const objects must take the constness of their target into account:

          const double *cptr;  // cptr may point to a double that is const

Exercises Section 4.2.4

Exercise 4.17: Given that p1 and p2 point to elements in the same array, what does the following statement do?

          p1 += p2 - p1;

Are there any values of p1 or p2 that could make this code illegal?

Exercise 4.18: Write a program that uses pointers to set the elements in an array of ints to zero.

Here cptr is a pointer to an object of type const double. The const qualifies the type of the object to which cptr points, not cptr itself. That is, cptr itself is not const. We need not initialize it and can assign a new value to it if we so desire. What we cannot do is use cptr to change the value to which it points:

          *cptr = 42;   // error: *cptr might be const

It is also a compile-time error to assign the address of a const object to a plain, nonconst pointer:

          const double pi = 3.14;
          double *ptr = &pi;        // error: ptr is a plain pointer
          const double *cptr = &pi; // ok: cptr is a pointer to const

We cannot use a void* pointer (Section 4.2.2, p. 119) to hold the address of a const object. Instead, we must use the type const void* to hold the address of a const object:

          const int universe = 42;
          const void *cpv = &universe; // ok: cpv is const
          void *pv = &universe;        // error: universe is const

A pointer to a const object can be assigned the address of a nonconst object, such as

          double dval = 3.14; // dval is a double; its value can be changed
          cptr = &dval;       // ok: but can't change dval through cptr

Although dval is not a const, any attempt to modify its value through cptr results in a compile-time error. When we declared cptr, we said that it would not change the value to which it points. The fact that it happens to point to a nonconst object is irrelevant.

image

We cannot use a pointer to const to change the underlying object. However, if the pointer addresses a nonconst object, it is possible that some other action will change the object to which the pointer points.

The fact that values to which a const pointer points can be changed is subtle and can be confusing. Consider:

image

In this case, cptr is defined as a pointer to const but it actually points at a nonconst object. Even though the object to which it points is nonconst, we cannot use cptr to change the object’s value. Essentially, there is no way for cptr to know whether the object it points to is const, and so it treats all objects to which it might point as const.

When a pointer to const does point to a nonconst, it is possible that the value of the object might change: After all, that value is not const. We could either assign to it directly or, as here, indirectly through another, plain nonconst pointer. It is important to remember that there is no guarantee that an object pointed to by a pointer to const won’t change.

image

It may be helpful to think of pointers to const as “pointers that think they point to const.”

In real-world programs, pointers to const occur most often as formal parameters of functions. Defining a parameter as a pointer to const serves as a contract guaranteeing that the actual object being passed into the function will not be modified through that parameter.

const Pointers

In addition to pointers to const, we can also have const pointers—that is, pointers whose own value we may not change:

          int errNumb = 0;
          int *const curErr = &errNumb; // curErr is a constant pointer

Reading this definition from right to left, we see that “curErr is a constant pointer to an object of type int.” As with any const, we may not change the value of the pointer—that is, we may not make it point to any other object. Any attempt to assign to a constant pointer—even assigning the same value back to curErr—is flagged as an error during compilation:

          curErr = curErr; // error: curErr is const

As with any const, we must initialize a const pointer when we create it.

The fact that a pointer is itself const says nothing about whether we can use the pointer to change the value to which it points. Whether we can change the value pointed to depends entirely on the type to which the pointer points. For example, curErr addresses a plain, nonconst int. We can use curErr to change the value of errNumb:

image

const Pointer to a const Object

We can also define a constant pointer to a constant object as follows:

          const double pi = 3.14159;
          // pi_ptr is const and points to a const object
          const double *const pi_ptr = &pi;

In this case, neither the value of the object addressed by pi_ptr nor the address itself can be changed. We can read its definition from right to left as “pi_ptr is a constant pointer to an object of type double defined as const.”

Pointers and Typedefs

The use of pointers in typedefs (Section 2.6, p. 61) often leads to surprising results. Here is a question almost everyone answers incorrectly at least once. Given the following,

          typedef string *pstring;
          const pstring cstr;

what is the type of cstr? The simple answer is that it is a pointer to const pstring. The deeper question is: what underlying type does a pointer to const pstring represent? Many think that the actual type is

          const string *cstr; // wrong interpretation of const pstring cstr

That is, that a const pstring would be a pointer to a constant string. But that is incorrect.

The mistake is in thinking of a typedef as a textual expansion. When we declare a const pstring, the const modifies the type of pstring, which is a pointer. Therefore, this definition declares cstr to be a const pointer to string. The definition is equivalent to

          // cstr is a const pointer to string
          string *const cstr; // equivalent to const pstring cstr

4.3 C-Style Character Strings

image

Although C++ supports C-style strings, they should not be used by C++ programs. C-style strings are a surprisingly rich source of bugs and are the root cause of many, many security problems.

In Section 2.2 (p. 40) we first used string literals and learned that the type of a string literal is array of constant characters. We can now be more explicit and note that the type of a string literal is an array of const char. A string literal is an instance of a more general construct that C++ inherits from C: C-style character strings. C-style strings are not actually a type in either C or C++. Instead, C-style strings are null-terminated arrays of characters:

image

Neither ca1 nor cp1 are C-style strings: ca1 is a character array, but the array is not null-terminated. cp1, which points to ca1, therefore, does not point to a null-terminated array. The other declarations are all C-style strings, remembering that the name of an array is treated as a pointer to the first element of the array. Thus, ca2 and ca3 are pointers to the first elements of their respective arrays.

Exercises Section 4.3

Exercise 4.19: Explain the meaning of the following five definitions. Identify any illegal definitions.

          (a) int i;
          (b) const int ic;
          (c) const int *pic;
          (d) int *const cpi;
          (e) const int *const cpic;

Exercise 4.20: Which of the following initializations are legal? Explain why.

          (a) int i = -1;
          (b) const int ic = i;
          (c) const int *pic = &ic;
          (d) int *const cpi = &ic;
          (e) const int *const cpic = &ic;

Exercise 4.21: Based on the definitions in the previous exercise, which of the following assignments are legal? Explain why.

          (a) i = ic;
          (b) pic = &ic;
          (c) cpi = pic;
          (d) pic = cpic;
          (e) cpic = &ic;
          (f) ic = *cpic;

Using C-style Strings

C-style strings are manipulated through (const) char* pointers. One frequent usage pattern uses pointer arithmetic to traverse the C-style string. The traversal tests and increments the pointer until we reach the terminating null character:

image

The condition in the while dereferences the const char* pointer cp and the resulting character is tested for its true or false value. A true value is any character other than the null. So, the loop continues until it encounters the null character that terminates the array to which cp points. The body of the while does whatever processing is needed and concludes by incrementing cp to advance the pointer to address the next character in the array.

image

This loop will fail if the array that cp addresses is not null-terminated. If this case, the loop is apt to read characters starting at cp until it encounters a null character somewhere in memory.

C Library String Functions

The Standard C library provides a set of functions, listed in Table 4.1, that operate on C-style strings. To use these functions, we must include the associated C header file

Table 4.1. C-Style Character String Functions

image

          #include <cstring>

which is the C++ version of the string.h header from the C library.

image

These functions do no checking on their string parameters.

The pointer(s) passed to these routines must be nonzero and each pointer must point to the initial character in a null-terminated array. Some of these functions write to a string they are passed. These functions assume that the array to which they write is large enough to hold whatever characters the function generates. It is up to the programmer to ensure that the target string is big enough.

When we compare library strings, we do so using the normal relational operators. We can use these operators to compare pointers to C-style strings, but the effect is quite different; what we’re actually comparing is the pointer values, not the strings to which they point:

          if (cp1 < cp2) // compares addresses, not the values pointed to

Assuming cp1 and cp2 point to elements in the same array (or one past that array), then the effect of this comparison is to compare the address in cp1 with the address in cp2. If the pointers do not address the same array, then the comparison is undefined.

To compare the strings, we must use strcmp and interpret the result:

          const char *cp1 = "A string example";
          const char *cp2 = "A different string";
          int i = strcmp(cp1, cp2);    // i is positive
          i = strcmp(cp2, cp1);        // i is negative
          i = strcmp(cp1, cp1);        // i is zero

The strcmp function returns three possible values: 0 if the strings are equal; or a positive or negative value, depending on whether the first string is larger or smaller than the second.

Never Forget About the Null-Terminator

When using the C library string functions it is essential to remember the strings must be null-terminated:

image

In this case, ca is an array of characters but is not null-terminated. What happens is undefined. The strlen function assumes that it can rely on finding a null character at the end of its argument. The most likely effect of this call is that strlen will keep looking through the memory that follows wherever ca happens to reside until it encounters a null character. In any event, the return from strlen will not be the correct value.

Caller Is Responsible for Size of a Destination String

The array that we pass as the first argument to strcat and strcpy must be large enough to hold the generated string. The code we show here, although a common usage pattern, is frought with the potential for serious error:

image

The problem is that we could easily miscalculate the size needed in largeStr. Similarly, if we later change the sizes of the strings to which either cp1 or cp2 point, then the calculated size of largeStr will be wrong. Unfortunately, programs similar to this code are widely distributed. Programs with such code are error-prone and often lead to serious security leaks.

When Using C-Style Strings, Use the strn Functions

If you must use C-style strings, it is usually safer to use the strncat and strncpy functions instead of strcat and strcpy:

          char largeStr[16 + 18 + 2]; // to hold cp1 a space and cp2
          strncpy(largeStr, cp1, 17); // size to copy includes the null
          strncat(largeStr, " ", 2);  // pedantic, but a good habit
          strncat(largeStr, cp2, 19); // adds at most 18 characters, plus a null

The trick to using these versions is to properly calculate the value to control how many characters get copied. In particular, we must always remember to account for the null when copying or concatenating characters. We must allocate space for the null because that is the character that terminates largeStr after each call. Let’s walk through these calls in detail:

• On the call to strncpy, we ask to copy 17 characters: all the characters in cp1 plus the null. Leaving room for the null is necessary so that largeStr is properly terminated. After the strncpy call, largeStr has a strlen value of 16. Remember, strlen counts the characters in a C-style string, not including the null.

• When we call strncat, we ask to copy two characters: the space and the null that terminates the string literal. After this call, largeStr has a strlen of 17. The null that had ended largeStr is overwritten by the space that we appended. A new null is written after that space.

• When we append cp2 in the second call, we again ask to copy all the characters from cp2, including the null. After this call, the strlen of largeStr would be 35: 16 characters from cp1, 18 from cp2, and 1 for the space that separates the two strings.

The array size of largeStr remains 36 throughout.

These operations are safer than the simpler versions that do not take a size argument as long as we calculate the size argument correctly. If we ask to copy or concatenate more characters than the size of the target array, we will still overrun that array. If the string we’re copying from or concatenating is bigger than the requested size, then we’ll inadvertently truncate the new version. Truncating is safer than overrunning the array, but it is still an error.

Whenever Possible, Use Library strings

None of these issues matter if we use C++ library strings:

          string largeStr = cp1; // initialize large Str as a copy of cp1
          largeStr += " ";       // add space at end of largeStr
          largeStr += cp2;       // concatenate cp2 onto end of largeStr

Now the library handles all memory management, and we need no longer worry if the size of either string changes.

image

For most applications, in addition to being safer, it is also more efficient to use library strings rather than C-style strings.

4.3.1 Dynamically Allocating Arrays

A variable of array type has three important limitations: Its size is fixed, the size must be known at compile time, and the array exists only until the end of the block in which it was defined. Real-world programs usually cannot live with these restrictions—they need a way to allocate an array dynamically at run time. Although all arrays have fixed size, the size of a dynamically allocated array need not be fixed at compile time. It can be (and usually is) determined at run time. Unlike an array variable, a dynamically allocated array continues to exist until it is explicitly freed by the program.

Exercises Section 4.3

Exercise 4.22: Explain the difference between the following two while loops:

          const char *cp = "hello";
          int cnt;
          while (cp) { ++cnt; ++cp; }
          while (*cp) { ++cnt; ++cp; }

Exercise 4.23: What does the following program do?

          const char ca[] = {'h', 'e', 'l', 'l', 'o'};
          const char *cp = ca;
          while (*cp) {
              cout << *cp << endl;
              ++cp;
          }

Exercise 4.24: Explain the differences between strcpy and strncpy. What are the advantages of each? The disadvantages?

Exercise 4.25: Write a program to compare two strings. Now write a program to compare the value of two C-style character strings.

Exercise 4.26: Write a program to read a string from the standard input. How might you write a program to read from the standard input into a C-style character string?

Every program has a pool of available memory it can use during program execution to hold dynamically allocated objects. This pool of available memory is referred to as the program’s free store or heap. C programs use a pair of functions named malloc and free to allocate space from the free store. In C++ we use new and delete expressions.

Defining a Dynamic Array

When we define an array variable, we specify a type, a name, and a dimension. When we dynamically allocate an array, we specify the type and size but do not name the object. Instead, the new expression returns a pointer to the first element in the newly allocated array:

          int *pia = new int[10]; // array of 10 uninitialized ints

This new expression allocates an array of ten ints and returns a pointer to the first element in that array, which we use to initialize pia.

A new expression takes a type and optionally an array dimension specified inside a bracket-pair. The dimension can be an arbitrarily complex expression. When we allocate an array, new returns a pointer to the first element in the array. Objects allocated on the free store are unnamed. We use objects on the heap only indirectly through their address.

Initializing a Dynamically Allocated Array

When we allocate an array of objects of a class type, then that type’s default constructor (Section 2.3.4, p. 50) is used to initialize each element. If the array holds elements of built-in type, then the elements are uninitialized:

          string *psa = new string[10]; // array of 10 empty strings
          int *pia = new int[10];       // array of 10 uninitialized ints

Each of these new expressions allocates an array of ten objects. In the first case, those objects are strings. After allocating memory to hold the objects, the default string constructor is run on each element of the array in turn. In the second case, the objects are a built-in type; memory to hold ten ints is allocated, but the elements are uninitialized.

Alternatively, we can value-initialize (Section 3.3.1, p. 92) the elements by following the array size by an empty pair of parentheses:

          int *pia2 = new int[10] (); // array of 10 uninitialized ints

The parentheses are effectively a request to the compiler to value-initialize the array, which in this case sets its elements to 0.

image

The elements of a dynamically allocated array can be initialized only to the default value of the element type. The elements cannot be initialized to separate values as can be done for elements of an array variable.

Dynamic Arrays of const Objects

If we create an array of const objects of built-in type on the free store, we must initialize that array: The elements are const, there is no way to assign values to the elements. The only way to initialize the elements is to value-initialize the array:

          // error: uninitialized const array
          const int *pci_bad = new const int[100];
          // ok: value-initialized const array
          const int *pci_ok = new const int[100]();

It is possible to have a const array of elements of a class type that provides a default constructor:

          // ok: array of 100 empty strings
          const string *pcs = new const string[100];

In this case, the default constructor is used to initialize the elements of the array.

Of course, once the elements are created, they may not be changed—which means that such arrays usually are not very useful.

It Is Legal to Dynamically Allocate an Empty Array

When we dynamically allocate an array, we often do so because we don’t know the size of the array at compile time. We might write code such as

          size_t n = get_size(); // get_size returns number of elements needed
          int* p = new int[n];
          for (int* q = p; q != p + n; ++q)
               /* process the array */ ;

to figure out the size of the array and then allocate and process the array.

An interesting question is: What happens if get_size returns 0? The answer is that our code works fine. The language specifies that a call to new to create an array of size zero is legal. It is legal even though we could not create an array variable of size 0:

          char arr[0];            // error: cannot define zero-length array
          char *cp = new char[0]; // ok: but cp can't be dereferenced

When we use new to allocate an array of zero size, new returns a valid, nonzero pointer. This pointer will be distinct from any other pointer returned by new. The pointer cannot be dereferenced—after all, it points to no element. The pointer can be compared and so can be used in a loop such as the preceeding one. It is also legal to add (or subtract) zero to such a pointer and to subtract the pointer from itself, yielding zero.

In our hypothetical loop, if the call to get_size returned 0, then the call to new would still succeed. However, p would not address any element; the array is empty. Because n is zero, the for loop effectively compares q to p. These pointers are equal; q was initialized to p, so the condition in the for fails and the loop body is not executed.

Freeing Dynamic Memory

When we allocate memory, we must eventually free it. Otherwise, memory is gradually used up and may be exhausted. When we no longer need the array, we must explicitly return its memory to the free store. We do so by applying the delete [] expression to a pointer that addresses the array we want to release:

          delete [] pia;

deallocates the array pointed to by pia, returning the associated memory to the free store. The empty bracket pair between the delete keyword and the pointer is necessary: It indicates to the compiler that the pointer addresses an array of elements on the free store and not simply a single object.

image

If the empty bracket pair is omitted, it is an error, but an error that the compiler is unlikely to catch; the program may fail at run time.

The least serious run-time consequence of omitting brackets when freeing an array is that too little memory will be freed, leading to a memory leak. On some systems and/or for some element types, more serious run-time problems are possible. It is essential to remember the bracket-pair when deleting pointers to arrays.

Using Dynamically Allocated Arrays

A common reason to allocate an array dynamically is if its dimension cannot be known at compile time. For example, char* pointers are often used to refer to multiple C-style strings during the execution of a program. The memory used to hold the various strings typically is allocated dynamically during program execution based on the length of the string to be stored. This technique is considerably safer than allocating a fixed-size array. Assuming we correctly calculate the size needed at run time, we no longer need to worry that a given string will overflow the fixed size of an array variable.

Suppose we have the following C-style strings:

          const char *noerr = "success";
          // ...
          const char *err189 = "Error: a function declaration must "
                               "specify a function return type!";

We might want to copy one or the other of these strings at run time to a new character array. We could calculate the dimension at run time, as follows:

image

Recall that strlen returns the length of the string not including the null. It is essential to remember to add 1 to the length returned from strlen to accommodate the trailing null.

Exercises Section 4.3.1

Exercise 4.27: Given the following new expression, how would you delete pa?

     int *pa = new int[10];

Exercise 4.28: Write a program to read the standard input and build a vector of ints from values that are read. Allocate an array of the same size as the vector and copy the elements from the vector into the array.

Exercise 4.29: Given the two program fragments in the highlighted box on page 138,

(a) Explain what the programs do.

(b) As it happens, on average, the string class implementation executes considerably faster than the C-style string functions. The relative average execution times on our more than five-year-old PC are as follows:

          user       0.47    # string class
          user       2.55    # C-style character string

Did you expect that? How would you account for it?

Exercise 4.30: Write a program to concatenate two C-style string literals, putting the result in a C-style string. Write a program to concatenate two library strings that have the same value as the literals used in the first program.

4.3.2 Interfacing to Older Code

Many C++ programs exist that predate the standard library and so do not yet use the string and vector types. Moreover, many C++ programs interface to existing C programs that cannot use the C++ library. Hence, it is not infrequent to encounter situations where a program written in modern C++ must interface to code that uses arrays and/or C-style character strings. The library offers facilities to make the interface easier to manage.

Mixing Library strings and C-Style Strings

As we saw on page 80 we can initialize a string from a string literal:

          string st3("Hello World");  // st3 holds Hello World

More generally, because a C-style string has the same type as a string literal and is null-terminated in the same way, we can use a C-style string anywhere that a string literal can be used:

• We can initialize or assign to a string from a C-style string.

• We can use a C-style string as one of the two operands to the string addition or as the right-hand operand to the compound assignment operators.

The reverse functionality is not provided: there is no direct way to use a library string when a C-style string is required. For example, there is no way to initialize a character pointer from a string:

          char *str = st2; // compile-time type error

There is, however, a string member function named c_str that we can often use to accomplish what we want:

          char *str = st2.c_str(); // almost ok, but not quite

The name c_str indicates that the function returns a C-style character string. Literally, it says, “Get me the C-style string representation”—that is, a pointer to the beginning of a null-terminated character array that holds the same data as the characters in the string.

This initialization fails because c_str returns a pointer to an array of const char. It does so to prevent changes to the array. The correct initialization is:

          const char *str = st2.c_str(); // ok

image

The array returned by c_str is not guaranteed to be valid indefinitely. Any subsequent use of st2 that might change the value of st2 can invalidate the array. If a program needs continuing access to the data, then the program must copy the array returned by c_str.

Using an Array to Initialize a vector

On page 112 we noted that it is not possible to initialize an array from another array. Instead, we have to create the array and then explicitly copy the elements from one array into the other. It turns out that we can use an array to initialize a vector, although the form of the initialization may seem strange at first. To initialize a vector from an array, we specify the address of the first element and one past the last element that we wish to use as initializers:

          const size_t arr_size = 6;
          int int_arr[arr_size] = {0, 1, 2, 3, 4, 5};
          // ivec has 6 elements: each a copy of the corresponding element in int_arr
          vector<int> ivec(int_arr, int_arr + arr_size);

The two pointers passed to ivec mark the range of values with which to initialize the vector. The second pointer points one past the last element to be copied. The range of elements marked can also represent a subset of the array:

          // copies 3 elements: int_arr[1], int_arr[2], int_arr[3]
          vector<int> ivec(int_arr + 1, int_arr + 4);

This initialization creates ivec with three elements. The values of these elements are copies of the values in int_arr[1] through int_arr[3].

Exercises Section 4.3.2

Exercise 4.31: Write a program that reads a string into a character array from the standard input. Describe how your program handles varying size inputs. Test your program by giving it a string of data that is longer than the array size you’ve allocated.

Exercise 4.32: Write a program to initialize a vector from an array of ints.

Exercise 4.33: Write a program to copy a vector of ints into an array of ints.

Exercise 4.34: Write a program to read strings into a vector. Now, copy that vector into an array of character pointers. For each element in the vector, allocate a new character array and copy the data from the vector element into that character array. Then insert a pointer to the character array into the array of character pointers.

Exercise 4.35: Print the contents of the vector and the array created in the previous exercise. After printing the array, remember to delete the character arrays.

4.4 Multidimensioned Arrays

image

Strictly speaking, there are no multidimensioned arrays in C++. What is commonly referred to as a multidimensioned array is actually an array of arrays:

          // array of size 3, each element is an array of ints of size 4          int ia[3][4];


It can be helpful to keep this fact in mind when using what appears to be a multidimensioned array.

An array whose elements are an array is said to have two dimensions. Each dimension is referred to by its own subscript:

     ia[2][3] // fetches last element from the array in the last row

The first dimension is often referred to as the row and the second as the column. In C++ there is no limit on how many subscripts are used. That is, we could have an array whose elements are arrays of elements that are arrays, and so on.

Initializing the Elements of a Multidimensioned Array

As with any array, we can initialize the elements by providing a bracketed list of initializers. Multidimensioned arrays may be initialized by specifying bracketed values for each row:

image

The nested braces, which indicate the intended row, are optional. The following initialization is equivalent, although considerably less clear.

     // equivalent initialization without the optional nested braces for each row
     int ia[3][4] = {0,1,2,3,4,5,6,7,8,9,10,11};

As is the case for single-dimension arrays, elements may be left out of the initializer list. We could initialize only the first element of each row as follows:

     // explicitly initialize only element 0 in each row
     int ia[3][4] = {{ 0 } , { 4 } , { 8 } };

The values of the remaining elements depend on the element type and follow the rules descibed on page 112.

If the nested braces were omitted, the results would be very different:

     // explicitly initialize row 0
     int ia[3][4] = {0, 3, 6, 9};

initializes the elements of the first row. The remaining elements are initialized to 0.

Subscripting a Multidimensioned Array

Indexing a multidimensioned array requires a subscript for each dimension. As an example, the following pair of nested for loops initializes a two-dimensioned array:

image

When we want to access a particular element of the array, we must supply both a row and column index. The row index specifies which of the inner arrays we intend to access. The column index selects an element from that inner array. Remembering this fact can help in calculating proper subscript values and in understanding how multidimensioned arrays are initialized.

If an expression provides only a single index, then the result is the inner-array element at that row index. Thus, ia[2] fetches the array that is the last row in ia. It does not fetch any element from that array; it fetches the array itself.

4.4.1 Pointers and Multidimensioned Arrays

As with any array, when we use the name of a multidimensioned array, it is automatically converted to a pointer to the first element in the array.

image

When defining a pointer to a multidimensioned array, it is essential to remember that what we refer to as a multidimensioned array is really an array of arrays.

Because a multidimensioned array is really an array of arrays, the pointer type to which the array converts is a pointer to the first inner array. Although conceptually straightforward, the syntax for declaring such a pointer can be confusing:

     int ia[3][4];      // array of size 3, each element is an array of ints of size 4
     int (*ip)[4] = ia; // ip points to an array of 4 ints
     ip = &ia[2];       // ia[2] is an array of 4 ints

We define a pointer to an array similarly to how we would define the array itself: We start by declaring the element type followed by a name and a dimension. The trick is that the name is a pointer, so we must prepend * to the name. We can read the definition of ip from the inside out as saying that *ip has type int[4]— that is, ip is a pointer to an int array of four elements.

image

The parentheses in this declaration are essential:

     int *ip[4]; // array of pointers to int     int (*ip)[4]; // pointer to an array of 4 ints


Typedefs Simplify Pointers to Multidimensioned Arrays

Typedefs (Section 2.6, p. 61) can help make pointers to elements in multidimensioned arrays easier to write, read, and understand. We might write a typedef for the element type of ia as

     typedef int int_array[4];
     int_array *ip = ia;

We might use this typedef to print the elements of ia:

image

The outer for loop starts by initializing p to point to the first array in ia. That loop continues until we’ve processed all three rows in ia. The increment, ++p, has the effect of moving p to point to the next row (e.g., the next element) in ia.

The inner for loop actually fetches the int values stored in the inner arrays. It starts by making q point to the first element in the array to which p points. When we dereference p, we get an array of four ints. As usual, when we use an array, it is converted automatically to a pointer to its first element. In this case, that first element is an int, and we point q at that int. The inner for loop runs until we’ve processed every element in the inner array. To obtain a pointer just off the end of the inner array, we again dereference p to get a pointer to the first element in that array. We then add 4 to that pointer to process the four elements in each inner array.

Exercises Section 4.4.1

Exercise 4.36: Rewrite the program to print the contents of the array ia without using a typedef for the type of the pointer in the outer loop.

Chapter Summary

This chapter covered arrays and pointers. These facilities provide functionality similar to that provided by the library vector and string types and their companion iterators. The vector type can be thought of as a more flexible, easier to manage array. Similarly, strings are a great improvement on C-style strings that are implemented as null-terminated character arrays.

Iterators and pointers allow indirect access to objects. Iterators are used to examine elements and navigate between the elements in vectors. Pointers provide similar access to array elements. Although conceptually simple, pointers are notoriously hard to use in practice.

Pointers and arrays can be necessary for certain low-level tasks, but they should be avoided because they are error-prone and hard to debug. In general, the library abstractions should be used in preference to low-level array and pointer alternatives built into the language. This advice is especially applicable to using strings instead of C-style null-terminated character arrays. Modern C++ programs should not use C-style strings.

Defined Terms

C-style strings

C programs treat pointers to null-terminated character arrays as strings. In C++, string literals are C-style strings. The C library defines a set of functions that operate on such strings, which C++ makes available in the cstring header. C++ programs should use C++ library strings in preference to C-style strings, which are inherently error-prone. A sizeable majority of security holes in networked programs are due to bugs related to using C-style strings and arrays.

compiler extension

Feature that is added to the language by a particular compiler. Programs that rely on compiler extensions cannot be moved easily to other compilers.

compound type

Type that is defined in terms of another type. Arrays, pointers, and references are compound types.

const void*

A pointer type that can point to any const type. See void*.

delete expression

A delete expression frees memory that was allocated by new:

     delete [] p;

where p must be a pointer to the first element in a dynamically allocated array. The bracket pair is essential: It indicates to the compiler that the pointer points at an array, not at a single object. In C++ programs, delete replaces the use of the C library free function.

dimension

The size of an array.

dynamically allocated

An object that is allocated on the program’s free store. Objects allocated on the free store exist until they are explicitly deleted.

free store

Memory pool available to a program to hold dynamically allocated objects.

heap

Synonym for free store.

new expression

Allocates dynamic memory. We allocate an array of n elements as follows:

     new type[n];

The array holds elements of the indicated type. new returns a pointer to the first element in the array. C++ programs use new in place of the C library malloc function.

pointer

An object that holds the address of an object.

pointer arithmetic

The arithmetic operations that can be applied to pointers. An integral type can be added to or subtracted from a pointer, resulting in a pointer positioned that many elements ahead or behind the original pointer. Two pointers can be subtracted, yielding the difference between the pointers. Pointer arithmetic is valid only on pointers that denote elements in the same array or an element one past the end of that array.

precedence

Defines the order in which operands are grouped with operators in a compound expression.

ptrdiff_t

Machine-dependent signed integral type defined in cstddef header that is large enough to hold the difference between two pointers into the largest possible array.

size_t

Machine-dependent unsigned integral type defined in cstddef header that is large enough to hold the size of the largest possible array.

* operator

Dereferencing a pointer yields the object to which the pointer points. The dereference operator returns an lvalue; we may assign to the value returned from the dereference operator, which has the effect of assigning a new value to the underlying element.

++ operator

When used with a pointer, the increment operator “adds one” by moving the pointer to refer to the next element in an array.

[] operator

The subscript operator takes two operands: a pointer to an element of an array and an index. Its result is the element that is offset from the pointer by the index. Indices count from zero—the first element in an array is element 0, and the last is element size of the array minus 1. The subscript operator returns an lvalue; we may use a subscript as the left-hand operand of an assignment, which has the effect of assigning a new value to the indexed element.

& operator

The address-of operator. Takes a single argument that must be an lvalue. Yields the address in memory of that object.

void*

A pointer type that can point to any nonconst type. Only limited operations are permitted on void* pointers. They can be passed or returned from functions and they can be compared with other pointers. They may not be dereferenced.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset