In C++ pointers and arrays are closely intertwined. In particular, as we’ll see, when we use an array, the compiler ordinarily converts the array to a pointer.
Normally, we obtain a pointer to an object by using the address-of operator (§ 2.3.2, p. 52). Generally speaking, the address-of operator may be applied to any object. The elements in an array are objects. When we subscript an array, the result is the object at that location in the array. As with any other object, we can obtain a pointer to an array element by taking the address of that element:
string nums[] = {"one", "two", "three"}; // array of strings
string *p = &nums[0]; // p points to the first element in nums
However, arrays have a special property—in most places when we use an array, the compiler automatically substitutes a pointer to the first element:
string *p2 = nums; // equivalent to p2 = &nums[0]
In most expressions, when we use an object of array type, we are really using a pointer to the first element in that array.
There are various implications of the fact that operations on arrays are often really operations on pointers. One such implication is that when we use an array as an initializer for a variable defined using auto
(§ 2.5.2, p. 68), the deduced type is a pointer, not an array:
int ia[] = {0,1,2,3,4,5,6,7,8,9}; // ia is an array of ten ints
auto ia2(ia); // ia2 is an int* that points to the first element in ia
ia2 = 42; // error: ia2 is a pointer, and we can't assign an int to a pointer
Although ia
is an array of ten int
s, when we use ia
as an initializer, the compiler treats that initialization as if we had written
auto ia2(&ia[0]); // now it's clear that ia2 has type int*
It is worth noting that this conversion does not happen when we use decltype
(§ 2.5.3, p. 70). The type returned by decltype(ia)
is array of ten int
s:
// ia3 is an array of ten ints
decltype(ia) ia3 = {0,1,2,3,4,5,6,7,8,9};
ia3 = p; // error: can't assign an int* to an array
ia3[4] = i; // ok: assigns the value of i to an element in ia3
Pointers that address elements in an array have additional operations beyond those we described in § 2.3.2 (p. 52). In particular, pointers to array elements support the same operations as iterators on vector
s or string
s (§ 3.4, p. 106). For example, we can use the increment operator to move from one element in an array to the next:
int arr[] = {0,1,2,3,4,5,6,7,8,9};
int *p = arr; // p points to the first element in arr
++p; // p points to arr[1]
Just as we can use iterators to traverse the elements in a vector
, we can use pointers to traverse the elements in an array. Of course, to do so, we need to obtain pointers to the first and one past the last element. As we’ve just seen, we can obtain a pointer to the first element by using the array itself or by taking the address-of the first element. We can obtain an off-the-end pointer by using another special property of arrays. We can take the address of the nonexistent element one past the last element of an array:
int *e = &arr[10]; // pointer just past the last element in arr
Here we used the subscript operator to index a nonexisting element; arr
has ten elements, so the last element in arr
is at index position 9. The only thing we can do with this element is take its address, which we do to initialize e
. Like an off-the-end iterator (§ 3.4.1, p. 106), an off-the-end pointer does not point to an element. As a result, we may not dereference or increment an off-the-end pointer.
Using these pointers we can write a loop to print the elements in arr
as follows:
for (int *b = arr; b != e; ++b)
cout << *b << endl; // print the elements in arr
begin
and end
FunctionsAlthough we can compute an off-the-end pointer, doing so is error-prone. To make it easier and safer to use pointers, the new library includes two functions, named begin
and end
. These functions act like the similarly named container members (§ 3.4.1, p. 106). However, arrays are not class types, so these functions are not member functions. Instead, they take an argument that is an array:
int ia[] = {0,1,2,3,4,5,6,7,8,9}; // ia is an array of ten ints
int *beg = begin(ia); // pointer to the first element in ia
int *last = end(ia); // pointer one past the last element in ia
begin
returns a pointer to the first, and end
returns a pointer one past the last element in the given array: These functions are defined in the iterator
header.
Using begin
and end
, it is easy to write a loop to process the elements in an array. For example, assuming arr
is an array that holds int
values, we might find the first negative value in arr
as follows:
// pbeg points to the first and pend points just past the last element in arr
int *pbeg = begin(arr), *pend = end(arr);
// find the first negative element, stopping if we've seen all the elements
while (pbeg != pend && *pbeg >= 0)
++pbeg;
We start by defining two int
pointers named pbeg
and pend
. We position pbeg
to denote the first element and pend
to point one past the last element in arr
. The while
condition uses pend
to know whether it is safe to dereference pbeg
. If pbeg
does point at an element, we dereference and check whether the underlying element is negative. If so, the condition fails and we exit the loop. If not, we increment the pointer to look at the next element.
A pointer “one past” the end of a built-in array behaves the same way as the iterator returned by the end
operation of a vector
. In particular, we may not dereference or increment an off-the-end pointer.
Pointers that address array elements can use all the iterator operations listed in Table 3.6 (p. 107) and Table 3.7 (p. 111). These operations—dereference, increment, comparisons, addition of an integral value, subtraction of two pointers—have the same meaning when applied to pointers that point at elements in a built-in array as they do when applied to iterators.
When we add (or subtract) an integral value to (or from) a pointer, the result is a new pointer. That new pointer points to the element the given number ahead of (or behind) the original pointer:
constexpr size_t sz = 5;
int arr[sz] = {1,2,3,4,5};
int *ip = arr; // equivalent to int *ip = &arr[0]
int *ip2 = ip + 4; // ip2 points to arr[4], the last element in arr
The result of adding 4
to ip
is a pointer that points to the element four elements further on in the array from the one to which ip
currently points.
The result of adding an integral value to a pointer must be a pointer to an element in the same array, or a pointer just past the end of the array:
// ok: arr is converted to a pointer to its first element; p points one past the end of arr
int *p = arr + sz; // use caution -- do not dereference!
int *p2 = arr + 10; // error: arr has only 5 elements; p2 has undefined value
When we add sz
to arr
, the compiler converts arr
to a pointer to the first element in arr
. When we add sz
to that pointer, we get a pointer that points sz
positions (i.e., 5
positions) past the first one. That is, it points one past the last element in arr
. Computing a pointer more than one past the last element is an error, although the compiler is unlikely to detect such errors.
As with iterators, subtracting two pointers gives us the distance between those pointers. The pointers must point to elements in the same array:
auto n = end(arr) - begin(arr); // n is 5, the number of elements in arr
The result of subtracting two pointers is a library type named ptrdiff_t
. Like size_t
, the ptrdiff_t
type is a machine-specific type and is defined in the cstddef
header. Because subtraction might yield a negative distance, ptrdiff_t
is a signed integral type.
We can use the relational operators to compare pointers that point to elements of an array, or one past the last element in that array. For example, we can traverse the elements in arr
as follows:
int *b = arr, *e = arr + sz;
while (b < e) {
// use *b
++b;
}
We cannot use the relational operators on pointers to two unrelated objects:
int i = 0, sz = 42;
int *p = &i, *e = &sz;
// undefined: p and e are unrelated; comparison is meaningless!
while (p < e)
Although the utility may be obscure at this point, it is worth noting that pointer arithmetic is also valid for null pointers (§ 2.3.2, p. 53) and for pointers that point to an object that is not an array. In the latter case, the pointers must point to the same object, or one past that object. If p
is a null pointer, we can add or subtract an integral constant expression (§ 2.4.4, p. 65) whose value is 0 to p
. We can also subtract two null pointers from one another, in which case the result is 0.
The result of adding an integral value to a pointer is itself a pointer. Assuming the resulting pointer points to an element, we can dereference the resulting pointer:
int ia[] = {0,2,4,6,8}; // array with 5 elements of type int
int last = *(ia + 4); // ok: initializes last to 8, the value of ia[4]
The expression *(ia + 4)
calculates the address four elements past ia
and dereferences the resulting pointer. This expression is equivalent to writing ia[4]
.
Recall that in § 3.4.1 (p. 109) we noted that parentheses are required in expressions that contain dereference and dot operators. Similarly, the parentheses around this pointer addition are essential. Writing
last = *ia + 4; // ok: last = 4, equivalent to ia[0] + 4
means dereference ia
and add 4
to the dereferenced value. We’ll cover the reasons for this behavior in § 4.1.2 (p. 136).
As we’ve seen, in most places when we use the name of an array, we are really using a pointer to the first element in that array. One place where the compiler does this transformation is when we subscript an array. Given
int ia[] = {0,2,4,6,8}; // array with 5 elements of type int
if we write ia[0]
, that is an expression that uses the name of an array. When we subscript an array, we are really subscripting a pointer to an element in that array:
int i = ia[2]; // ia is converted to a pointer to the first element in ia
// ia[2] fetches the element to which (ia + 2) points
int *p = ia; // p points to the first element in ia
i = *(p + 2); // equivalent to i = ia[2]
We can use the subscript operator on any pointer, as long as that pointer points to an element (or one past the last element) in an array:
int *p = &ia[2]; // p points to the element indexed by 2
int j = p[1]; // p[1] is equivalent to *(p + 1),
// p[1] is the same element as ia[3]
int k = p[-2]; // p[-2] is the same element as ia[0]
This last example points out an important difference between arrays and library types such as vector
and string
that have subscript operators. The library types force the index used with a subscript to be an unsigned value. The built-in subscript operator does not. The index used with the built-in subscript operator can be a negative value. Of course, the resulting address must point to an element in (or one past the end of) the array to which the original pointer points.
Unlike subscripts for vector
and string
, the index of the built-in subscript operator is not an unsigned
type.
Exercise 3.34: Given that p1
and p2
point to elements in the same array, what does the following code do? Are there values of p1
or p2
that make this code illegal?
p1 += p2 - p1;
Exercise 3.35: Using pointers, write a program to set the elements in an array to zero.
Exercise 3.36: Write a program to compare two arrays for equality. Write a similar program to compare two vector
s.