Objectives
In this chapter, you’ll:
■ Learn what pointers are, and declare and initialize them.
■ Use the address (&
) and indirection (*
) pointer operators.
■ Compare the capabilities of pointers and references.
■ Use pointers to pass arguments to functions by reference.
■ Use pointer-based arrays and strings mostly in legacy code.
■ Use const
with pointers and the data they point to.
■ Use operator sizeof
to determine the number of bytes that store a value of a particular type.
■ Understand pointer expressions and pointer arithmetic that you’ll see in legacy code.
■ Use C++11’s nullptr
to represent pointers to nothing.
■ Use C++11’s begin
and end
library functions with pointer-based arrays.
■ Learn various C++ Core Guidelines for avoiding pointers and pointer-based arrays to create safer, more robust programs.
■ Use C++20’s to_array
function to convert built-in arrays and initializer lists to std::array
s.
■ Continue our objects-natural approach by using C++20’s class template span
to create objects that are views into built-in arrays, std::array
s and std::vector
s.
Outline
7.2 Pointer Variable Declarations and Initialization
7.2.3 Null Pointers Prior to C++11
7.3.2 Indirection (*
) Operator
7.3.3 Using the Address (&
) and Indirection (*
) Operators
7.4 Pass-by-Reference with Pointers
7.5.1 Declaring and Accessing a Built-In Array
7.5.2 Initializing Built-In Arrays
7.5.3 Passing Built-In Arrays to Functions
7.5.4 Declaring Built-In Array Parameters
7.5.5 C++11: Standard Library Functions begin
and end
7.5.6 Built-In Array Limitations
7.6 C++20: Using to_array
to convert a Built-in Array to a std::array
7.7 Using const
with Pointers and the Data They Point To
7.7.1 Using a Nonconstant Pointer to Nonconstant Data
7.7.2 Using a Nonconstant Pointer to Constant Data
7.7.3 Using a Constant Pointer to Nonconstant Data
7.7.4 Using a Constant Pointer to Constant Data
7.9 Pointer Expressions and Pointer Arithmetic
7.9.1 Adding Integers to and Subtracting Integers from Pointers
7.9.2 Subtracting One Pointer from Another
7.9.4 Cannot Dereference a void*
7.10 Objects Natural Case Study: C++20 span
s—Views of Contiguous Container Elements
7.11 A Brief Intro to Pointer-Based Strings
7.11.2 Revisiting C++20’s to_array
Function
This chapter discusses pointers, built-in pointer-based arrays and pointer-based strings (also called C-strings), each of which C++ inherited from the C programming language.
Pointers are powerful but challenging to work with and error-prone. So, Modern C++ (C++20, C++17, C++14 and C++11) has added features that eliminate the need for most pointers. New software-development projects generally should prefer:
• using references to using pointers,
• using std::array
1 and std::vector
objects (Chapter 6) to using built-in pointer-based arrays, and
1. We pronounce “std::
” as “standard,” so throughout this chapter we say “a std::array”
rather than “an std::array
,” which assumes “std::
” is pronounced as its individual letters s
, t
and d
.
• using std::string
objects (Chapters 2 and 8) to pointer-based C-strings.
You’ll encounter pointers, pointer-based arrays and pointer-based strings frequently in the massive installed base of legacy C++ code. Pointers are required to:
• create and manipulate dynamic data structures, like linked lists, queues, stacks and trees that can grow and shrink at execution time—though most programmers will use the C++ standard library’s existing dynamic containers like vector and the containers we discuss in Chapter 17,
• process command-line arguments, which a program receives as a pointer-based array of pointer-based C-strings, and
• pass arguments by reference if there’s a possibility of a nullptr
2 (i.e., a pointer to nothing; Section 7.2.2)—a reference must refer to an actual object.
2. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-ptr-ref
.
CG We mention C++ Core Guidelines that encourage you to make your code safer and more robust by recommending you use techniques that avoid pointers, pointer-based arrays and pointer-based strings. For example, several guidelines recommend implementing pass-by-reference using references, rather than pointers.3
3. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#S-functions
.
20 For programs that still require pointer-based arrays (e.g., command-line arguments), C++20 adds two new features that help make your programs safer and more robust:
• Function to_array
converts a pointer-based array to a std::array
, so you can take advantage of the features we demonstrated in Chapter 6.
• span
s offer a safer way to pass built-in arrays to functions. They’re iterable, so you can use them with range-based for
statements to conveniently process elements without risking out-of-bounds array accesses. Also, because span
s are iterable, you can use them with standard library container-processing algorithms, such as accumulate
and sort
. We’ll cover span
s in this chapter’s objects natural case study where you’ll see that they also work with std::array
and std::vector
.
The key takeaway from reading this chapter is to avoid using pointers, pointer-based arrays and pointer-based strings whenever possible. If you must use them, take advantage of to_array
and span
s.
We declare and initialize pointers and demonstrate the pointer operators &
and *
. In Chapter 5, we performed pass-by-reference with references. Here, we show that pointers also enable pass-by-reference. We demonstrate built-in, pointer-based arrays and their intimate relationship with pointers.
We show how to use const
with pointers and the data they point to, and we introduce the sizeof
operator to determine the number of bytes that store values of particular fundamental types and pointers. We demonstrate pointer expressions and pointer arithmetic.
C-strings were used widely in older C++ software. This chapter briefly introduces C-strings. You’ll see how to process command-line arguments—a simple task for which C++ still requires you to use both pointer-based C-strings and pointer-based arrays.
Pointer variables contain memory addresses as their values. Usually, a variable directly contains a specific value. A pointer contains the memory address of a variable that, in turn, contains a specific value. In this sense, a variable name directly references a value, and a pointer indirectly references a value as shown in the following diagram:
Referencing a value through a pointer as in this diagram is called indirection.
The following declaration declares the variable countPtr
to be of type int*
(i.e., a pointer to an int
value) and is read (right-to-left), “countPtr
is a pointer to an int
”:
int* countPtr;
This *
is not an operator; rather, it indicates that the variable to its right is a pointer. We like to include the letters Ptr
in each pointer variable name to make it clear that the variable is a pointer and must be handled accordingly.
11 Security Initialize each pointer to nullptr
(from C++11) or a memory address. A pointer with the value nullptr
“points to nothing” and is known as a null pointer. From this point forward, when we refer to a “null pointer,” we mean a pointer with the value nullptr
. Initialize all pointers to prevent pointing to unknown or uninitialized areas of memory.
In earlier C++ versions, the value specified for a null pointer was 0
or NULL
. NULL
is defined in several standard library headers to represent the value 0
. Initializing a pointer to NULL
is equivalent to initializing it to 0
, but prior to C++11, 0
was used by convention. The value 0
is the only integer value that can be assigned directly to a pointer variable without first casting the integer to a pointer type (generally via a reinterpret_cast
; Section 9.8).
The unary operators &
and *
create pointer values and “dereference” pointers, respectively. We show how to use these operators in the following sections.
&
) OperatorThe address operator (&
) is a unary operator that obtains the memory address of its oper- and. For example, assuming the declarations
int y{5}; // declare variable y int* yPtr{nullptr}; // declare pointer variable yPtr
the following statement assigns the address of the variable y
to pointer variable yPtr
:
yPtr = &y; // assign address of y to yPtr
Variable yPtr
is said to “point to” y
—yPtr
indirectly references the variable y
’s value (5).
The &
in the preceding statement is not a reference variable declaration, where &
is always preceded by a type name. When declaring a reference, the &
is part of the type. In an expression like &y
, the &
is the address operator.
The following diagram shows a memory representation after the previous assignment:
The “pointing relationship” is indicated by drawing an arrow from the box that represents the pointer yPtr
in memory to the box that represents the variable y
in memory.
The following diagram shows another pointer memory representation with int
variable y
stored at memory location 600000
and pointer variable yPtr
at location 500000
The address operator’s operand must be an lvalue—the address operator cannot be applied to literals or to expressions that result in temporary values (like the results of calculations).
*
) OperatorApplying the unary *
operator to a pointer results in an lvalue representing the object to which its pointer operand points. This operator is commonly referred to as the indirection operator or dereferencing operator. If yPtr
points to y
and y
contains 5
(as in the preceding diagrams), the statement
cout << *yPtr << endl;
displays y
’s value (5
), as would the statement
cout << y << endl;
Using *
in this manner is called dereferencing a pointer. A dereferenced pointer also can be used as an lvalue in an assignment. The following assigns 9
to y
:
*yPtr = 9;
In this statement, *yPtr
is an alias for y
. The dereferenced pointer may also be used to receive an input value as in
cin >> *yPtr;
which places the input value in y
.
Security Dereferencing an uninitialized pointer results in undefined behavior that could cause a fatal execution-time error. This also could lead to accidentally modifying important data, allowing the program to run to completion, possibly with incorrect results. This is a potential security flaw that an attacker might be able to exploit to access data, overwrite data or even execute malicious code.4,5,6 Dereferencing a null pointer results in undefined behavior and typically causes a fatal execution-time error. In industrial-strength code, ensure that a pointer is not nullptr
before dereferencing it.7
4. “Undefined Behavior.” Wikipedia. Wikimedia Foundation, May 30, 2020. https://en.wikipedia.org/wiki/Undefined_behavior
.
5. “Common Weakness Enumeration.” CWE. Accessed June 14, 2020. https://cwe.mitre.org/data/definitions/824.html
.
6. “Dangling Pointer.” Wikipedia. Wikimedia Foundation, June 8, 2020. https://en.wikipedia.org/wiki/Dangling_pointer
.
7. The C++ Core Guidelines recommend using the gsl::not_null
class template from the Guidelines Support Library (GSL) to declare pointers that should not have the value nullptr
. Throughout this book, we adhere to the C++ Core Guidelines as appropriate. At the time of this writing, the Guidelines Support Library’s gsl::not_null
implementation did not produce helpful error messages in our compilers, so we chose not to use gsl::not_null
in our code.
&
) and Indirection (*
) OperatorsFigure 7.1 demonstrates the &
and *
pointer operators, which have the third-highest level of precedence (see the Appendix A for the complete operator-precedence chart). Memory locations are output by <<
in this example as hexadecimal (i.e., base-16) integers. (See Appendix D, Number Systems, for more information on hexadecimal integers.) The output shows that variable a
’s address (line 10) and aPtr
’s value (line 11) are identical, confirming that a
’s address was indeed assigned to aPtr
(line 8). The outputs from lines 12–13 confirm that *aPtr
has the same value as a
. The memory addresses output by this program with cout
and <<
are compiler- and platform-dependent, and typically change with each program execution, so you’ll likely see different addresses.
1// fig07_01.cpp
2// Pointer operators & and *.
3#include <iostream>
4using namespace std;
5 6int main() {
7constexpr int a{7}; // initialize a with 7
8const int* aPtr = &a; // initialize aPtr with address of int variable a
9 10cout << "The address of a is " << &a
11<< " The value of aPtr is " << aPtr;
12cout << " The value of a is " << a
13<< " The value of *aPtr is " << *aPtr << endl;
14}
The address of a is 002DFD80
The value of aPtr is 002DFD80
The value of a is 7
The value of *aPtr is 7
There are three ways in C++ to pass arguments to a function:
• pass-by-value
• pass-by-reference with a reference argument
• pass-by-reference with a pointer argument.
PERF Chapter 5 showed the first two. Here, we explain pass-by-reference with a pointer. Pointers, like references, can be used to modify variables in the caller or to pass large data objects by reference to avoid the overhead of copying objects. You accomplish pass-by-reference via pointers and the indirection operator (*
). When calling a function that receives a pointer, pass a variable’s address by applying the address operator (&
) to the variable’s name.
Figures 7.2 and 7.3 present two functions that each cube an integer. Figure 7.2 passes variable number
by value (line 12) to function cubeByValue
(lines 17–19), which cubes its argument and passes the result back to main
using a return
statement (line 18). We stored the new value in number
(line 12), though that is not required. For instance, the calling function might want to examine the function call’s result before modifying variable number
.
1// fig07_02.cpp
2// Pass-by-value used to cube a variable’s value.
3#include <iostream>
4using namespace std;
5 6int cubeByValue(int n); // prototype
7 8int main() {
9int number{5};
10 11cout << "The original value of number is " << number;
12number = cubeByValue(number); // pass number by value to cubeByValue
13cout << " The new value of number is " << number << endl;
14}
15 16// calculate and return cube of integer argument
17int cubeByValue(int n) {
18return n * n * n; // cube local variable n and return result
19}
The original value of number is 5
The new value of number is 125
1// fig07_03.cpp
2// Pass-by-reference with a pointer argument used to cube a
3// variable’s value.
4#include <iostream>
5using namespace std;
6 7void cubeByReference(int* nPtr); // prototype
8 9int main() {
10int number{5};
11 12cout << "The original value of number is " << number;
13cubeByReference(&number); // pass number address to cubeByReference
14cout << " The new value of number is " << number << endl;
15}
16 17// calculate cube of *nPtr; modifies variable number in main
18void cubeByReference(int* nPtr) {
19*nPtr = *nPtr * *nPtr * *nPtr; // cube *nPtr
20}
The original value of number is 5
The new value of number is 125
Figure 7.3 passes the variable number
to function cubeByReference
using pass-by-reference with a pointer argument (line 13)—the address of number
is passed to the function. Function cubeByReference
(lines 18–20) specifies parameter nPtr
(a pointer to int
) to receive its argument. The function uses the dereferenced pointer—*nPtr
, an alias for number
in main
—to cube the value to which nPtr
points (line 19). This directly changes the value of number
in main
(line 10). Line 19 can be made clearer with redundant parentheses:
*nPtr = (*nPtr) * (*nPtr) * (*nPtr); // cube *nPtr
A function receiving an address as an argument must define a pointer parameter to receive the address. For example, function cubeByReference
’s header (line 18) specifies that the function receives a pointer to an int
as an argument, stores the address in nPtr
and does not return a value.
Passing a variable by reference with a pointer does not actually pass anything by reference. Rather, a pointer to that variable is passed by value. That pointer value is copied into the function’s corresponding pointer parameter. The called function can then access the caller’s variable by dereferencing the pointer, thus accomplishing pass-by-reference.
Figures 7.4–7.5 graphically analyze the execution of Fig. 7.2 and Fig. 7.3, respectively. The rectangle above a given expression or variable contains the value being produced by a step in the diagram. Each diagram’s right column shows functions cubeByValue
(Fig. 7.2) and cubeByReference
(Fig. 7.3) only when they’re executing.
Security Here we present built-in arrays, which like std::array
s are also fixed-size data structures. We include this presentation mostly because you’ll see built-in arrays in legacy C++ code. New applications should use std::array
and std::vector
to create safer, more robust applications.
20 20 In particular, std::array
and std::vector
objects always know their own size—even when passed to other functions, which is not the case for built-in arrays. If you work on applications containing built-in arrays, you can use C++20’s to_array
function to convert them to std::array
s (Section 7.6), or you can process them more safely using C++20’s span
s (Section 7.10). There are some cases in which built-in arrays are required, such as receiving command-line arguments, which we demonstrate in Section 7.11.
As with std::array
, you must specify a built-in array’s element type and number of elements, but the syntax is different. For example, to reserve five elements for a built-in array of int
s named c
, use
int c[5]; // c is a built-in array of 5 integers
You use the subscript ([]
) operator to access a built-in array’s elements. Recall from Chapter 6 that the subscript ([]
) operator does not provide bounds checking for std::array
s—this is also true for built-in arrays. Of course, you can use std::array
’s at
member function to do bounds checking.
You can initialize the elements of a built-in array using an initializer list. For example,
int n[5]{50, 20, 30, 10, 40};
creates and initializes a built-in array of five int
s. If you provide fewer initializers than the number of elements, the remaining elements are value initialized—fundamental numeric types are set to 0
, bool
s are set to false
, pointers are set to nullptr
and, as we’ll see in Chapter 10, objects receive the default initialization specified by their class definitions. If you provide too many initializers, a compilation error occurs.
The compiler can size a built-in array by counting an initializer list’s elements. For example, the following creates a five-element array
:
int n[]{50, 20, 30, 10, 40};
The value of a built-in array’s name is implicitly convertible to a const
or non-const
pointer to the built-in array’s first element—this is known as decaying to a pointer. So the array name n
above is equivalent to &n[0]
, which is a pointer to the element containing 50
. You don’t need to take the address (&
) of a built-in array to pass it to a function—you simply pass its name. As you saw in Section 7.4, a function that receives a pointer to a variable in the caller can modify that variable in the caller. For built-in arrays, the called function can modify all the elements in the caller—unless the parameter is declared const
. Applying const
to a built-in array parameter to prevent the argument array in the caller from being modified in the called function is another example of the principle of least privilege.
You can declare a built-in array parameter in a function header, as follows:
int sumElements(const int values[], size_t numberOfElements)
which indicates that the function’s first argument should be a one-dimensional built-in array of int
s that should not be modified by the function. Unlike std::array
s and std::vector
s, built-in arrays don’t know their own size, so a function that processes a built-in array should also receive the built-in array’s size.
The preceding header also can be written as
int sumElements(const int* values, size_t numberOfElements)
The compiler does not differentiate between a function that receives a pointer and a function that receives a built-in array. In fact, the compiler converts const
int
values[]
to const int*
values
under the hood. This means the function must “know” when it’s receiving a built-in array vs. a single variable that’s being passed by reference.
CG 20 The C++ Core Guidelines specifically say not to pass built-in arrays to functions;8 rather, you should pass C++20 span
s because they maintain a pointer to the array’s first element and the array’s size. In Section 7.10, we’ll demonstrate span
s and you’ll see that passing a span
is superior to passing a built-in array and its size to a function.
8. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Ri-array
.
begin
and end
11 In Section 6.12, we sorted a std::array
of string
s called colors
as follows:
sort(begin(colors), end(colors)); // sort contents of colors
Functions begin
and end
specified that the entire std::array
should be sorted. Function sort
(and many other C++ Standard Library functions) also can be applied to built-in arrays. For example, to sort the built-in array n
(Section 7.5.2), you can write
sort(begin(n), end(n)); // sort contents of built-in array n
20 For a built-in array, begin
and end
work only in the scope that originally defines the array, which is where the compiler knows the array’s size. Again, you should pass built-in arrays to other functions using C++20 span
s, which we demonstrate in Section 7.10.
Built-in arrays have several limitations:
• They cannot be compared using the relational and equality operators—you must use a loop to compare two built-in arrays element by element. If you had two int
arrays named array1
and array2
, the condition array1
==
array2
would always be false, even if the arrays’ contents are identical. Remember, array names decay to const
pointers to the arrays’ first elements. And, of course, for separate arrays, those elements reside at different memory locations.
• They cannot be assigned to one another—an array name is effectively a const
pointer, so it can’t be changed by assignment.
• They don’t know their own size—a function that processes a built-in array typically receives both the built-in array’s name and its size as arguments.
• They don’t provide automatic bounds checking—you must ensure that array-access expressions use subscripts within the built-in array’s bounds.
to_array
to Convert a Built-in Array to a std::array
20 CG Security In industry, you’ll encounter C++ legacy code that uses built-in arrays. The C++ Core Guidelines say you should prefer std::array
s and std::vector
s to built-in arrays because they’re safer, and they do not become pointers when you pass them to functions.9 C++20’s new to_array
function10 (header <array>
) makes it convenient to create a std::array
from a built-in array or an initializer list. Figure 7.6 demonstrates to_array
. We use a generic lambda expression (lines 9–13) to display each std::array
’s contents. Again, specifying a lambda parameter’s type as auto
enables the compiler to infer the parameter’s type, based on the context in which the lambda appears. In this program, the generic lambda automatically determines the type of the std::array
over which it iterates.
9. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rsl-arrays
.
10. “to_array
From LFTS with Updates.” to_array
from LFTS with updates—HackMD. Accessed June 14, 2020. http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0325r4.html
.
1// fig07_06.cpp
2// C++20: Creating std::arrays with to_array.
20 3#include <iostream>
4#include <array>
5using namespace std;
6 7int main() {
8// lambda to display a collection of items
9const auto display = [](const auto& items) {
10for (const auto& item : items) {
11cout << item << " ";
12}
13};
14
to_array
to create a std::array
from a Built-In ArrayLine 18 creates a three-element std::array
of int
s by copying the contents of built-in array values1
. We use auto
to infer the std::array
variable’s type and size. If we declare the array’s type and size explicitly and it does not match to_array
’s return value, a compilation error occurs. We assign the result to the variable array1
. Lines 20 and 21 display the std::array
’s size and contents to confirm that it was created correctly.
15 const int values1[3]{10, 20, 30}; 16 17 // creating a std::array from a built-in array 18 const auto array1 = to_array(values1); 19 20 cout << "array1.size() = " << array1.size() << " array1: "; 21 display(array4); // use lambda to display contents 22
array1.size() = 3 array1: 10 20 30
to_array
to create a std::array
from an Initializer ListLine 24 shows that to_array
can create a std::array
from an initializer list. Lines 25 and 26 display the array
’s size and contents to confirm that it was created correctly.
23 // creating a std::array from an initializer list 24 const auto array2 = to_array({1, 2, 3, 4}); 25 cout << " array2.size() = " << array2.size() << " array2: "; 26 display(array2); // use lambda to display contents 27 28 cout << endl; 29
array2.size() = 4 array2: 1 2 3 4
const
with Pointers and the Data Pointed ToThis section discusses how to combine const
with pointer declarations to enforce the principle of least privilege. Chapter 5 explained that pass-by-value copies an argument’s value into a function’s parameter. If the copy is modified in the called function, the original value in the caller does not change. In some instances, even the copy of the argument’s value should not be altered in the called function.
If a value does not (or should not) change in the body of a function to which it’s passed, declare the parameter const
. Before using a function, check its function prototype to determine the parameters that it can and cannot modify.
There are four ways to pass a pointer to a function:
• a nonconstant pointer to nonconstant data,
• a nonconstant pointer to constant data (Fig. 7.7),
• a constant pointer to nonconstant data (Fig. 7.8) and
• a constant pointer to constant data (Fig. 7.9).
Each combination provides a different level of access privilege.
The highest privileges are granted by a nonconstant pointer to nonconstant data:
• the data can be modified through the dereferenced pointer, and
• the pointer can be modified to point to other data.
Such a pointer’s declaration (e.g., int*
countPtr
) does not include const
.
A nonconstant pointer to constant data is
• a pointer that can be modified to point to any data of the appropriate type, but
• the data to which it points cannot be modified through that pointer.
The declaration for such a pointer places const
to the left of the pointer’s type, as in11
11. Some programmers prefer to write this as int
const*
countPtr;
. They’d read this declaration from right to left as “countPtr
is a pointer to a constant integer.”
const int* countPtr;
The declaration is read from right to left as “countPtr
is a pointer to an integer constant” or, more precisely, “countPtr
is a nonconstant pointer to an integer constant.”
Figure 7.7 demonstrates the GNU C++ compilation error produced when you try to modify data via a nonconstant pointer to constant data.
1// fig07_07.cpp
2// Attempting to modify data through a
3// nonconstant pointer to constant data.
4 5int main() {
6int y{0};
7const int* yPtr{&y};
8*yPtr = 100; // error: cannot modify a const object
9}
GNU C++ compiler error message:
fig07_07.cpp: In function 'int main()':
fig07_07.cpp:8:10: error: assignment of read-only location '* yPtr'
8 | *yPtr = 100; // error: cannot modify a const object
| ~~~~~~^~~~~
PERF Security Use pass-by-value to pass fundamental-type arguments (e.g., int
s, double
s, etc.) unless the called function must directly modify the value in the caller. This is another example of the principle of least privilege. If large objects do not need to be modified by a called function, pass them using references to constant data or using pointers to constant data—though references are preferred. This gives the performance benefits of pass-by-reference and avoids the copy overhead of pass-by-value. Passing large objects using references to constant data or pointers to constant data also offers the security of pass-by-value.
A constant pointer to nonconstant data is a pointer that
• always points to the same memory location, and
• the data at that location can be modified through the pointer.
Pointers that are declared const
must be initialized when they’re declared, but if the pointer is a function parameter, it’s initialized with the pointer that’s passed to the function. Each successive call to the function reinitializes that function parameter.
Figure 7.8 attempts to modify a constant pointer. Line 9 declares pointer ptr
to be of type int*
const
. The declaration is read from right to left as “ptr
is a constant pointer to a nonconstant integer.” The pointer is initialized with the address of integer variable x
. Line 12 attempts to assign the address of y
to ptr
, but the compiler generates an error message. No error occurs when line 11 assigns the value 7
to *ptr
. The nonconstant value to which ptr
points can be modified using the dereferenced ptr
, even though ptr
itself has been declared const
.
1// fig07_08.cpp
2// Attempting to modify a constant pointer to nonconstant data.
3 4int main() {
5int x, y;
6 7// ptr is a constant pointer to an integer that can be modified
8// through ptr, but ptr always points to the same memory location.
9int* const ptr{&x}; // const pointer must be initialized
10 11*ptr = 7; // allowed: *ptr is not const
12ptr = &y; // error: ptr is const; cannot assign to it a new address
13}
Microsoft Visual C++ compiler error message:
error C3892: 'ptr': you cannot assign to a variable that is const
The minimum access privileges are granted by a constant pointer to constant data:
• such a pointer always points to the same memory location, and
• the data at that location cannot be modified via the pointer.
Figure 7.9 declares pointer variable ptr
to be of type const
int*
const
(line 12). This declaration is read from right to left as “ptr
is a constant pointer to an integer constant.” The figure shows the Apple Clang compiler’s error messages for attempting to modify the data to which ptr
points (line 16) and attempting to modify the address stored in the pointer variable (line 17). In line 14, no errors occur, because neither the pointer nor the data it points to is being modified.
1// fig07_09.cpp
2// Attempting to modify a constant pointer to constant data.
3#include <iostream>
4using namespace std;
5 6int main() {
7int x{5}, y;
8 9// ptr is a constant pointer to a constant integer.
10// ptr always points to the same location; the integer
11// at that location cannot be modified.
12const int* const ptr{&x};
1314
cout << *ptr << endl;
15 16*ptr = 7; // error: *ptr is const; cannot assign new value
17ptr = &y; // error: ptr is const; cannot assign new address
18}
Apple Clang compiler error messages:
fig07_09.cpp:16:9: error: read-only variable is not assignable
*ptr = 7; // error: *ptr is const; cannot assign new value
~~~~ ^
fig07_09.cpp:17:8: error: cannot assign to variable 'ptr' with const-qualified type 'const int *const'
ptr = &y; // error: ptr is const; cannot assign new address
~~~ ^
sizeof
OperatorThe compile-time unary operator sizeof
determines the size in bytes of a built-in array or of any other data type, variable or constant during program compilation. When applied to a built-in array’s name, as in Fig. 7.1012 (line 12), sizeof
returns the total number of bytes in the built-in array as a value of type size_t
. The computer we used to compile this program stores double
variables in 8 bytes of memory. numbers
is declared to have 20 elements (line 10), so it uses 160 bytes in memory. Applying sizeof
to a pointer parameter (line 20) in a function that receives a built-in array, returns the size of the pointer in bytes (4 on the system we used), not the built-in array’s size. Using the sizeof
operator in a function to find the size in bytes of a built-in array parameter returns the size in bytes of a pointer, not the size in bytes of the built-in array.
12. This is a mechanical example to demonstrate how sizeof
works. If you use static code-analysis tools, such as the C++ Core Guidelines checker in Microsoft Visual Studio, you’ll receive warnings because you should not pass built-in arrays to functions.
1// fig07_10.cpp
2// Sizeof operator when used on a built-in array's name
3// returns the number of bytes in the built-in array.
4#include <iostream>
5using namespace std;
6 7size_t getSize(double* ptr); // prototype
8 9int main() {
10double numbers[20]; // 20 doubles; occupies 160 bytes on our system
11 12cout << "The number of bytes in the array is " << sizeof(numbers);
13 14cout << " The number of bytes returned by getSize is "
15<< getSize(numbers) << endl;
16}
17 18// return size of ptr
19size_t getSize(double* ptr) {
20return sizeof(ptr);
21}
The number of bytes in the array is 160
The number of bytes returned by getSize is 4
11 Figure 7.11 uses sizeof
to calculate the number of bytes used to store various standard data types. The output was produced using the Apple Clang compiler in Xcode. Type sizes are platform dependent. When we run this program on our Windows system, for example, long
is 4 bytes and long
long
is 8 bytes, whereas on our Mac, they’re both 8 bytes. In this example13, lines 7–15 implicitly initialize each variable to 0 using a C++11 empty initializer list, {}
.
13. Line 16 uses const
rather than constexpr
to prevent a type mismatch compilation error. The name of the built-in array of int
s (line 15) decays to a const
int*
, so we must declare ptr
with that type.
1// fig07_11.cpp
2// sizeof operator used to determine standard data type sizes.
3#include <iostream>
4using namespace std;
5 6int main() {
7constexpr char c{}; // variable of type char
8constexpr short s{}; // variable of type short
9constexpr int i{}; // variable of type int
10constexpr long l{}; // variable of type long
11constexpr long long ll{}; // variable of type long long
12constexpr float f{}; // variable of type float
13constexpr double d{}; // variable of type double
14constexpr long double ld{}; // variable of type long double
15constexpr int array[20]{}; // built-in array of int
16const int* const ptr{array}; // variable of type int*
17 18cout << "sizeof c = " << sizeof c
19<< " sizeof(char) = " << sizeof(char)
20<< " sizeof s = " << sizeof s
21<< " sizeof(short) = " << sizeof(short)
22<< " sizeof i = " << sizeof i
23<< " sizeof(int) = " << sizeof(int)
24<< " sizeof l = " << sizeof l
25<< " sizeof(long) = " << sizeof(long)
26<< " sizeof ll = " << sizeof ll
27<< " sizeof(long long) = " << sizeof(long long)
28<< " sizeof f = " << sizeof f
29<< " sizeof(float) = " << sizeof(float)
30<< " sizeof d = " << sizeof d
31<< " sizeof(double) = " << sizeof(double)
32<< " sizeof ld = " << sizeof ld
33<< " sizeof(long double) = " << sizeof(long double)
34<< " sizeof array = " << sizeof array
35<< " sizeof ptr = " << sizeof ptr << endl;
36}
sizeof c = 1 sizeof(char) = 1
sizeof s = 2 sizeof(short) = 2
sizeof i = 4 sizeof(int) = 4
sizeof l = 8 sizeof(long) = 8
sizeof ll = 8 sizeof(long long) = 8
sizeof f = 4 sizeof(float) = 4
sizeof d = 8 sizeof(double) = 8
sizeof ld = 16 sizeof(long double) = 16
sizeof array = 80
sizeof ptr = 8
The number of bytes used to store a particular data type may vary among systems and compilers. When writing programs that depend on data type sizes, always use sizeof
to determine the number of bytes used to store the data types.
Operator sizeof
can be applied to any expression or type name. When applied to a variable name (which is not a built-in array’s name) or other expression, the number of bytes used to store the corresponding type is returned. The parentheses used with sizeof
are required only if a type name (e.g., int
) is supplied as its operand. The parentheses used with sizeof
are not required when sizeof
’s operand is an expression. Remember that sizeof
is a compile-time operator, so its operand is not evaluated at runtime.
C++ enables pointer arithmetic—arithmetic operations that may be performed on pointers. This section describes the operators that have pointer operands and how these operators are used with pointers.
CG Pointer arithmetic is appropriate only for pointers that point to built-in array elements. You’re likely to encounter pointer arithmetic in legacy code. However, the C++ Core Guidelines indicate that a pointer should refer only to a single object (not an array),14 and that you should not use pointer arithmetic because it’s highly error prone.15 If you need to process built-in arrays, use C++20 span
s instead (Section 7.10).
14. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Res-ptr
.
15. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#SS-bounds
.
Valid pointer arithmetic operations are:
• incrementing (++
) or decrementing (--
),
• adding an integer to a pointer (+
or +=
) or subtracting an integer from a pointer (-
or -=
), and
• subtracting one pointer from another of the same type
Subtracting pointers is appropriate only for two pointers that point to elements of the same built-in array.
Most computers today have four-byte (32-bit) or eight-byte (64-bit) integers, though some of the billions of resource-constrained Internet of Things (IoT) devices are built using 8-bit or 16-bit hardware. Integer sizes typically are based on the hardware architecture, so such hardware might use one- or two-byte integers, respectively. The results of pointer arithmetic depend on the size of the memory objects a pointer points to, so pointer arithmetic is machine-dependent.
Assume that int
v[5]
has been declared and that its first element is at memory location 3000
. Assume that pointer vPtr
has been initialized to point to v[0]
(i.e., the value of vPtr
is 3000
). The following diagram illustrates this situation for a machine with four-byte integers:
Variable vPtr
can be initialized to point to v
with either of the following statements (because a built-in array’s name implicitly converts to the address of its zeroth element):
int* vPtr{v}; int* vPtr{&v[0]};
In conventional arithmetic, the addition 3000
+
2
yields the value 3002
. This is normally not the case with pointer arithmetic. Adding an integer to or subtracting an integer from a pointer increments or decrements the pointer by that integer times the size of the type to which the pointer refers. The number of bytes depends on the memory object’s data type. For example, the statement
vPtr += 2;
would produce 3008
(from the calculation 3000
+
2
*
4
), assuming that an int
is stored in four bytes of memory. In the built-in array v
, vPtr
would now point to v[2]
as in the diagram below:
If vPtr
had been incremented to 3016
, which points to v[4]
, the statement
vPtr -= 4;
would set vPtr
back to 3000
—the beginning of the built-in array. If a pointer is being incremented or decremented by one, the increment (++
) and decrement (--
) operators can be used. Each of the statements
++vPtr; vPtr++;
increments the pointer to point to the built-in array’s next element. Each of the statements
--vPtr; vPtr--;
decrements the pointer to point to the built-in array’s previous element.
CG There’s no bounds checking on pointer arithmetic, so the C++ Core Guidelines recommend using std::span
s instead, which we demonstrate in Section 7.10. You must ensure that every pointer arithmetic operation that adds an integer to or subtracts an integer from a pointer results in a pointer that references an element within the built-in array’s bounds. As you’ll see, std::span
s have bounds checking, which helps you avoid errors.
Pointer variables pointing to the same built-in array may be subtracted from one another. For example, if vPtr
contains the address 3000
and v2Ptr
contains the address 3008
, the statement
x = v2Ptr - vPtr;
would assign to x
the number of built-in array elements from vPtr
to v2Ptr
—in this case, 2
. Pointer arithmetic is meaningful only on a pointer that points to a built-in array. We cannot assume that two variables of the same type are stored contiguously in memory unless they’re adjacent elements of a built-in array. Subtracting or comparing two pointers that do not refer to elements of the same built-in array is a logic error.
A pointer can be assigned to another pointer if both pointers are of the same type.16 The exception to this rule is the pointer to void
(i.e., void*
), which is a pointer capable of representing any pointer type. Any pointer to a fundamental type or class type can be assigned to a pointer of type void*
without casting. However, a pointer of type void*
cannot be assigned directly to a pointer of another type—the pointer of type void*
must first be cast to the proper pointer type (generally via a reinterpret_cast
; discussed in Section 9.8).
16. Of course, const
pointers cannot be modified.
void*
A void*
pointer cannot be dereferenced. For example, the compiler “knows” that an int*
points to four bytes of memory on a machine with four-byte integers. Dereferencing an int*
creates an lvalue that is an alias for the int
’s four bytes in memory. A void*
, however, simply contains a memory address for an unknown data type. You cannot dereference a void*
because the compiler does not know the type of the data to which the pointer refers and thus not the number of bytes.
The allowed operations on void*
pointers are:
• comparing void*
pointers with other pointers,
• casting void*
pointers to other pointer types and
• assigning addresses to void*
pointers.
All other operations on void*
pointers are compilation errors.
Pointers can be compared using equality and relational operators. Relational comparisons using are meaningless unless the pointers point to elements of the same built-in array. Pointer comparisons compare the addresses stored in the pointers. Comparing two pointers pointing to the same built-in array could show, for example, that one pointer points to a higher-numbered element than the other. A common use of pointer comparison is determining whether a pointer has the value nullptr
(i.e., a pointer to nothing).
span
s—Views of Contiguous Container Elements20 We now continue our objects natural approach by taking C++20 span
objects for a spin. A span
(header <span>
) enables programs to view contiguous elements of a container, such as a built-in array, a std::array
or a std::vector
. A span
is a “view” into a container— it “sees” the container’s contents, but does not have its own copy of the container’s data.
CG Earlier, we discussed how C++ built-in arrays decay to pointers when passed to functions. In particular, the function’s parameter loses the size information that was provided when you declared the array. You saw this in our sizeof
demonstration in Fig. 7.10. The C++ Core Guidelines recommend passing built-in arrays to functions as span
s17, which represent both a pointer to the array’s first element and the array’s size. Figure 7.12 demonstrates some key span
capabilities.
17. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rr-ap
.
1// fig07_12.cpp
2// C++20 spans: Creating views into containers.
3#include <array>
4#include <iostream>
5#include <numeric>
6#include <span>
7#include <vector>
8using namespace std;
9
displayArray
Security CG Passing a built-in array to a function typically requires both the array’s name and the array’s size. The parameter items
(line 12), though declared with []
, is simply a pointer to an int
—the pointer does not “know” how many elements the function’s argument contains. There are various problems with this approach. For instance, the code that calls displayArray
could pass the wrong value for size
. In this case, the function might not process all of items
’ elements, or the function might access an element outside items
’ bounds—a logic error and a potential security issue. In addition, we previously discussed the disadvantages of external iteration, as used in lines 13–15. The C++ Core Guidelines checker in Visual Studio issues several warnings about displayArray
and passing built-in arrays to functions. We include function displayArray
in this example only for comparison with passing span
s in function displaySpan
, which is the recommended approach.
10 // items parameter is treated as a const int* so we also need the size to 11 // know how to iterate over items with counter-controlled iteration 12 void displayArray(const int items[], size_t size) { 13 for (size_t i{0}; i < size; ++i) { 14 cout << items[i] << " "; 15 } 16 } 17
displaySpan
CG The C++ Core Guidelines indicate that a pointer should point only to one object, not an array18 and that functions like displayArray
, which receive a pointer and a size, are error-prone.19 To fix these issues, you should pass arrays to functions using span
s, as in displaySpan
(lines 20–24), which receives a span
containing const
int
s because the function does not need to modify the data to display it.
18. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Res-ptr
.
19. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Ri-array
.
18 // span parameter contains both the location of the first item 19 // and the number of elements, so we can iterate using range-based for 20 void displaySpan(span<const int> items) { 21 for (const auto& item : items) { // spans are iterable 22 cout << item << " "; 23 } 24 } 25
CG PERF A span
encapsulates both a pointer and a count of the number of contiguous elements. When you pass a built-in array to displaySpan
, C++ implicitly creates a span
containing a pointer to the array’s first element and the array’s size, which the compiler can determine from the array’s declaration. This span
is a view of the data in the original array that you pass as an argument. The C++ Core Guidelines indicate that you can pass a span
by value because it’s just as efficient as passing the pointer and size separately20, as we did in displayArray
.
20. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-range.
Security A span
has many capabilities similar to array
s and vector
s, such as iteration via the range-based for
statement. Because a span
is created based on the array’s original size as determined by the compiler, the range-based for
guarantees that we cannot access an element outside the bounds of the array that the span
views, thus fixing the various problems associated with displayArray
and helping prevent security issues like buffer overflows.
times2
A span
is a view into an existing container, so changing the span
’s elements changes the container’s original data. Function times2
multiplies every item in its span<int>
by 2. Note that we use a non-const
reference to modify each element that the span
views.
26 // spans can be used to modify elements in the original data structure 27 void times2(span<int> items) { 28 for (int& item : items) { 29 item *= 2; 30 } 31 } 32
Lines 34–36 create the int
built-in array values1
, the std::array
values2
and the std::vector
values3
. Each has five elements and stores its elements contiguously in memory. Line 41 calls displayArray
to display values1
’s contents. The displayArray
function’s first parameter is a pointer to an int
, so we cannot use a std::array
’s or std::vector
’s name to pass these objects to displayArray
.
33 int main() { 34 int values1[5]{1, 2, 3, 4, 5}; 35 array<int, 5> values2{6, 7, 8, 9, 10}; 36 vector<int> values3{11, 12, 13, 14, 15}; 37 38 // must specify size because the compiler treats displayArray's items 39 // parameter as a pointer to the first element of the argument 40 cout << "values1 via displayArray: "; 41 displayArray(values1, 5); 42
values1 via displayArray: 1 2 3 4 5
span
s and Passing Them to FunctionsLine 46 calls displaySpan
with values1
as an argument. The function’s parameter was declared as
span<const int>
so C++ creates a span containing a const
int*
that points to the array’s first element and the array’s size, which the compiler gets from the declaration of values1
(line 34). Because spans can view any contiguous sequence of elements, you may also pass a std::array
or std::vector
of int
to displaySpan
, and C++ will create an appropriate span
representing a pointer to the container’s first element and the container’s size. This makes function displaySpan
more flexible than displayArray
, which could receive only the built-in array in this example.
43 // compiler knows values' size and automatically creates a span 44 // representing &values1[0] and the array's length 45 cout << " values1 via displaySpan: "; 46 displaySpan(values1); 47 48 // compiler also can create spans from std::arrays and std::vectors 49 cout << " values2 via displaySpan: "; 50 displaySpan(values2); 51 cout << " values3 via displaySpan: "; 52 displaySpan(values3); 53
values1 via displayArray: 1 2 3 4 5 values1 via displaySpan: 1 2 3 4 5 values2 via displaySpan: 6 7 8 9 10 values3 via displaySpan: 11 12 13 14 15
span
’s Elements Modifies the Original DataAs we mentioned, function times2
multiplies its span
’s elements by 2. Line 55 calls times2
with values1
as an argument. The function’s parameter was declared as
span<int>
so C++ creates a span
containing an int*
that points to the array’s first element and the array’s size, which the compiler gets from the declaration of values1
(line 34). To prove that times2
modified the original array’s data, line 57 displays values1
’s updated values. Like displaySpan
, times2
can be called with this program’s std::array
or std::vector
as well.
54 // changing a span's contents modifies the original data 55 times2(values1); 56 cout << " values1 after times2 modifies its span argument: "; 57 displaySpan(values1); 58
values1 after times2 modifies its span argument: 2 4 6 8 10
You can explicitly create span
s and interact with them. Line 60 creates a span<int>
that views the data in values1
. Lines 61–62 demonstrate the span’s front
and back
member functions, which return the first and last element of the view, and thus, the first and last element of the built-in array values1
, respectively.
59 // spans have various array-and-vector-like capabilities 60 span<int> mySpan{values1}; 61 cout << " mySpan's first element: " << mySpan.front() 62 << " mySpan's last element: " << mySpan.back(); 63
mySpan's first element: 2 mySpan's last element: 10
CG An essential philosophy of the C++ Core Guidelines is to “prefer compile-time checking to runtime checking.”21 This enables the compiler to find and report errors at compile-time, rather than you having to write code to help prevent runtime errors. In line 60, the compiler determines the span
’s size (5
) from the values1
declaration in line 34. You can state the span
’s size, as in
21. C++ Core Guidelines. Accessed June 14, 2020. https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rp-compile-time
.
span<int, 5> mySpan{values1};
In this case, the compiler ensures that the span
’s declared size matches values1
’s size; otherwise, a compilation error occurs.
span
with the Standard Library’s accumulate
AlgorithmAs you’ve seen in this example, span
s are iterable. This means you also can use the begin
and end
functions with span
s to pass them to C++ standard library algorithms, such as accumulate
(line 66) or sort
. We cover standard library algorithms in depth in Chapter 18.
64 // spans can be used with standard library algorithms 65 cout << " Sum of mySpan's elements: " 66 << accumulate(begin(mySpan), end(mySpan), 0); 67
Sum of mySpan's elements: 30
Sometimes, you might want to process subsets of the data a span
views. A span
’s first
, last
and subspan
member functions create subviews, which are themselves views. Lines 70 and 72 use first
and last
to get span
s representing the first three and last three elements of values1
, respectively. Line 74 uses subspan
to get a span
that views the 3
elements starting from index 1
. In each case, we pass the subview’s span
to displaySpan
to confirm what the span represents.
68 // spans can be used to create subviews of a container 69 cout << " First three elements of mySpan: "; 70 displaySpan(mySpan.first(3)); 71 cout << " Last three elements of mySpan: "; 72 displaySpan(mySpan.last(3)); 73 cout << " Middle three elements of mySpan: "; 74 displaySpan(mySpan.subspan(1, 3)); 75
First three elements of mySpan: 2 4 6 Last three elements of mySpan: 6 8 10 Middle three elements of mySpan: 4 6 8
A subview of non-const
data can modify that data. Line 77 passes to function times2
a span
that views the 3
elements starting from index 1
of values1
. Line 79 displays the updated values1
elements to confirm the results.
76 // changing a subview's contents modifies the original data 77 times2(mySpan.subspan(1, 3)); 78 cout << " values1 after modifying middle three elements via span: "; 79 displaySpan(values1); 80
values1 after modifying middle three elements via span: 2 8 12 16 10
[]
OperatorLike built-in arrays, std::array
s and std::vector
s, you can access and modify span elements via the []
operator. Line 82 displays the element at index 2
. Line 85 attempts to access an element that does not exist. On the Microsoft Visual C++ compiler, this results in an exception that displays the message,22 "Expression: span index out of range"
.
22. At the time of this writing, the draft C++20 standard document makes no mention of the []
operator throwing an exception. Neither GNU C++ nor the Apple Clang C++ throw exceptions on line 85. They simply display whatever is in that memory location.
81 // access a span element via [] 82 cout << " The element at index 2 is: " << mySpan[2]; 83 84 // attempt to access an element outside the bounds 85 cout << " The element at index 10 is: " << mySpan[10] << endl; 86 }
The element at index 2 is: 12
We’ve already used the C++ Standard Library string
class to represent strings as full-fledged objects. Chapter 8 presents class std::string
in detail. This section introduces C-style, pointer-based strings (as defined by the C programming language). Here. we’ll refer to these as C-strings or strings and use std::string
when referring to the C++ standard library’s string
class.
Security std::string
is preferred because it eliminates many of the security problems and bugs that can be caused by manipulating C-strings. However, there are some cases in which C-strings are required, such as reading in command-line arguments. Also, if you work with legacy C and C++ programs, you’re likely to encounter pointer-based strings. We cover C-strings in detail in Appendix F.
Characters are the fundamental building blocks of C++ source programs. Every program is composed of characters that—when grouped meaningfully—are interpreted by the compiler as instructions and data used to accomplish a task. A program may contain character constants, each of which is an integer value represented as a character in single quotes. The value of a character constant is the integer value of the character in the machine’s character set. For example, 'z'
represents the integer value of z
(122 in the ASCII character set; see Appendix B), and '
'
represents the integer value of newline (10 in the ASCII character set).
A C-string (also called a pointer-based string) is a built-in array of characters ending with a null character (' '
), which marks where the string terminates in memory. A C-string is accessed via a pointer to its first character (no matter how long the string is). The result of sizeof
for a string literal (which is a C-string) is the length of the string, including the terminating null character.
A string literal may be used as an initializer in the declaration of either a built-in array of char
s or a variable of type const char*
. The declarations
char color[]{"blue"}; const char* colorPtr{"blue"};
each initialize a variable to the string "blue"
. The first declaration creates a five-element built-in array color
containing the characters 'b'
, 'l'
, 'u'
, 'e'
and ' '
. The second declaration creates pointer variable colorPtr
that points to the letter b
in the string "blue"
(which ends in ' '
) somewhere in memory. The first declaration above also may be implemented using an initializer list of individual characters, as in:
char color[]{'b', 'l', 'u', 'e', ' '};
String literals exist for the duration of the program. They may be shared if the same string literal is referenced from multiple locations in a program. String literals are immutable—they cannot be modified.
Not allocating sufficient space in a built-in array of char
s to store the null character that terminates a string is a logic error. Creating or using a C-string that does not contain a terminating null character can lead to logic errors.
Security When storing a string of characters in a built-in array of char
s, be sure that the builtin array is large enough to hold the largest string that will be stored. C++ allows strings of any length. If a string is longer than the built-in array of char
s in which it’s to be stored, characters beyond the end of the built-in array will overwrite subsequent memory locations. This could lead to logic errors, program crashes or security breaches.
A built-in array of char
s representing a null-terminated string can be output with cout
and <<
. The statement
cout << sentence;
displays the built-in array sentence
. cout
does not care how large the built-in array of char
s is. The characters are output until a terminating null character is encountered; the null character is not displayed. cin
and cout
assume that built-in arrays of char
s should be processed as strings terminated by null characters. cin
and cout
do not provide similar input and output processing capabilities for other built-in array types.
There are cases in which built-in arrays and C-strings must be used, such as processing a program’s command-line arguments, which are often passed to applications to specify configuration options, file names to process and more.
You supply command-line arguments to a program by placing them after the program’s name when executing it from the command line. Such arguments typically pass options to a program. For example, on a Windows system, the command
dir /p
uses the /p
argument to list the contents of the current directory, pausing after each screen of information. Similarly, on Linux or macOS, the following command uses the -la
argument to list the contents of the current directory with details about each file and directory:
ls -la
Command-line arguments are passed into a C++ program as C-strings, and the application name is treated as the first command line argument. To use the arguments as std::string
s or other data types (int
, double
, etc.), you must convert the arguments to those types. Figure 7.13 displays the number of command-line arguments passed to the program, then displays each argument on a separate line of output.
1// fig07_13.cpp
2// Reading in command-line arguments.
3#include <iostream>
4using namespace std;
5 6int main(int argc, char* argv[]) {
7cout << "There were " << argc << " command-line arguments: ";
8for (int i{0}; i < argc; ++i) {
9cout << argv[i] << endl;
10}
11}
fig07_13 Amanda Green 97
There were 4 command-line arguments
fig07_13
Amanda
Green
97
To receive command-line arguments, declare main
with two parameters (line 6), which by convention are named argc
and argv
, respectively. The first is an int
representing the number of arguments. The second is a char*
built-in array. The first element of the array is a C-string for the application name. The remaining elements are C-strings for the other command-line arguments.
The command
fig07_13 Amanda Green 97
passes "Amanda"
, "Green"
and 97"
to the application fig07_13
(on macOS and Linux you’d run this program with "./fig07_13"
). Command-line arguments are separated by white space, not commas. When this command executes, fig07_13
’s main
function receives the argument count 4
and a four-element array of C-strings:
• argv[0]
contains the application’s name "fig07_13"
(or "./fig07_13"
on macOS or Linux), and
• argv[1]
through argv[3]
contain "Amanda"
, "Green"
and "97"
, respectively.
You determine how to use these arguments in your program.
to_array
Function20 Section 7.6 demonstrated converting built-in arrays to std::array
s with to_array
. Figure 7.14 shows another purpose of to_array
. We use the same lambda expression (lines 9–13) as in Fig. 7.6 to display the std::array
contents after each to_array
call.
1// fig07_14.cpp
2// C++20: Creating std::arrays from string literals with to_array.
3#include <iostream>
4#include <array>
5using namespace std;
6 7int main() {
8// lambda to display a collection of items
9const auto display = [](const auto& items) {
10for (const auto& item : items) {
11cout << item << " ";
12}
13};
14
std::array
from a String Literal Creates a One-Element array
Function to_array
fixes an issue with initializing a std::array
from a string literal. Rather than creating a std::array
of the individual characters in the string literal, line 17 creates a one-element array
containing a const char*
pointing to the C-string "abc"
.
15 // initializing an array with a string literal 16 // creates a one-element array<const char*> 17 const auto array1 = array{"abc"}; 18 cout << " array1.size() = " << array1.size() << " array1: "; 19 display(array1); // use lambda to display contents 20
array1.size() = 1 array1: abc
to_array
Creates a std::array
of char
On the other hand, passing a string literal to to_array
(line 22) creates a std::array
of char
s containing elements for each character and the terminating null character. Line 23 confirms that the array
’s size is 6. Line 24 confirms the array
’s contents. The null character does not have a visual representation, so it does not appear in the output.
21 // creating std::array of characters from a string literal 22 const auto array2 = to_array("C++20"); 23 cout << " array2.size() = " << array2.size() << " array2: "; 24 display(array2); // use lambda to display contents 25 26 cout << endl; 27 }
array2.size() = 6 array2: C + + 2 0
In later chapters, we’ll introduce additional pointer topics:
• In Chapter 13, Object-Oriented Programming: Polymorphism, we’ll use pointers with class objects to show that the “runtime polymorphic processing” associated with object-oriented programming can be performed with references or pointers—you should favor references.
• In Chapter 14, Operator Overloading, we introduce dynamic memory management with pointers, which allows you at execution time to create and destroy objects as needed. Improperly managing this process is a source of subtle errors, such as “memory leaks.” We’ll show how “smart pointers” can automatically manage memory and other resources that should be returned to the operating system when they’re no longer needed.
• In Chapter 18, Standard Library Algorithms, we show that a function’s name is also a pointer to its implementation in memory, and that functions can be passed into other functions via function pointers—exactly as lambda expressions are.
This chapter discussed pointers, built-in pointer-based arrays and pointer-based strings (C-strings). We pointed out Modern C++ guidelines that recommend avoiding most pointers—preferring references to pointers, std::array
23 and std::vector
objects to built-in arrays, and std::string
objects to C-strings.
23. We pronounce “std::
” as “standard,” so throughout this chapter we say “a std::array”
rather than “an std::array
,” which assumes “std::
” is pronounced as its individual letters s
, t
and d
.
We declared and initialized pointers and demonstrated the pointer operators &
and *
. We showed that pointers enable pass-by-reference, but you should generally prefer references for that purpose. We used built-in, pointer-based arrays and showed their intimate relationship with pointers.
We discussed various combinations of const
with pointers and the data they point to and used the sizeof
operator to determine the number of bytes that store values of particular fundamental types and pointers. We demonstrated pointer expressions and pointer arithmetic.
We briefly discussed C-strings then showed how to process command-line arguments—a simple task for which C++ still requires you to use both pointer-based C-strings and pointer-based arrays.
As a reminder, the key takeaway from reading this chapter is that you should avoid using pointers, pointer-based arrays and pointer-based strings whenever possible. For programs that still use pointer-based arrays, you can use C++20’s to_array
function to convert built-in arrays to std::array
s and C++20’s span
s as a safer way to process built-in pointer-based arrays. In the next chapter, we discuss typical string-manipulation operations provided by std::string
and introduce file-processing capabilities.