8.10 Pointer-Based Strings (Optional)

We’ve already used the C++ Standard Library string class to represent strings as full-fledged objects. Chapter 21 presents class string in detail. This section introduces C-style, pointer-based strings (as defined by the C programming language), which we’ll simply call C strings. C++’s string class is preferred for use in new programs, because it eliminates many of the security problems and bugs that can be caused by manipulating C strings. We cover C strings here for a deeper understanding of pointers and built-in arrays, and because there are some cases (such as command-line arguments) in which C string processing is required. Also, if you work with legacy C and C++ programs, you’re likely to encounter pointer-based strings. We cover C strings in detail in Appendix F.

Characters and Character Constants

Characters are the fundamental building blocks of C++ source programs. Every program is composed of a sequence of characters that—when grouped together meaningfully—is interpreted by the compiler as instructions and data used to accomplish a task. A program may contain character constants. A character constant is an integer value represented as a character in single quotes. The value of a character constant is the integer value of the character in the machine’s character set. For example, 'z' represents the integer value of z (122 in the ASCII character set; see Appendix B), and ' ' represents the integer value of newline (10 in the ASCII character set).

Strings

A string is a series of characters treated as a single unit. A string may include letters, digits and various special characters such as +, -, *, /and $. String literals, or string constants, in C++ are written in double quotation marks as follows:


"John Q. Doe"            (a name)
"9999 Main Street"       (a street address)
"Maynard, Massachusetts" (a city and state)
"(201) 555-1212"         (a telephone number)

Pointer-Based Strings

A pointer-based string is a built-in array of characters ending with a null character (''), which marks where the string terminates in memory. A string is accessed via a pointer to its first character. The result of sizeof for a string literal is the length of the string including the terminating null character.

String Literals as Initializers

A string literal may be used as an initializer in the declaration of either a built-in array of chars or a variable of type const char*. The declarations


char color[]{"blue"};
const char* colorPtr{"blue"};

each initialize a variable to the string "blue". The first declaration creates a five-element built-in array color containing the characters 'b', 'l', 'u', 'e' and ''. The second declaration creates pointer variable colorPtr that points to the letter b in the string "blue" (which ends in '') somewhere in memory. String literals exist for the duration of the program and may be shared if the same string literal is referenced from multiple locations in a program. String literals cannot be modified.

Character Constants as Initializers

The declaration char color[] = "blue"; could also be written


char color[]{'b', 'l', 'u', 'e', ''};

which uses character constants in single quotes (') as initializers for each element of the built-in array. When declaring a built-in array of chars to contain a string, the built-in array must be large enough to store the string and its terminating null character. The compiler determines the size of the built-in array in the preceding declaration, based on the number of initializers in the initializer list.

Common Programming Error 8.7

Not allocating sufficient space in a built-in array of chars to store the null character that terminates a string is a logic error.

 

Common Programming Error 8.8

Creating or using a C string that does not contain a terminating null character can lead to logic errors.

 

Error-Prevention Tip 8.6

When storing a string of characters in a built-in array of chars, be sure that the built-in array is large enough to hold the largest string that will be stored. C++ allows strings of any length. If a string is longer than the built-in array of chars in which it’s to be stored, characters beyond the end of the built-in array will overwrite data in memory following the built-in array, leading to logic errors and potential security breaches.

Accessing Characters in a C String

Because a C string is a built-in array of characters, we can access individual characters in a string directly with array subscript notation. For example, in the preceding declaration, color[0] is the character 'b', color[2] is 'u' and color[4] is the null character.

Reading Strings into Built-In Arrays of char with cin

A string can be read into a built-in array of chars using cin. For example, the following statement reads a string into the built-in 20-element array of chars named word:


cin >> word;

The string entered by the user is stored in word. The preceding statement reads characters until a white-space character or end-of-file indicator is encountered. The string should be no longer than 19 characters to leave room for the terminating null character. The setw stream manipulator can be used to ensure that the string read into word does not exceed the size of the built-in array. For example, the statement


cin >> setw(20) >> word;

specifies that cin should read a maximum of 19 characters into word and save the 20th location to store the terminating null character for the string. The setw stream manipulator is not a sticky setting—it applies only to the next value being input. If more than 19 characters are entered, the remaining characters are not saved in word, but they will be in the input stream and can be read by the next input operation. Of course, any input operation can also fail. We show how to detect input failures in Section 13.8.

Reading Lines of Text into Built-In Arrays of char with cin.getline

In some cases, it’s desirable to input an entire line of text into a built-in array of chars. For this purpose, the cin object provides the member function getline, which takes three arguments—a built-in array of chars in which the line of text will be stored, a length and a delimiter character. For example, the statements


char sentence[80];
cin.getline(sentence, 80, '
');

declare sentence as a built-in array of 80 characters and read a line of text from the keyboard into the built-in array. The function stops reading characters when the delimiter character ' ' is encountered, when the end-of-file indicator is entered or when the number of characters read so far is one less than the length specified in the second argument. The last character in the built-in array is reserved for the terminating null character. If the delimiter character is encountered, it’s read and discarded. The third argument to cin.getline has ' ' as a default value, so the preceding function call could have been written as


cin.getline(sentence, 80);

Chapter 13, Stream Input/Output: A Deeper Look, provides a detailed discussion of cin.getline and other input/output functions.

Displaying C Strings

A built-in array of chars representing a null-terminated string can be output with cout and <<. The statement


cout << sentence;

displays the built-in array sentence. Like cin, cout does not care how large the built-in array of chars is. The characters are output until a terminating null character is encountered; the null character is not displayed. [Note: cin and cout assume that built-in arrays of chars should be processed as strings terminated by null characters; cin and cout do not provide similar input and output processing capabilities for other built-in array types.]

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset