CONTENTS
Section 2.1 Primitive Built-in Types 34
Section 2.2 Literal Constants 37
Section 2.4 const
Qualifier 56
Section 2.9 Writing Our Own Header Files 67
Types are fundamental to any program. They tell us what our data mean and what operations we can perform on our data.
C++ defines several primitive types: characters, integers, floating-point numbers, and so on. The language also provides mechanisms that let us define our own data types. The library uses these mechanisms to define more complex types such as variable-length character string
s, vector
s, and so on. Finally, we can modify existing types to form compound types. This chapter covers the built-in types and begins our coverage of how C++ supports more complicated types.
Types determine what the data and operations in our programs mean. As we saw in Chapter 1, the same statement
i =i +j;
can mean different things depending on the types of i
and j
. If i
and j
are integers, then this statement has the ordinary, arithmetic meaning of +
. However, if i
and j
are Sales_item
objects, then this statement adds the components of these two objects.
In C++ the support for types is extensive: The language itself defines a set of primitive types and ways in which we can modify existing types. It also provides a set of features that allow us to define our own types. This chapter begins our exploration of types in C++ by covering the built-in types and showing how we associate a type with an object. It also introduces ways we can both modify types and can build our own types.
C++ defines a set of arithmetic types, which represent integers, floating-point numbers, and individual characters and boolean values. In addition, there is a special type named void
. The void
type has no associated values and can be used in only a limited set of circumstances. The void
type is most often used as the return type for a function that has no return value.
The size of the arithmetic types varies across machines. By size, we mean the number of bits used to represent the type. The standard guarantees a minimum size for each of the arithmetic types, but it does not prevent compilers from using larger sizes. Indeed, almost all compilers use a larger size for int
than is strictly required. Table 2.1 (p. 36) lists the built-in arithmetic types and the associated minimum sizes.
Table 2.1. C++: Arithmetic Types
Because the number of bits varies, the maximum (or minimum) values that these types can represent also vary by machine.
The arithmetic types that represent integers, characters, and boolean values are collectively referred to as the integral types.
There are two character types: char
and wchar_t
. The char
type is guaranteed to be big enough to hold numeric values that correspond to any character in the machine’s basic character set. As a result, char
s are usually a single machine byte. The wchar_t
type is used for extended character sets, such as those used for Chinese and Japanese, in which some characters cannot be represented within a single char
.
The types short, int
, and long
represent integer values of potentially different sizes. Typically, short
s are represented in half a machine word, int
s in a machine word, and long
s in either one or two machine words (on 32-bit machines, int
s and longs
are usually the same size).
The type bool
represents the truth values, true
and false
. We can assign any of the arithmetic types to a bool
. An arithmetic type with value 0 yields a bool
that holds false
. Any nonzero value is treated as true
.
The integral types, except the boolean type, may be either signed or unsigned. As its name suggests, a signed type can represent both negative and positive numbers (including zero), whereas an unsigned
type represents only values greater than or equal to zero.
The integers, int, short
, and long
, are all signed by default. To get an unsigned type, the type must be specified as unsigned
, such as unsigned long
. The unsigned int
type may be abbreviated as unsigned
. That is, unsigned
with no other type implies unsigned int
.
Unlike the other integral types, there are three distinct types for char
: plain char, signed char
, and unsigned char
. Although there are three distinct types, there are only two ways a char
can be represented. The char
type is respresented using either the signed char
or unsigned char
version. Which representation is used for char
varies by compiler.
In an unsigned
type, all the bits represent the value. If a type is defined for a particular machine to use 8 bits, then the unsigned
version of this type could hold the values 0 through 255.
The C++ standard does not define how signed
types are represented at the bit level. Instead, each compiler is free to decide how it will represent signed
types. These representations can affect the range of values that a signed
type can hold. We are guaranteed that an 8-bit signed
type will hold at least the values from –127 through 127; many implementations allow values from –128 through 127.
Under the most common strategy for representing signed
integral types, we can view one of the bits as a sign bit. Whenever the sign bit is 1, the value is negative; when it is 0, the value is either 0 or a positive number. An 8-bit integral signed
type represented using a sign-bit can hold values from –128 through 127.
The type of an object determines the values that the object can hold. This fact raises the question of what happens when one tries to assign a value outside the allowable range to an object of a given type. The answer depends on whether the type is signed
or unsigned
.
For unsigned
types, the compiler must adjust the out-of-range value so that it will fit. The compiler does so by taking the remainder of the value modulo the number of distinct values the unsigned
target type can hold. An object that is an 8-bit unsigned char
, for example, can hold values from 0 through 255 inclusive. If we assign a value outside this range, the compiler actually assigns the remainder of the value modulo 256. For example, we might attempt to assign the value 336 to an 8-bit signed char
. If we try to store 336 in our 8-bit unsigned char
, the actual value assigned will be 80, because 80 is equal to 336 modulo 256.
For the unsigned
types, a negative value is always out of range. An object of unsigned
type may never hold a negative value. Some languages make it illegal to assign a negative value to an unsigned
type, but C++ does not.
In C++ it is perfectly legal to assign a negative number to an object with unsigned
type. The result is the negative value modulo the size of the type. So, if we assign –1 to an 8-bit unsigned char
, the resulting value will be 255, which is –1 modulo 256.
When assigning an out-of-range value to a signed
type, it is up to the compiler to decide what value to assign. In practice, many compilers treat signed
types similarly to how they are required to treat unsigned
types. That is, they do the assignment as the remainder modulo the size of the type. However, we are not guaranteed that the compiler will do so for the signed
types.
The types float, double
, and long double
represent floating-point single-, double-, and extended-precision values. Typically, float
s are represented in one word (32 bits), double
s in two words (64 bits), and long double
in either three or four words (96 or 128 bits). The size of the type determines the number of significant digits a floating-point value might contain.
The float
type is usually not precise enough for real programs—float
is guaranteed to offer only 6 significant digits. The double
type guarantees at least 10 significant digits, which is sufficient for most calculations.
A value, such as 42
, in a program is known as a literal constant: literal because we can speak of it only in terms of its value; constant because its value cannot be changed. Every literal has an associated type. For example, 0
is an int
and 3.14159
is a double
. Literals exist only for the built-in types. There are no literals of class types. Hence, there are no literals of any of the library types.
We can write a literal integer constant using one of three notations: decimal, octal, or hexadecimal. These notations, of course, do not change the bit representation of the value, which is always binary. For example, we can write the value 20
in any of the following three ways:
20 // decimal
024 // octal
0x14 // hexadecimal
Literal integer constants that begin with a leading 0
(zero) are interpreted as octal; those that begin with either 0x
or 0X
are interpreted as hexadecimal.
By default, the type of a literal integer constant is either int
or long
. The precise type depends on the value of the literal—values that fit in an int
are type int
and larger values are type long
. By adding a suffix, we can force the type of a literal integer constant to be type long
or unsigned
or unsigned long
. We specify that a constant is a long
by immediately following the value with either L
or l
(the letter “ell” in either uppercase or lowercase).
When specifying a long, use the uppercase L
: the lowercase letter l
is too easily mistaken for the digit 1.
In a similar manner, we can specify unsigned
by following the literal with either U
or u
. We can obtain an unsigned long
literal constant by following the value by both L
and U
. The suffix must appear with no intervening space:
There are no literals of type short
.
We can use either common decimal notation or scientific notation to write floating-point literal constants. Using scientific notation, the exponent is indicated either by E
or e
. By default, floating-point literals are type double
. We indicate single precision by following the value with either F
or f
. Similarly, we specify extended precision by following the value with either L
or l
(again, use of the lowercase l
is discouraged). Each pair of literals below denote the same underlying value:
The words true
and false
are literals of type bool
:
bool test = false;
Printable character literals are written by enclosing the character within single quotation marks:
'a' '2' ',' ' ' // blank
Such literals are of type char
. We can obtain a wide-character literal of type wchar_t
by immediately preceding the character literal with an L
, as in
L'a'
Some characters are nonprintable. A nonprintable character is a character for which there is no visible image, such as backspace or a control character. Other characters have special meaning in the language, such as the single and double quotation marks, and the backslash. Nonprintable characters and special characters are written using an escape sequence. An escape sequence begins with a backslash. The language defines the following escape sequences:
We can write any character as a generalized escape sequence of the form
ooo
where ooo
represents a sequence of as many as three octal digits. The value of the octal digits represents the numerical value of the character. The following examples are representations of literal constants using the ASCII character set:
The character represented by ’