Appendix D. Introduction to C++ for Java and C# Programmers

This appendix provides a short introduction to C++ for developers who already know Java or C#. It assumes that you are familiar with object-oriented concepts such as inheritance and polymorphism and that you want to learn C++. To avoid making this book an unwieldy 1500 page doorstop by including a complete C++ primer, this appendix confines itself to essentials. It presents the basic knowledge and techniques necessary to understand the programs presented in the rest of the book, with enough information to start developing cross-platform C++ GUI applications using Qt.

At the time of this writing, C++ is the only realistic option for developing cross-platform, high-performance, object-oriented GUI applications. Its detractors usually point out that Java or C#, which dropped C compatibility, is nicer to use; in fact, Bjarne Stroustrup, the inventor of C++, noted in The Design and Evolution of C++ (Addison-Wesley, 1994) that “within C++, there is a much smaller and cleaner language struggling to get out”.

Fortunately, when we program with Qt, we usually stick to a subset of C++ that is very close to the utopian language envisioned by Stroustrup, leaving us free to concentrate on the problem at hand. Furthermore, Qt extends C++ in several respects, through its innovative “signals and slots” mechanism, its Unicode support, and its foreach keyword.

In the first section of this appendix, we will see how to combine C++ source files to obtain an executable program. This will lead us to explore core C++ concepts such as compilation units, header files, object files, and libraries—and to get familiar with the C++ preprocessor, compiler, and linker.

Then we will turn to the most important language differences between C++, Java, and C#: how to define classes, how to use pointers and references, how to overload operators, how to use the preprocessor, and so on. Although the C++ syntax is superficially similar to that of Java and C#, the underlying concepts differ in subtle ways. At the same time, as an inspirational source for Java and C#, the C++ language has a lot in common with these two languages, including similar data types, the same arithmetic operators, and the same basic control flow statements.

The last section is devoted to the Standard C++ library, which provides ready-made functionality that can be used in any C++ program. The library is the result of more than thirty years of evolution, and as such it provides a wide range of approaches including procedural, object-oriented, and functional programming styles, and both macros and templates. Compared with the libraries provided with Java and C#, the Standard C++ library is quite narrow in scope; for example, it has no support for GUI programming, multithreading, databases, internationalization, networking, XML, or Unicode. To develop in these areas, C++ programmers are expected to use various (often platform-specific) third-party libraries.

This is where Qt saves the day. Qt began as a cross-platform GUI toolkit (a set of classes that makes it possible to write portable graphical user interface applications) but rapidly evolved into a full-blown application development framework that partly extends and partly replaces the Standard C++ library. Although this book uses Qt, it is useful to know what the Standard C++ library has to offer, since you may have to work with code that uses it.

Getting Started with C++

A C++ program consists of one or more compilation units. Each compilation unit is a separate source code file, typically with a .cpp extension (other common extensions are .cc and .cxx) that the compiler processes in one run. For each compilation unit, the compiler generates an object file, with the extension .obj (on Windows) or .o (on Unix and Mac OS X). The object file is a binary file that contains machine code for the architecture on which the program will run.

Once all the .cpp files have been compiled, we can combine the object files together to create an executable using a special program called the linker. The linker concatenates the object files and resolves the memory addresses of functions and other symbols referenced in the compilation units.

When building a program, exactly one compilation unit must contain a main() function that serves as the program’s entry point. This function doesn’t belong to any class; it is a global function. The process is shown schematically in Figure D.1.

The C++ compilation process (on Windows)

Figure D.1. The C++ compilation process (on Windows)

Unlike Java, where each source file must contain exactly one class, C++ lets us organize the compilation units as we want. We can implement several classes in the same .cpp file, or spread the implementation of a class across several .cpp files, and we can give the source files any names we like. When we make a change in one particular .cpp file, we need to recompile only that file and then relink the application to create a new executable.

Before we go further, let’s quickly review the source code of a trivial C++ program that computes the square of an integer. The program consists of two compilation units: main.cpp and square.cpp.

Here’s square.cpp:

1 double square(double n)
2 {
3     return n * n;
4 }

This file simply contains a global function called square() that returns the square of its parameter.

Here’s main.cpp:

 1 #include <cstdlib>
 2 #include <iostream>

 3 double square(double);

 4 int main(int argc, char *argv[])
 5 {
 6     if (argc != 2) {
 7         std::cerr << "Usage: square <number>" << std::endl;
 8         return 1;
 9     }

10     double n = std::strtod(argv[1], 0);
11     std::cout << "The square of " << argv[1] << " is "
12               << square(n) << std::endl;
13     return 0;
14 }

The main.cpp source file contains the main() function’s definition. In C++, this function takes an int and a char * array (an array of character strings) as parameters. The program’s name is available as argv[0] and the command-line arguments as argv[1], argv[2], ..., argv[argc - 1]. The parameter names argc (“argument count”) and argv (“argument values”) are conventional. If the program doesn’t access the command-line arguments, we can define main() with no parameters.

The main() function uses strtod() (“string to double”), cout (C++’s standard output stream), and cerr (C++’s standard error stream) from the Standard C++ library to convert the command-line argument to a double and to print text to the console. Strings, numbers, and end-of-line markers (endl) are output using the << operator, which is also used for bit-shifting. To access this standard functionality, we need the #include directives on lines 1 and 2.

All the functions and most other items in the Standard C++ library are in the std namespace. One way to access an item in a namespace is to prefix its name with the namespace’s name using the :: operator. In C++, the :: operator separates the components of a complex name. Namespaces make large multi-person projects easier because they help avoid name conflicts. We cover them later in this appendix.

The declaration on line 3 is a function prototype. It tells the compiler that a function exists with the given parameters and return value. The actual function can be located in the same compilation unit or in another compilation unit. Without the function prototype, the compiler wouldn’t let us call the function on line 12. Parameter names in function prototypes are optional.

The procedure to compile the program varies from platform to platform. For example, to compile on Solaris with the Sun C++ compiler, we would type the following commands:

CC -c main.cpp
CC -c square.cpp
CC main.o square.o -o square

The first two lines invoke the compiler to generate .o files for the .cpp files. The third line invokes the linker and generates an executable called square, which we can run as follows:

./square 64

This run of the program outputs the following message to the console:

The square of 64 is 4096

To compile the program, you probably want to get help from your local C++ guru. Failing this, you can still read the rest of this appendix without compiling anything and follow the instructions in Chapter 1 to compile your first C++/Qt application. Qt provides tools that make it easy to build applications on all platforms.

Back to our program: In a real-world application, we would normally put the square() function prototype in a separate file and include that file in all the compilation units where we need to call the function. Such a file is called a header file and usually has a .h extension (.hh, .hpp, and .hxx are also common). If we redo our example using the header file approach, we would create a file called square.h with the following contents:

1 #ifndef SQUARE_H
2 #define SQUARE_H

3 double square(double);

4 #endif

The header file is bracketed by three preprocessor directives (#ifndef, #define, and #endif). These directives ensure that the header file is processed only once, even if the header file is included several times in the same compilation unit (a situation that can arise when header files include other header files). By convention, the preprocessor symbol used to accomplish this is derived from the file name (in our example, SQUARE_H). We will come back to the preprocessor later in this appendix.

The new main.cpp file looks like this:

 1 #include <cstdlib>
 2 #include <iostream>

 3 #include "square.h"

 4 int main(int argc, char *argv[])
 5 {
 6     if (argc != 2) {
 7         std::cerr << "Usage: square <number>" << std::endl;
 8         return 1;
 9     }

10     double n = std::strtod(argv[1], 0);
11     std::cout << "The square of " << argv[1] << " is "
12               << square(n) << std::endl;
13     return 0;
14 }

The #include directive on line 3 expands to the contents of the file square.h. Directives that start with a # are picked up by the C++ preprocessor before the compilation proper takes place. In the old days, the preprocessor was a separate program that the programmer invoked manually before running the compiler. Modern compilers handle the preprocessor step implicitly.

The #include directives on lines 1 and 2 expand to the contents of the cstdlib and iostream header files, which are part of the Standard C++ library. Standard header files have no .h suffix. The angle brackets around the file names indicate that the header files are located in a standard location on the system, and double quotes tell the compiler to look in the current directory. Includes are normally gathered at the top of a .cpp file.

Unlike .cpp files, header files are not compilation units in their own right and do not produce any object files. Header files may only contain declarations that enable different compilation units to communicate with each other. Consequently, it would be inappropriate to put the square() function’s implementation in a header file. If we did so in our example, nothing bad would happen, because we include square.h only once, but if we included square.h from several .cpp files, we would get multiple implementations of the square() function (one per .cpp file that includes it). The linker would then complain about multiple (identical) definitions of square() and refuse to generate an executable. Conversely, if we declare a function but never implement it, the linker complains about an “unresolved symbol”.

So far, we have assumed that an executable consists exclusively of object files. In practice, it often also links against libraries that implement ready-made functionality. There are two main types of libraries:

  • Static libraries are put directly into the executable, as though they were object files. This ensures that the library cannot get lost but increases the size of the executable.

  • Dynamic libraries (also called shared libraries or DLLs) are located at a standard location on the user’s machine and are automatically loaded at application startup.

For the square program, we link against the Standard C++ library, which is implemented as a dynamic library on most platforms. Qt itself is a collection of libraries that can be built either as static or as dynamic libraries (the default is dynamic).

Main Language Differences

We will now take a more structured look at the areas where C++ differs from Java and C#. Many of the language differences are due to C++’s compiled nature and commitment to performance. Thus, C++ does not check array bounds at run-time, and there is no garbage collector to reclaim unused dynamically allocated memory.

For the sake of brevity, C++ constructs that are nearly identical to their Java and C# counterparts are not reviewed. In addition, some C++ topics are not covered here because they are not necessary when programming using Qt. Among these are defining template classes and functions, defining union types, and using exceptions. For the whole story, refer to a book such as The C++ Programming Language by Bjarne Stroustrup (Addison-Wesley, 2000) or C++ for Java Programmers by Mark Allen Weiss (Prentice Hall, 2003).

Primitive Data Types

The primitive data types offered by the C++ language are similar to those found in Java or C#. Figure D.2 lists C++’s primitive types and their definitions on the platforms supported by Qt 4.

Table D.2. Primitive C++ types

C++ Type

Description

bool

Boolean value

char

8-bit integer

short

16-bit integer

int

32-bit integer

long

32-bit or 64-bit integer

long long[*]

64-bit integer

float

32-bit floating-point value (IEEE 754)

double

64-bit floating-point value (IEEE 754)

[*] Microsoft calls the non-standard (but due to be standardized) long long type __int64. In Qt programs, qlonglong is available as an alternative that works on all Qt platforms.

By default, the short, int, long, and long long data types are signed, meaning that they can hold negative values as well as positive values. If we only need to store nonnegative integers, we can put the unsigned keyword in front of the type. Whereas a short can hold any value between −32768 and +32767, an unsigned short goes from 0 to 65535. The right-shift operator >> has unsigned (“fill with 0s”) semantics if one of the operands is unsigned.

The bool type can take the values true and false. In addition, numeric types can be used where a bool is expected, with the rule that 0 means false and any non-zero value means true.

The char type is used for storing ASCII characters and 8-bit integers (bytes). When used as an integer, it can be signed or unsigned, depending on the platform. The types signed char and unsigned char are available as unambiguous alternatives to char. Qt provides a QChar type that stores 16-bit Unicode characters.

Instances of built-in types are not initialized by default. When we create an int variable, its value could conceivably be 0, but could just as likely be -209486515. Fortunately, most compilers warn us when we attempt to read the contents of an uninitialized variable, and we can use tools such as Rational PurifyPlus and Valgrind to detect uninitialized memory accesses and other memory-related problems at run-time.

In memory, the numeric types (except long) have identical sizes on the different platforms supported by Qt, but their representation varies depending on the system’s byte order. On big-endian architectures (such as PowerPC and SPARC), the 32-bit value 0x12345678 is stored as the four bytes 0x12 0x34 0x56 0x78, whereas on little-endian architectures (such as Intel x86), the byte sequence is reversed. This makes a difference in programs that copy memory areas onto disk or that send binary data over the network. Qt’s QDataStream class, presented in Chapter 12, can be used to store binary data in a platform-independent way.

Class Definitions

Class definitions in C++ are similar to those in Java and C#, but there are several differences to be aware of. We will study these differences using a series of examples. Let’s start with a class that represent an (x, y) coordinate pair:

#ifndef POINT2D_H
#define POINT2D_H

class Point2D
{
public:
    Point2D() {
        xVal = 0;
        yVal = 0;
    }
    Point2D(double x, double y) {
        xVal = x;
        yVal = y;
    }

    void setX(double x) { xVal = x; }
    void setY(double y) { yVal = y; }
    double x() const { return xVal; }
    double y() const { return yVal; }

private:
    double xVal;
    double yVal;
};

#endif

The preceding class definition would appear in a header file, typically called point2d.h. The example exhibits the following C++ idiosyncrasies:

  • A class definition is divided into public, protected, and private sections, and ends with a semicolon. If no section is specified, the default is private. (For compatibility with C, C++ provides a struct keyword that is identical to class except that the default is public if no section is specified.)

  • The class has two constructors (one that has no parameters and one that has two). If we declared no constructor, C++ would automatically supply one with no parameters and an empty body.

  • The getter functions x() and y() are declared to be const. This means that they don’t (and can’t) modify the member variables or call non-const member functions (such as setX() and setY()).

The preceding functions were implemented inline, as part of the class definition. An alternative is to provide only function prototypes in the header file and to implement the functions in a .cpp file. Using this approach, the header file would look like this:

#ifndef POINT2D_H
#define POINT2D_H

class Point2D
{
public:
    Point2D();
    Point2D(double x, double y);

    void setX(double x);
    void setY(double y);
    double x() const;
    double y() const;

private:
    double xVal;
    double yVal;
};

#endif

The functions would then be implemented in point2d.cpp:

#include "point2d.h"

Point2D::Point2D()
{
    xVal = 0.0;
    yVal = 0.0;
}

Point2D::Point2D(double x, double y)
{
    xVal = x;
    yVal = y;
}

void Point2D::setX(double x)
{
    xVal = x;
}

void Point2D::setY(double y)
{
    yVal = y;
}

double Point2D::x() const
{
    return xVal;
}

double Point2D::y() const
{
    return yVal;
}

We start by including point2d.h because the compiler needs the class definition before it can parse member function implementations. Then we implement the functions, prefixing the function name with the class name using the :: operator.

We have seen how to implement a function inline and now how to implement it in a .cpp file. The two approaches are semantically equivalent, but when we call a function that is declared inline, most compilers simply expand the function’s body instead of generating an actual function call. This normally leads to faster code, but might increase the size of your application. For this reason, only very short functions should be implemented inline; longer functions should always be implemented in a .cpp file. In addition, if we forget to implement a function and try to call it, the linker will complain about an unresolved symbol.

Now, let’s try to use the class.

#include "point2d.h"

int main()
{
    Point2D alpha;
    Point2D beta(0.666, 0.875);

    alpha.setX(beta.y());
    beta.setY(alpha.x());

    return 0;
}

In C++, variables of any types can be declared directly without using new. The first variable is initialized using the default Point2D constructor (the constructor that has no parameters). The second variable is initialized using the second constructor. Access to an object’s member is performed using the . (dot) operator.

Variables declared this way behave like Java/C# primitive types such as int and double. For example, when we use the assignment operator, the contents of the variable are copied—not just a reference to an object. And if we modify a variable later on, any other variables that were assigned from it are left unchanged.

As an object-oriented language, C++ supports inheritance and polymorphism. To illustrate how it works, we will review the example of a Shape abstract base class and a subclass called Circle. Let’s start with the base class:

#ifndef SHAPE_H
#define SHAPE_H

#include "point2d.h"

class Shape
{
public:
    Shape(Point2D center) { myCenter = center; }

    virtual void draw() = 0;

protected:
    Point2D myCenter;
};

#endif

The definition appears in a header file called shape.h. Since the class definition refers to the Point2D class, we include point2d.h.

The Shape class has no base class. Unlike Java and C#, C++ doesn’t provide an Object class from which all classes are implicitly derived. Qt provides QObject as a natural base class for all kinds of objects.

The draw() function declaration has two interesting features: It contains the virtual keyword, and it ends with = 0. The virtual keyword indicates that the function may be reimplemented in subclasses. Like in C#, C++ member functions aren’t reimplementable by default. The bizarre = 0 syntax indicates that the function is a pure virtual function—a function that has no default implementation and that must be implemented in subclasses. The concept of an “interface” in Java and C# maps to a class with only pure virtual functions in C++.

Here’s the definition of the Circle subclass:

#ifndef CIRCLE_H
#define CIRCLE_H

#include "shape.h"

class Circle : public Shape
{
public:
    Circle(Point2D center, double radius = 0.5)
        : Shape(center) {
        myRadius = radius;
    }

    void draw() {
        // do something here
    }

private:
    double myRadius;
};

#endif

The Circle class is publicly derived from Shape, meaning that all public members of Shape remain public in Circle. C++ also supports protected and private inheritance, which restrict the access of the base class’s public and protected members.

The constructor takes two parameters. The second parameter is optional and takes the value 0.5 if not specified. The constructor passes the center parameter to the base class’s constructor using a special syntax between the function signature and the function body. In the body, we initialize the myRadius member variable. We could also have initialized the variable on the same line as the base class constructor initialization:

    Circle(Point2D center, double radius = 0.5)
        : Shape(center), myRadius(radius) { }

On the other hand, C++ doesn’t allow us to initialize a member variable in the class definition, so the following code is wrong:

// WON'T COMPILE
private:
    double myRadius = 0.5;
};

The draw() function has the same signature as the virtual draw() function declared in Shape. It is a reimplementation and it will be invoked polymorphically when draw() is called on a Circle instance through a Shape reference or pointer. C++ has no override keyword like in C#. Nor does C++ have a super or base keyword that refers to the base class. If we need to call the base implementation of a function, we can prefix the function name with the base class name and the :: operator. For example:

class LabeledCircle : public Circle
{
public:
    void draw() {
        Circle::draw();
        drawLabel();
    }
    ...
};

C++ supports multiple inheritance, meaning that a class can be derived from several classes at the same time. The syntax is as follows:

class DerivedClass : public BaseClass1, public BaseClass2, ...,
                     public BaseClassN
{
    ...
};

By default, functions and variables declared in a class are associated with instances of that class. We can also declare static member functions and static member variables, which can be used without an instance. For example:

#ifndef TRUCK_H
#define TRUCK_H

class Truck
{
public:
    Truck() { ++counter; }
    ~Truck() { --counter; }
    static int instanceCount() { return counter; }

private:
    static int counter;
};

#endif

The static member variable counter keeps track of how many Truck instances exist at any time. The Truck constructor increments it. The destructor, recognizable by the tilde (~) prefix, decrements it. In C++, the destructor is automatically invoked when a statically allocated variable goes out of scope or when a variable allocated using new is deleted. This is similar to the finalize() method in Java, except that we can rely on it being called at a specific point in time.

A static member variable has a single existence in a class: Such variables are “class variables” rather than “instance variables”. Each static member variable must be defined in a .cpp file (but without repeating the static keyword). For example:

#include "truck.h"

int Truck::counter = 0;

Failing to do this would result in an “unresolved symbol” error at link time. The instanceCount() static function can be accessed from outside the class, prefixed by the class name. For example:

#include <iostream>

#include "truck.h"

int main()
{
    Truck truck1;
    Truck truck2;

    std::cout << Truck::instanceCount() << " equals 2" << std::endl;

    return 0;
}

Pointers

A pointer in C++ is a variable that stores the memory address of an object (instead of storing the object directly). Java and C# have a similar concept, that of a “reference”, but the syntax is different. We will start by studying a contrived example that illustrates pointers in action:

 1 #include "point2d.h"

 2 int main()
 3 {
 4     Point2D alpha;
 5     Point2D beta;
 6     Point2D *ptr;

 7     ptr = &alpha;
 8     ptr->setX(1.0);
 9     ptr->setY(2.5);

10     ptr = &beta;
11     ptr->setX(4.0);
12     ptr->setY(4.5);

13     ptr = 0;

14     return 0;
15 }

The example relies on the Point2D class from the previous subsection. Lines 4 and 5 define two objects of type Point2D. These objects are initialized to (0, 0) by the default Point2D constructor.

Line 6 defines a pointer to a Point2D object. The syntax for pointers uses an asterisk in front of the variable name. Since we did not initialize the pointer, it contains a random memory address. This is solved on line 7 by assigning alpha’s address to the pointer. The unary & operator returns the memory address of an object. An address is typically a 32-bit or a 64-bit integer value specifying the offset of an object in memory.

On lines 8 and 9, we access the alpha object through the ptr pointer. Because ptr is a pointer and not an object, we must use the -> (arrow) operator instead of the . (dot) operator.

On line 10, we assign beta’s address to the pointer. From then on, any operation we perform through the pointer will affect the beta object.

Line 13 sets the pointer to be a null pointer. C++ has no keyword for representing a pointer that does not point to an object; instead, we use the value 0 (or the symbolic constant NULL, which expands to 0). Trying to use a null pointer results in a crash with an error message such as “Segmentation fault”, “General protection fault”, or “Bus error”. Using a debugger, we can find out which line of code caused the crash.

At the end of the function, the alpha object holds the coordinate pair (1.0, 2.5), whereas beta holds (4.0, 4.5).

Pointers are often used to store objects allocated dynamically using new. In C++ jargon, we say that these objects are allocated on the “heap”, whereas local variables (variables defined inside a function) are stored on the “stack”.

Here’s a code snippet that illustrates dynamic memory allocation using new:

#include "point2d.h"

int main()
{
    Point2D *point = new Point2D;
    point->setX(1.0);
    point->setY(2.5);
    delete point;

    return 0;
}

The new operator returns the memory address of a newly allocated object. We store the address in a pointer variable and access the object through that pointer. When we are done with the object, we release its memory using the delete operator. Unlike Java and C#, C++ has no garbage collector; dynamically allocated objects must be explicitly released using delete when we don’t need them anymore. Chapter 2 describes Qt’s parent–child mechanism, which greatly simplifies memory management in C++ programs.

If we forget to call delete, the memory is kept around until the program finishes. This would not be an issue in the preceding example, because we allocate only one object, but in a program that allocates new objects all the time, this could cause the program to keep allocating memory until the machine’s memory is exhausted. Once an object is deleted, the pointer variable still holds the address of the object. Such a pointer is a “dangling pointer” and should not be used to access the object. Qt provides a “smart” pointer, QPointer<T>, that automatically sets itself to 0 if the QObject it points to is deleted.

In the preceding example, we invoked the default constructor and called setX() and setY() to initialize the object. We could have used the two-parameter constructor instead:

Point2D *point = new Point2D(1.0, 2.5);

The example didn’t require the use of new and delete. We could just as well have allocated the object on the stack as follows:

Point2D point;
point.setX(1.0);
point.setY(2.5);

Objects allocated like this are automatically freed at the end of the block in which they appear.

If we don’t intend to modify the object through the pointer, we can declare the pointer const. For example:

const Point2D *ptr = new Point2D(1.0, 2.5);
double x = ptr->x();
double y = ptr->y();

// WON'T COMPILE
ptr->setX(4.0);
*ptr = Point2D(4.0, 4.5);

The ptr const pointer can be used only to call const member functions such as x() and y(). It is good style to declare pointers const when we don’t intend to modify the object using them. Furthermore, if the object itself is const, we have no choice but to use a const pointer to store its address. The use of const provides information to the compiler that can lead to early bug detection and performance gains. C# has a const keyword that is very similar to that of C++. The closest Java equivalent is final, but it only protects variables from assignment, not from calling “non-const” member functions on it.

Pointers can be used with built-in types as well as with classes. In an expression, the unary * operator returns the value of the object associated with the pointer. For example:

int i = 10;
int j = 20;

int *p = &i;
int *q = &j;

std::cout << *p << " equals 10" << std::endl;
std::cout << *q << " equals 20" << std::endl;

*p = 40;

std::cout << i << " equals 40" << std::endl;

p = q;
*p = 100;

std::cout << i << " equals 40" << std::endl;
std::cout << j << " equals 100" << std::endl;

The -> operator, which can be used to access an object’s members through a pointer, is pure syntactic sugar. Instead of ptr->member, we can also write (*ptr).member. The parentheses are necessary because the . (dot) operator has precedence over the unary * operator.

Pointers had a poor reputation in C and C++, to the extent that Java is often advertised as having no pointers. In reality, C++ pointers are conceptually similar to Java and C# references except that we can use pointers to iterate through memory, as we will see later in this section. Furthermore, the inclusion of “copy on write” container classes in Qt, along with C++’s ability to instantiate any class on the stack, means that we can often avoid pointers.

References

In addition to pointers, C++ also supports the concept of a “reference”. Like a pointer, a C++ reference stores the address of an object. Here are the main differences:

  • References are declared using & instead of *.

  • The reference must be initialized and can’t be reassigned later.

  • The object associated with a reference is directly accessible; there is no special syntax such as * or ->.

  • A reference cannot be null.

References are generally used when declaring parameters. For most types, C++ uses call-by-value as its default parameter-passing mechanism, meaning that when an argument is passed to a function, the function receives a brand new copy of the object. Here’s the definition of a function that receives its parameters through call-by-value:

#include <cstdlib>

double manhattanDistance(Point2D a, Point2D b)
{
    return std::abs(b.x() - a.x()) + std::abs(b.y() - a.y());
}

We would then invoke the function as follows:

Point2D broadway(12.5, 40.0);
Point2D harlem(77.5, 50.0);
double distance = manhattanDistance(broadway, harlem);

C programmers avoid needless copy operations by declaring their parameters as pointers instead of as values:

double manhattanDistance(const Point2D *ap, const Point2D *bp)
{
    return std::abs(bp->x() - ap->x()) + std::abs(bp->y() - ap->y());
}

They must then pass addresses instead of values when calling the function:

double distance = manhattanDistance(&broadway, &harlem);

C++ introduced references to make the syntax less cumbersome and to prevent the caller from passing a null pointer. If we use references instead of pointers, the function looks like this:

double manhattanDistance(const Point2D &a, const Point2D &b)
{
    return std::abs(b.x() - a.x()) + std::abs(b.y() - a.y());
}

The declaration of a reference is similar to that of a pointer, with & instead of *. But when we actually use the reference, we can forget that it is a memory address and treat it like an ordinary variable. In addition, calling a function that takes references as arguments doesn’t require any special care (no & operator).

All in all, by replacing Point2D with const Point2D & in the parameter list, we reduced the overhead of the function call: Instead of copying 256 bits (the size of four doubles), we copy only 64 or 128 bits, depending on the target platform’s pointer size.

The previous example used const references, preventing the function from modifying the objects associated with the references. When this kind of side effect is desired, we can pass a non-const reference or pointer. For example:

void transpose(Point2D &point)
{
    double oldX = point.x();
    point.setX(point.y());
    point.setY(oldX);
}

In some cases, we have a reference and we need to call a function that takes a pointer, or vice versa. To convert a reference to a pointer, we can simply use the unary & operator:

Point2D point;
Point2D &ref = point;
Point2D *ptr = &ref;

To convert a pointer to a reference, there is the unary * operator:

Point2D point;
Point2D *ptr = &point;
Point2D &ref = *ptr;

References and pointers are represented the same way in memory, and they can often be used interchangeably, which begs the question of when to use which. On the one hand, references have a more convenient syntax; on the other hand, pointers can be reassigned at any time to point to another object, they can hold a null value, and their more explicit syntax is often a blessing in disguise. For these reasons, pointers tend to prevail, with references used almost exclusively for declaring function parameters, in conjunction with const.

Arrays

Arrays in C++ are declared by specifying the number of items in the array within brackets in the variable declaration after the variable name. Two-dimensional arrays are possible using an array of arrays. Here’s the definition of a one-dimensional array containing ten items of type int:

int fibonacci[10];

The items are accessible as fibonacci[0], fibonacci[1], ..., fibonacci[9]. Often we want to initialize the array as we define it:

int fibonacci[10] = { 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 };

In such cases, we can then omit the array size, since the compiler can deduce it from the number of initializers:

int fibonacci[] = { 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 };

Static initialization also works for complex types, such as Point2D:

Point2D triangle[] = {
    Point2D(0.0, 0.0), Point2D(1.0, 0.0), Point2D(0.5, 0.866)
};

If we have no intention of altering the array later on, we can make it const:

const int fibonacci[] = { 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 };

To find out how many items an array contains, we can use the sizeof() operator as follows:

int n = sizeof(fibonacci) / sizeof(fibonacci[0]);

The sizeof() operator returns the size of its argument in bytes. The number of items in an array is its size in bytes divided by the size of one of its items. Because this is cumbersome to type, a common alternative is to declare a constant and to use it for defining the array:

enum { NFibonacci = 10 };

const int fibonacci[NFibonacci] = { 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 };

It would have been tempting to declare the constant as a const int variable. Unfortunately, some compilers have issues with const variables as array size specifiers. We will explain the enum keyword later in this appendix.

Iterating through an array is normally done using an integer. For example:

for (int i = 0; i < NFibonacci; ++i)
    std::cout << fibonacci[i] << std::endl;

It is also possible to traverse the array using a pointer:

const int *ptr = &fibonacci[0];
while (ptr != &fibonacci[10]) {
    std::cout << *ptr << std::endl;
    ++ptr;
}

We initialize the pointer with the address of the first item and loop until we reach the “one past the last” item (the “eleventh” item, fibonacci[10]). At each iteration, the ++ operator advances the pointer to the next item.

Instead of &fibonacci[0], we could also have written fibonacci. This is because the name of an array used alone is automatically converted into a pointer to the first item in the array. Similarly, we could substitute fibonacci + 10 for &fibonacci[10]. This works the other way around as well: We can retrieve the contents of the current item using either *ptr or ptr[0] and could access the next item using *(ptr + 1) or ptr[1]. This principle is sometimes called “equivalence of pointers and arrays”.

To prevent what it considers to be a gratuitous inefficiency, C++ does not let us pass arrays to functions by value. Instead, they must be passed by address. For example:

#include <iostream>

void printIntegerTable(const int *table, int size)
{
    for (int i = 0; i < size; ++i)
        std::cout << table[i] << std::endl;
}

int main()
{
    const int fibonacci[10] = { 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 };
    printIntegerTable(fibonacci, 10);
    return 0;
}

Ironically, although C++ doesn’t give us any choice about whether we want to pass an array by address or by value, it gives us some freedom in the syntax used to declare the parameter type. Instead of const int *table, we could also have written const int table[] to declare a pointer-to-constant-int parameter. Similarly, the argv parameter to main() can be declared as either char *argv[] or char **argv.

To copy an array into another array, one approach is to loop through the array:

const int fibonacci[NFibonacci] = { 0, 1, 1, 2, 3, 5, 8, 13, 21, 34 };
int temp[NFibonacci];

for (int i = 0; i < NFibonacci; ++i)
    temp[i] = fibonacci[i];

For basic data types such as int, we can also use memcpy(), which copies a block of memory. For example:

std::memcpy(temp, fibonacci, sizeof(fibonacci));

When we declare a C++ array, the size must be a constant.[*] If we want to create an array of a variable size, we have several options.

  • We can dynamically allocate the array:

    int *fibonacci = new int[n];

    The new [] operator allocates a certain number of items at consecutive memory locations and returns a pointer to the first item. Thanks to the “equivalence of pointers and arrays” principle, the items can be accessed through the pointer as fibonacci[0], fibonacci[1], ..., fibonacci[n - 1]. When we have finished using the array, we should release the memory it consumes using the delete [] operator:

    delete [] fibonacci;
  • We can use the standard vector<T> class:

    #include <vector>
    
    std::vector<int> fibonacci(n);

    Items are accessible using the [] operator, just like with a plain C++ array. With vector<T> (where T is the type of the items stored in the vector), we can resize the array at any time using resize() and we can copy it using the assignment operator. Classes that contain angle brackets (<>) in their name are called template classes.

  • We can use Qt’s QVector<T> class:

    #include <QVector>
    
    QVector<int> fibonacci(n);

    QVector<T>’s API is very similar to that of vector<T>, but it also supports iteration using Qt’s foreach keyword and uses implicit data sharing (“copy on write”) as a memory and speed optimization. Chapter 11 presents Qt’s container classes and explains how they relate to the Standard C++ containers.

You might be tempted to avoid built-in arrays whenever possible and use vector<T> or QVector<T> instead. It is nonetheless worthwhile understanding how the built-in arrays work because sooner or later you might want to use them in highly optimized code, or need them to interface with existing C libraries.

Character Strings

The most basic way to represent character strings in C++ is to use an array of chars terminated by a null byte (‘/ 0’). The following four functions demonstrate how these kinds of strings work:

void hello1()
{
    const char str[] = {
        'H', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', ''
    };
    std::cout << str << std::endl;
}

void hello2()
{
    const char str[] = "Hello world!";
    std::cout << str << std::endl;
}

void hello3()
{
    std::cout << "Hello world!" << std::endl;
}

void hello4()
{
    const char *str = "Hello world!";
    std::cout << str << std::endl;
}

In the first function, we declare the string as an array and initialize it the hard way. Notice the ‘/0’ terminator at the end, which indicates the end of the string. The second function has a similar array definition, but this time we use a string literal to initialize the array. In C++, string literals are simply const char arrays with an implicit ‘/0’ terminator. The third function uses a string literal directly, without giving it a name. Once translated into machine language instructions, it is identical to the previous two functions.

The fourth function is a bit different in that it creates not only an (anonymous) array, but also a pointer variable called str that stores the address of the array’s first item. In spite of this, the semantics of the function are identical to the previous three functions, and an optimizing compiler would eliminate the superfluous str variable.

Functions that take C++ strings as arguments usually take either a char * or a const char *. Here’s a short program that illustrates the use of both:

#include <cctype>
#include <iostream>

void makeUppercase(char *str)
{
    for (int i = 0; str[i] != ''; ++i)
        str[i] = std::toupper(str[i]);
}

void writeLine(const char *str)
{
    std::cout << str << std::endl;
}

int main(int argc, char *argv[])
{
    for (int i = 1; i < argc; ++i) {
        makeUppercase(argv[i]);
        writeLine(argv[i]);
    }
    return 0;
}

In C++, the char type normally holds an 8-bit value. This means that we can easily store ASCII, ISO 8859-1 (Latin-1), and other 8-bit-encoded strings in a char array, but that we can’t store arbitrary Unicode characters without resorting to multibyte sequences. Qt provides the powerful QString class, which stores Unicode strings as sequences of 16-bit QChars and internally uses the implicit data sharing (“copy on write”) optimization. Chapter 11 and Chapter 18 explain QString in more detail.

Enumerations

C++ has an enumeration feature for declaring a set of named constants similar to that provided by C# and recent versions of Java. Let’s suppose that we want to store days of the week in a program:

enum DayOfWeek {
    Sunday, Monday, Tuesday, Wednesday, Thursday, Friday, Saturday
};

Normally, we would put this declaration in a header file, or even inside a class. The preceding declaration is superficially equivalent to the following constant definitions:

const int Sunday    = 0;
const int Monday    = 1;
const int Tuesday   = 2;
const int Wednesday = 3;
const int Thursday  = 4;
const int Friday    = 5;
const int Saturday  = 6;

By using the enumeration construct, we can later declare variables or parameters of type DayOfWeek and the compiler will ensure that only values from the DayOfWeek enumeration are assigned to it. For example:

DayOfWeek day = Sunday;

If we don’t care about type safety, we can also write

int day = Sunday;

Notice that to refer to the Sunday constant from the DayOfWeek enum, we simply write Sunday, not DayOfWeek::Sunday.

By default, the compiler assigns consecutive integer values to the constants of an enum, starting at 0. We can specify other values if we want:

enum DayOfWeek {
    Sunday    = 628,
    Monday    = 616,
    Tuesday   = 735,
    Wednesday = 932,
    Thursday  = 852,
    Friday    = 607,
    Saturday  = 845
};

If we don’t specify the value of an enum item, the item takes the value of the preceding item, plus 1. Enums are sometimes used to declare integer constants, in which case we normally omit the name of the enum:

enum {
    FirstPort = 1024,
    MaxPorts  = 32767
};

Another frequent use of enums is to represent sets of options. Let’s consider the example of a Find dialog, with four checkboxes controlling the search algorithm (Wildcard syntax, Case sensitive, Search backward, and Wrap around). We can represent this by an enum where the constants are powers of 2:

enum FindOption {
    NoOptions      = 0x00000000,
    WildcardSyntax = 0x00000001,
    CaseSensitive  = 0x00000002,
    SearchBackward = 0x00000004,
    WrapAround     = 0x00000008
};

Each option is often called a “flag”. We can combine flags using the bitwise | or |= operator:

int options = NoOptions;
if (wilcardSyntaxCheckBox->isChecked())
    options |= WildcardSyntax;
if (caseSensitiveCheckBox->isChecked())
    options |= CaseSensitive;
if (searchBackwardCheckBox->isChecked())
    options |= SearchBackwardSyntax;
if (wrapAroundCheckBox->isChecked())
    options |= WrapAround;

We can test whether a flag is set using the bitwise & operator:

if (options & CaseSensitive) {
    // case-sensitive search
}

A variable of type FindOption can contain only one flag at a time. The result of combining several flags using | is a plain integer. Unfortunately, this is not type-safe: The compiler won’t complain if a function expecting a combination of FindOptions through an int parameter receives Saturday instead. Qt uses QFlags<T> to provide type safety for its own flag types. The class is also available when we define custom flag types. See the QFlags<T> online documentation for details.

Typedefs

C++ lets us give an alias to a data type using the typedef keyword. For example, if we use QVector<Point2D> a lot and want to save a few keystrokes (or are unfortunate enough to be stuck with a Norwegian keyboard and have trouble locating the angle brackets), we can put this typedef declaration in one of our header files:

typedef QVector<Point2D> PointVector;

From then on, we can use PointVector as a shorthand for QVector<Point2D>. Notice that the new name for the type appears after the old name. The typedef syntax deliberately mimics that of variable declarations.

In Qt, typedefs are used mainly for three reasons:

  • Convenience: Qt declares uint and QWidgetList as typedefs for unsigned int and QList<QWidget *> to save a few keystrokes.

  • Platform differences: Certain types need different definitions on different platforms. For example, qlonglong is defined as __int64 on Windows and as long long on other platforms.

  • Compatibility: The QIconSet class from Qt 3 was renamed QIcon in Qt 4. To help Qt 3 users port their applications to Qt 4, QIconSet is provided as a typedef for QIcon when Qt 3 compatibility is enabled.

Type Conversions

C++ provides several syntaxes for casting values from one type to another. The traditional syntax, inherited from C, involves putting the resulting type in parentheses before the value to convert:

const double Pi = 3.14159265359;
int x = (int)(Pi * 100);
std::cout << x << " equals 314" << std::endl;

This syntax is very powerful. It can be used to change the types of pointers, to remove const, and much more. For example:

short j = 0x1234;
if (*(char *)&j == 0x12)
    std::cout << "The byte order is big-endian" << std::endl;

In the preceding example, we cast a short * to a char * and we use the unary * operator to access the byte at the given memory location. On big-endian systems, that byte is 0x12; on little-endian systems, it is 0x34. Since pointers and references are represented the same way, it should come as no surprise that the preceding code can be rewritten using a reference cast:

short j = 0x1234;
if ((char &)j == 0x12)
    std::cout << "The byte order is big-endian" << std::endl;

If the data type is a class name, a typedef, or a primitive type that can be expressed as a single alphanumeric token, we can use the constructor syntax as a cast:

int x = int(Pi * 100);

Casting pointers and references using the traditional C-style casts is a kind of extreme sport, on par with paragliding and elevator surfing, because the compiler lets us cast any pointer (or reference) type into any other pointer (or reference) type. For that reason, C++ introduced four new-style casts with more precise semantics. For pointers and references, the new-style casts are preferable to the risky C-style casts and are used in this book.

  • static_cast<T>() can be used to cast a pointer-to-A to a pointer-to-B, with the constraint that class B must be a subclass of class A. For example:

    A *obj = new B;
    B *b = static_cast<B *>(obj);
    b->someFunctionDeclaredInB();

    If the object isn’t an instance of B, using the resulting pointer can lead to obscure crashes.

  • dynamic_cast<T>() is similar to static_cast<T>(), except that it uses run-time type information (RTTI) to check that the object associated with the pointer is an instance of class B. If this is not the case, the cast returns a null pointer. For example:

    A *obj = new B;
    B *b = dynamic_cast<B *>(obj);
    if (b)
        b->someFunctionDeclaredInB();

    On some compilers, dynamic_cast<T>() doesn’t work across dynamic library boundaries. It also relies on the compiler supporting RTTI, a feature that programmers can turn off to reduce the size of their executables. Qt solves these problems by providing qobject_cast<T>() for QObject subclasses.

  • const_cast<T>() adds or removes a const qualifier to a pointer or reference. For example:

    int MyClass::someConstFunction() const
    {
        if (isDirty()) {
            MyClass *that = const_cast<MyClass *>(this);
            that->recomputeInternalData();
        }
        ...
    }

    In the previous example, we cast away the const qualifier of the this pointer to call the non-const member function recomputeInternalData(). Doing so is not recommended and can normally be avoided by using the mutable keyword, as explained in Chapter 4.

  • reinterpret_cast<T>() converts any pointer or reference type to any other such type. For example:

    short j = 0x1234;
    if (reinterpret_cast<char &>(j) == 0x12)
        std::cout << "The byte order is big-endian" << std::endl;

In Java and C#, any reference can be stored as an Object reference if needed. C++ doesn’t have any universal base class, but it provides a special data type, void *, that stores the address of an instance of any type. A void * must be cast back to another type (using static_cast<T>()) before it can be used.

C++ provides many ways to cast types, but most of the time we don’t even need a cast. When using container classes such as vector<T> or QVector<T>, we can specify the T type and extract items without casts. In addition, for primitive types, certain conversions occur implicitly (e.g., from char to int), and for custom types we can define implicit conversions by providing a one-parameter constructor. For example:

class MyInteger
{
public:
    MyInteger();
    MyInteger(int i);
    ...
};

int main()
{
    MyInteger n;
    n = 5;
    ...
}

For some one-parameter constructors, the automatic conversion makes little sense. We can disable it by declaring the constructor with the explicit keyword:

class MyVector
{
public:
    explicit MyVector(int size);
    ...
};

Operator Overloading

C++ allows us to overload functions, meaning that we can declare several functions with the same name in the same scope, as long as they have different parameter lists. In addition, C++ supports operator overloading—the possibility of assigning special semantics to built-in operators (such as +, <<, and []) when they are used with custom types.

We have already seen a few examples of overloaded operators. When we used << to output text to cout or cerr, we didn’t trigger C++’s left-shift operator, but rather a special version of the operator that takes an ostream object (such as cout and cerr) on the left side and a string (alternatively, a number or a stream manipulator such as endl) on the right side and that returns the ostream object, allowing multiple calls in a row.

The beauty of operator overloading is that we can make custom types behave just like built-in types. To show how operator overloading works, we will overload +=, -=, +, and - to work on Point2D objects:

#ifndef POINT2D_H
#define POINT2D_H

class Point2D
{
public:
    Point2D();
    Point2D(double x, double y);

    void setX(double x);
    void setY(double y);
    double x() const;
    double y() const;

    Point2D &operator+=(const Point2D &other) {
        xVal += other.xVal;
        yVal += other.yVal;
        return *this;
    }
    Point2D &operator-=(const Point2D &other) {
        xVal -= other.xVal;
        yVal -= other.yVal;
        return *this;
    }

private:
    double xVal;
    double yVal;
};

inline Point2D operator+(const Point2D &a, const Point2D &b)
{
    return Point2D(a.x() + b.x(), a.y() + b.y());
}

inline Point2D operator-(const Point2D &a, const Point2D &b)
{
    return Point2D(a.x() - b.x(), a.y() - b.y());
}

#endif

Operators can be implemented either as member functions or as global functions. In our example, we implemented += and -= as member functions, and + and - as global functions.

The += and -= operators take a reference to another Point2D object and increment or decrement the x- and y-coordinates of the current object based on the other object. They return *this, which denotes a reference to the current object (this is of type Point2D *). Returning a reference allows us to write exotic code such as

a += b += c;

The + and - operators take two parameters and return a Point2D object by value (not a reference to an existing object). The inline keyword allows us to put these function definitions in the header file. If the function’s body had been longer, we would put a function prototype in the header file and the function definition (without the inline keyword) in a .cpp file.

The following code snippet shows all four overloaded operators in action:

Point2D alpha(12.5, 40.0);
Point2D beta(77.5, 50.0);

alpha += beta;
beta -= alpha;

Point2D gamma = alpha + beta;
Point2D delta = beta - alpha;

We can also invoke the operator functions just like any other functions:

Point2D alpha(12.5, 40.0);
Point2D beta(77.5, 50.0);

alpha.operator+=(beta);
beta.operator-=(alpha);

Point2D gamma = operator+(alpha, beta);
Point2D delta = operator-(beta, alpha);

Operator overloading in C++ is a complex topic, but we can go a long way without knowing all the details. It is still important to understand the fundamentals of operator overloading because several Qt classes (including QString and QVector<T>) use this feature to provide a simple and more natural syntax for such operations as concatenation and append.

Value Types

Java and C# distinguish between value types and reference types.

  • Value types: These are primitive types such as char, int, and float, as well as C# structs. What characterizes them is that they aren’t created using new and the assignment operator performs a copy of the value held by the variable. For example:

    int i = 5;
    int j = 10;
    i = j;
  • Reference types: These are classes such as Integer (in Java), String, and MyVeryOwnClass. Instances are created using new. The assignment operator copies only a reference to the object; to obtain a deep copy, we must call clone() (in Java) or Clone() (in C#). For example:

    Integer i = new Integer(5);
    Integer j = new Integer(10);
    i = j.clone();

In C++, all types can be used as “reference types”, and those that are copyable can be used as “value types” as well. For example, C++ doesn’t need any Integer class, because we can use pointers and new as follows:

int *i = new int(5);
int *j = new int(10);
*i = *j;

Unlike Java and C#, C++ treats user-defined classes in the same way as built-in types:

Point2D *i = new Point2D(5, 5);
Point2D *j = new Point2D(10, 10);
*i = *j;

If we want to make a C++ class copyable, we must ensure that our class has a copy constructor and an assignment operator. The copy constructor is invoked when we initialize an object with another object of the same type. C++ provides two equivalent syntaxes for this:

Point2D i(20, 20);

Point2D j(i);      // first syntax
Point2D k = i;     // second syntax

The assignment operator is invoked when we use the assignment operator on an existing variable:

Point2D i(5, 5);
Point2D j(10, 10);
j = i;

When we define a class, the C++ compiler automatically provides a copy constructor and an assignment operator that perform member-by-member copying. For the Point2D class, this is as though we had written the following code in the class definition:

class Point2D
{
public:
    ...
    Point2D(const Point2D &other)
        : xVal(other.xVal), yVal(other.yVal) { }

    Point2D &operator=(const Point2D &other) {
        xVal = other.xVal;
        yVal = other.yVal;
        return *this;
    }
    ...

private:
    double xVal;
    double yVal;
};

For some classes, the default copy constructor and assignment operator are unsuitable. This typically occurs if the class uses dynamic memory. To make the class copyable, we must then implement the copy constructor and the assignment operator ourselves.

For classes that don’t need to be copyable, we can disable the copy constructor and assignment operator by making them private. If we accidentally attempt to copy instances of such a class, the compiler reports an error. For example:

class BankAccount
{
public:
    ...

private:
    BankAccount(const BankAccount &other);
    BankAccount &operator=(const BankAccount &other);
};

In Qt, many classes are designed to be used as value classes. These have a copy constructor and an assignment operator, and are normally instantiated on the stack without new. This is the case for QDateTime, QImage, QString, and container classes such as QList<T>, QVector<T>, and QMap<K, T>.

Other classes fall in the “reference type” category, notably QObject and its subclasses (QWidget, QTimer, QTcpSocket, etc.). These have virtual functions and cannot be copied. For example, a QWidget represents a specific window or control on-screen. If there are 75 QWidget instances in memory, there are also 75 windows or controls on-screen. These classes are typically instantiated using the new operator.

Global Variables and Functions

C++ lets us declare functions and variables that don’t belong to any classes and that are accessible from any other function. We have seen several examples of global functions, including main(), the program’s entry point. Global variables are rarer, because they compromise modularity and thread reentrancy. It is still important to understand them because you might encounter them in code written by reformed C programmers and other C++ users.

To illustrate how global functions and variables work, we will study a small program that prints a list of 128 pseudo-random numbers using a quick-and-dirty algorithm. The program’s source code is spread over two .cpp files.

The first source file is random.cpp:

int randomNumbers[128];

static int seed = 42;

static int nextRandomNumber()
{
    seed = 1009 + (seed * 2011);
    return seed;
}
void populateRandomArray()
{
    for (int i = 0; i < 128; ++i)
        randomNumbers[i] = nextRandomNumber();
}

The file declares two global variables (randomNumbers and seed) and two global functions (nextRandomNumber() and populateRandomArray()). Two of the declarations contain the static keyword; these are visible only within the current compilation unit (random.cpp) and are said to have static linkage. The two others can be accessed from any compilation unit in the program; these have external linkage.

Static linkage is ideal for helper functions and internal variables that should not be used in other compilation units. It reduces the risks of having colliding identifiers (global variables with the same name or global functions with the same signature in different compilation units) and prevents malicious or otherwise ill-advised users from accessing the internals of a compilation unit.

Let’s now look at the second file, main.cpp, which uses the two global variables declared with external linkage in random.cpp:

#include <iostream>

extern int randomNumbers[128];

void populateRandomArray();

int main()
{
    populateRandomArray();
    for (int i = 0; i < 128; ++i)
        std::cout << randomNumbers[i] << std::endl;
    return 0;
}

We declare the external variables and functions before we call them. The external variable declaration (which makes an external variable visible in the current compilation unit) for randomNumbers starts with the extern keyword. Without extern, the compiler would think it has to deal with a variable definition, and the linker would complain because the same variable is defined in two compilation units (random.cpp and main.cpp). Variables can be declared as many times as we want, but they may be defined only once. The definition is what causes the compiler to reserve space for the variable.

The populateRandomArray() function is declared using a function prototype. The extern keyword is optional for functions.

Typically, we would put the external variable and function declarations in a header file and include it in all the files that need them:

#ifndef RANDOM_H
#define RANDOM_H
extern int randomNumbers[128];

void populateRandomArray();

#endif

We have already seen how static can be used to declare member variables and functions that are not attached to a specific instance of the class, and now we have seen how to use it to declare functions and variables with static linkage. There is one more use of the static keyword that should be noted in passing. In C++, we can declare a local variable static. Such variables are initialized the first time the function is called and hold their value between function invocations. For example:

void nextPrime()
{
    static int n = 1;

    do {
        ++n;
    } while (!isPrime(n));

    return n;
}

Static local variables are similar to global variables, except that they are only visible inside the function where they are defined.

Namespaces

Namespaces are a mechanism for reducing the risks of name clashes in C++ programs. Name clashes are often an issue in large programs that use several third-party libraries. In your own programs, you can choose whether you want to use namespaces.

Typically, we put a namespace around all the declarations in a header file to ensure that the identifiers declared in that header file don’t leak into the global namespace. For example:

#ifndef SOFTWAREINC_RANDOM_H
#define SOFTWAREINC_RANDOM_H

namespace SoftwareInc
{
    extern int randomNumbers[128];

    void populateRandomArray();
}

#endif

(Notice that we have also renamed the preprocessor macro used to avoid multiple inclusions, reducing the risk of a name clash with a header file of the same name but located in a different directory.)

The namespace syntax is similar to that of a class, but it doesn’t end with a semicolon. Here’s the new random.cpp file:

#include "random.h"

int SoftwareInc::randomNumbers[128];

static int seed = 42;

static int nextRandomNumber()
{
    seed = 1009 + (seed * 2011);
    return seed;
}

void SoftwareInc::populateRandomArray()
{
    for (int i = 0; i < 128; ++i)
        randomNumbers[i] = nextRandomNumber();
}

Unlike classes, namespaces can be “reopened” at any time. For example:

namespace Alpha
{
    void alpha1();
    void alpha2();
}

namespace Beta
{
    void beta1();
}

namespace Alpha
{
    void alpha3();
}

This makes it possible to define hundreds of classes, located in as many header files, as part of a single namespace. Using this trick, the Standard C++ library puts all its identifiers in the std namespace. In Qt, namespaces are used for global-like identifiers such as Qt::AlignBottom and Qt::yellow. For historical reasons, Qt classes do not belong to any namespace but are prefixed with the letter ‘Q’.

To refer to an identifier declared in a namespace from outside the namespace, we prefix it with the name of the namespace (and ::). Alternatively, we can use one of the following three mechanisms, which are aimed at reducing the number of keystrokes we must type.

  • We can define a namespace alias:

    namespace ElPuebloDeLaReinaDeLosAngeles
    {
        void beverlyHills();
        void culverCity();
        void malibu();
        void santaMonica();
    }
    
    namespace LA = ElPuebloDeLaReinaDeLosAngeles;

    After the alias definition, the alias can be used instead of the original name.

  • We can import a single identifier from a namespace:

    int main()
    {
        using ElPuebloDeLaReinaDeLosAngeles::beverlyHills;
    
        beverlyHills();
        ...
    }

    The using declaration allows us to access a given identifier from a namespace without having to prefix it with the name of the namespace.

  • We can import an entire namespace with a single directive:

    int main()
    {
        using namespace ElPuebloDeLaReinaDeLosAngeles;
    
        santaMonica();
        malibu();
        ...
    }

    With this approach, name clashes are more likely to occur. If the compiler complains about an ambiguous name (e.g., two classes with the same name defined in two different namespaces), we can always qualify the identifier with the name of the namespace when referring to it.

The Preprocessor

The C++ preprocessor is a program that converts a .cpp source file containing # directives (such as #include, #ifndef, and #endif) into a source file that contains no such directives. These directives perform simple textual operations on the source file, such as conditional compilation, file inclusion, and macro expansion. Normally, the preprocessor is invoked automatically by the compiler, but most systems still offer a way to invoke it alone (often through a -E or /E compiler option).

  • The #include directive expands to the contents of the file specified within angle brackets (<>) or double quotes (""), depending on whether the header file is installed at a standard location or is part of the current project. The file name may contain .. and / (which Windows compilers correctly interpret as a directory separator). For example:

    #include "../shared/globaldefs.h"
  • The #define directive defines a macro. Occurrences of the macro appearing after the #define directive are replaced with the macro’s definition. For example, the directive

    #define PI 3.14159265359

    tells the preprocessor to replace all future occurrences of the token PI in the current compilation unit with the token 3.14159265359. To avoid clashes with variable and class names, it is common practice to give macros all-uppercase names. It is possible to define macros that take arguments:

    #define SQUARE(x) ((x) * (x))

    In the macro body, it is good style to surround all occurrences of the parameters with parentheses, as well as the entire body, to avoid problems with operator precedence. After all, we want 7 * SQUARE(2 + 3) to expand to 7 * ((2 + 3) * (2 + 3)), not to 7 * 2 + 3 * 2 + 3.

    C++ compilers normally allow us to define macros on the command line, using the -D or /D option. For example:

    CC -DPI=3.14159265359 -c main.cpp

    Macros were very popular in the old days, before typedefs, enums, constants, inline functions, and templates were introduced. Nowadays, their most important role is to protect header files against multiple inclusions.

  • Macros can be undefined at any point using #undef:

    #undef PI

    This is useful if we want to redefine a macro, since the preprocessor doesn’t let us define the same macro twice. It is also useful to control conditional compilation.

  • Portions of code can be processed or skipped using #if, #elif, #else, and #endif, based on the numeric value of macros. For example:

    #define NO_OPTIM         0
    #define OPTIM_FOR_SPEED  1
    #define OPTIM_FOR_MEMORY 2
    
    #define OPTIMIZATION     OPTIM_FOR_MEMORY
    
    ...
    
    #if OPTIMIZATION == OPTIM_FOR_SPEED
    typedef int MyInt;
    #elif OPTIMIZATION == OPTIM_FOR_MEMORY
    typedef short MyInt;
    #else
    typedef long long MyInt;
    #endif

    In the preceding example, only the second typedef declaration would be processed by the compiler, resulting in MyInt being defined as a synonym for short. By changing the definition of the OPTIMIZATION macro, we get different programs. If a macro isn’t defined, its value is taken to be 0.

    Another approach to conditional compilation is to test whether a macro is defined. This can be done using the using the defined() operator as follows:

    #define OPTIM_FOR_MEMORY
    
    ...
    
    #if defined(OPTIM_FOR_SPEED)
    typedef int MyInt;
    #elif defined(OPTIM_FOR_MEMORY)
    typedef short MyInt;
    #else
    typedef long long MyInt;
    #endif
  • For convenience, the preprocessor recognizes #ifdef X and #ifndef X as synonyms for #if defined(X) and #if !defined(X). To protect a header file against multiple inclusions, we wrap its contents with the following idiom:

    #ifndef MYHEADERFILE_H
    #define MYHEADERFILE_H
    
    ...
    
    #endif

    The first time the header file is included, the symbol MYHEADERFILE_H is not defined, so the compiler processes the code between #ifndef and #endif. The second and any subsequent times the header file is included, MYHEADERFILE_H is defined, so the entire #ifndef ... #endif block is skipped.

  • The #error directive emits a user-defined error message at compile time. This is often used in conjunction with conditional compilation to report an impossible case. For example:

    class UniChar
    {
    public:
    #if BYTE_ORDER == BIG_ENDIAN
        uchar row;
        uchar cell;
    #elif BYTE_ORDER == LITTLE_ENDIAN
        uchar cell;
        uchar row;
    #else
    #error "BYTE_ORDER must be BIG_ENDIAN or LITTLE_ENDIAN"
    #endif
    };

Unlike most other C++ constructs, where whitespace is irrelevant, preprocessor directives stand alone on a line and require no semicolon. Very long directives can be split across multiple lines by ending every line except the last with a backslash.

The Standard C++ Library

In this section, we will briefly review the Standard C++ library. Figure D.3 lists the core C++ header files. The <exception>, <limits>, <new>, and <typeinfo> headers support the C++ language; for example, <limits> allows us to test properties of the compiler’s integer and floating-point arithmetic support, and <typeinfo> offers basic introspection. The other headers provide generally useful classes, including a string class and a complex numeric type. The functionality offered by <bitset>, <locale>, <string>, and <typeinfo> loosely overlaps with the QBitArray, QLocale, QString, and QMetaObject classes in Qt.

Table D.3. Core C++ library header files

Header File

Description

<bitset>

Template class for representing fixed-length bit sequences

<complex>

Template class for representing complex numbers

<exception>

Types and functions related to exception handling

<limits>

Template class that specifies properties of numeric types

<locale>

Classes and functions related to localization

<new>

Functions that manage dynamic memory allocation

<stdexcept>

Predefined types of exceptions for reporting errors

<string>

Template string container and character traits

<typeinfo>

Class that provides basic meta-information about a type

<valarray>

Template classes for representing value arrays

Standard C++ also includes a set of header files that deal with I/O, listed in Figure D.4. The standard I/O classes’ design harks back to the 1980s and is needlessly complex, making them very hard to extend—so difficult, in fact, that entire books have been written on the subject. It also leaves the programmer with a Pandora’s box of unresolved issues related to character encodings and platform-dependent binary representations of primitive data types.

Table D.4. C++ I/O library header files

Header File

Description

<fstream>

Template classes that manipulate external files

<iomanip>

I/O stream manipulators that take an argument

<ios>

Template base class for I/O streams

<iosfwd>

Forward declarations for several I/O stream template classes

<iostream>

Standard I/O streams (cin, cout, cerr, clog)

<istream>

Template class that controls input from a stream buffer

<ostream>

Template class that controls output to a stream buffer

<sstream>

Template classes that associate stream buffers with strings

<streambuf>

Template classes that buffer I/O operations

<strstream>

Classes for performing I/O stream operations on character arrays

Chapter 12 presents the corresponding Qt classes, which feature Unicode I/O as well as a large set of national character encodings and a platform-independent abstraction for storing binary data. Qt’s I/O classes form the basis of Qt’s inter-process communication, networking, and XML support. Qt’s binary and text stream classes are very easy to extend to handle custom data types.

The early 1990s saw the introduction of the Standard Template Library (STL), a set of template-based container classes, iterators, and algorithms that slipped into the ISO C++ standard at the eleventh hour. Figure D.5 lists the header files that form the STL. The STL has a very clean, almost mathematical design that provides generic type-safe functionality. Qt provides its own container classes, whose design is partly inspired by the STL. We describe them in Chapter 11.

Table D.5. STL header files

Header File

Description

<algorithm>

General-purpose template functions

<deque>

Double-ended queue template container

<functional>

Templates that help construct and manipulate functors

<iterator>

Templates that help construct and manipulate iterators

<list>

Doubly linked list template container

<map>

Single-valued and multi-valued map template containers

<memory>

Utilities for simplifying memory management

<numeric>

Template numeric operations

<queue>

Queue template container

<set>

Single-valued and multi-valued set template containers

<stack>

Stack template container

<utility>

Basic template functions

<vector>

Vector template container

Since C++ is essentially a superset of the C programming language, C++ programmers also have the entire C library at their disposal. The C header files are available either with their traditional names (e.g., <stdio.h>) or with new-style names with a c- prefix and no .h (e.g., <cstdio>). When we use the new-style version, the functions and data types are declared in the std namespace. (This doesn’t apply to macros such as ASSERT(), because the preprocessor is unaware of namespaces.) The new-style syntax is recommended if your compiler supports it.

Figure D.6 lists the C library header files. Most of these offer functionality that overlaps with more recent C++ headers or with Qt. One notable exception is <cmath>, which declares mathematical functions such as sin(), sqrt(), and pow().

Table D.6. C++ header files for C library facilities

Header File

Description

<cassert>

The ASSERT() macro

<cctype>

Functions for classifying and mapping characters

<cerrno>

Macros related to error condition reporting

<cfloat>

Macros that specify properties of primitive floating-point types

<ciso646>

Alternative spellings for ISO 646 charset users

<climits>

Macros that specify properties of primitive integer types

<clocale>

Functions and types related to localization

<cmath>

Mathematical functions and constants

<csetjmp>

Functions for performing non-local jumps

<csignal>

Functions for handling system signals

<cstdarg>

Macros for implementing variable argument list functions

<cstddef>

Common definitions for several standard headers

<cstdio>

Functions for performing I/O

<cstdlib>

General utility functions

<cstring>

Functions for manipulating char arrays

<ctime>

Types and functions for manipulating time

<cwchar>

Extended multibyte and wide character utilities

<cwctype>

Functions for classifying and mapping wide characters

This completes our quick overview of the Standard C++ library. On the Internet, Dinkumware offers complete reference documentation for the Standard C++ library at http://www.dinkumware.com/refxcpp.html, and SGI has a comprehensive STL programmer’s guide at http://www.sgi.com/tech/stl/. The official definition of the Standard C++ library is in the C and C++ standards, available as PDF files or paper copies from the International Organization for Standardization (ISO).

In this appendix, we covered a lot of ground at a fast pace. When you start learning Qt from Chapter 1, you should find that the syntax is a lot simpler and clearer than this appendix might have suggested. Good Qt programming only requires the use of a subset of C++ and usually avoids the need for the more complex and obscure syntax that C++ makes possible. Once you start typing in code and building and running executables, the clarity and simplicity of the Qt approach will become apparent. And as soon as you start writing more ambitious programs, especially those that need fast and fancy graphics, the C++/Qt combination will continue to keep pace with your needs.



[*] Some compilers allow variables in that context, but this feature should not be relied upon in portable programs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset