C++ makes polymorphism easy to program. It’s certainly possible to program for polymorphism in non-object-oriented languages such as C, but doing so requires complex and potentially dangerous pointer manipulations. This section discusses how C++ can implement polymorphism, virtual
functions and dynamic binding internally. This will give you a solid understanding of how these capabilities really work. More importantly, it will help you appreciate the overhead of polymorphism—in terms of additional memory consumption and processor time. This can help you determine when to use polymorphism and when to avoid it. C++ Standard Library classes like array
and vector
are implemented without polymorphism and virtual
functions to avoid the associated execution-time overhead and achieve optimal performance.
First, we’ll explain the data structures that the compiler builds at compile time to support polymorphism at execution time. You’ll see that polymorphism is accomplished through three levels of pointers, i.e., triple indirection. Then we’ll show how an executing program uses these data structures to execute virtual
functions and achieve the dynamic binding associated with polymorphism. Our discussion explains a possible implementation; this is not a language requirement.
When C++ compiles a class that has one or more virtual
functions, it builds a virtual function table (vtable) for that class. The vtable contains pointers to the class’s virtual
functions—a pointer to a function contains the starting address in memory of the code that performs the function’s task. Just as an array name is implicitly convertible to the address of the array’s first element, a function name is implicitly convertible to the starting address of its code. An executing program uses the vtable to select the proper function implementation each time a virtual
function of that class is called on any object of that class. The leftmost column of Fig. 12.18 illustrates the vtables for the classes Employee
, SalariedEmployee
, CommissionEmployee
and BasePlusCommissionEmployee
.
Employee
Class vtableIn the Employee
class vtable, the first function pointer is set to 0
(i.e., nullptr
), because function earnings
is a pure virtual
function and therefore lacks an implementation. The second function pointer points to function toString
, which returns a string
containing the employee’s full name and social security number. [Note: We’ve abbreviated the output of each toString
function in this figure to conserve space.] Any class that has one or more nullptr
s (represented with the value 0
) in its vtable is an abstract class. Classes without any nullptr
s in their vtables (such as SalariedEmployee
, CommissionEmployee
and BasePlusCommissionEmployee
) are concrete classes.
SalariedEmployee
Class vtableClass SalariedEmployee
overrides function earnings
to return the employee’s weekly salary, so the function pointer points to the earnings
function of class SalariedEmployee
. SalariedEmployee
also overrides toString
, so the corresponding function pointer points to the SalariedEmployee
member function that returns "salaried
employee: "
followed by the employee’s name, social security number and weekly salary.
CommissionEmployee
Class vtableThe earnings
function pointer in the vtable for class CommissionEmployee
points to the CommissionEmployee
’s earnings
function that returns the employee’s gross sales multiplied by the commission rate. The toString
function pointer points to the CommissionEmployee
version of the function, which returns the employee’s type, name, social security number, commission rate and gross sales. As in class SalariedEmployee
, both functions override the functions in class Employee
.
BasePlusCommissionEmployee
Class vtableThe earnings
function pointer in the vtable for class BasePlusCommissionEmployee
points to the BasePlusCommissionEmployee
’s earnings
function, which returns the employee’s base salary plus gross sales multiplied by commission rate. The toString
function pointer points to the BasePlusCommissionEmployee
version of the function, which returns the employee’s base salary plus the type, name, social security number, commission rate and gross sales. Both functions override the functions in class CommissionEmployee
.
virtual
FunctionsIn our Employee
case study, each concrete class provides its own implementation for virtual
functions earnings
and toString
. You’ve learned that each class that inherits directly from abstract base class Employee
must implement earnings
in order to be a concrete class, because earnings
is a pure virtual
function. These classes do not need to implement function toString
, however, to be considered concrete—toString
is not a pure virtual
function and derived classes can inherit class Employee
’s implementation of toString
.
Furthermore, class BasePlusCommissionEmployee
does not have to implement either function toString
or earnings
—both function implementations can be inherited from concrete class CommissionEmployee
. If a class in our hierarchy were to inherit function implementations in this manner, the vtable pointers for these functions would simply point to the function implementation that was being inherited. For example, if BasePlusCommissionEmployee
did not override earnings
, the earnings
function pointer in the vtable for class BasePlusCommissionEmployee
would point to the same earnings
function as the vtable for class CommissionEmployee
.
Polymorphism is accomplished through an elegant data structure involving three levels of pointers. We’ve discussed one level—the function pointers in the vtable. These point to the actual functions that execute when a virtual
function is invoked.
Now we consider the second level of pointers. Whenever an object of a class with one or more virtual
functions is instantiated, the compiler attaches to the object a pointer to the vtable for that class. This pointer is normally at the front of the object, but it isn’t required to be implemented that way. In Fig. 12.18, these pointers are associated with the objects created in Fig. 12.17 (one object for each of the types SalariedEmployee
, CommissionEmployee
and BasePlusCommissionEmployee
). The diagram shows each of the object’s data member values. For example, the salariedEmployee
object contains a pointer to the SalariedEmployee
vtable; the object also contains the values John
Smith
, 111-11-1111
and $800.00
.
The third level of pointers simply contains the handles to the objects that receive the virtual
function calls. The handles in this level may also be references. Figure 12.18 depicts the vector
employees
that contains Employee
pointers.
Now let’s see how a typical virtual
function call executes. Consider in the function virtualViaPointer
the call baseClassPtr->toString()
(line 62 of Fig. 12.17). Assume that baseClassPtr
contains employees[1]
(i.e., the address of object commissionEmployee
in employees
). When the compiler compiles this statement, it determines that the call is indeed being made via a base-class pointer and that toString
is a virtual
function.
The compiler determines that toString
is the second entry in each of the vtables. To locate this entry, the compiler notes that it will need to skip the first entry. Thus, the compiler compiles an offset or displacement into the table of machine-language object-code pointers to find the code that will execute the virtual
function call. The size in bytes of the offset depends on the number of bytes used to represent a function pointer on an individual platform. For example, on a 32-bit platform, a pointer is typically stored in four bytes, whereas on a 64-bit platform, a pointer is typically stored in eight bytes. We assume four bytes for this discussion.
The compiler generates code that performs the following operations [Note: The numbers in the list correspond to the circled numbers in Fig. 12.18]:
Select the ith entry of employees
(in this case, the address of object CommissionEmployee
) and pass it as an argument to function virtualViaPointer
. This sets parameter baseClassPtr
to point to commissionEmployee
.
Dereference that pointer to get to the commissionEmployee
object—which, as you recall, begins with a pointer to the CommissionEmployee
vtable.
Dereference commissionEmployee
’s vtable pointer to get to the CommissionEmployee
vtable.
Skip the offset of four bytes to select the toString
function pointer.
Dereference the toString
function pointer to form the “name” of the actual function to execute, and use the function-call operator ()
to execute the appropriate toString
function, which in this case returns the employee’s type, name, social security number, gross sales and commission rate.
Figure 12.18’s data structures may appear complex, but this complexity is managed by the compiler and hidden from you, making polymorphic programming straightforward. The pointer dereferencing operations and memory accesses that occur on every virtual
function call require some additional execution time. The vtables and the vtable pointers added to the objects require some additional memory.
Polymorphism, as typically implemented with virtual
functions and dynamic binding in C++, is efficient. For most applications, you can use these capabilities with nominal impact on performance.
Virtual functions and dynamic binding enable polymorphic programming as an alternative to switch
logic programming. Optimizing compilers normally generate polymorphic code that’s nearly as efficient as hand-coded switch
-based logic. Polymorphism’s overhead is acceptable for most applications. In some situations—such as real-time applications with stringent performance requirements—polymorphism’s overhead may be too high.