12.8 (Optional) Polymorphism, Virtual Functions and Dynamic Binding “Under the Hood”

C++ makes polymorphism easy to program. It’s certainly possible to program for polymorphism in non-object-oriented languages such as C, but doing so requires complex and potentially dangerous pointer manipulations. This section discusses how C++ can implement polymorphism, virtual functions and dynamic binding internally. This will give you a solid understanding of how these capabilities really work. More importantly, it will help you appreciate the overhead of polymorphism—in terms of additional memory consumption and processor time. This can help you determine when to use polymorphism and when to avoid it. C++ Standard Library classes like array and vector are implemented without polymorphism and virtual functions to avoid the associated execution-time overhead and achieve optimal performance.

First, we’ll explain the data structures that the compiler builds at compile time to support polymorphism at execution time. You’ll see that polymorphism is accomplished through three levels of pointers, i.e., triple indirection. Then we’ll show how an executing program uses these data structures to execute virtual functions and achieve the dynamic binding associated with polymorphism. Our discussion explains a possible implementation; this is not a language requirement.

When C++ compiles a class that has one or more virtual functions, it builds a virtual function table (vtable) for that class. The vtable contains pointers to the class’s virtual functions—a pointer to a function contains the starting address in memory of the code that performs the function’s task. Just as an array name is implicitly convertible to the address of the array’s first element, a function name is implicitly convertible to the starting address of its code. An executing program uses the vtable to select the proper function implementation each time a virtual function of that class is called on any object of that class. The leftmost column of Fig. 12.18 illustrates the vtables for the classes Employee, SalariedEmployee, CommissionEmployee and BasePlusCommissionEmployee.

Employee Class vtable

In the Employee class vtable, the first function pointer is set to 0 (i.e., nullptr), because function earnings is a pure virtual function and therefore lacks an implementation. The second function pointer points to function toString, which returns a string containing the employee’s full name and social security number. [Note: We’ve abbreviated the output of each toString function in this figure to conserve space.] Any class that has one or more nullptrs (represented with the value 0) in its vtable is an abstract class. Classes without any nullptrs in their vtables (such as SalariedEmployee, CommissionEmployee and BasePlusCommissionEmployee) are concrete classes.

SalariedEmployee Class vtable

Class SalariedEmployee overrides function earnings to return the employee’s weekly salary, so the function pointer points to the earnings function of class SalariedEmployee. SalariedEmployee also overrides toString, so the corresponding function pointer points to the SalariedEmployee member function that returns "salaried employee: " followed by the employee’s name, social security number and weekly salary.

CommissionEmployee Class vtable

The earnings function pointer in the vtable for class CommissionEmployee points to the CommissionEmployee’s earnings function that returns the employee’s gross sales multiplied by the commission rate. The toString function pointer points to the CommissionEmployee version of the function, which returns the employee’s type, name, social security number, commission rate and gross sales. As in class SalariedEmployee, both functions override the functions in class Employee.

BasePlusCommissionEmployee Class vtable

The earnings function pointer in the vtable for class BasePlusCommissionEmployee points to the BasePlusCommissionEmployee’s earnings function, which returns the employee’s base salary plus gross sales multiplied by commission rate. The toString function pointer points to the BasePlusCommissionEmployee version of the function, which returns the employee’s base salary plus the type, name, social security number, commission rate and gross sales. Both functions override the functions in class CommissionEmployee.

Inheriting Concrete virtual Functions

In our Employee case study, each concrete class provides its own implementation for virtual functions earnings and toString. You’ve learned that each class that inherits directly from abstract base class Employee must implement earnings in order to be a concrete class, because earnings is a pure virtual function. These classes do not need to implement function toString, however, to be considered concrete—toString is not a pure virtual function and derived classes can inherit class Employee’s implementation of toString.

Fig. 12.18 How virtual function calls work.

Furthermore, class BasePlusCommissionEmployee does not have to implement either function toString or earnings—both function implementations can be inherited from concrete class CommissionEmployee. If a class in our hierarchy were to inherit function implementations in this manner, the vtable pointers for these functions would simply point to the function implementation that was being inherited. For example, if BasePlusCommissionEmployee did not override earnings, the earnings function pointer in the vtable for class BasePlusCommissionEmployee would point to the same earnings function as the vtable for class CommissionEmployee.

Three Levels of Pointers to Implement Polymorphism

Polymorphism is accomplished through an elegant data structure involving three levels of pointers. We’ve discussed one level—the function pointers in the vtable. These point to the actual functions that execute when a virtual function is invoked.

Now we consider the second level of pointers. Whenever an object of a class with one or more virtual functions is instantiated, the compiler attaches to the object a pointer to the vtable for that class. This pointer is normally at the front of the object, but it isn’t required to be implemented that way. In Fig. 12.18, these pointers are associated with the objects created in Fig. 12.17 (one object for each of the types SalariedEmployee, CommissionEmployee and BasePlusCommissionEmployee). The diagram shows each of the object’s data member values. For example, the salariedEmployee object contains a pointer to the SalariedEmployee vtable; the object also contains the values John Smith, 111-11-1111 and $800.00.

The third level of pointers simply contains the handles to the objects that receive the virtual function calls. The handles in this level may also be references. Figure 12.18 depicts the vector employees that contains Employee pointers.

Now let’s see how a typical virtual function call executes. Consider in the function virtualViaPointer the call baseClassPtr->toString() (line 62 of Fig. 12.17). Assume that baseClassPtr contains employees[1] (i.e., the address of object commissionEmployee in employees). When the compiler compiles this statement, it determines that the call is indeed being made via a base-class pointer and that toString is a virtual function.

The compiler determines that toString is the second entry in each of the vtables. To locate this entry, the compiler notes that it will need to skip the first entry. Thus, the compiler compiles an offset or displacement into the table of machine-language object-code pointers to find the code that will execute the virtual function call. The size in bytes of the offset depends on the number of bytes used to represent a function pointer on an individual platform. For example, on a 32-bit platform, a pointer is typically stored in four bytes, whereas on a 64-bit platform, a pointer is typically stored in eight bytes. We assume four bytes for this discussion.

The compiler generates code that performs the following operations [Note: The numbers in the list correspond to the circled numbers in Fig. 12.18]:

  1. Select the ith entry of employees (in this case, the address of object CommissionEmployee) and pass it as an argument to function virtualViaPointer. This sets parameter baseClassPtr to point to commissionEmployee.

  2. Dereference that pointer to get to the commissionEmployee object—which, as you recall, begins with a pointer to the CommissionEmployee vtable.

  3. Dereference commissionEmployee’s vtable pointer to get to the CommissionEmployee vtable.

  4. Skip the offset of four bytes to select the toString function pointer.

  5. Dereference the toString function pointer to form the “name” of the actual function to execute, and use the function-call operator () to execute the appropriate toString function, which in this case returns the employee’s type, name, social security number, gross sales and commission rate.

Figure 12.18’s data structures may appear complex, but this complexity is managed by the compiler and hidden from you, making polymorphic programming straightforward. The pointer dereferencing operations and memory accesses that occur on every virtual function call require some additional execution time. The vtables and the vtable pointers added to the objects require some additional memory.

Performance Tip 12.1

Polymorphism, as typically implemented with virtual functions and dynamic binding in C++, is efficient. For most applications, you can use these capabilities with nominal impact on performance.

 

Performance Tip 12.2

Virtual functions and dynamic binding enable polymorphic programming as an alternative to switch logic programming. Optimizing compilers normally generate polymorphic code that’s nearly as efficient as hand-coded switch-based logic. Polymorphism’s overhead is acceptable for most applications. In some situations—such as real-time applications with stringent performance requirements—polymorphism’s overhead may be too high.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset