Chapter 27. Types and RTTI

FAQ 27.01 What is the purpose of this chapter?

image

This chapter explores static and dynamic type checking, both of which are allowed in C++. Its main theme is that static type checking is always a good idea, while the impulse to use dynamic type checking should be carefully controlled. In those cases, such as persistence, where some form of dynamic type checking might be required, runtime type identification (RTTI) should be used.

FAQ 27.02 What is static type checking?

Static type checking, sometimes known as static typing, is when the compiler checks the type correctness of operations at compile time. For example, the compiler checks the parameter types of function arguments and checks that a member function invocation is guaranteed to work at runtime, then it flags improper matches as errors at compile time.

In object-oriented programs, the most common symptom of a type mismatch is the attempt to invoke a member function via a reference to an object, where the reference's type and/or the object's type does not support the member function. For example, if class X has member function f() but not member function g() and x is an instance of class X, then x.f() is legal and x.g() is illegal.

image

Fortunately, C++ catches errors like this at compile time.

FAQ 27.03 What is dynamic type checking?

Dynamic type checking, sometimes known as dynamic typing, is the determination of type correctness at runtime.

With dynamic type checking, user code determines whether an object supports a particular member function at runtime rather than at compile time. Dynamic type checking is often accompanied by downcasts (see FAQ 27.11) and can unnecessarily increase the cost of C++ software. Runtime type identification (RTTI) is one kind of dynamic type checking that is supported directly by C++.

The following example demonstrates the wrong way to do things. Pretend that the various escape sequences toggle italics on the various kinds of printers.

image

image

Although the example uses classes and virtual functions, it is not the best use of OO technology. The type() member function is used in basically the same way that procedural code uses tagged unions (that is, tag fields that indicate which piece of the union is currently being used). This approach is subject to error and is nonextensible compared to the proper use of classes and virtual functions, shown later in this chapter. For example, adding a new kind of printer requires changes to the printUsingItalics() function and probably to other functions as well. This is a ripple effect that is typical with non-OO (and bad OO!) software.

FAQ 27.04 What is the basic problem with dynamic type checking?

The basic problem with dynamic type checking is that it uses code to find code, creating extensibility problems later.

With dynamic type checking, code has to be written to check the type of an object to see if it supports a particular set of member functions (this is the code that is doing the finding). Accessing the member functions may require a down cast to the appropriate pointer type (this is the code that is being searched for).

When the user code uses code to find server code, the user code is more complex and fragile. OO programming is supposed to encapsulate complexity, and the in appropriate use of dynamic type checking can undo this benefit. Often dynamic type-checking tests require the user code to know the server's inheritance hierarchy, in which case changing the server's inheritance hierarchy breaks the user code. This is unfortunate, considering that one of the main goals of object-oriented technology is to reduce maintenance costs.

Dynamic type checking also requires a runtime check to ensure that the object supports the requested member function. This is usually implemented using control flow, such as an if or switch statement. These runtime tests are frequently avoidable if the design exploits the static type-checking capabilities of the C++ compiler.

Finally, it is much more expensive to catch an error at runtime than it is to find the same error at compile time. Don't use dynamic type checking without a good reason.

FAQ 27.05 How can dynamic type checking be avoided?

Design. Design. Design.

Circumstances sometimes require the use of dynamic type checking, but unfortunately, dynamic type checking is often used when it is not required. Often dynamic type checking is used because the programmer does not have enough expertise or does not take the time to produce a good object-oriented design. When dynamic type checking seems attractive, try revising the design instead. After the design has been revisited and dynamic type checking still seems desirable, use it. But be aware of the additional coding, testing, and maintenance costs.

FAQ 27.06 Are there better alternatives to dynamic type checking?

One alternative to dynamic type checking and down casts is dynamic binding and virtual functions. To use this alternative technique, member functions that show up only in the derived classes are generalized and moved up to the base class. Effectively this means that the class selection and down cast is performed automatically and safely by C++. Furthermore, this approach produces extensible software because it automatically extends itself whenever a new derived class is created—as if an extra case or else if magically appeared in the dynamic type-checking technique.

The following example is a rework of the code from FAQ 27.03. Compared to the old class hierarchy, the italicsXXX() member functions from the derived classes are generalized and moved into the base class as virtual member function italics(). This results in a substantial simplification of the user code printUsingItalics(). Instead of selecting the printer type based on a type() member function and using control flow logic to figure out what to do, the user code simply invokes the new italics() member function.

image

From a broader standpoint, complexity is moved from the user code to the server code, from the many to the few. This is normally the right trade-off. Furthermore, adding a new kind of Printer doesn't require existing code to be modified—reducing the ripple effect when compared to FAQ 27.03.

FAQ 27.07 What is a capability query?

A capability query is an inspector member function (see FAQ 14.07) that allows users to determine whether an object supports some other member function. Capability queries invite inflexibility.

The benefit of capability queries is that they allow a class designer to avoid thinking about how users will use the objects, instead forcing the user code to explicitly test the classes and objects to see what capabilities they support.

The problem with capability queries is that they allow a class designer to avoid thinking about how users will use the objects, instead forcing the user code to explicitly test the classes and objects to see what capabilities they support.

Capability queries export complexity from the server to the users, from the few to the many. User code often needs explicit control flow to select operations based on the results of a capability query—user code uses code to find code (see FAQ 27.04). This impacts existing user code when new derived classes are added.

Capability queries are not normally recommended.

FAQ 27.08 What is an alternative to dynamic type checking with containers?

image

Templates offer a viable alternative when working with containers.

In the past, some container classes were designed assuming the existence of some kind of master base class. This has been called the based object approach. In particular, it was common to encounter container classes that inserted or extracted elements that were pointers to a single base class, typically called Object.

Applying the based object approach to containers makes it hard to mix two or more class libraries. For example, it may not be possible to put an object from one library into a container from another library, since the master base classes from the two libraries normally won't match exactly.

In general, this approach can and should be avoided through the use of templates or design patterns. The particular problem of extensible container classes has been elegantly solved in the standard C++ container classes by using templates and iterators.

Note that Java always inherits all classes from class Object. But since there is exactly one Object class in Java, as opposed to one per library vendor in C++, it isn't as big a problem in Java. The important point is that Java and C++ are very different in some fundamental ways. Syntactically they appear to be quite similar, but semantically there are some fundamental differences. Therefore just because a technique works in one language (say the based object approach works in Java) does not mean the same approach works or should be made to work in a different language (see FAQ 28.08).

FAQ 27.09 Are there cases where dynamic type checking is necessary?

Yes, particularly with persistent heterogeneous objects.

A program can't have static knowledge about things that existed before the execution of the program. If objects from several classes were previously stored in a database, the program that peels the objects off the disk drive's platter (or, equivalently, slurps them from a coaxial cable) cannot know the types of the objects because it didn't create them.

In these cases, the objects may need to be queried about their types, especially if the persistent objects are highly heterogeneous. To whatever extent possible, use the maxim “Ask once, then remember.” In other words, try to avoid asking an object its type (or its capabilities) every time it is used. This is especially true if the queries require reasoning about the objects in a nonextensible manner (that is, control flow logic that uses code to find code; see FAQ 27.04).

Note that it is normally possible to avoid the type queries if the objects are known to be of the same class (homogeneous) or at least known to be derived from some common ABC that has a fairly rich set of member functions.

FAQ 27.10 Given a pointer to an ABC, how can the class of the referent be found?

image

This is an idea that should be avoided.

The typical reason for trying to find an object's class is to use an algorithm that depends on the object's class. If the algorithm varies depending on the derived class, then the algorithm should be a virtual member function in the class hierarchy. If the algorithm is structurally the same for all derived classes but has little pieces that differ depending on the derived class, then the little pieces should be virtual member functions in the class hierarchy. This technique lets derived classes select the ideal algorithm or algorithm fragments without any additional branch points in the software (see FAQ 27.04).

For example, finding the minimal distance to a mouse click requires different algorithms for circles, squares, lines, and so forth. One might be tempted to write non-OO code such as the following (pretend Position, Shape, Circle, and so forth, are classes).

image

One problem with this non-OO technique is that adding a new derived class requires working user code to be modified by adding a new else if section. Besides the obvious concern that changing working user code may break it, in large systems it is difficult to find all the places that need to be changed, and in very large systems there is typically a scheduling problem coordinating the changes in diverse teams of developers. In one organization the ripple effect was so bad that it took nine months to add a new gizmo to the system (this was mainly due to a scheduling concern since the entire system was huge—in excess of 10 million lines of non-OO code). After a proper OO design of selected subsystems, the same sorts of additions are now routinely done by a single person in a single day.1

1 “Lessons Learned from the OS/400 OO Project,”Communications of the ACM.1995;38(10):54 –64.

A proper OO design would move the function dist() into the Shape rather than moving the Shape into the function dist().

image

The OO solution greatly reduces the amount of code that needs to be modified when a new class is added. A little extra design work pays large dividends.

FAQ 27.11 What is a downcast?

A downcast is the conversion of a Base* to a Derived*, where class Derived is publicly derived from class Base. A downcast is used when the client code thinks (or hopes!) that a Base* points to an object of class Derived or a class derived from Derived and it needs to access a member function that is provided by Derived but not by Base.

For example, suppose class LiquidAsset is derived from class Asset, and LiquidAsset is a derived class that is liquidatable but Asset itself is not liquidatable. A downcast from an Asset* to a LiquidAsset* allows the liquidation.

image

image

The output of this program follows.

Sorry, couldn't liquidate this asset
Liquidated $100

Although dynamic_cast (see FAQ 27.17) can eliminate the unsafe casts, it cannot eliminate the nonextensible control flow logic. See FAQ 27.12 for a better alternative.

FAQ 27.12 What is an alternative to using downcasts?

Move the user code into the object in the form of virtual functions.

An if-downcast pair can often be replaced by a virtual function call. The key insight is to replace the capability query with a service request. A service request is a virtual function that the client can use to politely ask an object to perform some action (such as “Try to liquidate yourself”).

To help find the segments of code that will need to be moved into the service requests, look for those segments of code that use capability queries and depend on the type of the class. Segments of code that depend on the type of the class should be moved into the hierarchy as virtual functions; segments of code that don't depend on the type of the class can remain user code or can be nonvirtual member functions in the base class.

In the previous FAQ, the service request in the user code included the entire tryToLiquidate operation (this entire operation depended on the derived class). To apply this guideline, move the code for this operation into the class hierarchy as a virtual function.

image

image

The output of this program follows.

Sorry, couldn't liquidate this asset
Liquidated $100

In the previous FAQ, the downcast was explicit and was therefore subject to human error. In the revised solution, the conversion from Asset* to LiquidAsset* is implicitly part of the virtual function call mechanism. LiquidAsset::tryToLiquidate() does not need to downcast the this pointer into a LiquidAsset*.

Think of a virtual function call as an extensible if-downcast pair that always down casts to the right type.

FAQ 27.13 Why are downcasts dangerous?

Downcasts override the help a compiler can give and rely solely on the knowledge of the programmer.

A downcast from a base class pointer to a derived class pointer instructs the compiler to blindly reinterpret the bits of the pointer. But if you've guessed wrong about the object's class, you're in big trouble—the coerced pointer can create havoc. Learn about type-safe downcasting with RTTI (see FAQ 27.16) instead, but more important, avoid downcasts entirely.

FAQ 27.14 Should the inheritance graph of C++ hierarchies be tall or short?

The inheritance graph should be a forest of short trees.

When the inheritance graph is too tall, downcasts are common. This is because the type of the pointer is often sufficiently different from the type of the object that the desired member function is available only by downcasting the pointer. Also, the deeper the graph, the less likely that the inheritance relationships are proper. A tall graph is frequently a sign of an uninformed attempt at code reuse. Remember: inheritance is not for code reuse (see FAQ 8.12).

The type-safe philosophy espoused in this book discourages the unnecessary use of downcasting, even if downcasts are checked first.

FAQ 27.15 Should the inheritance graph of C++ hierarchies be monolithic or a forest?

The inheritance graph should be a forest.

The inheritance hierarchy of well-designed C++ software is normally a forest of little trees rather than a large, monolithic tree. Monolithic trees usually result in excessive use of downcasting. The type-safe philosophy espoused in this book discourages the use of downcasting.

FAQ 27.16 What is Runtime Type Identification (RTTI)?

image

RTTI is the official way in standard C++ to discover the type of an object and to convert the type of a pointer or reference (that is, dynamic typing). The need came from practical experience with C++. RTTI replaces many homegrown versions with a solid, consistent approach. It has many features and capabilities; this chapter discusses dynamic_cast<T>(), static_cast<T>(), and typeid(). Other features, such as const_cast() and reinterpret_cast(), and issues related to multiple/private/protected/virtual inheritance are not discussed.

FAQ 27.17 What is the purpose of dynamic_cast<T>()?

image

It's a way to see if an object supports a given interface at runtime. It can be a bit complicated, so this simplified FAQ covers only the normal situations that occur repeatedly.

Very loosely speaking, dynamic_cast<T>(x) is like the old-style cast (T)x, meaning that it casts the value of x to the type T (T is normally either a pointer or a reference to some class). dynamic_cast<T>(x) has several important advantages over the old-style cast. It never performs an invalid conversion since it checks that the cast is legal at runtime, and the syntax is more obvious and explicit than the old-style cast, thus appropriately calling attention to the conversion.

If p is a pointer, dynamic_cast<Fred*>(p) converts p to a Fred* like (Fred*)p, but if the conversion is not valid, it returns NULL. If r is a reference, dynamic_cast<Fred&>(r) converts r to a Fred& just like (Fred&)r, but if the conversion is not valid, an exception of type bad_cast is thrown. A conversion is valid if the object pointed to by p (or referred to by r) is either a Fred or a publicly derived class of Fred. Here is some sample syntax.

image

When dynamic_cast<T>(p) is being used to perform a downcast, p's type must designate a class with at least one virtual function (or be NULL). However, this restriction does not apply to potential recipients of the cast, such as cp in the example.

FAQ 27.18 Is dynamic_cast<T>() a panacea?

image

No, like everything else, dynamic_cast<T>() can be misused.

It's a horrible design error, but some programmers (mis)use dynamic_cast<T>() in huge if / then / else blocks to determine an object's type and then take the appropriate action. This situation screams out for virtual functions and dynamic binding, not the extensibility-killing misuse of RTTI (see FAQ 27.03).

Also, watch out for performance hits due to this implementation technique. It is all too easy to think of dynamic_cast<T>() as a constant-time operation, when in fact it may take linear time and chew up CPU cycles if the inheritance hierarchies are deep or if the advice about huge if blocks has been ignored.

FAQ 27.19 What does static_cast<T>() do?

image

It tells the compiler, “Trust me.”

Sometimes the programmer knows the type of an object and has to or wants to let the compiler in on the secret. static_cast<T>() is the standard C++ way to do this at compile time. There are situations where either the knowledge to make the cast exists only in the programmer's mind or the runtime system cannot do the job because of technical reasons. Here is some sample syntax.

Target* tg = static_cast<Target*>(src);  // just do it

The C++ static_cast<T>() is better than C-style casting because it stands out in the code and explicitly states the programmer's understanding and intentions. It also understands and respects const and access controls.

FAQ 27.20 What does typeid() do?

image

It determines the precise type of an object at runtime.

Given a reference or pointer as input, typeid() returns a reference to a standard library class called type_info. type_info has a name() member function that returns the name of the parameter's type in an implementation-specific format. This name represents the precise, lowest-level type of the object. If the value of the pointer is NULL, typeid() throws a bad_typeid exception.

Note that dynamic_cast<T>(p) and static_cast<T>(p) are template functions, where T is the template parameter and p is the function parameter, but typeid() is not a template function.

typeid() and dynamic_cast<T>() are two sides of the same coin. They both take a base class pointer or reference that may refer to a derived class object. But typeid() returns a class name whereas dynamic_cast<T>() is passed a class name. typeid() is used to discover the object's exact class, but it doesn't convert the pointer; dynamic_cast<T>() converts the pointer but doesn't determine the object's exact class—the pointer may be converted to some intermediate base class rather than to the object's exact class.

The character representation of the class name from name() is stored in system memory and must not be deleted by the programmer.

FAQ 27.21 Are there any hidden costs for type-safe downcasts?

Yes, type-safe downcasts have five hidden costs.

Although type-safe downcasts never cast a pointer to an incorrect type, they have five hidden costs. They increase coding cost, maintenance cost, testing cost, runtime CPU cost, and extensibility cost.

  1. Coding cost: Type-safe downcasts move complexity from the server code into the user code, from the few to the many.
  2. Maintenance cost: Moving code from the server code to the user code increases the overall software bulk.
  3. Testing cost: A test harness must be devised to exercise every if, including the ifs used to test the type safety of the downcasts.
  4. Runtime CPU cost: Additional code must be executed to test the type safety of the downcasts. This is not a constant time cost, by the way, since it may be necessary to search an entire inheritance hierarchy.
  5. Extensibility cost: The additional control flow code needs to be modified when new derived classes are added.

The underlying cause for these costs lies with the style of programming implied by type-safe downcasts rather than with the downcasts themselves. Embracing the more extensible style of programming that does not use unnecessary downcasts is part of using C++ properly.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset