CHAPTER 2
Modern C++

C++ was thoroughly modernized in 2011 with the addition of a plethora of features and constructs borrowed from more recent programming languages. As a result, C++ kept its identity as the language of choice for programming close to the metal, high-performance software in demanding contexts, and at the same time, adapted to the demands of modern hardware, adopted modern programming idioms borrowed from the field of functional programming, and incorporated useful constructs into its syntax and standard library that were previously available only from third party libraries.

The major innovation is the new Standard Threading Library, which is explored in the next chapter. Since we are using new C++11 constructs in the rest of the text, this chapter selectively introduces some particularly useful innovations. A more complete picture can be found online or in up-to-date C++ textbooks. Readers familiar with C++11 may easily skip this chapter.

2.1 LAMBDA EXPRESSIONS

One of the most useful features in C++11, borrowed from the field of functional programming, is the ability to define anonymous function objects on the fly with the lambda syntax like in:

images

The new auto keyword provides automatic type deduction, which is particularly useful with lambdas that produce a different compiler-generated type for every lambda declaration. A lambda declaration starts with a capture clause [], followed by the list of its arguments (a lambda is after all a function, or more exactly a callable object), and its body, the sequence of instructions that are executed when the lambda is called, just like the body of functions and methods. The difference is that lambdas are declared within functions. Once declared, they may be called like any other function or function object, for example:

images

One powerful feature of lambdas is their ability to capture variables from their environment on declaration.

[] means no capture.
[=] means capture by value, that is, by copy, of all variables in scope used in the lambda's body.
[&] means capture all variables by reference.

We may also capture variables selectively with the syntax:

[x] means capture only images, by value with this syntax or by reference with images.
[=, &x, &y] means capture images and images by reference, and all others by value. Obviously images means capture images and images by value and all others by reference.
[x, &y] means capture images by value, images by reference and nothing else.

For instance,

images

Behind the scenes, the compiler creates a function object when we declare a lambda, that is, an object that defines the operator () (with the arguments of the lambda) and therefore is callable (like a function). The captured variables are implicitly declared as data members with a value type when captured by value and a reference type when captured by reference, initialized with the captured data on declaration. Hence, the syntax:

images

is equivalent to the (much heavier):

images

As a function object, a lambda can be passed as an argument or returned from functions. Functions that manipulate functions are called higher-order functions, and the standard <algorithm> library provides a vast number of these.

Lambdas are also incredibly useful as adapters and resolve a constant annoyance C++ developers face when calling functions with signatures inconsistent with their data. The Standard Template Library (STL), for instance, includes a wealth of useful generic algorithms. But to use these algorithms we must respect their functions' signatures. Say we hold a vector of times from today:

images

and we want to compute an annuity given a constant rate images. We could write a hand-crafted loop, of course:

images

but it is considered best professional practice to apply generic algorithms instead.1 The computation we just conducted is a reduction, where a collection is traversed sequentially and an accumulator is updated for each element. The STL algorithm for reductions is images, located in the <numeric> header. The version of interest to us has the following signature:

images

The type images of the accumulator in our case is double, as is *InputIt, so the function images that updates the accumulator images for each element images must be consistent with the form:

images

but our instruction for the update of the accumulator is:

images

and prior to C++11, it would have been such an annoyance to squeeze that line of code into the required signature that we would probably have ended up with the hand-crafted loop. With the lambda syntax, it takes a line to do that right:

images

There are of course many other uses of lambdas, and we will discuss a few later, but their ability to seamlessly adapt data to signatures is the reason why we use them every day.

C++11 also provides dedicated adapter functions images and images in the <functional> header (the latter turning member functions images into free functions images), although lambdas can also do this in more convenient manner. The syntax for images in particular is rather peculiar and it is easier to achieve the exact same behavior with lambdas.

We will be working with lambdas throughout the book.

2.2 FUNCTIONAL PROGRAMMING IN C++

The introduction of lambdas is part of an effort to modernize C++ with idioms borrowed from the growing and fashionable field of functional programming. Although C++ does not, and never will, support functional programming idioms the way a language like Haskell does, C++ does support some key elements of functional programming, in particular value semantics for functions and higher-order functions.

Value semantics means that functions may be manipulated just like other types and in particular they can be assigned to variables and passed as arguments or returned as results by higher-order functions. Note that lambdas are literals for functions, which means that the instruction:

images

assigns a function literal to images in the same way we assign number or string literals in:

images

C++11 defines the images template class in the <functional> header as a unique class for holding functions and anything callable. That means that a concrete type like

function<double(const double)>

can hold anything that may be called with a double to return a double: a C style function pointer, a function object, including a lambda, or a member function bound to an object. An object of that type is itself callable of course, and it has value semantics, in the sense that it can be assigned or passed as an argument, or returned as a result from a higher-order function.

It looks peculiar and at first sight impossible in C++ to define a type based on the behavior rather than the nature of the objects it holds.2 images is implemented with an interesting, advanced design pattern called type erasure. Unfortunately, this versatility comes with a cost. Type erasure necessarily involves the storage of the underlying objects on the heap. Hence, to initialize, assigning or copying a images object involves an allocation.3 For this reason, we refrain from using this class despite its convenience, and manipulate functions as template types instead.4

Composition

As a first example, we consider the composition of functions, and write a (higher-order) function that takes two functions as arguments and returns the function resulting from their composition.

images

We use the auto keyword so that types are deduced at compile time. Note that it is a function, not a number, that is returned. For instance, the following code creates a function by composing an exponential with a square root:

images

Lambdas are obviously unnecessary here; they wrap the functions images and images without adapting anything. However, the following does not compile on Visual Studio:

images

Standard mathematical functions are overloaded so they work with many different types, and the compiler doesn't know which overload to pick to instantiate the templates. For this reason, we must explicitly state the function types when we compose standard functions, as follows:

images

We are not limited to numerical functions. Any function that takes an argument of type images and returns a result of type images (which we denote images) may be composed with any function images to create a function images. We can imagine a function that creates a vector images out of an unsigned integer:

images

where images is an STL algorithm from the header <algorithm> that fills a sequence by repeated calls to a function, and the lambda is marked mutable because its execution modifies its internal data images. We can code a function that sums up the values in a vector:

images

where the STL images algorithm was discussed earlier. We could define a (particularly inefficient) way to compute the sum of the first images numbers by composition:

images

We could even design ways to compose functions of multiple arguments, either by binding or currying. We have to stop here and refer interested readers to a specialized publication like [21].

Lifting

Another useful idiom borrowed from functional programming is lifting. To lift a function means to turn it into one that operates on compound types. For instance, we may implement a lift that turns a scalar function into a vector function that applies the original function to all the elements of a vector:

images

images is a generic STL algorithm from header <algorithm> that applies a unary function to all the elements in a collection. What is returned from images is not a vector but a function of a vector that returns a vector. It can be used as follows (we lift the images function into a images that computes a vector of exponentials from a vector of numbers):

images

images is another generic algorithm from the <algorithm> header that sequentially applies an action to all the elements in a collection. We use it to display the entries in the result vector images.

As a (slightly) more advanced example, suppose we have a function that implements the Black and Scholes formula from [22]:

images

We can lift it into a function that computes a vector of option prices from a vector of spots, but we must first turn it into a function of the spot alone by binding the other arguments. That could be done with a lambda, or with the images function from the header <functional>:

images

More information about images can be found online. It takes a function, followed by its arguments in order, and returns a new function. When we pass a value for an argument, the argument is bound to this value. When we pass a placeholder images, the argument is bound to the imagesth argument of the resulting function. In our example, we created a new function out of images, by binding its first argument (the spot) to the first argument of the new function, and all other arguments to fixed values.

Alternatively, we could bind the spot and create a function that values a call out of volatility alone, and then lift it so it returns a vector of calls out of a vector of volatilities:

images

Note that our lifting function is specialized for functions images, lifting into a function images. It is possible, with template magic, to produce a generic lifting function for functions images, lifting into images where images is an arbitrary collection, not necessarily a vector. This exercise is out of scope here, and we refer to specialized publications.

Functional programming idioms are exciting and fashionable. For an excellent introduction to functional programming in its natural habitat Haskell, we refer to [23].

We barely scratched the surface of functional programming in C++11, but hopefully gave a sense of how functions may be created and manipulated like any other type of data. It would take a dedicated publication to cover that subject in full, and, indeed, one such publication exists, [21], where interested readers will find a much more complete discussion of the implementation of functional programming idioms in C++.

2.3 MOVE SEMANTICS

Moving onto a different topic, the following pattern is valid but inefficient in traditional C++:

images

The images vector is destroyed when images returns, but before that, the images vector is allocated and the contents of images are copied. This is of course very inefficient: memory is allocated twice, and an unnecessary duplication of data is conducted from a container that is destroyed immediately afterwards. This inefficiency might be caught by the compiler's RVO (Return Value Optimization), whereby the compiler would directly instantiate images inside images. RVO is not guaranteed5 so programmers settled for a less natural syntax where result vectors are passed by reference as arguments rather than returned from functions.

C++11 move semantics permanently resolved this situation.

Conventional C++ allows class developers to implement their own copy constructors and copy assignment operators:

images

The code in the body of the copy constructor and assignment is automatically executed whenever an object of that type is initialized or assigned from another object of the same type. The code doesn't have to conduct a copy (our example does not) but it is expected that this is the case, and that such code should result in the duplication of the right-hand side (images) into the left-hand side (images).

When the developer does not supply a copy constructor or a copy assignment, the compiler provides default ones that perform copies of all data members (by calling their own copy constructors or assignment operators when these members are themselves classes).

C++11 introduced additional move constructors and assignments, with the perhaps unusual “&&” syntax:

images

These are automatically invoked whenever the images is a temporary object, like a result returned from a function, as opposed to a named object. They can also be explicitly invoked with the images keyword (which is actually a function):

images

We can (and, in the example, did) code whatever we want in the move constructor and assignment. What is expected is a quick transfer of the ownership of the images object's resources to the images object, without modification of the managed resources themselves, leaving the images empty.

Let us discuss a relevant example. If we wanted to code our own vector class wrapping a dumb pointer, we could proceed as follows (we simplify to the extreme and only show code for the core functionality). We start with the skeleton of a custom Vector class, including the copy constructor and copy assignment:

images

The copy constructor clones the images Vector into images, implementing a memory allocation followed by a copy of the data.6 The copy assignment operator does the same thing, but it must also release the data previously managed by images (for the copy constructor, images is not yet constructed so it doesn't manage any data).

To avoid duplicating the copy constructor code into the copy assignment operator, we applied the well known “copy and swap” idiom. The method images swaps the pointers (and sizes) of images and images Vectors, effectively swapping the ownership of data, without modifying the data itself in any way: after the swap, images owns images's previous data, and images owns the data previously managed by images. In the assignment operator, we construct a temporary vector images by copy of images, and swap images with images. As a result, images holds a copy of the previous contents of images and images manages the previous data of images. When images goes out of scope, its destructor is invoked and the data previously managed by images is destroyed and its memory is released. A similar process is implemented in the resizer.

Copy semantics make a copy of images into images, without modification to images, at the cost of an expensive allocation and an expensive copy of the data. We now implement move semantics. A move is not supposed to copy any data, but transfer the ownership of images's resources to images and leave images in an empty state. It follows that images is not a const argument to the move constructor:

images

After the swap, images points on images's former data, images points on images's former, uninitialized memory, since images is not yet constructed. No data was copied, no memory allocated, and the ownership of images's data was swiftly and efficiently transferred to images, in exchange for some uninitialized memory. Therefore, images is empty after execution and images effectively owns its former contents and resources. For avoidance of doubt, we set images to images so its destructor would not attempt to deallocate it.

The move assignment is implemented in a similar way, the difference being, images may own resources prior to the assignment, in which case they must be released. We move images into a temporary Vector, and swap images with images, so that images ends up with the ownership of images's previous resources, images ends up empty after it was moved, and the previous resources of images, transferred to images in the swap, are released when images exits scope.

images

A move transfer is an order of magnitude faster than a copy. All it does is swap pointers, something called a shallow copy, without allocation of memory or copy of data. But it renders the moved object unusable after the transfer.

When the images object is unnamed, for example, returned from a function, then it couldn't possibly be reused in any way. The compiler knows this, so it always invokes move semantics in place of slower copy semantics in these situations. When the images is a named object, the compiler cannot safely move it, and it would normally make a copy. When we know that we don't reuse a named images object after the transfer, we can explicitly invoke its move semantics with images.

In those situations where move semantics are invoked, either automatically or explicitly, on an object that doesn't implement them, the compiler falls back on to more expensive copy semantics. Copy semantics are always implemented: when a copy constructor and assignment operator are not explicitly declared on a class, the compiler generates default ones, by copy of all data members. On the contrary, the compiler doesn't generally produce a default move constructor or move assignment operator. This only happens in restrictive cases. It follows that we must always declare move semantics explicitly in classes that manage memory or other resources, or expensive copies will be executed in situations where a faster move transfer could have been performed safely instead.

Move semantics are implemented in all STL containers and standard library classes out of the box. It is our responsibility to implement them in our own container classes and other classes that manage resources. For example, we update our simple matrix code with move semantics. We will use it in Parts II and III. The code below is in the file matrix.h on our repository.

images
images

2.4 SMART POINTERS

Smart pointers wrap standard (or dumb) pointers and implement the RAII (Resource Acquisition Is Initialization) idiom. RAII is a rather verbal name for an idiom that implements the release of resources in destructors, so that when an object exits scope, for whatever reason (the function returned or an exception was thrown), resources are always automatically released. Smart pointers relieve developers from the concern of explicitly releasing allocated memory, and protect against memory leaks.

Smart pointers are otherwise manipulated just like dumb pointers; in particular they can be dereferenced with operators images and images to read and write the managed object.

C++ developers have been using smart pointers for decades, either hand-crafted or from third-party libraries like Boost. They are part of the standard C++ library since C++11.

A simplistic smart pointer could be coded as follows. This smart pointer cannot be copied, since the memory is owned and released on destruction by a single object. But it can be moved, in which case the images pointer loses ownership of memory when the images pointer acquires it.

images

The standard library smart pointer images located in the <memory> header is implemented along these lines.

Importantly, the standard guarantees that imagess are manipulated without overhead compared to dumb pointers. Dynamic memory management with dumb pointers is inconvenient, prone to memory leaks, and without benefit compared to imagess. RAII management is a free benefit, and it is considered best practice to always manage heap memory with smart pointers.

We can create an object on the heap with the traditional operator images and assign it to a smart pointer for RAII management:

images

or more simply:

images

Since C++14, the free function images offers a terser, potentially more efficient syntax for the creation of objects in dynamic memory managed with a images:

images

or more simply:

images

images also forwards its parameters to the managed object's contructor:

images

The second breed of standard smart pointers, imagess, also located in the <memory> header, offer further benefits, but not for free. Shared pointers are reference counted. They can be copied, in which case all copies share the ownership of the managed memory, and it is only when the last owner exits scope that the managed memory is released and its contents destroyed.

Shared pointers are very powerful because we never worry about memory being released too early or too late. A resource remains alive as long as there are pointers referencing it, and it is released automatically when this is no longer the case. But they are slower than dumb and unique pointers. Reference counting is not free, especially in a concurrent environment. For this reason, we must use imagess parsimoniously, always pass them by reference (to avoid unnecessary reference counting), and do nothing else than dereference them7 at low level and especially in repeated code. Obviously, we only use imagess when we effectively need reference counting, and imagess otherwise.

Like imagess, imagess can adopt dumb pointers for RAII management, although this is implemented in a more convenient and more efficient manner in the factory function images which syntax is identical to images.

The following example demonstrates how the two types of standard library smart pointers work.

images

C++11 comes with many, many more new features, the most useful probably being hash tables and variadic templates. They are covered in many textbooks and online resources. Readers wishing to investigate these matters in further detail are referred to Scott Meyers' [24]. An outstanding innovation is undoubtedly the Standard Threading Library investigated in the next chapter.

NOTES

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset