Chapter 2

Functions

Lesson Objectives

By the end of this chapter, you will be able to:

  • Explain what functions are and how to declare them
  • Utilize local and global variables
  • Pass arguments to functions and return values from functions
  • Create overloaded functions and call them appropriately
  • Apply the concept of namespaces in organizing functions

In this chapter, we are going to look at functions in C++, how to use them, and why we would want to use them.

Introduction

Functions are a core tool in a programmer's toolkit for writing maintainable code. The concept of a function is common in almost every programming language. Functions have different names in various languages: procedures, routines, and many more, but they all have two main characteristics in common:

  • They represent a sequence of instructions grouped together.
  • The sequence of instructions is identified by a name, which can be used to refer to the function.

The programmer can call, or invoke a function when the functionalities provided by the function are needed.

When the function is called, the sequence of instructions is executed. The caller can also provide some data to the function to be used in operations within the program. The following are the main advantages of using functions:

  • Reduces repetition: It often occurs that a program needs to repeat the same operations in different parts of the codebase. Functions allow us to write a single implementation that is carefully tested, documented, and of high quality. This code can be called from different places in the codebase, which enables code reusability. This, in turn, increases the productivity of the programmer and the quality of the software.
  • Boosts code readability and modification: Often, we need several operations to implement a functionality in our program. In these cases, grouping the operations together in a function and giving a descriptive name to the function helps to express what we want to do instead of how we do it.

    Using functions greatly increases the readability of our code because it's now composed of descriptive names of what we are trying to achieve, without the noise of how the result is achieved.

    In fact, it is easier to test and debug as you may only need to modify a function without having to revisit the structure of the program.

  • Higher level of abstraction: We can give a meaningful name to the function to represent what it should achieve. This way, the calling code can be concerned with what the function is supposed to do, and it does not need to know how the operations are performed.

    Note

    Abstraction is the process of extracting all relevant properties from a class and exposing them, while hiding details that are not important for a specific usage.

    Let's take a tree as an example. If we were to use it in the context of an orchard, we could abstract the tree to be a "machine" that takes a determined amount of space and, given sunlight, water, and fertilizers, produces a certain number of fruits per year. The property we are interested in is the tree's fruit production ability, so we want to expose it and hide all the other details that are not relevant to our case.

    In computer science, we want to apply the same concept: capture the key fundamental properties of a class without showing the algorithm that implements it.

A prime example of this is the sort function, which is present in many languages. We know what the function expects and what it is going to do, but rarely are we aware of the algorithm that is used to do it, and it might also change between different implementations of the language.

In the following sections, we will demystify how function declaration and definition works.

Function Declaration and Definition

A function declaration has the role of telling the compiler the name, the parameters, and the return type of a function. After a function has been declared, it can be used in the rest of the program.

The definition of the function specifies what operations a function performs.

A declaration is composed of the type of the returned value, followed by the name of the function and by a list of parameters inside a pair of parentheses. These last two components form the signature of the function. The syntax of a function declaration is as follows:

// Declaration: function without body

return_type function_name( parameter list );

If a function returns nothing, then the type void can be used, and if a function is not expecting any parameters the list can be empty.

Let's look at an example of a function declaration:

void doNothingForNow();

Here, we declared a function named doNothingForNow(), which takes no arguments and returns nothing. After this declaration, we can call the doNothingForNow() function in our program.

To call a function that does not have any parameters, write the name of the function followed by a pair of parentheses.

When a function is called, the execution flow goes from the body of the function currently being executed to the body of the called function.

In the following example, the execution flow starts at the beginning of the body of main function and starts executing its operations in order. The first operation it encounters is the call to doNothingForNow(). At that point, the execution flow goes into the body of doNothingForNow().

When all the operations inside a function are executed, or the function instructs them to go back to the caller, the execution flow resumes from the operation after the function call.

In our example, the operation after the function call prints Done on the console:

#include <iostream>

void doNothingForNow();

int main() {

doNothingForNow ();

std::cout << "Done";

}

If we were to compile this program, the compilation would succeed, but linking would fail.

In this program, we instructed the compiler that a function called doNothingForNow() exists and then we invoked it. The compiler generates an output that calls doNothingForNow().

The linker then tries to create an executable from the compiler output, but since we did not define doNothingForNow(), it cannot find the function's definition, so it fails.

To successfully compile the program, we need to define doNothingForNow(). In the next section, we will explore how to define a function using the same example.

Defining a Function

To define a function, we need to write the same information that we used for the declaration: the return type, the name of the function, and the parameter list, followed by the function body. The function body delimits a new scope and is composed of a sequence of statements delimited by curly braces.

When the function is executed, the statements are executed in order:

// Definition: function with body

return_type function_name( parameter_list ) {

statement1;

statement2;

...

last statement;

}

Let's fix the program by adding the body for doNothingForNow():

void doNothingForNow() {

// Do nothing

}

Here, we defined doNothingForNow() with an empty body. This means that as soon as the function execution starts, the control flow returns to the function that called it.

Note

When we define a function, we need to make sure that the signature (the return value, the name, and the parameters) are the same as the declaration.

The definition counts as a declaration as well. We can skip the declaration if we define the function before calling it.

Let's revisit our program now since we have added the definition for our function:

#include <iostream>

void doNothingForNow() {

// do nothing

}

int main() {

doNothingForNow();

std::cout << "Done";

}

If we compile and run the program, it will succeed and show Done on the output console.

In a program, there can be multiple declarations of the same function, as long as the declarations are the same. On the other hand, only a single definition of the function can exist, as mandated by the One Definition Rule (ODR).

Note

Several definitions of the same function may exist if compiled in different files, but they need to be identical. If they are not, then the program might do unpredictable things.

The compiler is not going to warn you!

The solution is to have the declaration in a header file, and the definition in an implementation file.

A header file is included in many different implementation files, and the code in these files can call the function.

An implementation file is compiled only once, so we can guarantee that the definition is seen only once by the compiler.

Then, the linker puts all of the outputs of the compiler together, finds a definition of the function, and produces a valid executable.

Exercise 3: Calling a Function from main()

In our application, we want to log errors. To do so, we have to specify a function called log(), which prints Error! to the standard output when called.

Let's create a function that can be called from several files, and put it in a different header file that can be included:

  1. Create a file named log.h and declare a function called log() with no parameters and that returns nothing:

    void log();

  2. Now, let's create a new file, log.cpp, where we define the log() function to print to the standard output:

    #include <iostream>

    // This is where std::cout and std::endl are defined

    void log() {

    std::cout << "Error!" << std::endl;

    }

  3. Change the main.cpp file to include log.h and call log() in the main() function:

    #include <log.h>

    int main() {

    log();

    }

  4. Compile the two files and run the executable. You will see that the message Error! is printed when we execute it.

Local and Global Variables

The body of a function is a code block that can contain valid statements, one of which is a variable definition. As we learned in Lesson 1, Getting Started, when such a statement appears, the function declares a local variable.

This is in contrast to global variables, which are the variables that are declared outside of functions (and classes, which we will look at in Lesson 3, Classes).

The difference between a local and a global variable is in the scope in which it is declared, and thus, in who can access it.

Note

Local variables are in the function scope and can only be accessed by the function. On the contrary, global variables can be accessed by any function that can see them.

It is desirable to use local variables over global variables because they enable encapsulation: only the code inside the function body can access and modify the variable, making the variable invisible to the rest of the program. This makes it easy to understand how a variable is used by a function since its usage is restricted to the function body and we are guaranteed that no other code is accessing it.

Encapsulation is usually used for three separate reasons, which we will explore in more detail in Lesson 3, Classes:

  • To restrict the access to data used by a functionality
  • To bundle together the data and the functionality that operates on it
  • Encapsulation is a key concept that allows you to create abstractions

On the other hand, global variables can be accessed by any function.

This makes it hard to be sure of the function's value when interacting with them, unless we know not only what our function does, but also what all the other code in the program that interacts with the global variable does.

Additionally, code that we add later to the program, might start modifying the global variable in a way that we did not expect in our function, breaking the functionality of our function without ever modifying the function itself. This makes it extremely difficult to modify, maintain, and evolve programs.

The solution to this problem is to use the const qualifier so that no code can change the variable, and we can treat it as a value that never changes.

Note

Always use the const qualifier with global variables whenever possible.

Try to avoid using mutable global variables.

It is a good practice to use global const variables instead of using values directly in the code. They allow you to give a name and a meaning to the value, without any of the risks that come with mutable global variables.

Working with Variable Objects

It is important to understand the relationship between variables, objects, and the lifetime of objects in C++ to write programs correctly.

Note

An object is a piece of data in the program's memory.

A variable is a name we give to an object.

There is a distinction in C++ between the scope of a variable and the lifetime of the object it refers to. The scope of a variable is the part of the program where the variable can be used.

The lifetime of an object, on the contrary, is the time during execution wherein the object is valid to access.

Let's examine the following program to understand the lifetime of an object:

#include <iostream>

/* 1 */ const int globalVar = 10;

int* foo(const int* other) {

/* 5 */ int fooLocal = 0;

std::cout << "foo's local: " << fooLocal << std::endl;

std::cout << "main's local: " << *other << std::endl;

/* 6 */ return &fooLocal;

}

int main()

{

/* 2 */ int mainLocal = 15;

/* 3 */ int* fooPointer = foo(&mainLocal);

std::cout << "main's local: " << mainLocal << std::endl;

std::cout << "We should not access the content of fooPointer! It's not valid." << std::endl;

/* 4 */ return 0;

}

Figure 2.1: Lifetime of an object
Figure 2.1: Lifetime of an object

The lifetime of a variable starts when it is initialized and ends when the containing block ends. Even if we have a pointer or reference to a variable, we should access it only if it's still valid. fooPointer is pointing to a variable which is no longer valid, so it should not be used!

When we declare a local variable in the scope of a function, the compiler automatically creates an object when the function execution reaches the variable declaration; the variable refers to that object.

When we declare a global variable instead, we are declaring it in a scope that does not have a clear duration – it is valid for the complete duration of the program. Because of this, the compiler creates the object when the program starts before any function is executed – even the main() function.

The compiler also takes care of terminating the object's lifetime when the execution exits from the scope in which the variable has been declared, or when the program terminates in the case of a global variable. The termination of the lifetime of an object is usually called destruction.

Variables declared in a scope block, either local or global, are called automatic variables, because the compiler takes care of initializing and terminating the lifetime of the object associated with the variables.

Let's look at an example of a local variable:

void foo() {

int a;

}

In this case, the variable a is a local variable of type int. The compiler automatically initializes the object it refers to with what is called its default initialization when the execution reaches that statement, and the object will be destroyed at the end of the function, again, automatically.

Note

The default initialization of basic types, such as integers, is doing nothing for us. This means that the variable a will have an unspecified value.

If multiple local variables are defined, the initialization of the objects happens in the order of declaration:

void foo() {

int a;

int b;

}

Variable a is initialized before b. Since variable b was initialized after a, its object is destroyed before the one a refers to.

If the execution never reaches the declaration, the variable is not initialized. If the variable is not initialized, it is also not destroyed:

void foo() {

if (false) {

int a;

}

int b;

}

Here, the variable a is never default initialized, and thus never destroyed. This is similar for global variables:

const int a = 1;

void main() {

std::cout << "a=" << a << std::endl;

}

Variable a is initialized before the main() function is called and is destroyed after we return the value from the main() function.

Exercise 4: Using Local and Global Variables in a Fibonacci Sequence

We want to write a function that returns the 10th number in a Fibonacci sequence.

Note

The nth Fibonacci number is defined as the sum of the n-1th and the n-2th, with the first number in the sequence being 0 and the second being 1.

Example:

10th Fibonacci number = 8th Fibonacci number + 9th Fibonacci number

We want to use the best practice of giving a name and a meaning to values, so instead of using 10 in the code, we are going to define a const global variable, named POSITION.

We will also use two local variables in the function to remember the n-1th and the n-2th number:

  1. Write the program and include the following constant global variable after the header file:

    #include <iostream>

    const int POSITION = 10;

    const int ALREADY_COMPUTED = 3;

  2. Now, create a function named print_tenth_fibonacci() with the return type as void:

    void print_tenth_fibonacci()

  3. Within the function, include three local variables, named n_1, n_2, and current of type int, as shown here:

    int n_1 = 1;

    int n_2 = 0;

    int current = n_1 + n_2;

  4. Let's create a for loop to generate the remaining Fibonacci numbers until we reach the 10th, using the global variables we defined previously as starting and ending indices:

    for(int i = ALREADY_COMPUTED; i < POSITION; ++i){

    n_2 = n_1;

    n_1 = current;

    current = n_1 + n_2;

    }

  5. Now, after the previous for loop, add the following print statement to print the last value stored in the current variable:

    std::cout << current << std::endl;

  6. In the main() function, call print_tenth_fibonacci() and print the value of the 10th element of the Fibonacci sequence:

    int main() {

    std::cout << "Computing the 10th Fibonacci number" << std::endl;

    print_tenth_fibonacci();

    }

Let's understand the variable data flow of this exercise. First, the n_1 variable is initialized, then n_2 is initialized, and right after that, current is initialized. And then, current is destroyed, n_2 is destroyed, and finally, n_1 is destroyed.

i is also an automatic variable in the scope that's created by the for loop, so it is destroyed at the end of the for loop scope.

For each combination of cond1 and cond2, identify when initialization and destruction occurs in the following program:

void foo()

if(cond1) {

int a;

}

if (cond2) {

int b;

}

}

Passing Arguments and Returning Values

In the Introduction section, we mentioned that the caller can provide some data to the function. This is done by passing arguments to the parameters of the function.

The parameters that a function accept are part of its signature, so we need to specify them in every declaration.

The list of parameters a function can accept is contained in the parentheses after the function name. The parameters in the function parentheses are comma-separated, composed by a type, and optionally an identifier.

For example, a function taking two integer numbers would be declared as follows:

void two_ints(int, int);

If we wanted to give a name to these parameters, a and b respectively, we would write the following:

void two_ints(int a, int b);

Inside its body, the function can access the identifiers defined in the function signature as if they were declared variables. The values of the function parameters are decided when the function is called.

To call a function that takes a parameter, you need to write the name of the function, followed by a list of expressions inside a pair of parentheses:

two_ints(1,2);

Here, we called the two_ints function with two arguments: 1 and 2.

The arguments used to call the function initialize the parameters that the function is expecting. Inside the two_ints function, variable a will be equal to 1, and b will be equal to 2.

Each time the function is called, a new set of parameters is initialized from the arguments that were used to call the function.

Note

Parameter: This is a variable that was defined by a function, and can be used to provide data as per the code.

Argument: The value the caller wants to bind to the parameters of the function.

In the following example, we used two values, but we can also use arbitrary expressions as arguments:

two_ints(1+2, 2+3);

Note

The order in which the expression is evaluated is not specified!

This means that when calling two_ints(1+2, 2+3);, the compiler might first execute 1+2 and then 2+3, or 2+3 and then 1+2. This is usually not a problem if the expression does not change any state in the program, but it can create bugs that are hard to detect when it does. For example, given int i = 0;, if we call two_ints(i++, i++), we don't know whether the function is going to be called with two_ints(0, 1) or two_ints(1, 0).

In general, it's better to declare expressions that change the state of the program in their own statements, and call functions with expressions that do not modify the program's state.

The function parameters can be of any type. As we already saw, a type in C++ could be a value, a reference, or a pointer. This gives the programmer a few options on how to accept parameters from the callers, based on the behavior it wants.

In the following subsections, we will explore the working mechanism of Pass by value and Pass by reference in more detail.

Pass by Value

When the parameter type of a function is a value type, we say that the function is taking an argument by value or the argument is passed by value.

When a parameter is a value type, a new local object is created each time the function is called.

As we saw with automatic variables, the lifetime of the object lasts until the execution does not reach the end of the function's scope.

When the parameter is initialized, a new copy is made from the argument provided when invoking the function.

Note

If you want to modify a parameter but do not want or do not care about the calling code seeing the modification, use pass by value.

Exercise 5: Calculating Age using Pass by Value Arguments

James wants to write a C++ program to calculate what the age of a person will be after five years by providing their current age as an input.

To implement such a program, he is going to write a function that takes a person's age by value and computes how old they will be in 5 years, and then prints it on the screen:

  1. Create a function named byvalue_age_in_5_years, as illustrated here. Make sure that the value in the calling code does not change:

    void byvalue_age_in_5_years(int age) {

    age += 5;

    std::cout << "Age in 5 years: " << age << std::endl;

    // Prints 100

    }

  2. Now, in main(), call the function we created in the previous step by passing the variable age as a value:

    int main() {

    int age = 95;

    byvalue_age_in_5_years(age);

    std::cout << "Current age: " << age;

    // Prints 95

    }

    Note

    Pass by value should be the default way of accepting arguments: always use it unless you have a specific reason not to.

    The reason for this is that it makes the separation between the calling code and the called function stricter: the calling code cannot see the changes that the called function makes on the parameters.

Passing parameters by value creates a clear boundary between the calling function and the called function, because the parameters are copied:

  1. As the calling function, we know that the variables we passed to the functions will not be modified by it.
  2. As the called function, we know that even if we modify the provided parameters, there will be no impact on the called function.

This makes it easy to understand the code, because the changes we make to the parameters have no impact outside of the function.

Pass by value can be the faster option when taking an argument, especially if the memory size of the argument is small (for example, integers, characters, float, or small structures).

We need to remember though that passing by value performs a copy of the argument. Sometimes, this can be an expensive operation both in terms of memory and processing time, like when copying a container with many elements.

There are some cases where this limitation can be overcome with the move semantic that was added in C++11. We will see more of it in Lesson 3, Classes.

Let's look at an alternative to pass by value that has a different set of properties.

Pass by Reference

When the parameter type of the function is a reference type, we say that the function is taking an argument by reference or the argument is passed by reference.

We saw earlier that a reference type does not create a new object – it is simply a new variable, or name that refers to an object that already exists.

When the function that accepts the argument by reference is called, the reference is bound to the object used in the argument: the parameter will refer to the given object. This means that the function has access to the object the calling code provided and can modify it.

This is convenient if the goal of the function is to modify an object, but it can be more difficult to understand the interaction between the caller and the called function in such situations.

Note

Unless the function must modify the variable, always use const references, as we will see later.

Exercise 6: Calculating Incrementation of Age using Pass by Reference

James would like to write a C++ program which, given anyone's age as input, prints Congratulations! if their age will be 18 or older in the next 5 years.

Let's write a function that accepts its parameters by reference:

  1. Create a function named byreference_age_in_5_years() of type void, as illustrated here:

    void byreference_age_in_5_years(int& age) {

    age += 5;

    }

  2. Now, in main(), call the function we created in the previous step by passing the variable age as a reference:

    int main() {

    int age = 13;

    byreference_age_in_5_years(age);

    if (age >= 18) {

    std::cout << "Congratulations! " << std::endl;

    }

    }

Contrary to passing by value the speed when passing by reference does not change when the memory size of the object passed.

This makes pass by reference the preferred method when copying an object, since providing pass by value to the function is expensive, especially if we cannot use the move semantic that was added in C++11.

Note

If you want to use pass by reference, but you are not modifying the provided object, make sure to use const.

With C++, we can use std::cin to read input from the console executing the program.

When writing std::cin >> variable;, the program will block waiting for some user input, and then it will populate variable with the value read from the input as long as it is a valid value and the program knows how to read it. By default, we can assign all the built-in data types and some types defined in the standard library, such as string.

Activity 3: Checking Voting Eligibility

James is creating a program to print a message on the console screen: "Congratulations! You are eligible to vote in your country" or "No worries, just <value> more years to go." after the user provides their current age as input.

  1. Create a function named byreference_age_in_5_years(int& age) and add the following code:

    #include <iostream>

    void byreference_age_in_5_years(int& age) {

    if (age >= 18) {

    std::cout << "Congratulations! You are eligible to vote for your nation." << std::endl;

    return;

  2. In the else block, add the code to calculate the years remaining until they can vote:

    } else{

    int reqAge = 18;

    }

    }

  3. In main(), add the input stream, as illustrated, to accept the input from the user. Pass the value as a reference in the previous function:

    int main() {

    int age;

    std::cout << "Please enter your age:";

    std::cin >> age;

    The solution for this activity can be found on page 284.

Working with const References or r-value References

A temporary object cannot be passed as an argument for a reference parameter. To accept temporary parameters, we need to use const references or r-value references. The r-value references are references that are identified by two ampersands, &&, and can only refer to temporary values. We will look at them in more detail in Lesson 4, Generic Programming and Templates.

We need to remember that a pointer is a value that represents the location of an object.

Being a value, it means that when we are accepting a parameter as a pointer, the pointer itself is passed as a value.

This means that the modification of the pointer inside the function is not going to be visible to the caller.

But if we are modifying the object the pointer points to, then the original object is going to be modified:

void modify_pointer(int* pointer) {

*pointer = 1;

pointer = 0;

}

int main() {

int a = 0;

int* ptr = &a;

modify_pointer(ptr);

std::cout << "Value: " << *ptr << std::endl;

std::cout << "Did the pointer change? " << std::boolalpha << (ptr == &a);

}

Most of the time, we can think of passing a pointer as passing a reference, with the caveat that you need to be aware that the pointer might be null.

Accepting a parameter as a pointer is mainly used for three reasons:

  • Traversing the elements of an array, by providing the start pointer and either the end pointer or the size of the array.
  • Optionally modifying a value. This means that the function modifies a value if it is provided.
  • Returning more than a single value. This is often done to set the value of a pointer passed as an argument and then return an error code to signal whether the operation was performed.

We will see in Lesson 4, Generic Programming and Templates, how features introduced in C++11 and C++17 allow us to avoid using pointers for some of these use cases, eliminating the possibility of some common classes of errors, such as dereferencing invalid pointers or accessing unallocated memory.

The options of passing by value or passing by reference are applicable to every single parameter the function expects, independently.

This means that a function can take some arguments by value and some by reference.

Returning Values from Functions

Up until now, we have seen how to provide values to a function. In this section, we will see how a function can provide value back to the caller.

We said earlier that the first part of a function declaration is the type returned by the function: this is often referred to as the function's return type.

All the previous examples used void to signal that they were returning nothing. Now, it is time to look at an example of a function returning a value:

int sum(int, int);

The previous function accepts two integers by value as parameters and returns an integer.

The invocation of the function in the caller code is an expression evaluating to an integer. This means that we can use it anywhere that an expression is allowed:

int a = sum(1, 2);

A function can return a value by using the return keyword, followed by the value it wants to return.

The function can use the return keyword several times inside its body, and each time the execution reaches the return keyword, the program will stop executing the function and go back to the caller, with the value returned by the function, if any. Let's look at the following code:

void rideRollercoasterWithChecks(int heightInCm) {

if (heightInCm < 100) {

std::cout << "Too short";

return;

}

if (heightInCm > 210) {

std::cout << "Too tall";

return;

}

rideRollercoaster();

// implicit return at the end of the function

}

A function also returns to the caller if it reaches the end of its body.

This is what we did in the earlier examples since we did not use the return keyword.

Not explicitly returning can be okay if a function has a void return type. However, it will give unexpected results if the function is expected to return a value: the returned type will have an unspecified value and the program will not be correct.

Be sure to enable the warning, as it will save you a lot of debugging time.

Note

It is surprising, but every major compiler allows the compiling of functions, which declare a return type other than void, but don't return a value.

This is easy to spot in simple functions, but it is much harder in complex ones with lots of branches.

Every compiler supports options to warn you if a function returns without providing a value.

Let's look at an example of a function returning an integer:

int sum(int a, int b) {

return a + b;

}

As we said earlier, a function can use the return statement several times inside its body, as shown in the following example:

int max(int a, int b) {

if(a > b) {

return a;

} else {

return b;

}

}

We always return a value that's independent of the values of the arguments.

Note

It is a good practice to return as early as possible in an algorithm.

The reason for this is that as you follow the logic of the code, especially when there are many conditionals, a return statement tells you when that execution path is finished, allowing you to ignore what happens in the remaining part of the function.

If you only return at the end of the function, you always have to look at the full code of the function.

Since a function can be declared to return any type, we have to decide whether to return a value or a reference.

Returning by Value

A function whose return type is a value type is said to return by value.

When a function that returns by value reaches a return statement, the program creates a new object, which is initialized from the value of the expression in the return statement.

In the previous function, sum, when the code reaches the stage of returning a + b, a new integer is created, with the value equal to the sum of a and b, and is returned.

On the side of the caller, int a = sum(1,2);, a new temporary automatic object is created and is initialized from the value returned by the function (the integer that was created from the sum of a and b).

This object is called temporary because its lifetime is valid only while the full-expression in which it is created is executed. We will see in the Returning by Reference section, what this means and why this is important.

The calling code can then use the returned temporary value in another expression or assign it to a value.

Add the end of the full expression, since the lifetime of the temporary object is over, it is destroyed.

In this explanation, we mentioned that objects are initialized several times while returning a value. This is not a performance concern as C++ allows compilers to optimize all these initializations, and often initialization happens only once.

Note

It is preferable to return by value as it's often easier to understand, easier to use, and as fast as returning by reference.

How can returning by value be so fast? C++11 introduced the move semantic, which allows moving instead of copying the return types when they support the move operation. We'll see how in Lesson 3, Classes. Even before C++11, all mainstream compilers implemented return value optimization (RVO) and named return value optimization (NRVO), where the return value of a function is constructed directly in the variable into which they would have been copied to when returned. In C++17, this optimization, also called copy elision, became mandatory.

Returning by Reference

A function whose return type is a reference is said to return by reference.

When a function returning a reference reaches a return statement, a new reference is initialized from the expression that's used in the return statement.

In the caller, the function call expression is substituted by the returned reference.

However, in this situation, we need to also be aware of the lifetime of the object the reference is referring to. Let's look at an example:

const int& max(const int& a, const int& b) {

if (a > b) {

return a;

} else {

return b;

}

}

First, we need to note that this function already has a caveat. The max function is returning by value, and it did not make a difference if we returned a or b when they were equal.

In this function, instead, when a == b we are returning b, this means that the code calling this function needs to be aware of this distinction. In the case where a function returns a non-const reference it might modify the object referred to by the returned reference, and whether a or b is returned might make a difference.

We are already seeing how references can make our code harder to understand.

Let's look at the function we used:

int main() {

const int& a = max(1,2);

std::cout << a;

}

This program has an error! The reason is that 1 and 2 are temporary values, and as we explained before, a temporary value is alive until the end of the full expression containing it.

To better understand what is meant by "the end of the full expression containing it", let's look at the code we have in the preceding code block: int& a = max(1,2);. There are four expressions in this piece of code:

  • 1 is an integer literal, which still counts as an expression
  • 2 is an integer literal, similar to 1
  • max(expression1, expression2) is a function call expression
  • a = expression3 is an assignment expression

All of this happens in the declaration statement of variable a.

The third point covers the function call expression, while containing the full expression is covered in the following point.

This means that lifetimes 1 and 2 will stop at the end of the assignment. But we got a reference to one of them! And we are using it!

Accessing an object whose lifetime is terminated is prohibited by C++, and this will result in an invalid program.

In a more complex example, such as int a = max(1,2) + max(3,4);, the temporary objects returned by the max functions will be valid until the end of the assignment, but no longer.

Here, we are using the two references to sum them, and then we assign the result as a value. If we assigned the result to a reference, as in the following example, int& a = max(1,2) + max(3,4);, instead, the program would have been wrong.

This sounds confusing, but it is important to understand as it can be a source of hard-to-debug problems if we use a temporary object after the full expression in which it's created has finished executing.

Let's look at another common mistake in functions returning references:

int& sum(int a, int b) {

int c = a + b;

return c;

}

We created a local, automatic object in the function body and then we returned a reference to it.

In the previous section, we saw that local objects' lifetimes end at the end of the function. This means that we are returning a reference to an object whose lifetime will always be terminated.

Earlier, we mentioned the similarities between passing arguments by reference and passing arguments by pointers.

This similarity persists when returning pointers: the object pointed to by a pointer needs to be alive when the pointer is later dereferenced.

So far, we have covered examples of mistakes while returning by reference. How can references be used correctly as return types to functions?

The important part of using references correctly as return values is to make sure that the object outlives the reference: the object must always be alive – at least until there is a reference to it.

A common example is accessing a part of an object, for example, using an std::array, which is a safe option compared to the built-in array:

int& getMaxIndex(std::array<int, 3>& array, int index1, int index2) {

/* This function requires that index1 and index2 must be smaller than 3! */

int maxIndex = max(index1, index2);

return array[maxIndex];

The calling code would look as follows:

int main() {

std:array<int, 3> array = {1,2,3};

int& elem = getMaxIndex(array, 0, 2);

elem = 0;

std::cout << array[2];

// Prints 0

}

In this example, we are returning a reference to an element inside an array, and the array remains alive longer than the reference.

The following are guidelines for using return by reference correctly:

  • Never return a reference to a local variable (or a part of it)
  • Never return a reference to a parameter accepted by value (or a part of it)

When returning a reference that was received as a parameter, the argument passed to the function must live longer than the returned reference.

Apply the previous rule, even when you are returning a reference to a part of the object (for example, an element of an array).

Activity 4: Using Pass by Reference and Pass by Value

In this activity, we are going to see the different trade-offs that can be made when writing a function, depending on the parameters the function accepts:

  1. Write a function that takes two numbers and returns the sum. Should it take the arguments by value or reference? Should it return by value or by reference?
  2. After that, write a function that takes two std::arrays of ten integers and an index (guaranteed to be less than 10) and returns the greater of the two elements to the given index in the two arrays.
  3. The calling function should then modify the element. Should it take the arguments by value or reference? Should it return by value or by reference? What happens if the values are the same?

Take the arrays by reference and return by reference because we are saying that the calling function is supposed to modify the element. Take the index by value since there is no reason to use references.

If the values are the same, the element from the first array is returned.

Note

The solution to this activity can be found at page 285.

Const Parameters and Default Arguments

In the previous chapter, we saw how and when to use references in function parameters and return types. C++ has an additional qualifier, the const qualifier, which can be used independently from the ref-ness (whether the type is a reference or not) of the type.

Let's see how const is used in the various scenarios we investigated when looking at how functions can accept parameters.

Passing by const Value

In pass by value, the function parameter is a value type: when invoked, the argument is copied into the parameter.

This means that regardless of whether const is used in the parameter or not, the calling code cannot see the difference.

The only reason to use const in the function signature is to document to the implementation that it cannot modify such a value.

This is not commonly done, as the biggest value of a function signature is for the caller to understand the contract of calling the function. Because of this, it is rare to see int max(const int, const int), even if the function does not modify the parameters.

There is an exception, though: when the function accepts a pointer.

In such cases, the function wants to make sure that it is not assigning a new value to the pointer. The pointer acts similar to a reference here, since it cannot be bound to a new object, but provides nullability.

An example could be setValue(int * const), a function that takes a const pointer to an int.

The integer is not const, so it can be changed, but the pointer is const and the implementation cannot change it during implementation.

Passing by const Reference

Const is extremely important in pass by reference, and any time you use a reference in the parameters of a function, you should also add const to it (if the function is not designed to modify it).

The reason for this is that a reference allows you to modify the provided object freely.

It is error-prone, since the function might modify an object the caller does not expect to be modified by mistake, and it is hard to understand as there is no clear boundary between the caller and the function, again, because the function can modify the state of the caller.

const instead fixes the problem, since a function cannot modify an object through a const reference.

This allows the function to use reference parameters without some of the drawbacks of using references.

The function should remove the const from a reference, but only if it is intended to modify the provided object, otherwise every reference should be const.

Another advantage of const reference parameters is that temporary objects can be used as arguments for them.

Returning by const Value

There is no widespread reason to return by const value since the calling code often assigns the value to a variable, in which case the const-ness of the variables is going to be the deciding factor, or passes the value to a next expression, and it is rare for an expression to expect a const value.

Returning by const value also inhibits the move semantic of C++11, thus reducing performance.

Returning by const Reference

A function should always return by const reference when the returned reference is meant to only be read and not be modified by the calling code.

The same concept we applied to object lifetimes when returning references to them also apply to const:

  • When returning a reference accepted as a parameter, if the parameter is a const reference, the returned reference must be const as well
  • When returning a reference to a part of an object accepted as the const reference parameter, the returned reference must be const as well

A parameter accepted as a reference should be returned as a const reference if the caller is not expected to modify it.

Sometimes, the compilation fails, stating that the code is trying to modify an object that is a const reference. Unless the function is meant to modify the object, the solution is not to remove const from the reference in the parameter. Instead, look for why the operation that you are trying to perform does not work with const, and what the possible alternatives are.

const is not about the implementation, it is about the meaning of the function.

When you write the function signature, you should decide whether to use const, as the implementation will have to find a way to respect that.

For example:

void setTheThirdItem(std::array<int, 10>& array, int item)

This should clearly take a reference to the array since its purpose is to modify the array.

On the other hand, we can use the following:

int findFirstGreaterThan(const std::array<int, 10>& array, int threshold)

This tells us that we are only looking into the array – we are not changing it, so we should use const.

Note

It is a best practice to use const as much as possible, as it allows the compiler to make sure that we are not modifying objects that we do not want to modify.

This can help to prevent bugs.

It also helps to keep another best practice in mind: never use the same variable to represent different concepts. Since the variable cannot be changed, it is less natural to reuse it instead of creating a new one.

Default Arguments

Another feature C++ provides to make life easier for the caller when it comes to calling functions are default arguments.

Default arguments are added to a function declaration. The syntax is to add an = sign and supply the value of the default argument after the identifier of the parameter of the function. An example of this would be:

int multiply(int multiplied, int multiplier = 1);

The caller of the function can call multiply either with 1 or 2 arguments:

multiply(10); // Returns 10

multiply(10, 2); // Returns 20

When an argument with a default value is omitted, the function uses the default value instead. This is extremely convenient if there are functions with sensible defaults that callers mostly do not want to modify, except in specific cases.

Imagine a function that returns the first word of a string:

char const * firstWord(char const * string, char separator = ' ').

Most of the time, a word is separated by a whitespace character, but a function can decide whether or not it should use a different separator. The fact that a function offers the possibility to provide a separator is not forcing most callers, which simply want to use the space, to specify it.

It is a best practice to set the default arguments in the function signature declaration, and not declare them in the definition.

Namespaces

One of the goals of functions is to better organize our code. To do so, it is important to give meaningful names to them.

For example, in package management software, there might be a function called sort for sorting packages. As you can see, the name is the same as the function that would sort a list of numbers.

C++ has a feature that allows you to avoid these kinds of problems and groups names together: namespaces.

A namespace starts a scope in which all the names declared inside are part of the namespace.

To create a namespace, we use the namespace keyword, followed by the identifier and then the code block:

namespace example_namespace {

// code goes here

}

To access an identifier inside a namespace, we prepend the name of the namespace to the name of the function.

Namespaces can be nested as well. Simply use the same declaration as before inside the namespace:

namespace parent {

namespace child {

// code goes here

}

}

To access an identifier inside a namespace, you prepend the name of the identifier with the name of the namespace in which it is declared, followed by ::.

You might have noticed that, before we were using std::cout. This is because the C++ standard library defines the std namespace and we were accessing the variable named cout.

To access an identifier inside multiple namespaces, you can prepend the list of all the namespaces separated by ::parent::child::some_identifier. We can access names in the global scope by prepending :: to the name—::name_in_global_scope.

If we were to only use cout, the compiler would have told us that the name does not exist in the current scope.

This is because the compiler searches only in the current namespace and the parent namespaces to find an identifier by default, so unless we specify the std namespace, the compiler will not search in it.

C++ helps make this more ergonomic with the help of the using declaration.

The using declaration is defined by the using keyword, followed by an identifier specified with its namespaces.

For example, using std::cout; is a using declaration that declares that we want to use cout. When we want to use all the declarations from a namespace, we can write using namespace namespace_name;. For example, if we want to use every name defined in the std namespace, we would write: using namespace std;.

When a name is declared inside the using declaration, the compiler also looks for that name when looking for an identifier.

This means that, in our code, we can use cout and the compiler will find std::cout.

A using declaration is valid as long as we are in the scope in which it is declared.

Note

To better organize your code and avoid naming conflicts, you should always put your code inside a namespace that's specific to either your application or library.

Namespaces can also be used to specify that some code is used only by the current code.

Let's imagine you have a file called a.cpp that contains int default_name = 0; and another file called b.cpp with int default_name = 1;. When you compile the two files and link them together, we get an invalid program: the same variable has been declared with two different values, and this violates the One Definition Rule (ODR).

But you never meant for those to be the same variable. To you, they were some variables that you just wanted to use inside your .cpp file.

To tell that to the compiler, you can use anonymous namespaces: a namespace with no identifier.

All the identifiers created inside it will be private to the current translation unit (normally the .cpp file).

How can you access an identifier inside an anonymous namespace? You can access the identifier directly, without the need to use the namespace name, which does not exist, or the using declaration.

Note

You should only use anonymous namespaces in .cpp files.

Activity 5: Organizing Functions in Namespaces

Write a function to read the name of a car for a lottery in a namespace based on numerical input. If the user inputs 1, they win a Lamborghini, and if the user inputs 2, they win a Porsche:

  1. Define the first namespace as LamborghiniCar with an output() function that will print "Congratulations! You deserve the Lamborghini." when called.
  2. Define the second namespace as PorscheCar with an output() function that will print "Congratulations! You deserve the Porsche." when called.
  3. Write a main function to read the input of numbers 1 and 2 into a variable called magicNumber.
  4. Create an if-else loop with the if condition calling the first namespace with LamborghiniCar::output() if the input is 1. Otherwise, the second namespace is called similarly when the input is 2.
  5. If neither of these conditions are met, we print a message asking them to enter a number between 1 and 2.

    Note

    The solution for this activity can be found on page 285.

Function Overloading

We saw how C++ allows us to write a function that takes parameters either by value or by reference, using const, and organizes them in namespaces.

There is an additional powerful feature of C++ that allows us to give the same name to functions that perform the same conceptual operation on different types: function overloading.

Function overloading is the ability to declare several functions with the same name – that is, if the set of parameters they accept is different.

An example of this is the multiply function. We can imagine this function being defined for integers and floats, or even for vectors and matrices.

If the concept represented by the function is the same, we can provide several functions that accept different kinds of parameters.

When a function is invoked, the compiler looks at all the functions with that name, called the overload set, and picks the function that is the best match for the arguments provided.

The precise rule on how the function is selected is complex, but the behavior is often intuitive: the compiler looks for the better match between the arguments and the expected parameters of the function. If we have two functions, int increment(int) and float increment(float), and we call them with increment(1), the integer overload is selected because an integer is a better match to an integer than a float, even if an integer can be converted into a float. An example of this would be:

bool isSafeHeightForRollercoaster(int heightInCm) {

return heightInCm > 100 && heightInCm < 210;

}

bool isSafeHeightForRollercoaster(float heightInM) {

return heightInM > 1.0f && heightInM < 2.1f;

}

// Calls the int overload

isSafeHeightForRollercoaster(187);

// Class the float overload

isSafeHeightForRollercoaster(1.67f);

Thanks to this feature, the calling code does not need to worry about which overload of the function the compiler is going to select, and the code can be more expressive thanks to using the same function to express the same meaning.

Activity 6: Writing a Math Library for a 3D Game

Johnny wants to implement a math library for the video game he is making. It will be a 3D game, so he will need to operate on points representing the three coordinates: x, y, and z.

The points are represented as std::array<float, 3>. A library will be used throughout the game, so Johnny needs to be sure it can work when included multiple times (by creating a header file and declaring the functions there).

The library needs to support the following steps:

  1. Finding the distance between 2 floats, 2 integers, or 2 points.
  2. If only one of the 2 points is provided, the other one is assumed to be the origin (the point at location (0,0,0)).
  3. Additionally, Johnny often needs to compute the circumference of a circle from its radius (defined as 2*pi*r) to understand how far enemies can see. pi is constant for the duration of the program (which can be declared globally in the .cpp file).
  4. When an enemy moves, it visits several points. Johnny needs to compute the total distance that it would take to walk along those points.
  5. For simplicity, we will limit the number of points to 10, but Johnny might need up to 100. The function would take std::array<std::array<float, 3>, 10> and compute the distance between consecutive points.

    For example (with a list of 5 points): for the array {{0,0,0}, {1,0,0}, {1,1,0}, {0,1,0}, {0,1,1}}, the total distance is 5, because going from {0,0,0} to {1,0,0} is a distance of 1, then going from {1,0,0} to {1,1,0} is a distance of 1 again, and so on for the remaining 3 points.

    Note

    The solution for this activity can be found on page 286.

Make sure that the functions are well-organized by grouping them together.

Remember that the distance between two points is computed as the square root of (x2-x1)^2 + (y2-y1)^2 + (z2-z1)^2.

C++ offers the std::pow function for the power function, which takes the base and the exponent, and the std::sqrt function, which takes the number to square. Both are in the cmath header.

Summary

In this chapter, we saw the powerful features C++ offers to implement functions.

We started by discussing why functions are useful and what they can be used for, and then we dove into how to declare and define them.

We analyzed different ways of accepting parameters and returning values, how to make use of local variables, and then explored how to improve the safety and convenience of calling them with const and default arguments.

Finally, we saw how functions can be organized in namespaces and the ability to give the same name to different functions that implement the same concept, making the calling code not have to think about which version to call.

In the next chapter, we will look at how to create classes and how they are used in C++ to make building complex programs easy and safe.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset