By the end of this chapter, you will be able to:
In this chapter, we are going to look at functions in C++, how to use them, and why we would want to use them.
Functions are a core tool in a programmer's toolkit for writing maintainable code. The concept of a function is common in almost every programming language. Functions have different names in various languages: procedures, routines, and many more, but they all have two main characteristics in common:
The programmer can call, or invoke a function when the functionalities provided by the function are needed.
When the function is called, the sequence of instructions is executed. The caller can also provide some data to the function to be used in operations within the program. The following are the main advantages of using functions:
Using functions greatly increases the readability of our code because it's now composed of descriptive names of what we are trying to achieve, without the noise of how the result is achieved.
In fact, it is easier to test and debug as you may only need to modify a function without having to revisit the structure of the program.
Abstraction is the process of extracting all relevant properties from a class and exposing them, while hiding details that are not important for a specific usage.
Let's take a tree as an example. If we were to use it in the context of an orchard, we could abstract the tree to be a "machine" that takes a determined amount of space and, given sunlight, water, and fertilizers, produces a certain number of fruits per year. The property we are interested in is the tree's fruit production ability, so we want to expose it and hide all the other details that are not relevant to our case.
In computer science, we want to apply the same concept: capture the key fundamental properties of a class without showing the algorithm that implements it.
A prime example of this is the sort function, which is present in many languages. We know what the function expects and what it is going to do, but rarely are we aware of the algorithm that is used to do it, and it might also change between different implementations of the language.
In the following sections, we will demystify how function declaration and definition works.
A function declaration has the role of telling the compiler the name, the parameters, and the return type of a function. After a function has been declared, it can be used in the rest of the program.
The definition of the function specifies what operations a function performs.
A declaration is composed of the type of the returned value, followed by the name of the function and by a list of parameters inside a pair of parentheses. These last two components form the signature of the function. The syntax of a function declaration is as follows:
// Declaration: function without body
return_type function_name( parameter list );
If a function returns nothing, then the type void can be used, and if a function is not expecting any parameters the list can be empty.
Let's look at an example of a function declaration:
void doNothingForNow();
Here, we declared a function named doNothingForNow(), which takes no arguments and returns nothing. After this declaration, we can call the doNothingForNow() function in our program.
To call a function that does not have any parameters, write the name of the function followed by a pair of parentheses.
When a function is called, the execution flow goes from the body of the function currently being executed to the body of the called function.
In the following example, the execution flow starts at the beginning of the body of main function and starts executing its operations in order. The first operation it encounters is the call to doNothingForNow(). At that point, the execution flow goes into the body of doNothingForNow().
When all the operations inside a function are executed, or the function instructs them to go back to the caller, the execution flow resumes from the operation after the function call.
In our example, the operation after the function call prints Done on the console:
int main() {
doNothingForNow ();
std::cout << "Done";
}
If we were to compile this program, the compilation would succeed, but linking would fail.
In this program, we instructed the compiler that a function called doNothingForNow() exists and then we invoked it. The compiler generates an output that calls doNothingForNow().
The linker then tries to create an executable from the compiler output, but since we did not define doNothingForNow(), it cannot find the function's definition, so it fails.
To successfully compile the program, we need to define doNothingForNow(). In the next section, we will explore how to define a function using the same example.
To define a function, we need to write the same information that we used for the declaration: the return type, the name of the function, and the parameter list, followed by the function body. The function body delimits a new scope and is composed of a sequence of statements delimited by curly braces.
When the function is executed, the statements are executed in order:
// Definition: function with body
return_type function_name( parameter_list ) {
statement1;
statement2;
...
last statement;
}
Let's fix the program by adding the body for doNothingForNow():
void doNothingForNow() {
// Do nothing
}
Here, we defined doNothingForNow() with an empty body. This means that as soon as the function execution starts, the control flow returns to the function that called it.
When we define a function, we need to make sure that the signature (the return value, the name, and the parameters) are the same as the declaration.
The definition counts as a declaration as well. We can skip the declaration if we define the function before calling it.
Let's revisit our program now since we have added the definition for our function:
#include <iostream>
void doNothingForNow() {
// do nothing
}
int main() {
doNothingForNow();
std::cout << "Done";
}
If we compile and run the program, it will succeed and show Done on the output console.
In a program, there can be multiple declarations of the same function, as long as the declarations are the same. On the other hand, only a single definition of the function can exist, as mandated by the One Definition Rule (ODR).
Several definitions of the same function may exist if compiled in different files, but they need to be identical. If they are not, then the program might do unpredictable things.
The compiler is not going to warn you!
The solution is to have the declaration in a header file, and the definition in an implementation file.
A header file is included in many different implementation files, and the code in these files can call the function.
An implementation file is compiled only once, so we can guarantee that the definition is seen only once by the compiler.
Then, the linker puts all of the outputs of the compiler together, finds a definition of the function, and produces a valid executable.
In our application, we want to log errors. To do so, we have to specify a function called log(), which prints Error! to the standard output when called.
Let's create a function that can be called from several files, and put it in a different header file that can be included:
void log();
#include <iostream>
// This is where std::cout and std::endl are defined
void log() {
std::cout << "Error!" << std::endl;
}
#include <log.h>
int main() {
log();
}
The body of a function is a code block that can contain valid statements, one of which is a variable definition. As we learned in Lesson 1, Getting Started, when such a statement appears, the function declares a local variable.
This is in contrast to global variables, which are the variables that are declared outside of functions (and classes, which we will look at in Lesson 3, Classes).
The difference between a local and a global variable is in the scope in which it is declared, and thus, in who can access it.
Local variables are in the function scope and can only be accessed by the function. On the contrary, global variables can be accessed by any function that can see them.
It is desirable to use local variables over global variables because they enable encapsulation: only the code inside the function body can access and modify the variable, making the variable invisible to the rest of the program. This makes it easy to understand how a variable is used by a function since its usage is restricted to the function body and we are guaranteed that no other code is accessing it.
Encapsulation is usually used for three separate reasons, which we will explore in more detail in Lesson 3, Classes:
On the other hand, global variables can be accessed by any function.
This makes it hard to be sure of the function's value when interacting with them, unless we know not only what our function does, but also what all the other code in the program that interacts with the global variable does.
Additionally, code that we add later to the program, might start modifying the global variable in a way that we did not expect in our function, breaking the functionality of our function without ever modifying the function itself. This makes it extremely difficult to modify, maintain, and evolve programs.
The solution to this problem is to use the const qualifier so that no code can change the variable, and we can treat it as a value that never changes.
Always use the const qualifier with global variables whenever possible.
Try to avoid using mutable global variables.
It is a good practice to use global const variables instead of using values directly in the code. They allow you to give a name and a meaning to the value, without any of the risks that come with mutable global variables.
It is important to understand the relationship between variables, objects, and the lifetime of objects in C++ to write programs correctly.
An object is a piece of data in the program's memory.
A variable is a name we give to an object.
There is a distinction in C++ between the scope of a variable and the lifetime of the object it refers to. The scope of a variable is the part of the program where the variable can be used.
The lifetime of an object, on the contrary, is the time during execution wherein the object is valid to access.
Let's examine the following program to understand the lifetime of an object:
#include <iostream>
/* 1 */ const int globalVar = 10;
int* foo(const int* other) {
/* 5 */ int fooLocal = 0;
std::cout << "foo's local: " << fooLocal << std::endl;
std::cout << "main's local: " << *other << std::endl;
/* 6 */ return &fooLocal;
}
int main()
{
/* 2 */ int mainLocal = 15;
/* 3 */ int* fooPointer = foo(&mainLocal);
std::cout << "main's local: " << mainLocal << std::endl;
std::cout << "We should not access the content of fooPointer! It's not valid." << std::endl;
/* 4 */ return 0;
}
The lifetime of a variable starts when it is initialized and ends when the containing block ends. Even if we have a pointer or reference to a variable, we should access it only if it's still valid. fooPointer is pointing to a variable which is no longer valid, so it should not be used!
When we declare a local variable in the scope of a function, the compiler automatically creates an object when the function execution reaches the variable declaration; the variable refers to that object.
When we declare a global variable instead, we are declaring it in a scope that does not have a clear duration – it is valid for the complete duration of the program. Because of this, the compiler creates the object when the program starts before any function is executed – even the main() function.
The compiler also takes care of terminating the object's lifetime when the execution exits from the scope in which the variable has been declared, or when the program terminates in the case of a global variable. The termination of the lifetime of an object is usually called destruction.
Variables declared in a scope block, either local or global, are called automatic variables, because the compiler takes care of initializing and terminating the lifetime of the object associated with the variables.
Let's look at an example of a local variable:
void foo() {
int a;
}
In this case, the variable a is a local variable of type int. The compiler automatically initializes the object it refers to with what is called its default initialization when the execution reaches that statement, and the object will be destroyed at the end of the function, again, automatically.
The default initialization of basic types, such as integers, is doing nothing for us. This means that the variable a will have an unspecified value.
If multiple local variables are defined, the initialization of the objects happens in the order of declaration:
void foo() {
int a;
int b;
}
Variable a is initialized before b. Since variable b was initialized after a, its object is destroyed before the one a refers to.
If the execution never reaches the declaration, the variable is not initialized. If the variable is not initialized, it is also not destroyed:
void foo() {
if (false) {
int a;
}
int b;
}
Here, the variable a is never default initialized, and thus never destroyed. This is similar for global variables:
const int a = 1;
void main() {
std::cout << "a=" << a << std::endl;
}
Variable a is initialized before the main() function is called and is destroyed after we return the value from the main() function.
We want to write a function that returns the 10th number in a Fibonacci sequence.
The nth Fibonacci number is defined as the sum of the n-1th and the n-2th, with the first number in the sequence being 0 and the second being 1.
Example:
10th Fibonacci number = 8th Fibonacci number + 9th Fibonacci number
We want to use the best practice of giving a name and a meaning to values, so instead of using 10 in the code, we are going to define a const global variable, named POSITION.
We will also use two local variables in the function to remember the n-1th and the n-2th number:
#include <iostream>
const int POSITION = 10;
const int ALREADY_COMPUTED = 3;
void print_tenth_fibonacci()
int n_1 = 1;
int n_2 = 0;
int current = n_1 + n_2;
for(int i = ALREADY_COMPUTED; i < POSITION; ++i){
n_2 = n_1;
n_1 = current;
current = n_1 + n_2;
}
std::cout << current << std::endl;
int main() {
std::cout << "Computing the 10th Fibonacci number" << std::endl;
print_tenth_fibonacci();
}
Let's understand the variable data flow of this exercise. First, the n_1 variable is initialized, then n_2 is initialized, and right after that, current is initialized. And then, current is destroyed, n_2 is destroyed, and finally, n_1 is destroyed.
i is also an automatic variable in the scope that's created by the for loop, so it is destroyed at the end of the for loop scope.
For each combination of cond1 and cond2, identify when initialization and destruction occurs in the following program:
void foo()
if(cond1) {
int a;
}
if (cond2) {
int b;
}
}
In the Introduction section, we mentioned that the caller can provide some data to the function. This is done by passing arguments to the parameters of the function.
The parameters that a function accept are part of its signature, so we need to specify them in every declaration.
The list of parameters a function can accept is contained in the parentheses after the function name. The parameters in the function parentheses are comma-separated, composed by a type, and optionally an identifier.
For example, a function taking two integer numbers would be declared as follows:
void two_ints(int, int);
If we wanted to give a name to these parameters, a and b respectively, we would write the following:
void two_ints(int a, int b);
Inside its body, the function can access the identifiers defined in the function signature as if they were declared variables. The values of the function parameters are decided when the function is called.
To call a function that takes a parameter, you need to write the name of the function, followed by a list of expressions inside a pair of parentheses:
two_ints(1,2);
Here, we called the two_ints function with two arguments: 1 and 2.
The arguments used to call the function initialize the parameters that the function is expecting. Inside the two_ints function, variable a will be equal to 1, and b will be equal to 2.
Each time the function is called, a new set of parameters is initialized from the arguments that were used to call the function.
Parameter: This is a variable that was defined by a function, and can be used to provide data as per the code.
Argument: The value the caller wants to bind to the parameters of the function.
In the following example, we used two values, but we can also use arbitrary expressions as arguments:
two_ints(1+2, 2+3);
The order in which the expression is evaluated is not specified!
This means that when calling two_ints(1+2, 2+3);, the compiler might first execute 1+2 and then 2+3, or 2+3 and then 1+2. This is usually not a problem if the expression does not change any state in the program, but it can create bugs that are hard to detect when it does. For example, given int i = 0;, if we call two_ints(i++, i++), we don't know whether the function is going to be called with two_ints(0, 1) or two_ints(1, 0).
In general, it's better to declare expressions that change the state of the program in their own statements, and call functions with expressions that do not modify the program's state.
The function parameters can be of any type. As we already saw, a type in C++ could be a value, a reference, or a pointer. This gives the programmer a few options on how to accept parameters from the callers, based on the behavior it wants.
In the following subsections, we will explore the working mechanism of Pass by value and Pass by reference in more detail.
When the parameter type of a function is a value type, we say that the function is taking an argument by value or the argument is passed by value.
When a parameter is a value type, a new local object is created each time the function is called.
As we saw with automatic variables, the lifetime of the object lasts until the execution does not reach the end of the function's scope.
When the parameter is initialized, a new copy is made from the argument provided when invoking the function.
If you want to modify a parameter but do not want or do not care about the calling code seeing the modification, use pass by value.
James wants to write a C++ program to calculate what the age of a person will be after five years by providing their current age as an input.
To implement such a program, he is going to write a function that takes a person's age by value and computes how old they will be in 5 years, and then prints it on the screen:
void byvalue_age_in_5_years(int age) {
age += 5;
std::cout << "Age in 5 years: " << age << std::endl;
// Prints 100
}
int main() {
int age = 95;
byvalue_age_in_5_years(age);
std::cout << "Current age: " << age;
// Prints 95
}
Pass by value should be the default way of accepting arguments: always use it unless you have a specific reason not to.
The reason for this is that it makes the separation between the calling code and the called function stricter: the calling code cannot see the changes that the called function makes on the parameters.
Passing parameters by value creates a clear boundary between the calling function and the called function, because the parameters are copied:
This makes it easy to understand the code, because the changes we make to the parameters have no impact outside of the function.
Pass by value can be the faster option when taking an argument, especially if the memory size of the argument is small (for example, integers, characters, float, or small structures).
We need to remember though that passing by value performs a copy of the argument. Sometimes, this can be an expensive operation both in terms of memory and processing time, like when copying a container with many elements.
There are some cases where this limitation can be overcome with the move semantic that was added in C++11. We will see more of it in Lesson 3, Classes.
Let's look at an alternative to pass by value that has a different set of properties.
When the parameter type of the function is a reference type, we say that the function is taking an argument by reference or the argument is passed by reference.
We saw earlier that a reference type does not create a new object – it is simply a new variable, or name that refers to an object that already exists.
When the function that accepts the argument by reference is called, the reference is bound to the object used in the argument: the parameter will refer to the given object. This means that the function has access to the object the calling code provided and can modify it.
This is convenient if the goal of the function is to modify an object, but it can be more difficult to understand the interaction between the caller and the called function in such situations.
Unless the function must modify the variable, always use const references, as we will see later.
James would like to write a C++ program which, given anyone's age as input, prints Congratulations! if their age will be 18 or older in the next 5 years.
Let's write a function that accepts its parameters by reference:
void byreference_age_in_5_years(int& age) {
age += 5;
}
int main() {
int age = 13;
byreference_age_in_5_years(age);
if (age >= 18) {
std::cout << "Congratulations! " << std::endl;
}
}
Contrary to passing by value the speed when passing by reference does not change when the memory size of the object passed.
This makes pass by reference the preferred method when copying an object, since providing pass by value to the function is expensive, especially if we cannot use the move semantic that was added in C++11.
If you want to use pass by reference, but you are not modifying the provided object, make sure to use const.
With C++, we can use std::cin to read input from the console executing the program.
When writing std::cin >> variable;, the program will block waiting for some user input, and then it will populate variable with the value read from the input as long as it is a valid value and the program knows how to read it. By default, we can assign all the built-in data types and some types defined in the standard library, such as string.
James is creating a program to print a message on the console screen: "Congratulations! You are eligible to vote in your country" or "No worries, just <value> more years to go." after the user provides their current age as input.
#include <iostream>
void byreference_age_in_5_years(int& age) {
if (age >= 18) {
std::cout << "Congratulations! You are eligible to vote for your nation." << std::endl;
return;
} else{
int reqAge = 18;
}
}
int main() {
int age;
std::cout << "Please enter your age:";
std::cin >> age;
The solution for this activity can be found on page 284.
A temporary object cannot be passed as an argument for a reference parameter. To accept temporary parameters, we need to use const references or r-value references. The r-value references are references that are identified by two ampersands, &&, and can only refer to temporary values. We will look at them in more detail in Lesson 4, Generic Programming and Templates.
We need to remember that a pointer is a value that represents the location of an object.
Being a value, it means that when we are accepting a parameter as a pointer, the pointer itself is passed as a value.
This means that the modification of the pointer inside the function is not going to be visible to the caller.
But if we are modifying the object the pointer points to, then the original object is going to be modified:
void modify_pointer(int* pointer) {
*pointer = 1;
pointer = 0;
}
int main() {
int a = 0;
int* ptr = &a;
modify_pointer(ptr);
std::cout << "Value: " << *ptr << std::endl;
std::cout << "Did the pointer change? " << std::boolalpha << (ptr == &a);
}
Most of the time, we can think of passing a pointer as passing a reference, with the caveat that you need to be aware that the pointer might be null.
Accepting a parameter as a pointer is mainly used for three reasons:
We will see in Lesson 4, Generic Programming and Templates, how features introduced in C++11 and C++17 allow us to avoid using pointers for some of these use cases, eliminating the possibility of some common classes of errors, such as dereferencing invalid pointers or accessing unallocated memory.
The options of passing by value or passing by reference are applicable to every single parameter the function expects, independently.
This means that a function can take some arguments by value and some by reference.
Up until now, we have seen how to provide values to a function. In this section, we will see how a function can provide value back to the caller.
We said earlier that the first part of a function declaration is the type returned by the function: this is often referred to as the function's return type.
All the previous examples used void to signal that they were returning nothing. Now, it is time to look at an example of a function returning a value:
int sum(int, int);
The previous function accepts two integers by value as parameters and returns an integer.
The invocation of the function in the caller code is an expression evaluating to an integer. This means that we can use it anywhere that an expression is allowed:
int a = sum(1, 2);
A function can return a value by using the return keyword, followed by the value it wants to return.
The function can use the return keyword several times inside its body, and each time the execution reaches the return keyword, the program will stop executing the function and go back to the caller, with the value returned by the function, if any. Let's look at the following code:
void rideRollercoasterWithChecks(int heightInCm) {
if (heightInCm < 100) {
std::cout << "Too short";
return;
}
if (heightInCm > 210) {
std::cout << "Too tall";
return;
}
rideRollercoaster();
// implicit return at the end of the function
}
A function also returns to the caller if it reaches the end of its body.
This is what we did in the earlier examples since we did not use the return keyword.
Not explicitly returning can be okay if a function has a void return type. However, it will give unexpected results if the function is expected to return a value: the returned type will have an unspecified value and the program will not be correct.
Be sure to enable the warning, as it will save you a lot of debugging time.
It is surprising, but every major compiler allows the compiling of functions, which declare a return type other than void, but don't return a value.
This is easy to spot in simple functions, but it is much harder in complex ones with lots of branches.
Every compiler supports options to warn you if a function returns without providing a value.
Let's look at an example of a function returning an integer:
int sum(int a, int b) {
return a + b;
}
As we said earlier, a function can use the return statement several times inside its body, as shown in the following example:
int max(int a, int b) {
if(a > b) {
return a;
} else {
return b;
}
}
We always return a value that's independent of the values of the arguments.
It is a good practice to return as early as possible in an algorithm.
The reason for this is that as you follow the logic of the code, especially when there are many conditionals, a return statement tells you when that execution path is finished, allowing you to ignore what happens in the remaining part of the function.
If you only return at the end of the function, you always have to look at the full code of the function.
Since a function can be declared to return any type, we have to decide whether to return a value or a reference.
A function whose return type is a value type is said to return by value.
When a function that returns by value reaches a return statement, the program creates a new object, which is initialized from the value of the expression in the return statement.
In the previous function, sum, when the code reaches the stage of returning a + b, a new integer is created, with the value equal to the sum of a and b, and is returned.
On the side of the caller, int a = sum(1,2);, a new temporary automatic object is created and is initialized from the value returned by the function (the integer that was created from the sum of a and b).
This object is called temporary because its lifetime is valid only while the full-expression in which it is created is executed. We will see in the Returning by Reference section, what this means and why this is important.
The calling code can then use the returned temporary value in another expression or assign it to a value.
Add the end of the full expression, since the lifetime of the temporary object is over, it is destroyed.
In this explanation, we mentioned that objects are initialized several times while returning a value. This is not a performance concern as C++ allows compilers to optimize all these initializations, and often initialization happens only once.
It is preferable to return by value as it's often easier to understand, easier to use, and as fast as returning by reference.
How can returning by value be so fast? C++11 introduced the move semantic, which allows moving instead of copying the return types when they support the move operation. We'll see how in Lesson 3, Classes. Even before C++11, all mainstream compilers implemented return value optimization (RVO) and named return value optimization (NRVO), where the return value of a function is constructed directly in the variable into which they would have been copied to when returned. In C++17, this optimization, also called copy elision, became mandatory.
A function whose return type is a reference is said to return by reference.
When a function returning a reference reaches a return statement, a new reference is initialized from the expression that's used in the return statement.
In the caller, the function call expression is substituted by the returned reference.
However, in this situation, we need to also be aware of the lifetime of the object the reference is referring to. Let's look at an example:
const int& max(const int& a, const int& b) {
if (a > b) {
return a;
} else {
return b;
}
}
First, we need to note that this function already has a caveat. The max function is returning by value, and it did not make a difference if we returned a or b when they were equal.
In this function, instead, when a == b we are returning b, this means that the code calling this function needs to be aware of this distinction. In the case where a function returns a non-const reference it might modify the object referred to by the returned reference, and whether a or b is returned might make a difference.
We are already seeing how references can make our code harder to understand.
Let's look at the function we used:
int main() {
const int& a = max(1,2);
std::cout << a;
}
This program has an error! The reason is that 1 and 2 are temporary values, and as we explained before, a temporary value is alive until the end of the full expression containing it.
To better understand what is meant by "the end of the full expression containing it", let's look at the code we have in the preceding code block: int& a = max(1,2);. There are four expressions in this piece of code:
All of this happens in the declaration statement of variable a.
The third point covers the function call expression, while containing the full expression is covered in the following point.
This means that lifetimes 1 and 2 will stop at the end of the assignment. But we got a reference to one of them! And we are using it!
Accessing an object whose lifetime is terminated is prohibited by C++, and this will result in an invalid program.
In a more complex example, such as int a = max(1,2) + max(3,4);, the temporary objects returned by the max functions will be valid until the end of the assignment, but no longer.
Here, we are using the two references to sum them, and then we assign the result as a value. If we assigned the result to a reference, as in the following example, int& a = max(1,2) + max(3,4);, instead, the program would have been wrong.
This sounds confusing, but it is important to understand as it can be a source of hard-to-debug problems if we use a temporary object after the full expression in which it's created has finished executing.
Let's look at another common mistake in functions returning references:
int& sum(int a, int b) {
int c = a + b;
return c;
}
We created a local, automatic object in the function body and then we returned a reference to it.
In the previous section, we saw that local objects' lifetimes end at the end of the function. This means that we are returning a reference to an object whose lifetime will always be terminated.
Earlier, we mentioned the similarities between passing arguments by reference and passing arguments by pointers.
This similarity persists when returning pointers: the object pointed to by a pointer needs to be alive when the pointer is later dereferenced.
So far, we have covered examples of mistakes while returning by reference. How can references be used correctly as return types to functions?
The important part of using references correctly as return values is to make sure that the object outlives the reference: the object must always be alive – at least until there is a reference to it.
A common example is accessing a part of an object, for example, using an std::array, which is a safe option compared to the built-in array:
int& getMaxIndex(std::array<int, 3>& array, int index1, int index2) {
/* This function requires that index1 and index2 must be smaller than 3! */
int maxIndex = max(index1, index2);
return array[maxIndex];
The calling code would look as follows:
int main() {
std:array<int, 3> array = {1,2,3};
int& elem = getMaxIndex(array, 0, 2);
elem = 0;
std::cout << array[2];
// Prints 0
}
In this example, we are returning a reference to an element inside an array, and the array remains alive longer than the reference.
The following are guidelines for using return by reference correctly:
When returning a reference that was received as a parameter, the argument passed to the function must live longer than the returned reference.
Apply the previous rule, even when you are returning a reference to a part of the object (for example, an element of an array).
In this activity, we are going to see the different trade-offs that can be made when writing a function, depending on the parameters the function accepts:
Take the arrays by reference and return by reference because we are saying that the calling function is supposed to modify the element. Take the index by value since there is no reason to use references.
If the values are the same, the element from the first array is returned.
The solution to this activity can be found at page 285.
In the previous chapter, we saw how and when to use references in function parameters and return types. C++ has an additional qualifier, the const qualifier, which can be used independently from the ref-ness (whether the type is a reference or not) of the type.
Let's see how const is used in the various scenarios we investigated when looking at how functions can accept parameters.
In pass by value, the function parameter is a value type: when invoked, the argument is copied into the parameter.
This means that regardless of whether const is used in the parameter or not, the calling code cannot see the difference.
The only reason to use const in the function signature is to document to the implementation that it cannot modify such a value.
This is not commonly done, as the biggest value of a function signature is for the caller to understand the contract of calling the function. Because of this, it is rare to see int max(const int, const int), even if the function does not modify the parameters.
There is an exception, though: when the function accepts a pointer.
In such cases, the function wants to make sure that it is not assigning a new value to the pointer. The pointer acts similar to a reference here, since it cannot be bound to a new object, but provides nullability.
An example could be setValue(int * const), a function that takes a const pointer to an int.
The integer is not const, so it can be changed, but the pointer is const and the implementation cannot change it during implementation.
Const is extremely important in pass by reference, and any time you use a reference in the parameters of a function, you should also add const to it (if the function is not designed to modify it).
The reason for this is that a reference allows you to modify the provided object freely.
It is error-prone, since the function might modify an object the caller does not expect to be modified by mistake, and it is hard to understand as there is no clear boundary between the caller and the function, again, because the function can modify the state of the caller.
const instead fixes the problem, since a function cannot modify an object through a const reference.
This allows the function to use reference parameters without some of the drawbacks of using references.
The function should remove the const from a reference, but only if it is intended to modify the provided object, otherwise every reference should be const.
Another advantage of const reference parameters is that temporary objects can be used as arguments for them.
There is no widespread reason to return by const value since the calling code often assigns the value to a variable, in which case the const-ness of the variables is going to be the deciding factor, or passes the value to a next expression, and it is rare for an expression to expect a const value.
Returning by const value also inhibits the move semantic of C++11, thus reducing performance.
A function should always return by const reference when the returned reference is meant to only be read and not be modified by the calling code.
The same concept we applied to object lifetimes when returning references to them also apply to const:
A parameter accepted as a reference should be returned as a const reference if the caller is not expected to modify it.
Sometimes, the compilation fails, stating that the code is trying to modify an object that is a const reference. Unless the function is meant to modify the object, the solution is not to remove const from the reference in the parameter. Instead, look for why the operation that you are trying to perform does not work with const, and what the possible alternatives are.
const is not about the implementation, it is about the meaning of the function.
When you write the function signature, you should decide whether to use const, as the implementation will have to find a way to respect that.
For example:
void setTheThirdItem(std::array<int, 10>& array, int item)
This should clearly take a reference to the array since its purpose is to modify the array.
On the other hand, we can use the following:
int findFirstGreaterThan(const std::array<int, 10>& array, int threshold)
This tells us that we are only looking into the array – we are not changing it, so we should use const.
It is a best practice to use const as much as possible, as it allows the compiler to make sure that we are not modifying objects that we do not want to modify.
This can help to prevent bugs.
It also helps to keep another best practice in mind: never use the same variable to represent different concepts. Since the variable cannot be changed, it is less natural to reuse it instead of creating a new one.
Another feature C++ provides to make life easier for the caller when it comes to calling functions are default arguments.
Default arguments are added to a function declaration. The syntax is to add an = sign and supply the value of the default argument after the identifier of the parameter of the function. An example of this would be:
int multiply(int multiplied, int multiplier = 1);
The caller of the function can call multiply either with 1 or 2 arguments:
multiply(10); // Returns 10
multiply(10, 2); // Returns 20
When an argument with a default value is omitted, the function uses the default value instead. This is extremely convenient if there are functions with sensible defaults that callers mostly do not want to modify, except in specific cases.
Imagine a function that returns the first word of a string:
char const * firstWord(char const * string, char separator = ' ').
Most of the time, a word is separated by a whitespace character, but a function can decide whether or not it should use a different separator. The fact that a function offers the possibility to provide a separator is not forcing most callers, which simply want to use the space, to specify it.
It is a best practice to set the default arguments in the function signature declaration, and not declare them in the definition.
One of the goals of functions is to better organize our code. To do so, it is important to give meaningful names to them.
For example, in package management software, there might be a function called sort for sorting packages. As you can see, the name is the same as the function that would sort a list of numbers.
C++ has a feature that allows you to avoid these kinds of problems and groups names together: namespaces.
A namespace starts a scope in which all the names declared inside are part of the namespace.
To create a namespace, we use the namespace keyword, followed by the identifier and then the code block:
namespace example_namespace {
// code goes here
}
To access an identifier inside a namespace, we prepend the name of the namespace to the name of the function.
Namespaces can be nested as well. Simply use the same declaration as before inside the namespace:
namespace parent {
namespace child {
// code goes here
}
}
To access an identifier inside a namespace, you prepend the name of the identifier with the name of the namespace in which it is declared, followed by ::.
You might have noticed that, before we were using std::cout. This is because the C++ standard library defines the std namespace and we were accessing the variable named cout.
To access an identifier inside multiple namespaces, you can prepend the list of all the namespaces separated by :: – parent::child::some_identifier. We can access names in the global scope by prepending :: to the name—::name_in_global_scope.
If we were to only use cout, the compiler would have told us that the name does not exist in the current scope.
This is because the compiler searches only in the current namespace and the parent namespaces to find an identifier by default, so unless we specify the std namespace, the compiler will not search in it.
C++ helps make this more ergonomic with the help of the using declaration.
The using declaration is defined by the using keyword, followed by an identifier specified with its namespaces.
For example, using std::cout; is a using declaration that declares that we want to use cout. When we want to use all the declarations from a namespace, we can write using namespace namespace_name;. For example, if we want to use every name defined in the std namespace, we would write: using namespace std;.
When a name is declared inside the using declaration, the compiler also looks for that name when looking for an identifier.
This means that, in our code, we can use cout and the compiler will find std::cout.
A using declaration is valid as long as we are in the scope in which it is declared.
To better organize your code and avoid naming conflicts, you should always put your code inside a namespace that's specific to either your application or library.
Namespaces can also be used to specify that some code is used only by the current code.
Let's imagine you have a file called a.cpp that contains int default_name = 0; and another file called b.cpp with int default_name = 1;. When you compile the two files and link them together, we get an invalid program: the same variable has been declared with two different values, and this violates the One Definition Rule (ODR).
But you never meant for those to be the same variable. To you, they were some variables that you just wanted to use inside your .cpp file.
To tell that to the compiler, you can use anonymous namespaces: a namespace with no identifier.
All the identifiers created inside it will be private to the current translation unit (normally the .cpp file).
How can you access an identifier inside an anonymous namespace? You can access the identifier directly, without the need to use the namespace name, which does not exist, or the using declaration.
You should only use anonymous namespaces in .cpp files.
Write a function to read the name of a car for a lottery in a namespace based on numerical input. If the user inputs 1, they win a Lamborghini, and if the user inputs 2, they win a Porsche:
The solution for this activity can be found on page 285.
We saw how C++ allows us to write a function that takes parameters either by value or by reference, using const, and organizes them in namespaces.
There is an additional powerful feature of C++ that allows us to give the same name to functions that perform the same conceptual operation on different types: function overloading.
Function overloading is the ability to declare several functions with the same name – that is, if the set of parameters they accept is different.
An example of this is the multiply function. We can imagine this function being defined for integers and floats, or even for vectors and matrices.
If the concept represented by the function is the same, we can provide several functions that accept different kinds of parameters.
When a function is invoked, the compiler looks at all the functions with that name, called the overload set, and picks the function that is the best match for the arguments provided.
The precise rule on how the function is selected is complex, but the behavior is often intuitive: the compiler looks for the better match between the arguments and the expected parameters of the function. If we have two functions, int increment(int) and float increment(float), and we call them with increment(1), the integer overload is selected because an integer is a better match to an integer than a float, even if an integer can be converted into a float. An example of this would be:
bool isSafeHeightForRollercoaster(int heightInCm) {
return heightInCm > 100 && heightInCm < 210;
}
bool isSafeHeightForRollercoaster(float heightInM) {
return heightInM > 1.0f && heightInM < 2.1f;
}
// Calls the int overload
isSafeHeightForRollercoaster(187);
// Class the float overload
isSafeHeightForRollercoaster(1.67f);
Thanks to this feature, the calling code does not need to worry about which overload of the function the compiler is going to select, and the code can be more expressive thanks to using the same function to express the same meaning.
Johnny wants to implement a math library for the video game he is making. It will be a 3D game, so he will need to operate on points representing the three coordinates: x, y, and z.
The points are represented as std::array<float, 3>. A library will be used throughout the game, so Johnny needs to be sure it can work when included multiple times (by creating a header file and declaring the functions there).
The library needs to support the following steps:
For example (with a list of 5 points): for the array {{0,0,0}, {1,0,0}, {1,1,0}, {0,1,0}, {0,1,1}}, the total distance is 5, because going from {0,0,0} to {1,0,0} is a distance of 1, then going from {1,0,0} to {1,1,0} is a distance of 1 again, and so on for the remaining 3 points.
The solution for this activity can be found on page 286.
Make sure that the functions are well-organized by grouping them together.
Remember that the distance between two points is computed as the square root of (x2-x1)^2 + (y2-y1)^2 + (z2-z1)^2.
C++ offers the std::pow function for the power function, which takes the base and the exponent, and the std::sqrt function, which takes the number to square. Both are in the cmath header.
In this chapter, we saw the powerful features C++ offers to implement functions.
We started by discussing why functions are useful and what they can be used for, and then we dove into how to declare and define them.
We analyzed different ways of accepting parameters and returning values, how to make use of local variables, and then explored how to improve the safety and convenience of calling them with const and default arguments.
Finally, we saw how functions can be organized in namespaces and the ability to give the same name to different functions that implement the same concept, making the calling code not have to think about which version to call.
In the next chapter, we will look at how to create classes and how they are used in C++ to make building complex programs easy and safe.