Chapter 3
Prevent, Detect, and Remove Bugs
3.1 Developing Software 6= Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.1.1 Before coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1.2 During coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.1.3 After coding .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Common Mistakes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2.1 Uninitialized Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.2 Wrong Array Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.3 Wrong Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Post-Execution and Interactive Debugging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 Separate Testing Code from Production Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Some books suggest that software should be well-designed, carefully written, and never
debugged. These books do not say anything about debugging. From my experience writing
programs, working with students, and talking to people in the software industry, debugging
is difficult to avoid completely, even when software is planned and written carefully. In some
ways, debugging is like editing an article. It is very difficult to write a good article without
any editing. Even though debugging is difficult to avoid completely, it should not be relied
upon. Experienced programmers carefully prevent bugs from happening and detect them
as early as possible.
Many people learn software development by writing small programs (tens of lines for
each program). This is good because learning should progress in stages. The problem is that
many people hold onto habits acceptable for small programs when they attempt to write
larger programs. Writing a program of 400 lines requires different strategies than writing a
program of 40 lines. This book is written for people learning how to write programs that are
between 100 and 1,000 “lines of code” (LoC). Although LoC is not a particularly good way
of measuring software complexity, it does serve as a very basic yardstick for how complex a
program might be. Finding a good way to measure software complexity is beyond the scope
of this book. Instead, this book gives some suggestions on how to write correct programs.
3.1 Developing Software 6= Coding
From the experience of writing small programs, some students have the habit of “coding
testing debugging”. Unfortunately, this is the wrong approach to developing software.
Expert programmers use strategies to prevent, detect, and remove software bugs. Coding is
not developing software. Coding means typing statements in a text editor. Coding is only
one small part of developing software.
Before typing a single line of code, you first need to know why you are developing the
software. Perhaps you are working on homework assignments for a programming class. In
33
34 Intermediate C Programming
this case, you should ask the purposes of these assignments. In particular, there should
be some learning objectives. Without knowing the purposes, it is impossible to understand
how to evaluate software. This is increasingly important as software becomes more complex.
Complex software has many parts and you need to understand why these parts are needed
and how they affect each other. Developing software requires many steps before, during,
and after coding. The following gives a few principles for you.
3.1.1 Before coding
Read the specification and understand the requirements.
Consider possible inputs and expected outputs.
Identify valid but unexpected inputs and the correct outputs. For example, when
writing a sorting program, what would the program do if the input is already sorted?
What would the program do if the same number occurs multiple times? Identifying
unexpected inputs is a key concern for developing reliable software.
Identify invalid inputs and the ways to detect them. Consider the sorting program as
an example again. If the specification says valid inputs are positive integers. What
would the program do if the input contains floating-point numbers? What would the
program do if the input contains negative numbers? What would the program do if
the input contains symbols (such as !@#&)? Even when the input is invalid, your
program should never crash (for example, producing a segmentation fault). Besides
being incorrect, software that crashes is a sign of security risks.
Think about the solution and sketch down an approach on paper.
Draw block diagrams showing how information is exchanged among different parts of
the program.
After you have a design, plan the implementation aspect of the program: How many
functions should be created? What does each function do? How many files are needed?
When you design a function, try to keep these two suggestions in mind: Each function
should do only one thing, and should consist of no more than 40 lines of code. This
number is not rigid: 45 lines are all right; 120 lines are too long. A general rule is that
the entire function should fit in your computer screen using a readable font size.
If you have a detailed design, you will save time on coding and debugging.
3.1.2 During coding
This may surprise you: If you want to finish the programs correctly and faster, write
more code. Write code that is not needed. Before you put the code for one requirement
into a larger program, write a small program to test your solution. This is called the
unit test. If you cannot make one function work, you will definitely be incapable of
making anything work after putting many functions together. After you have done
some parts of the programs, make sure these parts work before continuing. You need
to write additional code to test these parts, since the other parts are not ready yet. A
rule of thumb is this: For every line of code you have to write, you should write three
additional lines of code. This additional work helps you understand what you must
do and helps you test what you have done.
Always use a text editor that automatically indents programs. Such an editor can help
you detect braces at wrong places. Why is indentation important? Because it is easier
to visually detect misaligned code. Using the right tools can save you valuable time.
Read your code line by line before running any test case. If you have not tried this
method, you may be surprised at how effective this method can be. Reading code can
help you find problems that are difficult to find by testing. One example is:
Prevent, Detect, and Remove Bugs 35
i f (a > 0) ;1
{2
... // always runs , not controlled by the if co ndition3
}4
The semicolon ; ends the if condition. As a result, the code inside { and } is not
controlled by the if condition and always runs.
Run some simple test cases in your head. If you do not understand what your program
does, the computer will not be able to do what you want.
Write code to test whether certain conditions are met, before proceeding. Suppose
sorting is part of a program: Check whether the data is sorted before the program
does anything else.
Avoid copying and pasting code; instead, refactor the code by creating a function
and, thus, avoiding duplication. If you need to make slight changes to the copied
code, use the function’s argument(s) to handle the differences. This is a tried-and-
true principle: Similar code invites mistakes. You will soon lose track of the number of
copies and the differences among similar code. It is difficult to maintain the consistency
of multiple copies of the code. You will likely find that your program is correct in some
situations and wrong in others. Finding and removing this type of bug can be very
time-consuming. Your best strategy is to avoid it in the first place. It is better to write
a program that is always wrong than a program that is sometimes right. If it is always
wrong, and the problems come from only a single place, you can focus on that place.
If the problems do not consistently appear and come from many possible places, it is
more difficult to identify and remove the mistakes.
Use version control. Have you ever had an experience like this: “Some parts of the
program worked yesterday. I made some changes and nothing works now. I changed
so many places that I don’t remember exactly what I have changed.”? Version control
allows you to see the changes from the previous commit.
Resolve all compiler warnings. Many studies have shown that warnings are likely to
be serious errors, even though they are not syntax errors. Some people ignore warning
messages, thinking that they can handle the warnings after they get their programs
to work. However, the warning messages frequently indicate the problems preventing
their programs from working.
3.1.3 After coding
Read your program after you think you have finished it. Check the common mistakes
described below. Do not rely on testing: Testing can tell you that the program does not
work; it cannot tell you that the program does work. It is possible that the test cases do not
cover all possible scenarios. It is usually difficult to design test cases that cover all possible
scenarios. For a complex program, covering all possible scenarios is usually impossible.
3.2 Common Mistakes
Here is a list of some common mistakes I have seen in the programs written by students
(sometimes even by myself). Many students assure me that they will never make these
mistakes. The reality is that people do make these mistakes, and more often than they
36 Intermediate C Programming
think. This section considers only coding mistakes, not design mistakes. Design mistakes
require a different book on the subject of designing software.
3.2.1 Uninitialized Variables
One common mistake is uninitialized variables. Some students think all variables are
initialized to zero automatically. This is wrong. Uninitialized variables store garbage values.
The values may be zero but there is no guarantee. This type of mistake is difficult to
discover via testing. Sometimes, the values may happen to be zero, leading you to think that
the program is correct. When the values are not zero, the programs have problems. Some
students think that initializing variables slows down a program—however, these nanoseconds
of delay are negligible. It is better to slow down your program by a few nanoseconds than
to spend hours debugging.
3.2.2 Wrong Array Indexes
For an array of n elements, the valid indexes are 0, 1, 2, ..., n 1 and n is an invalid index.
When a program has a wrong index, the program may seem to work on some occasions,
but may crash on others. You do not want to write a program whose behavior depends on
luck.
3.2.3 Wrong Types
You can ride a bicycle. You can write with a pen. You cannot ride a pen. You cannot
write with a bicycle. In a program, types specify what can be done. You need to understand
and use types correctly. The trend of programming languages is to make types more restric-
tive and to prevent programmers from making accidental mistakes. Sometimes gcc treats
suspicious type problems as warnings. You should treat these warnings as serious mistakes.
3.3 Post-Execution and Interactive Debugging
To debug a program, you need a strategy. You need to divide the program into stages and
isolate the problems based on the stages. Ensure the program is correct in each stage before
integrating the stages. For example, consider a program with three stages: (i) reads some
integers from a file, (ii) sorts the integers, and (iii) saves the sorted integers to another file.
Testing each stage before integration is called unit testing. For unit tests, you often need to
write additional code as the “drivers” of a stage. For example, to test whether sorting works
without getting the data from a file, you need to write code that generates the data (maybe
using a random number generator). Debugging can be interactive or post-execution. If a
program takes hours, you may not want to debug the program interactively. Instead, you
may want the program to print debugging messages (this is called logging). The messages
help you understand what occurs during the long execution. Another situation is debugging
a program that communicates with another program that has timing requirements. For
example, you debug a program that communicates with another program through networks.
If you debug the program interactively and slow it down too much, the other program may
think the network is disconnected and stop communicating with your program. Yet another
Prevent, Detect, and Remove Bugs 37
scenario is that your program interfaces with the physical world (e.g., controlling a robot).
The physical world does not wait for your program and it cannot slow down too much.
Logging also slows down a program; thus, do not add excessive amounts of logging.
In many other cases, you can slow down your programs and debug the programs
interactively—run some parts of the programs, see the intermediate results, change the
programs, run them again, continue the process until you are convinced the programs are
correct. For interactive debugging, printing debugging messages is usually ineffective and
time-wasting. There are several problems with printing debugging messages for interactive
debugging:
Code needs to be inserted for printing debugging messages . This can be a considerable
amount of effort. In most cases, the debugging messages must be removed later because
debugging messages should appear in neither the final code nor its output.
If there are too few messages, there is insufficient information to help you determine
what is wrong.
If there are too many messages, some messages may be irrelevant and should be
ignored. Getting the right amount of messages, not too few and not too many, can be
difficult.
Worst of all, problems are likely to occur at unexpected places where no debugging
messages have been inserted. As a result, more and more debugging messages must
be added. This can be time-consuming and frustrating.
Instead of using debugging messages in interactive debugging, gdb (or DDD) is a better
tool in most cases. I have shown you some gdb commands. I will describe more commands
later in this book.
3.4 Separate Testing Code from Production Code
You should write programs that can detect their own bugs. If you want to check whether
an array is sorted, do not print the elements on screen and check by your eyes. Write a
function that checks whether an array is sorted. The code is usually not printing debugging
messages. Instead, write code that can help you debug without relying on your own eyes.
You should consider writing testing code before you write a program. This is a common
practice called test-driven development. How to write testing code? Many books have been
written about software testing. This section gives you one suggestion. Consider the following
two examples of testing your code. Suppose func is the function you want to test and
test func is the code for testing func.
func(arguments)
{
/* do work to get result */
/* test to check result */
}
test_func(arguments)
{
/* create arguments */
result = func(arguments);
/* check the result */
}
What is the difference between these two approaches? The first (located on the left) calls
the testing code inside a function of your program. In the second (located on the right),
the testing code is outside your program and the testing code calls func. This difference
is important because the first mixes the testing code with the actual code needed for your
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset