Programming Problems and Debugging 107
Symbol Address Value
chptr2 114 100
chptr1 113 100
str1[6] 112 0’
str1[5] 111 d
str1[4] 110 n
str1[3] 109 o
str2[2] 108 c
str2[1] 107 e
str2[0] 106 S
str1[5] 105 0’
str1[4] 104 t
str1[3] 103 s
str1[2] 102 r
str1[1] 101 i
str1[0] 100 F
Lines 11 and 12 make chptr1 and chptr2 store the address of str1[0]. By putting
const in front of char * at line 11, we do not want to change the value at the memory
that is pointed by chptr1. Line 13 is not allowed because this line attempts to change the
value at address 100, i.e., & str1[0], through chptr1. Please note the word through. It
is still possible to change str1[0] as long as the change is not through chptr1. Line 15 does
not prevent us from changing chptr1 itself because chptr1 is not a constant. Hence, we
can change the value of chptr1 at line 15.
Symbol Address Value
chptr1 113 100 106 by line 15
In contrast, line 12 says chptr2 is a constant and so it cannot be changed in line 16.
However, it is possible to change the value of str1[0] through chptr2 at line 14.
Symbol Address Value
chptr2 114 100 (cannot be changed)
chptr1 113 100 106 by line 15
str1[0] 100 F C by line 14
In my countchar, str itself is not a constant (similar to chptr1) so changing str itself
at line 23 is allowed. In fact, const can be used twice for the same pointer. In this exam-
ple, chptr3 stores the address of & str1[0]. Changing the value of chptr3 (line 17) and
changing the value at the address (line 18) are disallowed.
// const2 . c1
#in clude < stdio .h >2
#in clude < stdlib .h >3
int main ( i n t argc , char * argv [])4
{5
char str1 [20];6
char str2 [20];7
strcpy ( str1 , " First " );8
strcpy ( str2 , " Second ") ;9
const char * chptr1 = & str1 [0];10
char * const chptr2 = & str1 [0];11
const char * const chptr3 = & str1 [0];12
// * chptr1 = C ; // not allowed13
108 Intermediate C Programming
* chptr2 = C ; // OK14
chptr1 = & str2 [0]; // OK15
// chptr2 = & str2 [0]; // not allowed16
// chptr3 = & str2 [0]; // not allowed17
// * chptr3 = C ; // not allowed18
return EXIT _SUCCES S ;19
}20
7.2 Debugging
To write good programs, we must abandon the habit of “coding-testing-debugging”.
Testing does not magically reveal what is wrong with a program and how to fix it. Instead,
we must have a plan before writing code. This includes having a testing and debugging
strategy. After writing each line of code, read it carefully. This saves time. It will help you
find simple mistakes, but reading code will also help reveal more subtle problems.
A common problem among learners is forgetting to initialize variables. C does not ini-
tialize variables. It is your responsibility. This can lead to apparently inexplicable bugs and
worse code that seems to work but then mysteriously fails. It is hard to find these types
of bugs by “testing” and “debugging” because the program behavior can be unexpected,
or sometimes appear correct. In this case, there is no substitute to reading the code, and
asking yourself if every variable has been initialized.
Another common mistake is putting ; in the wrong place. In the code listing for
mystring.c, if an ; is placed at the end of line 7, 17, or 30 then the program is incor-
rect because the block of code between { and } is no longer related to the while or if
conditions. Putting ; at the end of line 7, 17, or 30 makes the program enter infinite loops.
This problem can be difficult to find by testing alone because the program does not stop
and the output will probably be incomplete. Putting ; at the end of line 19 increments
the count regardless of whether * str and ch match. These problems are easy to find by
reading code line by line carefully. Unfortunately, gcc cannot offer much help because the
program is syntactically correct.
7.2.1 Find Infinite Loops
Even if you are very careful, it is difficult to avoid all mistakes. Thus it is useful to know
how to use a debugger such as gdb. It allows us to execute code line by line, and inspect
the values in the variables. It is important to appreciate that gdb is not a substitute to
reading your code, and reasoning about it logically. Nonetheless, gdb augments our ability
to diagnose and fix problems in code, for example, in finding infinite loops.
Infinite loops are typically indicated by a program that should stop quickly but does not.
Running the program again and again, of course, will not be helpful. Inserting “debugging
messages” into the code may be more helpful, but it also takes a lot of time. We need
to know where the actual problem is. Possible but difficult. If we know the problem, we
will fix it without inserting debugging messages. After finding and fixing the problem, the
debugging messages usually have to be removed. There is a better way: The gdb debugger
can make this easy.
Assume that we are running the code for mystring.c and a ; has accidentally been
inserted at the end of line 7. The program enters an infinite loop and does not stop.
Programming Problems and Debugging 109
$ ./mystring strlen input output strlen
What is the fastest way to diagnose this problem? Start gdb in a Linux Terminal.
$ gdb mystring
Please remember that gdb takes the executable file as the input, and not a .c file. Inside
gdb, type
(gdb) r strlen input output strlen
The first letter r means “run” the program. It replaces ./mystring in the command
line. Add the normal command-line arguments after r. The program now starts and enters
the aforementioned infinite loop. Press Ctrl-c to interrupt the normal execution of the
program and gdb will display something like:
Program received signal SIGINT, Interrupt.
0x0000000000400884 in my_strlen (str=0x603590 "If we consider that
part of the theory of relativity which may ") at mystring.c:7
This means that the program has stopped at line 7 of the file mystring.c. The message
tells us where the infinite loop is. At the (gdb) prompt, type list to show the code:
int my_strlen ( const char * str )4
{5
int len = 0;6
while ( str [ len ] != 0 );7
{8
len ++;9
As explained earlier, infinite loops may occur in many places (lines 7, 17, and 30). Using
gdb to identify an infinite loop is easy and takes only a few seconds. Moreover, we do not
need to modify any line before using gdb.
7.2.2 Find Invalid Memory Accesses
An “invalid memory access”, or “segmentation fault”, is a common error that stops a
program. It occurs when a program attempts to access memory outside the allowed regions.
A program that has invalid memory accesses may create security vulnerability. We can
introduce a memory error into line 17 of mystring.c. That line should be:
while (* str != 0 )17
Suppose we accidentally write it as:
while (* str != 0 ) // without 17
What is the difference between 0’ and ’0’? The former is a null-terminator, a special
invisible character indicating the end of a string. The latter is a normal character like a’,
but happens to be the character zero: 0’. Do not mix them. When running the program
$ ./mystring countchar input output countchar
we get
Segmentation fault (core dumped)
110 Intermediate C Programming
Segmentation fault means the program tries to access (read or write) memory at an
invalid address. The operating system reacts by stopping the program. It is similar to
attempting to enter someone else’s house. If you do not own the house, entering the house is
illegal. We can determine where segmentation fault occurs by using either gdb or valgrind.
If we run the program using gdb, we will see something like:
Program received signal SIGSEGV, Segmentation fault.
0x00000000004008cd in my_countchar (str=0x624000 $<$Address 0x624000
out of bounds$>$, ch=73 ’I’) at mystring.c:17
Type the bt (backtrace) command at the gdb prompt, to see the call stack.
#0 0x00000000004008cd in my_countchar (str=0x624000
$<$Address 0x624000 out of bounds$>$, ch=73 ’I’) at mystring.c:17
#1 0x0000000000400cc9 in main (argc=4, argv=0x7fffffffe418) at main.c:74
The call stack shows two frames. The top frame is frame 0 and the next is frame 1. In
gdb, you can see a specific frame. Type
(gdb) f 1
after the (gdb) prompt to enter frame 1. Type list to show the code around line 74 in
main.c. Type print i to print the value of i. This is the line number of the input file. Its
value is 0 meaning that the my countchar is processing the first line of input.
The first line of the input is “If we consider that part of the theory of relativity which
mayn” gdb can tell us that if we type the command print lines[i]:
(gdb) print lines[i]
$3 = 0x603590 “If we consider that part of the theory of relativity which mayn”
The starting address of this string is 0x603590. Something is wrong inside my countchar
but at this stage, it is unclear precisely what is wrong. Let’s go back to the frame of
my countchar by typing f 0:
(gdb) f 0
#0 0x00000000004008cd in my countchar (str=0x624000
<Address 0x624000 out of bounds>, ch=73 ’I’) at mystring.c:17
The segmentation fault occurs at line 17 and this line reads the value at the address stored
in str. Print the value of str in gdb:
(gdb) print str
$4 = 0x624000 <Address 0x624000 out of bounds>
Compare the value of str (0x624000) and the starting address of the string in main
(0x603590). The difference is quite large but the string is not very long, only 63 characters.
This means that str kept increasing far beyond the end of the string.
You may ask, “Why does the segmentation fault occur when str is so large? Doesn’t
the program start accessing invalid addresses after str is larger than 0x603590 + 63?”
It is correct that the program starts accessing invalid addresses after str is larger than
0x603590 + 63. However, Linux stops the program only when it reads memory that it is
not authorized to use. Since the memory is given in chunks, as explained in Section 5.3,
the segmentation fault occurs when the program accesses a memory address beyond the
currently authorized segment. A program may access all the addresses inside the segments
given to the program, and the operating system will not stop it. This does not mean that
the program is correct. We need to correct the program as soon as possible.
Programming Problems and Debugging 111
7.2.3 Detect Invalid Memory Accesses
Another way to detect invalid memory accesses is using valgrind. The log file from
valgrind can be quite long. You should go to the very last line. In this example, the line
says:
ERROR SUMMARY: 59745 errors from 4 contexts (suppressed: 2 from 2)
Do not be too concerned about the number of errors detected. Fixing one error will Often
fix many of the other detected errors. This is because a single error can be hit many times
as a program executes. Go to the very top of the log file and start looking for anything
related to the source files, i.e., mystring.c, mystring.h, and main.c. The first detected
problem related to mystring.c is:
==4238== Conditional jump or move depends on uninitialised value(s)
==4238== at 0x4008D2: my_countchar (mystring2.c:17)
==4238== by 0x400CC8: main (main.c:74)
Another detected problem is:
==4238== Conditional jump or move depends on uninitialised value(s)
==4238== at 0x4008BE: my_countchar (mystring2.c:19)
==4238== by 0x400CC8: main (main.c:74)
This is related to the problem at line 17. If we fix line 17, the problem at line 19
disappears. Some students think that accessing invalid memory is harmless as long as the
programs do not have segmentation faults. This is wrong. Allowing invalid addresses is
one of the most common security problems in software. It can allow a malicious program to
“hijack” another program. If a program accesses invalid addresses, the program’s behavior is
not defined. That means it may work a hundred or a thousand times, and then mysteriously
fail. The same program may fail when using a different compiler, or run on a different
computer.
Please remember that testing can demonstrate that something is wrong; however, testing
cannot demonstrate that everything is right. As we see, Linux stops the program when str
is already very far away from valid addresses. Thus, we cannot rely on testing exclusively.
Instead, we need to use many methods to prevent and detect mistakes.
“Which one should I use, gdb or valgrind?”, you may ask. The answer is both. These
two tools serve different purposes. Choosing gdb vs. valgrind is like choosing a hammer vs.
a screw driver. Use the right tool for the job: gdb is interactive, and allows a programmer
to see the program’s execution line by line. In contrast, valgrind runs the program until it
stops (or crashes). In general it is a good idea to use valgrind first to detect whether there
are any memory problems and then use gdb to pinpoint the problem. You should always use
valgrind to check whether your programs have invalid memory accesses. The command is:
$ valgrind –leak-check=full –tool=memcheck –verbose
What is the difference between gcc and valgrind? Doesn’t gcc also check whether a
program has problems? The gcc compiler checks the source code: i.e., it finds syntax errors.
This is a very rudimentary form of error checking. It is like a spell-checker in a document
editor. An article without any spelling error does not mean that the article makes any sense.
In similar fashion, gcc does not check what happens when the program runs. It is impossible
for gcc to check what the program does when it is running.
In contrast, valgrind is a run-time checker. The program must be run for valgrind to
check anything. This implies the following: If the program does not execute the parts of code
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset