7. Programming Problems and Debugging (3/4)

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Programming Problems and Debugging 107

Symbol Address Value

chptr2 114 100

chptr1 113 100

str1[6] 112 ’0’

str1[5] 111 d

str1[4] 110 n

str1[3] 109 o

str2[2] 108 c

str2[1] 107 e

str2[0] 106 S

str1[5] 105 ’0’

str1[4] 104 t

str1[3] 103 s

str1[2] 102 r

str1[1] 101 i

str1[0] 100 F

Lines 11 and 12 make chptr1 and chptr2 store the address of str1[0]. By putting

const in front of char * at line 11, we do not want to change the value at the memory

that is pointed by chptr1. Line 13 is not allowed because this line attempts to change the

value at address 100, i.e., & str1[0], through chptr1. Please note the word through. It

is still possible to change str1[0] as long as the change is not through chptr1. Line 15 does

not prevent us from changing chptr1 itself because chptr1 is not a constant. Hence, we

can change the value of chptr1 at line 15.

Symbol Address Value

chptr1 113 100 → 106 by line 15

In contrast, line 12 says chptr2 is a constant and so it cannot be changed in line 16.

However, it is possible to change the value of str1[0] through chptr2 at line 14.

Symbol Address Value

chptr2 114 100 (cannot be changed)

chptr1 113 100 → 106 by line 15

str1[0] 100 F → C by line 14

In my countchar, str itself is not a constant (similar to chptr1) so changing str itself

at line 23 is allowed. In fact, const can be used twice for the same pointer. In this exam-

ple, chptr3 stores the address of & str1[0]. Changing the value of chptr3 (line 17) and

changing the value at the address (line 18) are disallowed.

// const2 . c1

#in clude < stdio .h >2

#in clude < stdlib .h >3

int main ( i n t argc , char * argv [])4

char str1 [20];6

char str2 [20];7

strcpy ( str1 , " First " );8

strcpy ( str2 , " Second ") ;9

const char * chptr1 = & str1 [0];10

char * const chptr2 = & str1 [0];11

const char * const chptr3 = & str1 [0];12

// * chptr1 = ’C ’; // not allowed13

108 Intermediate C Programming

* chptr2 = ’C ’; // OK14

chptr1 = & str2 [0]; // OK15

// chptr2 = & str2 [0]; // not allowed16

// chptr3 = & str2 [0]; // not allowed17

// * chptr3 = ’C ’; // not allowed18

return EXIT _SUCCES S ;19

}20

7.2 Debugging

To write good programs, we must abandon the habit of “coding-testing-debugging”.

Testing does not magically reveal what is wrong with a program and how to ﬁx it. Instead,

we must have a plan before writing code. This includes having a testing and debugging

strategy. After writing each line of code, read it carefully. This saves time. It will help you

ﬁnd simple mistakes, but reading code will also help reveal more subtle problems.

A common problem among learners is forgetting to initialize variables. C does not ini-

tialize variables. It is your responsibility. This can lead to apparently inexplicable bugs and

worse code that seems to work but then mysteriously fails. It is hard to ﬁnd these types

of bugs by “testing” and “debugging” because the program behavior can be unexpected,

or sometimes appear correct. In this case, there is no substitute to reading the code, and

asking yourself if every variable has been initialized.

Another common mistake is putting ; in the wrong place. In the code listing for

mystring.c, if an ; is placed at the end of line 7, 17, or 30 then the program is incor-

rect because the block of code between { and } is no longer related to the while or if

conditions. Putting ; at the end of line 7, 17, or 30 makes the program enter inﬁnite loops.

This problem can be diﬃcult to ﬁnd by testing alone because the program does not stop

and the output will probably be incomplete. Putting ; at the end of line 19 increments

the count regardless of whether * str and ch match. These problems are easy to ﬁnd by

reading code line by line carefully. Unfortunately, gcc cannot oﬀer much help because the

program is syntactically correct.

7.2.1 Find Inﬁnite Loops

Even if you are very careful, it is diﬃcult to avoid all mistakes. Thus it is useful to know

how to use a debugger such as gdb. It allows us to execute code line by line, and inspect

the values in the variables. It is important to appreciate that gdb is not a substitute to

reading your code, and reasoning about it logically. Nonetheless, gdb augments our ability

to diagnose and ﬁx problems in code, for example, in ﬁnding inﬁnite loops.

Inﬁnite loops are typically indicated by a program that should stop quickly but does not.

Running the program again and again, of course, will not be helpful. Inserting “debugging

messages” into the code may be more helpful, but it also takes a lot of time. We need

to know where the actual problem is. Possible but diﬃcult. If we know the problem, we

will ﬁx it without inserting debugging messages. After ﬁnding and ﬁxing the problem, the

debugging messages usually have to be removed. There is a better way: The gdb debugger

can make this easy.

Assume that we are running the code for mystring.c and a ; has accidentally been

inserted at the end of line 7. The program enters an inﬁnite loop and does not stop.

Programming Problems and Debugging 109

$ ./mystring strlen input output strlen

What is the fastest way to diagnose this problem? Start gdb in a Linux Terminal.

$ gdb mystring

Please remember that gdb takes the executable ﬁle as the input, and not a .c ﬁle. Inside

gdb, type

(gdb) r strlen input output strlen

The ﬁrst letter r means “run” the program. It replaces ./mystring in the command

line. Add the normal command-line arguments after r. The program now starts and enters

the aforementioned inﬁnite loop. Press Ctrl-c to interrupt the normal execution of the

program and gdb will display something like:

Program received signal SIGINT, Interrupt.

0x0000000000400884 in my_strlen (str=0x603590 "If we consider that

part of the theory of relativity which may ") at mystring.c:7

This means that the program has stopped at line 7 of the ﬁle mystring.c. The message

tells us where the inﬁnite loop is. At the (gdb) prompt, type list to show the code:

int my_strlen ( const char * str )4

int len = 0;6

while ( str [ len ] != ’0 ’);7

len ++;9

As explained earlier, inﬁnite loops may occur in many places (lines 7, 17, and 30). Using

gdb to identify an inﬁnite loop is easy and takes only a few seconds. Moreover, we do not

need to modify any line before using gdb.

7.2.2 Find Invalid Memory Accesses

An “invalid memory access”, or “segmentation fault”, is a common error that stops a

program. It occurs when a program attempts to access memory outside the allowed regions.

A program that has invalid memory accesses may create security vulnerability. We can

introduce a memory error into line 17 of mystring.c. That line should be:

while (* str != ’ 0 ’)17

Suppose we accidentally write it as:

while (* str != ’0 ’) // without 17

What is the diﬀerence between ’0’ and ’0’? The former is a null-terminator, a special

invisible character indicating the end of a string. The latter is a normal character like ’a’,

but happens to be the character zero: ’0’. Do not mix them. When running the program

$ ./mystring countchar input output countchar

we get

Segmentation fault (core dumped)

110 Intermediate C Programming

Segmentation fault means the program tries to access (read or write) memory at an

invalid address. The operating system reacts by stopping the program. It is similar to

attempting to enter someone else’s house. If you do not own the house, entering the house is

illegal. We can determine where segmentation fault occurs by using either gdb or valgrind.

If we run the program using gdb, we will see something like:

Program received signal SIGSEGV, Segmentation fault.

0x00000000004008cd in my_countchar (str=0x624000 $<$Address 0x624000

out of bounds$>$, ch=73 ’I’) at mystring.c:17

Type the bt (backtrace) command at the gdb prompt, to see the call stack.

#0 0x00000000004008cd in my_countchar (str=0x624000

$<$Address 0x624000 out of bounds$>$, ch=73 ’I’) at mystring.c:17

#1 0x0000000000400cc9 in main (argc=4, argv=0x7fffffffe418) at main.c:74

The call stack shows two frames. The top frame is frame 0 and the next is frame 1. In

gdb, you can see a speciﬁc frame. Type

(gdb) f 1

after the (gdb) prompt to enter frame 1. Type list to show the code around line 74 in

main.c. Type print i to print the value of i. This is the line number of the input ﬁle. Its

value is 0 meaning that the my countchar is processing the ﬁrst line of input.

The ﬁrst line of the input is “If we consider that part of the theory of relativity which

mayn” gdb can tell us that if we type the command print lines[i]:

(gdb) print lines[i]

$3 = 0x603590 “If we consider that part of the theory of relativity which mayn”

The starting address of this string is 0x603590. Something is wrong inside my countchar

but at this stage, it is unclear precisely what is wrong. Let’s go back to the frame of

my countchar by typing f 0:

(gdb) f 0

#0 0x00000000004008cd in my countchar (str=0x624000

<Address 0x624000 out of bounds>, ch=73 ’I’) at mystring.c:17

The segmentation fault occurs at line 17 and this line reads the value at the address stored

in str. Print the value of str in gdb:

(gdb) print str

$4 = 0x624000 <Address 0x624000 out of bounds>

Compare the value of str (0x624000) and the starting address of the string in main

(0x603590). The diﬀerence is quite large but the string is not very long, only 63 characters.

This means that str kept increasing far beyond the end of the string.

You may ask, “Why does the segmentation fault occur when str is so large? Doesn’t

the program start accessing invalid addresses after str is larger than 0x603590 + 63?”

It is correct that the program starts accessing invalid addresses after str is larger than

0x603590 + 63. However, Linux stops the program only when it reads memory that it is

not authorized to use. Since the memory is given in chunks, as explained in Section 5.3,

the segmentation fault occurs when the program accesses a memory address beyond the

currently authorized segment. A program may access all the addresses inside the segments

given to the program, and the operating system will not stop it. This does not mean that

the program is correct. We need to correct the program as soon as possible.

Programming Problems and Debugging 111

7.2.3 Detect Invalid Memory Accesses

Another way to detect invalid memory accesses is using valgrind. The log ﬁle from

valgrind can be quite long. You should go to the very last line. In this example, the line

says:

ERROR SUMMARY: 59745 errors from 4 contexts (suppressed: 2 from 2)

Do not be too concerned about the number of errors detected. Fixing one error will Often

ﬁx many of the other detected errors. This is because a single error can be hit many times

as a program executes. Go to the very top of the log ﬁle and start looking for anything

related to the source ﬁles, i.e., mystring.c, mystring.h, and main.c. The ﬁrst detected

problem related to mystring.c is:

==4238== Conditional jump or move depends on uninitialised value(s)

==4238== at 0x4008D2: my_countchar (mystring2.c:17)

==4238== by 0x400CC8: main (main.c:74)

Another detected problem is:

==4238== Conditional jump or move depends on uninitialised value(s)

==4238== at 0x4008BE: my_countchar (mystring2.c:19)

==4238== by 0x400CC8: main (main.c:74)

This is related to the problem at line 17. If we ﬁx line 17, the problem at line 19

disappears. Some students think that accessing invalid memory is harmless as long as the

programs do not have segmentation faults. This is wrong. Allowing invalid addresses is

one of the most common security problems in software. It can allow a malicious program to

“hijack” another program. If a program accesses invalid addresses, the program’s behavior is

not deﬁned. That means it may work a hundred or a thousand times, and then mysteriously

fail. The same program may fail when using a diﬀerent compiler, or run on a diﬀerent

computer.

Please remember that testing can demonstrate that something is wrong; however, testing

cannot demonstrate that everything is right. As we see, Linux stops the program when str

is already very far away from valid addresses. Thus, we cannot rely on testing exclusively.

Instead, we need to use many methods to prevent and detect mistakes.

“Which one should I use, gdb or valgrind?”, you may ask. The answer is both. These

two tools serve diﬀerent purposes. Choosing gdb vs. valgrind is like choosing a hammer vs.

a screw driver. Use the right tool for the job: gdb is interactive, and allows a programmer

to see the program’s execution line by line. In contrast, valgrind runs the program until it

stops (or crashes). In general it is a good idea to use valgrind ﬁrst to detect whether there

are any memory problems and then use gdb to pinpoint the problem. You should always use

valgrind to check whether your programs have invalid memory accesses. The command is:

$ valgrind –leak-check=full –tool=memcheck –verbose

What is the diﬀerence between gcc and valgrind? Doesn’t gcc also check whether a

program has problems? The gcc compiler checks the source code: i.e., it ﬁnds syntax errors.

This is a very rudimentary form of error checking. It is like a spell-checker in a document

editor. An article without any spelling error does not mean that the article makes any sense.

In similar fashion, gcc does not check what happens when the program runs. It is impossible

for gcc to check what the program does when it is running.

In contrast, valgrind is a run-time checker. The program must be run for valgrind to

check anything. This implies the following: If the program does not execute the parts of code

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 7. Programming Problems and Debugging (3/4)

Create new playlist

Sign In

Sign Up

Table of Contents for
7. Programming Problems and Debugging (3/4)