Now let's talk about what to do when your program is having trouble.
A program can go wrong in any number of ways. Maybe it won't run at all. A look at the error messages, especially the first line or two of the error messages, usually leads you to the problem, which will be somewhere in the syntax, and its solution, which will be to use the correct syntax (e.g., matching braces or ending each statement with a semicolon).
Your program may run but not behave as you planned. Then you have some problem
with the logic of the program. Perhaps at some point, you've zigged when you should
have zagged, like adding instead of subtracting or using the assignment operator
=
when you meant to test for equality
between two numbers with ==
. Or, the problem
could be that you just have a poor design to accomplish your task, and it's only
when you actually try it out that the flaw becomes evident.
However, sometimes the problem is not obvious, and you have to resort to the heavy artillery.
Fortunately, Perl has several ways to help you find and fix bugs in your programs.
The use of the statements use strict;
and use warnings;
should become a
habit, as you can catch many errors with them. The Perl debugger gives you complete
freedom to examine a program in detail as it runs.
In general, it's not too hard to tell when the syntax of a program is wrong because the Perl interpreter will produce error messages that usually lead you right to the problem. It's much harder to tell when the program is doing something you didn't really want. Many such problems can be caught if you turn on the warnings and enforce the strict use of declarations.
You have probably noticed that all the programs in this book up until now start with the command interpreter line:
#!/usr/bin/perl -w
That -w
turns on Perl's warnings and attempts to find potential problems in
your code and then to warn you about them. It finds common problems such as
variables that are declared more than once, and so on, things that are not
syntax errors but that can lead to bugs.
Another way to turn on warnings is to add the following statement near the top of the program:
use warnings;
The statement use warnings;
may not be
available on your version of Perl, if it's an old one. So if your Perl complains
about it, take it out and use the -w
command instead, either on the command interpreter line, or from the command
line:
$ perl -w my_program
However, use warnings; is a bit more portable between different operating
systems. So, from now on, that's the way I'll turn on warnings in my code.
Another important helper you should use is the following statement placed near
the top of your program (next to use
warnings;
):
use strict;
As mentioned previously, this forces you to declare your variables. (It has some options, that are beyond the scope of this book.) It finds misspelled variables, undeclared variables that may be interfering with other parts of the program, and so on.
Sometimes you can identify misbehaving code by selectively commenting out
sections of the program until you find the part that seems to cause the problem.
You can also add print
statements at
suspicious parts of a misbehaving program to check what certain variables are
doing. Both of these are time-honored programming techniques, and they work well
in almost any programming language.
Commenting out sections of code can be particularly helpful when the error messages that you get from Perl don't point you directly at the offending line. This happens occasionally. When it does happen you may, by trial and error, discover that commenting out a small section of code causes the error messages to go away; then you know where the error is occurring.
Adding print
statements can also be a quick
way to pinpoint a problem, especially if you already have some idea of where the
problem is. As a novice programmer, however, you may find that using the Perl
debugger is easier than adding print
statements. In the debugger, you can easily set print
statements at any line. For instance, the following
debugger command says to print the values of $i
and $k
before line
48:
a 48 print "$i $k "
Once you learn how to do it, this method is generally faster and easier than
editing the Perl program and adding print
statements by hand. Using this method is partly a matter of taste, since some
extremely good Perl programmers prefer to do it the old-fashioned way, by adding
print
statements.
My favorite way to deal with nonobvious bugs in my programs is to use the Perl debugger. The problem with bugs in code is that once a program starts running, all you can see is the output; you can't see the steps a program is taking. The Perl debugger lets you examine your program in detail, step by step, and almost always can lead you quickly to the problem. You'll also find that it's easy to use with a little practice.
There are situations the Perl debugger can't handle well: interacting processes that depend on timing considerations, for instance. The debugger can examine only one program at a time, and while examining, it stops the program, so timing considerations with other processes go right out the window.
For most purposes, the Perl debugger is a great, essential, programming tool. This section introduces its most important features.
Example 6-4 has some bugs we can examine. It's supposed to take a sequence and two bases, and output everything from those two bases to the end of the sequence (if it can find them in the sequence). The two bases can be given as an argument, or if no argument is given, the program uses the bases TA by default.
There is one new thing in Example
6-4. The next
statement affects the
control flow in a loop. It immediately returns the control flow
to the next iteration of the loop, skipping whatever else would have
followed. Also, you may want to recall $_
, which we discussed back in Example 5-5 in the context of a
foreach
loop.
Example 6-4. A program with a bug or two
#!/usr/bin/perl # A program with a bug or two # # An optional argument, for where to start printing the sequence, # is a two-base subsequence. # # Print everything from the subsequence ( or TA if no subsequence # is given as an argument) to the end of the DNA. # declare and initialize variables my $dna = 'CGACGTCTTCTAAGGCGA'; my @dna; my $receivingcommittment; my $previousbase = ''; my$subsequence = ''; if (@ARGV) { my$subsequence = $ARGV[0]; }else{ $subsequence = 'TA'; } my $base1 = substr($subsequence, 0, 1); my $base2 = substr($subsequence, 1, 1); # explode DNA @dna = split ( '', $dna ); ######### Pseudocode of the following loop: # # If you've received a committment, print the base and continue. Otherwise: # # If the previous base was $base1, and this base is $base2, print them. # You have now received a committment to print the rest of the string. # # At each loop, save the previous base. foreach (@dna) { if ($receivingcommittment) { print; next; } elsif ($previousbase eq $base1) { if ( /$base2/ ) { print $base1, $base2; $receivingcommitment = 1; } } $previousbase = $_; } print " "; exit;
Here's the output of two runs of Example 6-1:
$ perl example 6-4 AA $ perl example 6-4 TA
Huh? It should have printed out AAGGCGA
when called with the argument AA
, and
TAAGGCGA
when called with no
arguments. There must be a bug in this program. But, if you look it over,
there isn't anything obviously wrong. It's time to fire up the debugger.
What follows is an actual debugging session on Example 6-4, interspersed with
comments to explain what's happening and why.
The debugger runs interactively, and you control it from the
keyboard.[6] The most common way to start it is by giving the
-d
switch to Perl at the
command line. Since you're using buggy Example 6-4 to demonstrate the
debugger, here's how to start that program:
perl -d example6-4
Alternatively, you could have added a -d
flag to the
command interpreter:
#!/usr/bin/perl -d
On systems such as Unix and Linux where command interpretation works, this starts the debugger automatically.
To stop the debugger, simply type q
.
First, let's try to find the bug in Example 6-4 when it's called with no arguments:
$ perl -d example6-4 Default die handler restored. Loading DB routines from perl5db.pl version 1.07 Editor support available. Enter h or 'h h' for help, or 'man perldebug' for more help. main::(example6-4:11): my $dna = 'CGACGTCTTCTAAGGCGA'; DB<1>
Let's stop right here at the beginning and look at a few things. After
some messages, which may not mean a whole lot right now, you get the
excellent information that the commands h
and h
h
give more help. Let's try h
h
:
DB<1> h h List/search source lines: Control script execution: l [ln|sub] List source code T Stack trace - or . List previous/current line s [expr] Single step [in expr] w [line] List around line n [expr] Next, steps over subs f filename View source in file <CR/Enter> Repeat last n or s /pattern/ ?patt? Search forw/backw r Return from subroutine v Show versions of modules c [ln|sub] Continue until position Debugger controls: L List break/watch/actions O [...] Set debugger options t [expr] Toggle trace [trace expr] <[<]|{[{]|>[>] [cmd] Do pre/post-prompt b [ln|event|sub] [cnd] Set breakpoint ! [N|pat] Redo a previous command d [ln] or D Delete a/all breakpoints H [-num] Display last num commands a [ln] cmd Do cmd before line = [a val] Define/list an alias W expr Add a watch expression h [db_cmd] Get help on command A or W Delete all actions/watch |[|]db_cmd Send output to pager ![!] syscmd Run cmd in a subprocess q or ^D Quit R Attempt a restart Data Examination: expr Execute perl code, also see: s,n,t expr x|m expr Evals expr in list context, dumps the result or lists methods. p expr Print expression (uses script's current package). S [[!]pat] List subroutine names [not] matching pattern V [Pk [Vars]] List Variables in Package. Vars can be ~pattern or !pattern. X [Vars] Same as "V current_package [Vars]". For more help, type h cmd_letter, or run man perldebug for all docs. DB<2>
It's a bit hard to read, but you have a concise summary of the debugger commands. You can
also use the h
command, which gives
several screens worth of information. The |
h
command displays those several pages
one at a time; the pipe
at the beginning of a debugger command pipes the output through
a pager, which typically advances a page when you hit the spacebar on your
keyboard. You should try those out. Right now, however, let's focus on a few
of the most useful commands. But remember that typing h
command
can give you help about the
command.
Back to the immediate problem. When you started up the debugger, you saw that it stopped on the first line of real Perl code:
main::(example6-4:11): my $dna = 'CGACGTCTTCTAAGGCGA';
There's an important point about the debugger you should understand right away. It shows the line it's about to execute, not the line it just executed.
So really, Example 6-4 hasn't
done anything yet. You can see from the command summary that p
tells the debugger to print out values. If you ask it to print
the value of $dna
, you'll find:
DB<2> p $dna DB<3>
It didn't show anything because there's nothing to show; it hasn't even
seen the variable $dna
yet. So you should
execute the statement. There are two commands to use: n
or s
both
execute the statement being displayed. (The difference is that n
or "next" skips the plunge into a subroutine call, treating it
like a single statement; s
or "single step" enters a subroutine and single steps you
through that code as well.) Once you've given one of these commands, you can
just hit Enter to repeat the same command.
Since there aren't any subroutines, you needn't worry about choosing
between n
and s
, so let's use n
:
DB<3> n main::(example6-4:12): my @dna; DB<3>
This shows the next line (you can see the line numbers of the Perl program
at the end of the prompt). If you wish to see more lines, the v
or "window" command will serve:
DB<3> w 9 10 # declare and initialize variables 11: my $dna = 'CGACGTCTTCTAAGGCGA'; 12==> my @dna; 13: my $receivingcommittment; 14: my $previousbase = ''; 15 16: my $subsequence = ''; 17 18: if (@ARGV) { DB<3>
The
current line—the line that will be executed next—is highlighted
with an arrow (==>
).
The w
seems like a useful thing. Let's get more information about it
with the help command h
w
:
DB<3> h w w [line] List window around line. DB<4>
Actually, there's more—hitting w
repeatedly keeps showing more of the program; a minus sign backs up a
screen. But enough of that.
Now that $dna
has been declared and
initialized, let's see if it's what we expect: this may be where the
program's going wrong, on the first statement!
DB<4> p $dna CGACGTCTTCTAAGGCGA DB<5>
That's exactly what was expected. There's no bug, so let's continue examining the lines, printing out values here and there:
DB<5> n main::(example6-4:13): my $receivingcommittment; DB<5> n main::(example6-4:14): my $previousbase = ''; DB<5> n main::(example6-4:16): my $subsequence = ''; DB<5> n main::(example6-4:18): if (@ARGV) { DB<5> p @ARGV DB<6> w 15 16: my $subsequence = ''; 17 18==> if (@ARGV) { 19: my $subsequence = $ARGV[0]; 20 }else{ 21: $subsequence = 'TA'; 22 } 23 24: my $base1 = substr($subsequence, 0, 1); DB<6> n main::(example6-4:21): $subsequence = 'TA'; DB<6> n main::(example6-4:24): my $base1 = substr($subsequence, 0, 1); DB<6> p $subsequence TA DB<7> n main::(example6-4:25): my $base2 = substr($subsequence, 1, 1); DB<7> n main::(example6-4:28): @dna = split ( '', $dna ); DB<7> p $base1 T DB<8> p $base2 A DB<9>
So far, everything is as expected; the default subsequence TA
is being used, and the $base1
and $base2
variables are set to T
and A
, the first and
second bases of the subsequence. Let's continue:
DB<9> n main::(example6-4:39): foreach (@dna) { DB<9> p @dna CGACGTCTTCTAAGGCGA DB<10> p "@dna" C G A C G T C T T C T A A G G C G A DB<11>
This shows a trick with Perl and printing
arrays: normally they are printed without any spacing between
the elements, but enclosing an array in double quotes in a print
statement causes it to be displayed with
spaces between the elements.
Again, everything seems okay, and we're about to enter a loop. Let's look at the whole loop first:
DB<11> w 36 # 37 # At each loop, save the previous base. 38 39==> foreach (@dna) { 40: if ($receivingcommittment) { 41: print; 42: next; 43 } elsif ($previousbase eq $base1) { 44: if ( /$base2/ ) { 45: print $base1, $base2; DB<11> w 43 } elsif ($previousbase eq $base1) { 44: if ( /$base2/ ) { 45: print $base1, $base2; 46: $recievingcommitment = 1; 47 } 48 } 49: $previousbase = $_; 50 } 51 52: print " "; DB<11>
Despite the few repeated lines resulting from the w
command, you can see the whole loop. Now you know something
in here is going wrong: when you tested the program without giving it an
argument, as it's running now, it took the default argument TA
, and so far it seemed okay. However, all it
actually did in your test was to print out the TA
when it was supposed to print out everything in the string
starting with the first occurrence of TA
.
What's going wrong?
To figure out what's wrong, you can set a breakpoint in your code. A breakpoint is a spot in your program where you tell the debugger to stop execution so you can poke around in the code. The Perl debugger lets you set breakpoints in various ways. They let you run the program, stopping only to examine it when a statement with a breakpoint is reached. That way, you don't have to step through every line of code. (If you have 5,000 lines of code, and the error happens when you hit a line of code that's first used when you're reading the 12,000th line of input, you'll be happy about this feature.)
Notice that the part of this loop that prints out the rest of the string,
once the starting two bases have been found, is the if
block starting at line 40:
if ($receivingcommittment) { print; next; }
Let's look at that $receivingcommittment
variable.
Here's one way to do this. Let's set a breakpoint at line 40. Type
b
40
and then c
to continue, and the program proceeds until it hits line
40:
DB<11> b 40 DB<12> c main::(example6-4:40): if ($receivingcommittment) { DB<12> p C DB<12>
The last command, p
, prints out the element from the @dna
array you reached in the foreach
loop. Since you didn't specify a variable for the
loop, it used the default $_
variable.
Many Perl commands such as print
or
pattern matching operate on the default $_
variable if no other variable is given. (It's the cousin
of the @_
default array subroutines used
to hold their parameters.) So the p
debugger command shows that you're operating on C from the @dna
array, which is the first
character.
All well and good. But it would be good to have the program break when the
variable $receivingcommittment
has a
change in its value, and then single step from there, to see why the program
isn't printing out the rest of the string. Recall that this variable is the
flag whose change tells the program to print the rest of the string. First
let's delete all other
breakpoints:
DB<21> B * Deleting all breakpoints...
You can "watch" the variable with W
like so:
DB<12> w $receivingcommittment DB<13> c TA Debugged program terminated. Use q to quit or R to restart, use O inhibit_exit to avoid stopping after program termination, h q, h R or h O to get additional info. DB<13>
Wait a minute! The W
command should indicate when $receivingcommittment
changes value. But when the program
continued running with the c
command, it
ran to the end, meaning that $receivingcommittment
never changed value. So let's start up
the program again and break on the line that changes its value:
DB<13> R Warning: some settings and command-line options may be lost! Default die handler restored. Loading DB routines from perl5db.pl version 1.07 Editor support available. Enter h or 'h h' for help, or 'man perldebug' for more help. main::(example6-4:11): my $dna = 'CGACGTCTTCTAAGGCGA'; DB<13> v 45 42: next; 43 } elsif ($previousbase eq $base1) { 44: if ( /$base2/ ) { 45: print $base1, $base2; 46: $recievingcommitment = 1; 47 } 48 } 49: $previousbase = $_; 50 } 51 DB<14> b 46 DB<15> c TAmain::(example6-4:46): $recievingcommitment = 1; DB<15> n main::(example6-4:49): $previousbase = $_; DB<15> p $receivingcommittment DB<16>
Huh? The code says it's assigning the variable a value of 1, but after you
execute the code, with the n
and try to
print out the value, it doesn't print anything.
If you stare harder at the program, you see that at line 46 you misspelled
$receivingcommittment
as $recievingcommitment
. That explains
everything; fix it and run it again:
$ perl example6-4 TAAGGCGA
Now, did that fix the other bug when you ran Example 6-4 with an argument?
$ perl example6-4 AA GACGTCTTCTAAGGCGA
Again, huh? You
expected AAGGCGA
. Can there
be another bug in the program? Let's try the debugger again:
$ perl -d example6-4 AA Default die handler restored. Loading DB routines from perl5db.pl version 1.07 Editor support available. Enter h or 'h h' for help, or 'man perldebug' for more help. main::(example6-4:11): my $dna = 'CGACGTCTTCTAAGGCGA'; DB<1> n main::(example6-4:12): my @dna; DB<1> n main::(example6-4:13): my $receivingcommittment; DB<1> n main::(example6-4:14): my $previousbase = ''; DB<1> n main::(example6-4:16): my $subsequence = ''; DB<1> n main::(example6-4:18): if (@ARGV) { DB<1> n main::(example6-4:19): my $subsequence = $ARGV[0]; DB<1> n main::(example6-4:24): my $base1 = substr($subsequence, 0, 1); DB<1> n main::(example6-4:25): my $base2 = substr($subsequence, 1, 1); DB<1> n main::(example6-4:28): @dna = split ( '', $dna ); DB<1> p $subsequence DB<2> p $base1 DB<3> p $base2 DB<4>
Okay, for some reason the $subsequence
,
and therefore the $base1
and $base2
variables, are not getting set right.
How come?
Check out line 19 where you declared a new my
variable in the block of the if
statement with the same name, $subsequence
. That's the variable you're setting, but it's
disappearing as soon as the if
statement
is over, because it's scoped in the block since it's a my
variable.
So again, you fix that problem by removing the my
declaration on line 19 and instead inserting an assignment
$subsequence = $ARGV[0];
and run the
program again:
$ perl example6-4 TAAGGCGA $ perl example6-4 AA AAGGCGA
Here, finally, is success.
Example 6-4 was somewhat
artificial. It turns out that these problems would have been reported easily
if warnings had been used. So let's see an actual example of the benefits of
use
strict;
and use
warnings;
, as discussed earlier in this
chapter.
If you go back to the original Example 6-4 and add the use
warnings;
directive near the top of the
program, you get the following output:
$ perl example6-4 Name "main::recievingcommitment" used only once: possible typo at example6-4 line 47. TA
As you see, the warnings found the first bug immediately. They noticed there was a variable that was used only once, usually a sign of a misspelled variable. (I can never spell "receiving" or "commitment" properly.) So fix the misspelling at line 46, and run it again:
$ perl example6-4 TAAGGCGA $ perl example6-4 AA substr outside of string at example6-4 line 26. Use of uninitialized value in regexp compilation at example6-4 line 45. Use of uninitialized value in print at example6-4 line 46. GACGTCTTCTAAGGCGA
So, the first bug is fixed. The second bug remains with a few warnings that are, perhaps, hard to understand. But focus on the first error message, and see that it complains about line 26:
my $base2 = substr($subsequence, 1, 1);
So, there's something wrong with $subsequence
. Often, error messages will be off by one line,
so it may well be that the error starts on the line before, the first time
$subsequence
is operated on by the
substr
. But that's not the case
here.
Nonetheless, the warnings have pointed directly to the problem. In this
case, you still have to take a little initiative; look back at the $subsequence
variable and notice the extra
my
declaration within the if
block on line 20 that is preventing the
variable from being initialized properly. Now this is not necessarily always
a bug—declaring a variable scoped within a block and that overrides another
variable of the same name that is outside the block. In fact, it's perfectly
legal, so the programmers who wrote the warnings did not flag it as an
obvious error. However, it seems to have caused a real problem here!
One final point: if you go back to the original, buggy program, notice
there's no use
strict;
in the program. If you add that
and run the program without arguments, you get the following:
$ perl example6-4 Global symbol "$recievingcommitment" requires explicit package name at example6-4 line 47. Execution of example6-4 aborted due to compilation errors.
Fixing the misspelled variable, and running the program with the argument, you get:
$ perl example6-4 AA GACGTCTTCTAAGGCGA
You can see that use
strict;
didn't help for the other bug. Remember, it's best to
employ both use
strict;
and use
warnings;
.