6.6
Fixing Bugs in Your Code
Now let's talk about
what to do when your
program
is having trouble.
A program can go wrong in any number of ways. Maybe it won't
run at all. A look at the error messages, especially the first line
or two of the error messages, usually leads you to the problem, which
will be somewhere in the syntax, and its solution, which will be
to use the correct syntax (e.g., matching braces or ending each
statement with a semicolon).
Your program may run but not behave as you planned. Then you have
some problem with the logic of the program. Perhaps at some point,
you've zigged when you should have zagged, like adding instead
of subtracting or using the assignment operator
= when you meant to test for equality between
two numbers with ==. Or, the problem could be
that you just have a poor design to accomplish your task, and
it's only when you actually try it out that the flaw becomes
evident.
However, sometimes the problem is not obvious, and you have to resort
to the heavy artillery.
Fortunately, Perl has several ways to help you find and fix bugs in
your programs. The use of the statements use
strict; and use warnings;
should become a habit, as you can catch many errors with them. The
Perl debugger gives you complete freedom to examine a program in
detail as it runs.
6.6.1
use warnings; and use strict;
In general, it's not too hard to tell when the syntax of a
program is wrong because the Perl interpreter will produce error
messages that usually lead you right to the problem. It's much
harder to tell when the program is doing something you didn't
really want. Many such problems can be caught if you turn on the
warnings and enforce the strict use of declarations.
You have probably noticed that all the programs in this book up until
now start with the
command interpreter line:
#!/usr/bin/perl -w
That -w
turns on
Perl's warnings and attempts to find potential problems in your
code and then to warn you about them. It finds common problems such
as variables that are declared more than once, and so on, things that
are not syntax errors but that can lead to bugs.
Another way to turn on warnings is to add the following statement
near the top of the program:
use warnings;
The statement use warnings; may not be available
on your version of Perl, if it's an old one. So if your Perl
complains about it, take it out and use the -w
command instead, either on the command interpreter line, or from the
command line:
$ perl -w my_program
However, use warnings; is a bit more portable between different
operating systems. So, from now on, that's the way I'll
turn on warnings in my code. Another important helper you should use
is the following statement placed near the top of your program (next
to use warnings;):
use strict;
As mentioned previously, this forces you to declare your variables.
(It has some options, that are beyond the scope of this book.) It
finds misspelled variables, undeclared variables that may be
interfering with other parts of the program, and so on.
|
It's best to always use both use
strict; and use
warnings; when writing your Perl code.
|
|
6.6.2
Fixing Bugs with Comments and Print Statements
Sometimes you can identify misbehaving code by selectively commenting
out sections of the program until you find the part that seems to
cause the problem. You can also add print
statements at suspicious parts of a misbehaving program to check what
certain variables are doing. Both of these are time-honored
programming techniques, and they work well in almost any programming
language.
Commenting out sections of code can be particularly helpful when the
error messages that you get from Perl don't point you directly
at the offending line. This happens occasionally. When it does happen
you may, by trial and error, discover that commenting out a small
section of code causes the error messages to go away; then you know
where the error is occurring.
Adding print statements can also be a quick way to
pinpoint a problem, especially if you already have some idea of where
the problem is. As a novice programmer, however, you may find that
using the Perl debugger is easier than adding
print statements. In the debugger, you can easily
set print statements at any line. For instance,
the following debugger command says to print the values of
$i and $k before line 48:
a 48 print "$i $k\n"
Once you learn how to do it, this method is generally faster and
easier than editing the Perl program and adding
print statements by hand. Using this method is
partly a matter of taste, since some extremely good Perl programmers
prefer to do it the old-fashioned way, by adding
print statements.
6.6.3
The Perl Debugger
My favorite way to deal with
nonobvious bugs in my programs is to use the Perl debugger. The
problem with bugs in code is that once a program starts running, all
you can see is the output; you can't see the steps a program is
taking. The Perl debugger lets you examine your program in detail,
step by step, and almost always can lead you quickly to the problem.
You'll also find that it's easy to use with a little
practice.
There are situations the Perl debugger can't handle well:
interacting processes that depend on timing considerations, for
instance. The debugger can examine only one program at a time, and
while examining, it stops the program, so timing considerations with
other processes go right out the window.
For most purposes, the Perl debugger is a great, essential,
programming tool. This section introduces its most important
features.
6.6.3.1
A program with bugs
Example 6-4 has some bugs we can examine. It's
supposed to take a sequence and two bases, and output everything from
those two bases to the end of the sequence (if it can find them in
the sequence). The two bases can be given as an argument, or if no
argument is given, the program uses the bases TA by default.
There is one new thing in Example 6-4. The
next statement affects the
control
flow in a loop. It immediately returns the control flow to the next
iteration of the loop, skipping whatever else would have followed.
Also, you may want to recall
$_
,
which we discussed back in Example 5-5 in the
context of a foreach
loop.
Example 6-4. A program with a bug or two
#!/usr/bin/perl
# A program with a bug or two
#
# An optional argument, for where to start printing the sequence,
# is a two-base subsequence.
#
# Print everything from the subsequence ( or TA if no subsequence
# is given as an argument) to the end of the DNA.
# declare and initialize variables
my $dna = 'CGACGTCTTCTAAGGCGA';
my @dna;
my $receivingcommittment;
my $previousbase = '';
my$subsequence = '';
if (@ARGV) {
my$subsequence = $ARGV[0];
}else{
$subsequence = 'TA';
}
my $base1 = substr($subsequence, 0, 1);
my $base2 = substr($subsequence, 1, 1);
# explode DNA
@dna = split ( '', $dna );
######### Pseudocode of the following loop:
#
# If you've received a committment, print the base and continue. Otherwise:
#
# If the previous base was $base1, and this base is $base2, print them.
# You have now received a committment to print the rest of the string.
#
# At each loop, save the previous base.
foreach (@dna) {
if ($receivingcommittment) {
print;
next;
} elsif ($previousbase eq $base1) {
if ( /$base2/ ) {
print $base1, $base2;
$recievingcommitment = 1;
}
}
$previousbase = $_;
}
print "\n";
exit;
Here's the output of two runs of Example 6-1:
$ perl example 6-4 AA
$ perl example 6-4
TA
Huh? It should have printed out AAGGCGA when
called with the argument AA, and
TAAGGCGA when called with no arguments. There must
be a bug in this program. But, if you look it over, there isn't
anything obviously wrong. It's time to fire up the debugger.
What follows is an actual debugging session on Example 6-4, interspersed with comments to explain
what's happening and why.
6.6.3.2
How to start and stop the debugger
The debugger runs interactively, and you
control it from the keyboard.[6]
The most common way to start it is by giving the
-d switch to Perl at the command line. Since
you're using buggy Example 6-4 to demonstrate
the debugger, here's how to start that program:
perl -d example6-4
Alternatively, you could have added a -d flag to the
command interpreter:
#!/usr/bin/perl -d
On systems such as Unix and Linux where command interpretation works,
this starts the debugger automatically.
To stop the debugger, simply type q.
6.6.3.3
Debugger command summary
First, let's try to find the bug in Example 6-4 when it's called with no
arguments:
$ perl -d example6-4
Default die handler restored.
Loading DB routines from perl5db.pl version 1.07
Editor support available.
Enter h or 'h h' for help, or 'man perldebug' for more help.
main::(example6-4:11): my $dna = 'CGACGTCTTCTAAGGCGA';
DB<1>
Let's stop right here at the beginning and look at a few
things. After some messages, which may not mean a whole lot right
now, you get the excellent information that the commands
h and h h
give more help. Let's try h
h:
DB<1> h h
List/search source lines: Control script execution:
l [ln|sub] List source code T Stack trace
- or . List previous/current line s [expr] Single step [in expr]
w [line] List around line n [expr] Next, steps over subs
f filename View source in file <CR/Enter> Repeat last n or s
/pattern/ ?patt? Search forw/backw r Return from subroutine
v Show versions of modules c [ln|sub] Continue until position
Debugger controls: L List break/watch/actions
O [...] Set debugger options t [expr] Toggle trace [trace expr]
<[<]|{[{]|>[>] [cmd] Do pre/post-prompt b [ln|event|sub] [cnd] Set breakpoint
! [N|pat] Redo a previous command d [ln] or D Delete a/all breakpoints
H [-num] Display last num commands a [ln] cmd Do cmd before line
= [a val] Define/list an alias W expr Add a watch expression
h [db_cmd] Get help on command A or W Delete all actions/watch
|[|]db_cmd Send output to pager ![!] syscmd Run cmd in a subprocess
q or ^D Quit R Attempt a restart
Data Examination: expr Execute perl code, also see: s,n,t expr
x|m expr Evals expr in list context, dumps the result or lists methods.
p expr Print expression (uses script's current package).
S [[!]pat] List subroutine names [not] matching pattern
V [Pk [Vars]] List Variables in Package. Vars can be ~pattern or !pattern.
X [Vars] Same as "V current_package [Vars]".
For more help, type h cmd_letter, or run man perldebug for all docs.
DB<2>
It's a bit hard to read, but
you have a concise summary of the
debugger commands. You can also use the h command,
which gives several screens worth of information. The
| h command displays those
several pages one at a time; the pipe
at the beginning of a debugger command
pipes the output through a pager, which typically advances a page
when you hit the spacebar on your keyboard. You should try those out.
Right now, however, let's focus on a few of the most useful
commands. But remember that typing h
command can give you help about the
command.
6.6.3.4
Stepping through statements with the debugger
Back to the immediate problem. When you
started up the debugger, you saw that it stopped on the first line of
real Perl code:
main::(example6-4:11): my $dna = 'CGACGTCTTCTAAGGCGA';
There's an important point about the debugger you should
understand right away. It shows the line it's about to execute,
not the line it just executed.
So really, Example 6-4 hasn't done anything
yet. You can see from the command summary that
p tells the debugger to print
out values. If you ask it to print the value of
$dna, you'll find:
DB<2> p $dna
DB<3>
It didn't show anything because there's nothing to show;
it hasn't even seen the variable $dna yet.
So you should execute the statement. There are two commands to use:
n or s both execute the
statement being displayed. (The difference is that
n or "next" skips the plunge
into a subroutine call, treating it like a single statement;
s or "single step" enters a
subroutine and single step you through that code as well.) Once
you've given one of these commands, you can just hit Enter to
repeat the same command.
Since there aren't any subroutines, you needn't worry
about choosing between n and s,
so let's use n:
DB<3> n
main::(example6-4:12): my @dna;
DB<3>
This shows the next line (you can see the line numbers of the Perl
program at the end of the prompt). If you wish to see more lines, the
w or "window" command will
serve:
DB<3> w
9
10 # declare and initialize variables
11: my $dna = 'CGACGTCTTCTAAGGCGA';
12==> my @dna;
13: my $receivingcommittment;
14: my $previousbase = '';
15
16: my $subsequence = '';
17
18: if (@ARGV) {
DB<3>
The current line—the line that will be
executed next—is highlighted with an arrow
(==>).
The w seems like a useful thing. Let's
get more information about it with the help command
h w:
DB<3> h w
w [line] List window around line.
DB<4>
Actually, there's more—hitting w
repeatedly keeps showing more of the program; a minus sign backs up a
screen. But enough of that.
Now that $dna has been declared and initialized,
the program seems wrong on the first statement:
DB<4> p $dna
CGACGTCTTCTAAGGCGA
DB<5>
That's exactly what was expected. There's no bug, so
let's continue examining the lines, printing out values here
and there:
DB<5> n
main::(example6-4:13): my $receivingcommittment;
DB<5> n
main::(example6-4:14): my $previousbase = '';
DB<5> n
main::(example6-4:16): my $subsequence = '';
DB<5> n
main::(example6-4:18): if (@ARGV) {
DB<5> p @ARGV
DB<6> w
15
16: my $subsequence = '';
17
18==> if (@ARGV) {
19: my $subsequence = $ARGV[0];
20 }else{
21: $subsequence = 'TA';
22 }
23
24: my $base1 = substr($subsequence, 0, 1);
DB<6> n
main::(example6-4:21): $subsequence = 'TA';
DB<6> n
main::(example6-4:24): my $base1 = substr($subsequence, 0, 1);
DB<6> p $subsequence
TA
DB<7> n
main::(example6-4:25): my $base2 = substr($subsequence, 1, 1);
DB<7> n
main::(example6-4:28): @dna = split ( '', $dna );
DB<7> p $base1
T
DB<8> p $base2
A
DB<9>
So far, everything is as expected; the default subsequence
TA is being used, and the
$base1 and $base2 variables are
set to T and A, the first and
second bases of the subsequence. Let's continue:
DB<9> n
main::(example6-4:39): foreach (@dna) {
DB<9> p @dna
CGACGTCTTCTAAGGCGA
DB<10> p "@dna"
C G A C G T C T T C T A A G G C G A
DB<11>
This shows a trick with Perl and printing
arrays: normally they are
printed without any spacing between the elements, but enclosing an
array in double quotes in a print statement causes
it to be displayed with spaces between the elements.
Again, everything seems okay, and we're about to enter a
loop.
Let's look at the whole loop first:
DB<11> w
36 #
37 # At each loop, save the previous base.
38
39==> foreach (@dna) {
40: if ($receivingcommittment) {
41: print;
42: next;
43 } elsif ($previousbase eq $base1) {
44: if ( /$base2/ ) {
45: print $base1, $base2;
DB<11> w
43 } elsif ($previousbase eq $base1) {
44: if ( /$base2/ ) {
45: print $base1, $base2;
46: $recievingcommitment = 1;
47 }
48 }
49: $previousbase = $_;
50 }
51
52: print "\n";
DB<11>
Despite the few repeated lines resulting from the
w command, you can see the whole loop. Now you
know something in here is going wrong: when you tested the program
without giving it an argument, as it's running now, it took the
default argument TA, and so far it seemed okay.
However, all it actually did in your test was to print out the
TA when it was supposed to print out everything in
the string starting with the first occurrence of
TA. What's going wrong?
6.6.3.5
Setting breakpoints
To figure
out
what's wrong, you can set a breakpoint in your code. A
breakpoint is a spot in your program where you
tell the debugger to stop execution so you can poke around in the
code. The Perl debugger lets you set breakpoints in various ways.
They let you run the program, stopping only to examine it when a
statement with a breakpoint is reached. That way, you don't
have to step through every line of code. (If you have 5,000 lines of
code, and the error happens when you hit a line of code that's
first used when you're reading the 12,000th line of input,
you'll be happy about this feature.)
Notice that the part of this loop that prints out the rest of the
string, once the starting two bases have been found, is the
if block starting at line 40:
if ($receivingcommittment) {
print;
next;
}
Let's look at that $receivingcommittment
variable.
Here's one way to do this. Let's set a breakpoint at line
40. Type b 40 and then
c to continue, and the program proceeds
until it hits line 40:
DB<11> b 40
DB<12> c
main::(example6-4:40): if ($receivingcommittment) {
DB<12> p
C
DB<12>
The last command, p
, prints out the element from the
@dna array you reached in the
foreach loop. Since you didn't specify a
variable for the loop, it used the default $_
variable. Many Perl commands such as print or
pattern matching operate on the default $_
variable if no other variable is given. (It's the cousin of the
@_ default array subroutines used to hold their
parameters.) So the p debugger command shows that
you're operating on C from the @dna array,
which is the first character.
All well and good. But it would be good to have the program break
when the variable $receivingcommittment has a
change in its value, and then single step from there, to see why the
program isn't printing out the rest of the string. Recall that
this variable is the flag whose change tells the program to print the
rest of the string. First let's delete all other
breakpoints:
DB<12> D
Deleting all breakpoints...
You can "watch" the variable with W
like so:
DB<12> W $receivingcommittment
DB<13> c
TA
Debugged program terminated. Use q to quit or R to restart,
use O inhibit_exit to avoid stopping after program termination,
h q, h R or h O to get additional info.
DB<13>
Wait a minute! The W command should indicate when
$receivingcommittment changes value. But when the
program continued running with the c command, it
ran to the end, meaning that $receivingcommittment
never changed value. So let's start up the program again and
break on the line that changes its value:
DB<13> R
Warning: some settings and command-line options may be lost!
Default die handler restored.
Loading DB routines from perl5db.pl version 1.07
Editor support available.
Enter h or 'h h' for help, or 'man perldebug' for more help.
main::(example6-4:11): my $dna = 'CGACGTCTTCTAAGGCGA';
DB<13> w 45
42: next;
43 } elsif ($previousbase eq $base1) {
44: if ( /$base2/ ) {
45: print $base1, $base2;
46: $recievingcommitment = 1;
47 }
48 }
49: $previousbase = $_;
50 }
51
DB<14> b 46
DB<15> c
TAmain::(example6-4:46): $recievingcommitment = 1;
DB<15> n
main::(example6-4:49): $previousbase = $_;
DB<15> p $receivingcommittment
DB<16>
Huh? The code says it's assigning the variable a value of 1,
but after you execute the code, with the n and try
to print out the value, it doesn't print anything.
If you stare harder at the program, you see that at line 66 you
misspelled $receivingcommittment as
$recievingcommitment. That explains everything;
fix it and run it again:
$ perl example6-4
TAAGGCGA
Success!
6.6.3.6
Fixing another bug
Now, did that fix the other bug when you ran Example 6-4 with an argument?
$ perl example6-4 AA
GACGTCTTCTAAGGCGA
Again, huh? You
expected
AAGGCGA. Can there be another bug in the program?
Let's try the debugger again:
$ perl -d example6-4 AA
Default die handler restored.
Loading DB routines from perl5db.pl version 1.07
Editor support available.
Enter h or 'h h' for help, or 'man perldebug' for more help.
main::(example6-4:11): my $dna = 'CGACGTCTTCTAAGGCGA';
DB<1> n
main::(example6-4:12): my @dna;
DB<1> n
main::(example6-4:13): my $receivingcommittment;
DB<1> n
main::(example6-4:14): my $previousbase = '';
DB<1> n
main::(example6-4:16): my $subsequence = '';
DB<1> n
main::(example6-4:18): if (@ARGV) {
DB<1> n
main::(example6-4:19): my $subsequence = $ARGV[0];
DB<1> n
main::(example6-4:24): my $base1 = substr($subsequence, 0, 1);
DB<1> n
main::(example6-4:25): my $base2 = substr($subsequence, 1, 1);
DB<1> n
main::(example6-4:28): @dna = split ( '', $dna );
DB<1> p $subsequence
DB<2> p $base1
DB<3> p $base2
DB<4>
Okay, for some reason the $subsequence, and
therefore the $base1 and $base2
variables, are not getting set right. How come?
Check out line 19 where you declared a new my
variable in the block of the if statement with the
same name, $subsequence. That's the variable
you're setting, but it's disappearing as soon as the
if statement is over, because it's scoped in
the block since it's a my variable.
So again, you fix that problem by removing the my
declaration on line 19 and instead inserting an assignment
$subsequence = $ARGV[0]; and run the program
again:
$ perl example6-4
TAAGGCGA
$ perl example6-4 AA
AAGGCGA
Here, finally, is success.
6.6.3.7
use warnings; and use strict; redux
Example 6-4 was somewhat artificial. It turns out
that these problems would have been reported easily if warnings had
been used. So let's see an actual example of the benefits of
use
strict; and
use warnings;, as discussed
earlier in this chapter.
If you go back to the original Example 6-4 and add
the use warnings; directive
near the top of the program, you get the following output:
$ perl example6-4
Name "main::recievingcommitment" used only once: possible typo at example6-4 line 47.
TA
As you see, the warnings found the first bug immediately. They
noticed there was a variable that was used only once, usually a sign
of a misspelled variable. (I can never spell "receiving"
or "commitment" properly.) So fix the misspelling at line
66, and run it again:
$ perl example6-4
TAAGGCGA
$ perl example6-4 AA
substr outside of string at example6-4 line 26.
Use of uninitialized value in regexp compilation at example6-4 line 45.
Use of uninitialized value in print at example6-4 line 46.
GACGTCTTCTAAGGCGA
So, the first bug is fixed. The second bug remains with a few
warnings that are, perhaps, hard to understand. But focus on the
first error message, and see that it complains about line 26:
my $base2 = substr($subsequence, 1, 1);
So, there's something wrong with
$subsequence. Often, error messages will be off by
one line, so it may well be that the error starts on the line before,
the first time $subsequence is operated on by the
substr. But that's not the case here.
Nonetheless, the warnings have pointed directly to the problem. In
this case, you still have to take a little initiative; look back at
the $subsequence variable and notice the extra
my declaration within the if
block on line 20 that is preventing the variable from being
initialized properly. Now this is not necessarily always a
bug—declaring a variable scoped within a block and that
overrides another variable of the same name that is outside the
block. In fact, it's perfectly legal, so the programmers who
wrote the warnings did not flag it as an obvious error. However, it
seems to have caused a real problem here!
One final point: if you go back to the original, buggy program,
notice there's no use
strict; in the program. If you add that and run
the program without arguments, you get the following:
$ perl example6-4
Global symbol "$recievingcommitment" requires explicit package name at example6-4 line 47.
Execution of example6-4 aborted due to compilation errors.
Fixing the misspelled variable, and running the program with the
argument, you get:
$ perl example6-4 AA
GACGTCTTCTAAGGCGA
You can see that use
strict; didn't help for the other bug. Remember,
it's best to employ both use
strict; and use warnings;.