2.2
Perl's Benefits
The following sections illustrate some of Perl's strong points.
2.2.1
Ease of Programming
Computer languages differ in which things
they make easy. By "easy" I mean easy for a programmer to
program. Perl has certain features that simplifies several common
bioinformatics tasks. It can deal with information in ASCII text
files or flat files, which are exactly the kinds of files in which
much important biological data appears, in the GenBank and PDB
databases, among others. (See the discussion of ASCII in Chapter 4; Genbank and PDB are the subjects in Chapter 10 and Chapter 11.) Perl makes
it easy to process and manipulate long sequences such as DNA and
proteins. Perl makes it convenient to write a program that controls
one or more other programs. As a final example, Perl is used to put
biology research labs, and their results, on their own dynamic web
sites. Perl does all this and more.
Although Perl is a language that's remarkably suited to
bioinformatics, it isn't the only choice nor is it always the
best choice. Other programming languages such as C and Java are also
used in bioinformatics. The choice of language depends on the problem
to be programmed, the skills of the programmers, and the available
system.
2.2.2
Rapid Prototyping
Another important benefit
of using Perl for biological
research is the speed with which a programmer can write a typical
Perl program (referred to as rapid
prototyping). Many problems can be solved in far fewer
lines of Perl code than in C or Java. This has been important to its
success in research. In a research environment there are frequent
needs for programs that do something new, that are needed only once
or occasionally, or that need to be frequently modified. In Perl, you
can often toss such a program off in a few minutes or a few hours
work, and the research can proceed. This rapid prototyping ability is
often a key consideration when choosing Perl for a job. It is common
to find programmers familiar with both Perl and C who claim that Perl
is five to ten times faster to program in than C. The difference can
be critical in the typical understaffed research lab.
2.2.3
Portability, Speed, and Program Maintenance
Portability
means how many types of computer systems
the language can run on. Perl has no problems there, as it's
available for virtually all modern computers found in biology labs.
If you write a DNA analyzer in Perl on your Mac, then move it to a
Windows computer, you'll find it usually runs as is or with
only minor retrofitting.
Speed means the speed with which the program
runs. Here Perl is pretty good but not the best. For speed of
execution, the usual language of choice is C. A program written in C
typically runs two or more times faster than the comparable Perl
program. (There are ways of speeding up Perl with compilers and such,
but still... .)
In many organizations, programs are first written in Perl, and then
only the programs that absolutely need to have maximum speed are
rewritten in C. The fact is, maximum speed is only occasionally an
important consideration.
Programming is relatively expensive to do: it takes time, and skilled
personnel. It's labor-intensive. On the other hand, computers
and computer time (often called CPU time after the central processing
unit) are relatively inexpensive. Most desktop computers sit idle for
a large part of the day, anyway. So it's usually best to let
the computer do the work, and save the programmer's time.
Unless your program absolutely must run in say, four seconds instead
of ten seconds, you're okay with Perl.
Program maintenance is the general
activity of keeping everything working:
such activities as adding features to a program, extending it to
handle more types of input, porting it to run on other computer
systems, fixing bugs, and so forth. Programs take a certain amount of
time, effort and cost to write, but successful programs end up
costing more to maintain than they did to write in the first place.
It's important to write in a language, and in a style, that
makes maintenance relatively easy, and Perl allows you to do so. (You
can write obscure, hard-to-maintain code in Perl, as in other
languages, but I'll give you pointers on how to make your code
easy for other programmers to read.)
2.2.4
Versions of Perl
Perl, like almost all popular
software, has gone through much growth and
change over the course of its nearly 15-year life. The
authors—Larry Wall and a large group of cohorts—publish
new versions periodically. These new versions have been carefully
designed to support most programs written under old versions, but
occasionally some major new features are added that don't work
with older versions of Perl.
This book assumes you have Perl Version 5 or higher installed. If you
have Perl installed on your computer, it's likely Perl 5, but
it's best to check. On a Unix or Linux system, or from an
MS-DOS or MacOS X command window,
the perl -v command displays
the version number, in my case, Version 5.6.1. The number 5.6.1 is
"bigger" than 5; that means it's okay. If you get a
smaller number (very likely 4.036), you have to install a recent
version of Perl to enable the majority of programs in this book to
run as shown.
What about future versions? Perl is always evolving, and Perl Version
6 is on the horizon. Will the code in this book still work in Perl 6?
The answer is yes. Although Perl 6 is going to add some new things to
the language, it should have no trouble with the Perl 5 code in this
book.