B.14
Input/Output
This section covers getting information into programs and
receiving data back from them.
B.14.1
Input from Files
Perl has
several convenient ways to get
information into a program. In this book, I've emphasized
opening files and reading in the information contained in them,
because it is frequently used, and because it behaves very much the
same way on all different operating systems. You've observed
the open
and
close system calls and how to associate a
filehandle with
a file when you open it, which then is used to read in the data. As
an example:
open(FILEHANDLE, "informationfile");
@data_from_informationfile = <FILEHANDLE>;
close(FILEHANDLE);
This code opens the file informationfile and
associates the filehandle FILEHANDLE with it. The filehandle is then
used within angle brackets < > to actually
read in the contents of the file and store the contents in the array
@data_from_informationfile. Finally, the file is
closed by referring once again to the opened filehandle.
B.14.2
Input from STDIN
Perl allows you to read in any input that
is automatically sent to your program via standard input (STDIN).
STDIN is a filehandle that by default is always open. Your program
may be expecting some input that way. For instance, on a Mac, you can
drag and drop a file icon onto the Perl applet for your program to
make the file's contents appear in STDIN. On Unix systems, you
can pipe the output of some other program into the STDIN of your
program with shell commands such as:
someprog | my_perl_program
You can also pipe the contents of a file into your program with:
cat file | my_perl_program
or with:
my_perl_program < file.
Your program can then read in the data (from program or file) that
comes as STDIN just as if it came from a file that you've
opened:
@data_from_stdin = <STDIN>;
B.14.3
Input from Files Named on the Command Line
You
can name your input files on the command
line. <> is shorthand for
<ARGV>. The ARGV
filehandle treats the array
@ARGV as a list of
filenames and returns the contents of all those files, one line at a
time. Perl places all command-line arguments into the array
@ARGV. Some of these may be special flags, which
should be read and removed from @ARGV if there
will also be datafiles named. Perl assumes that anything in
@ARGV refers to an input filename when it reaches
a < > command. The contents of the file or
files are then available to the program using the angle brackets
< > without a filehandle, like so:
@data_from_files = <>;
For example, on Microsoft,
Unix, or on the MacOS X, you specify
input files at the command line, like so:
% my_program file1 file2 file3
B.14.4
Output Commands
The
print statement is the most common way to
output data from a Perl program. The print
statement takes as arguments a list of scalars separated by commas.
An array can be an argument, in which case, the elements of the array
are all printed one after the other:
@array = ('DNA', 'RNA', 'Protein');
print @array;
This prints out:
DNARNAProtein
If you want to put spaces between the elements of an array, place it
between double quotes in the print statement, like
this:
@array = ('DNA', 'RNA', 'Protein');
print "@array";
This prints out:
DNA RNA Protein
The print statement can specify a filehandle as an
optional indirect object between the print
statement and the arguments, like so:
print FH "@array";
The printf function gives more control over the
formatting of the output of numbers.
For instance, you can specify field widths; the precision, or number
of places after the decimal point; and whether the value is right- or
left-justified in the field. I showed the most common options in
Chapter 12 and refer you to the Perl documentation
that comes with your copy of Perl for all the details.
The sprintf
function is related to the
printf function; it formats a string instead of
printing it out.
The format and write commands
are a way to format a multiline output, as when generating reports.
format can be a useful command, but in practice it
isn't used much. The full details are available in your Perl
documentation, and O'Reilly's Programming Perl
contains an entire chapter on format.
You can also see format in Chapter 12 of this book.
B.14.4.1
Output to STDOUT, STDERR, and Files
Standard
output, with the filehandle STDOUT, is
the default destination for output from a Perl program, so it
doesn't have to be named. The following two statements are
equivalent unless you used select to change the
default output
filehandle:
print "Hello biology world!\n";
print STDOUT "Hello biology world!\n";
Note that the STDOUT isn't followed by a comma. STDOUT is
usually directed to the computer screen, but it may be redirected at
the command line to other programs or
files. This Unix command pipes the STDOUT
of my_program to the STDIN of
your_program:
my_program | your_program
This Unix command directs the output of my_program
to the file outputfile:
my_program > outputfile
It's also common to direct certain error messages to the
predefined standard error filehandle STDERR or to a file you've
opened for input and named with a particular filehandle. Here are
examples of these two tasks:
print STDERR "If you reached this part of the program, something is terribly wrong!";
open(OUTPUTFD, ">output_file");
print OUTPUTFD "Here is the first line in the output file output_file\n";
STDERR is also usually directed to the computer screen by default,
but it can be directed into a file from the command line. This is
done differently for different systems, for example, as follows (on
Unix with the sh or bash
shells):
myprogram 2>myprogram.error
You can also direct STDERR to a file from within your Perl program by
including code such as the following before the first output to
STDERR. This is the most portable way to redirect STDERR:
open (STDERR, ">myprogram.error") or die "Cannot open error file
myprogram.error:$!\n";
The problem with this is that the original STDERR is lost. This
method, taken from Programming Perl, saves and
restores the original STDERR:
open ERRORFILE, ">myprogram.error"
or die "Can't open myprogram.error";
open SAVEERR, ">&STDERR";
open STDERR, ">&ERRORFILE;
print STDERR "This will appear in error file myprogram.error\n";
# now, restore STDERR
close STDERR;
open STDERR, ">&SAVEERR";
print STDERR "This will appear on the computer screen\n";
There are a lot of details concerning filehandles not covered in this
book, and redirecting one of the predefined filehandles such as
STDERR can cause problems, especially as your programs get bigger and
rely more on modules and libraries of subroutines. One safe way is to
define a new filehandle associated with an error file and to send all
your error messages to it:
open (ERRORMESSAGES, ">myprogram.error")
or die "Cannot open myprogram.error:$!\n";
print ERRORMESSAGES "This is an error message\n";
Note that the die function, and the closely related
warn function, print their error messages to
STDERR.