12.5
Presenting Data
Up to now, we've relied on the print
statement to format
output. In this section, I introduce three
additional Perl features for
writing
output:
The entire story about these Perl output features is beyond the scope
of this book, but I'll tell you just enough to give you an idea
of how they can be used.
12.5.1
The printf Function
The printf function is
like the print
function but with extra features that allow you to specify how
certain data is printed out. Perl's printf
function is taken from the C language function of the same name.
Here's an example of a printf statement:
my $first = '3.14159265';
my $second = 76;
my $third = "Hello world!";
printf STDOUT "A float: %6.4f An integer: %-5d and a string: %s\n",
$first, $second, $third;
This code snippet prints the following:
A float: 3.1416 An integer: 76 and a string: Hello world!
The arguments to the printf function consist of
a format string, followed by a list of
values that are printed as specified by the format string. The format
string may also contain any text along with the directives to print
the list of values. (You may also specify an optional filehandle in
the same manner you would a print function.)
The
directives consist of a percent sign
followed by a required conversion specifier, which in the example
includes f for floating point,
d for integer, and s for
string. The conversion specifier indicates what kind of data is in
the variable to be printed. Between the % and the
conversion specifier, there may be 0 or more flags, an optional
minimum field width, an optional precision, and an optional length
modifier. The list of values following the format string must contain
data that matches the types of directives, in order.
There are many possible options for these flags and specifiers (some
are listed in Appendix B). Here's what is in
Example 12-3. First, the directive
%6.4f specifies to print a floating point (that
is, a decimal) number, with a minimum width of six characters overall
(padded with spaces if necessary), and at most four positions for the
decimal part. You see in the output that, although the
$f floating-point number gives the value of pi to
eight decimal places, the example specifies a precision of four
decimal places, which are all that is printed out.
The %-5d directive specifies an integer to be
printed in a field of width 5; the - flag causes
the number to be left-justified in the field. Finally, the
%s directive prints a string.
12.5.2
here Documents
Now we'll briefly
examine here
documents. These are convenient ways to specify multiline text for
output with perhaps some variables to be interpolated, in a way that
looks pretty much the same in your code as it will in the
output—that is, without a lot of print
statements or embedded newline \n characters.
We'll follow Example 12-3 and its output with a
discussion.
Example 12-3. Example of here document
#!/usr/bin/perl
# Example of here document
use strict;
use warnings;
my $DNA = 'AAACCCCCCGGGGGGGGTTTTTT';
for( my $i = 0 ; $i < 2 ; ++$i ) {
print <<HEREDOC;
On iteration $i of the loop!
$DNA
HEREDOC
}
exit;
Here's the output from Example 12-3:
On iteration 0 of the loop!
AAACCCCCCGGGGGGGGTTTTTT
On iteration 1 of the loop!
AAACCCCCCGGGGGGGGTTTTTT
In Example 12-3, a
here
document was put in a for loop, so that you can
see the $i variable changing in the printout. The
variables are interpolated into a here document in
the same way they are interpolated into a double-quoted string. Every
time they go through the loop, the contents of the
here document are subject to variable
interpolation and are printed out. The terminating string used in
this example, HEREDOC, can be any string you specify. (There are
several options for dealing with things like indentation and so
forth; I won't discuss them here and refer you to the Perl
documentation.) Here documents are handy for some tasks, such as when
you have a long, multiline document with just a few changes applied
each time you print it. A business form letter, with only the
addressee changed, is a typical example. Using a
here document preserves the look of the final
output in the code, while allowing variable interpolation.
12.5.3
format and write
Finally, let's
take a look at the
format and write functions.
format is designed to generate reports and can
handle page numbers, headers, and various layout options such as
centering and left and right justification. It's modelled on
the FORTRAN programming-language conventions for formatting and so is
particularly handy for producing reports based on that style, such as
the PDB file format, in which fields are specified as occupying
certain columns on the line.
Example 12-4 is a short example of a format that
creates a FASTA-style output.
Example 12-4. Example of format function to produce FASTA output
#!/usr/bin/perl
# Create fasta format DNA output with "format" function
use strict;
use warnings;
# Declare variables
my $id = 'A0000';
my $description = 'Highly weird DNA. This DNA is so unlikely!';
my $DNA = 'AAAAAACCCCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGGTTTTTTTTTTTTTTTTTTTTT';
# Define the format
format STDOUT =
# The header line
>@<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<...
$id, $description
# The DNA lines
^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~
$DNA
.
# Print the fasta-formatted DNA output
write;
exit;
Here's the output of Example 12-4:
>A0000 Highly unlikely DNA. This DNA is so...
AAAAAACCCCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGGTTTTTTTTT
TTTTTTTTTTTT
After declaring and initializing the variables that fill in the form,
the form is defined with:
format STDOUT =
and the format continues until it reaches the line with a period at
the beginning.
The format is composed of three kinds of lines:
-
A comment beginning with the pound sign #
-
A
picture line that specifies the layout of
text
-
An argument line that names the variables that fill in the preceding
picture line
The picture line and the argument line must be adjacent; they
can't be separated by a comment line, for instance.
The first picture line/argument line combo is for the header
information:
>@<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<...
$id, $description
The picture line has two picture fields in it, associated with the
variables $id and $description,
respectively. The picture line begins with a greater-than sign,
>, which is just text that begins each FASTA
file header line, by definition. Then comes the first picture field,
which is an @ sign followed by nine
< signs. The @ sign declares
a field that has the associated variable interpolated into it. The
use of the nine less-than signs specifies
that the value should be left-justified, for a total of 10 columns.
If the value is bigger than 10 columns, it is truncated. A less-than
sign left-justifies, a greater-than sign
right-justifies, and a
vertical bar | centers the data in the
field.
The second picture field is almost identical. It is longer and ends
with three dots (an ellipsis) which prints if the contents of the
variable $description can't fit into the
length of the picture field (which, in this case, is true.)
The next pair of picture/argument lines is:
^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~
$DNA
The picture field starts with a caret, which declares a picture field
that will handle variable-length records. The line also contains 49
less-than signs, for a total of 50 columns, left-justified. At the
end are two tilde ~ signs, which
indicate there should be additional lines for the data if it
doesn't fit one on one line.
The write command simply prints the previously
defined format. By default, the output goes to STDOUT, as is done in
the example, but you can supply a filehandle to the
format and write statements if
you desire.
The upcoming release of Perl 6 will move formats out of the core of
the language and make them into a module. Details are not available
as of this writing, but this change will probably entail adding a
statement such as use Formats; near the top of
your code in order to load the module for using formats.