Up to now, we've relied on the print
statement to format
output. In this section, I introduce three additional Perl features for
writing output:
printf
function
here
documents
format
and write
functions
The entire story about these Perl output features is beyond the scope of this book, but I'll tell you just enough to give you an idea of how they can be used.
The printf
function is
like the print
function but
with extra features that allow you to specify how certain data is printed out.
Perl's printf
function is taken from the C
language function of the same name. Here's an example of a printf
statement:
my $first = '3.14159265'; my $second = 76; my $third = "Hello world!"; printf STDOUT "A float: %6.4f An integer: %-5d and a string: %s ", $first, $second, $third;
This code snippet prints the following:
A float: 3.1416 An integer: 76 and a string: Hello world!
The arguments to the printf
function
consist of
a format string, followed by a list of values that are printed as
specified by the format string. The format string may also contain any text
along with the directives to print the list of values. (You may also specify an
optional filehandle in the same manner you would in a print
function.)
The
directives consist of a percent sign followed by a required
conversion specifier, which in the example includes f
for floating point, d
for
integer, and s
for string. The conversion
specifier indicates what kind of data is in the variable to be printed. Between
the %
and the conversion specifier, there may
be 0 or more flags, an optional minimum field width, an optional precision, and
an optional length modifier. The list of values following the format string must
contain data that matches the types of directives, in order.
There are many possible options for these flags and specifiers (some are
listed in Appendix B). Here's what is in
the example just given. First, the directive %6.4f
specifies to print a floating point (that is, a decimal)
number, with a minimum width of six characters overall (padded with spaces if
necessary), and at most four positions for the decimal part. You see in the
output that, although the $f
floating-point
number gives the value of pi to eight decimal places, the example specifies a
precision of four decimal places, which are all that is printed out.
The %-5d
directive specifies an integer to
be printed in a field of width 5; the -
flag
causes the number to be left-justified in the field. Finally, the %s
directive prints a string.
Now we'll briefly
examine here
documents. These are
convenient ways to specify multiline text for output with perhaps some variables
to be interpolated, in a way that looks pretty much the same in your code as it
will in the output—that is, without a lot of print
statements or embedded newline
characters. We'll follow Example 12-3 and its output with a
discussion.
Example 12-3. Example of here document
#!/usr/bin/perl # Example of here document use strict; use warnings; my $DNA = 'AAACCCCCCGGGGGGGGTTTTTT'; for( my $i = 0 ; $i < 2 ; ++$i ) { print <<HEREDOC; On iteration $i of the loop! $DNA HEREDOC } exit;
Here's the output from Example 12-3:
On iteration 0 of the loop! AAACCCCCCGGGGGGGGTTTTTT On iteration 1 of the loop! AAACCCCCCGGGGGGGGTTTTTT
In Example 12-3, a here
document was put in a for
loop, so that you can see the $i
variable
changing in the printout. The variables are interpolated into a here
document in the same way they are
interpolated into a double-quoted string. Every time they go through the loop,
the contents of the here
document are subject
to variable interpolation and are printed out. The terminating string used in
this example, HEREDOC, can be any string you specify. (There are several options
for dealing with things like indentation and so forth; I won't discuss them here
and refer you to the Perl documentation.) Here documents are handy for some
tasks, such as when you have a long, multiline document with just a few changes
applied each time you print it. A business form letter, with only the addressee
changed, is a typical example. Using a here
document preserves the look of the final output in the code, while allowing
variable interpolation.
Finally, let's
take a look at the format
and
write
functions. format
is designed to generate reports and can
handle page numbers, headers, and various layout options such as centering and
left and right justification. It's modelled on the FORTRAN programming language
conventions for formatting and so is particularly handy for producing reports
based on that style, such as the PDB file format, in which fields are specified
as occupying certain columns on the line.
Example 12-4 is a short example of a format that creates a FASTA-style output.
Example 12-4. Example of format function to produce FASTA output
#!/usr/bin/perl # Create fasta format DNA output with "format" function use strict; use warnings; # Declare variables my $id = 'A0000'; my $description = 'Highly weird DNA. This DNA is so unlikely!'; my $DNA = 'AAAAAACCCCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGGTTTTTTTTTTTTTTTTTTTTT'; # Define the format format STDOUT = # The header line >@<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<... $id, $description # The DNA lines ^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~ $DNA . # Print the fasta-formatted DNA output write; exit;
Here's the output of Example 12-4:
>A0000 Highly unlikely DNA. This DNA is so... AAAAAACCCCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGGTTTTTTTTT TTTTTTTTTTTT
After declaring and initializing the variables that fill in the form, the form is defined with:
format STDOUT =
and the format continues until it reaches the line with a period at the beginning.
The format is composed of three kinds of lines:
The picture line and the argument line must be adjacent; they can't be separated by a comment line, for instance.
The first picture line/argument line combo is for the header information:
>@<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<... $id, $description
The picture line has two picture fields in it, associated with the variables $id
and $description
, respectively. The picture line begins with a
greater-than sign, >
, which is just text
that begins each FASTA file header line, by definition. Then comes the first
picture field, which is an @
sign followed by
nine <
signs. The @
sign declares a field that has the associated variable
interpolated into it. The use of the nine
less-than signs specifies that the value should be left-justified,
for a total of 10 columns. If the value is bigger than 10 columns, it is
truncated. A less-than sign left-justifies, a
greater-than sign right-justifies, and a
vertical bar | centers the data in the field.
The second picture field is almost identical. It is longer and ends with three
dots (an ellipsis) which prints if the contents of the variable $description
can't fit into the length of the
picture field (in our example it prints "This DNA is so...")
The next pair of picture/argument lines is:
^<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<~~ $DNA
The picture field starts with a caret, which declares a picture field that will handle
variable-length records. The line also contains 49 less-than signs, for a total
of 50 columns, left-justified. At the end are two tilde ~
signs, which indicate
there should be additional lines for the data if it doesn't fit one on one
line.
The write
command simply prints the
previously defined format. By default, the output goes to STDOUT, as is done in
the example, but you can supply a filehandle to the format
and write
statements if
you desire.
The upcoming release of Perl 6 will move formats out of the core of the
language and make them into a module. Details are not available as of this
writing, but this change will probably entail adding a statement such as
use Formats;
near the top of your code in
order to load the
module for using
formats.