Working with @ARGV and Script Arguments

One aspect of running Perl scripts that I've sort of sidestepped over the last few days is that of dealing with command-line arguments. Yesterday we talked a bit about Perl's own switches (-e, -w) and so on, but what if you want to actually pass switches or arguments to your own scripts—how do you handle those? That's what we'll discuss in this section: script arguments in general, and handling script switches.

Anatomy of the @ARGV

When you call a Perl script with arguments beyond the name of the script, those arguments are stored in the special global list @ARGV (on the Mac, for droplets, @ARGV will be the filenames that were dropped onto the droplet). You can process this array the same way you would any other list in your Perl script. For example, here's a snippet that will just print out the arguments the script was called with, one on each line:

foreach my $arg (@ARGV) {
   print "$arg
";
}

If your script uses a construct such as while (<>), Perl will use the contents of the @ARGV list as the filenames to open and read (if there are no files in @ARGV, Perl will try to read from the standard input). Multiple files are all opened and read sequentially, as if they were all one big file.

If you want more control over the contents of the files you're reading into a script, you could examine the contents of @ARGV to find the filenames to open and read. Processing @ARGV is also useful if you're looking for a specific number of arguments—for example, one configuration file and one data file. If you just want to process the contents of any number of files, it's handy to use the <> shortcut. If you want to specifically have a set of arguments, and control the processing of each one, read the files from @ARGV and process them individually.

Note

Unlike C and Unix's argv, Perl's @ARGV contains only the arguments, not the name of the script itself ($ARGV[0] will contain the first argument). To get the name of the script, you can use the special variable $0 instead.


Script Switches and Fun with Getopt

One use of the script command line is to pass switches to a script. Switches are arguments that start with a dash (-a, -b, -c), and are usually used to control the behavior of that script. Sometimes switches are single letters (-s), sometimes they're grouped (-abc), and sometimes they have an associated value or argument (-o outfile.txt).

You can call a script with any switches you want; those switches will end up as elements of the @ARGV array just as any other arguments do. If you were using <> to process @ARGV, you'll want to get rid of those switches before reading any data—otherwise, Perl will assume that your -s switch is the name of a file. To process and remove those switches littering @ARGV, you could laboriously go through the array and figure out which elements were options, which ones were options with associated arguments, and finally end up with a list of actual filenames after you were done doing all that. Or you could use the Getopt module to do all that for you.

The Getopt module, part of the standard module library that comes with Perl, manages script switches. There are actually two modules: Getopt::Std, which processes single-character switches (-a, -d, -ofile, and so on); and Getopt::Long, which allows just about any kind of options, including multicharacter options (-sde) and GNU-style double-hyphen options (--help, --size, and so on).

In this section, I describe the Getopt::Std module, for handling simple options. If you want to handle more complex options via the Getopt::Long module, you are welcome to explore that module's documentation for yourself (see the perlmod manual page for details).

To use the Getopt::Std module, you import it in your script as you do all modules:

use Getopt::Std;

Importing Getopt::Std gives you two functions: getopt and getopts. These functions are used to extract the switches from your @ARGV array and set scalar variables in your script for each of those switches.

getopts

Let's start with getopts, which defines and processes single-character switches with or without values. The getopts function takes a single string argument containing the single-character switches that your script will accept. Arguments that take values must be followed by a colon (:). Uppercase and lowercase are significant and represent different switches. For example:

getopts('abc'),

The argument here, 'abc', processes -a, -b or -c switches, in any order, with no associated values. Those switches can be grouped: -ab or -abc will work just as well as the individual switches will. Here's another:

getopts('ab:c'),

In this example, the -b switch can take a value, which must appear on the Perl command line immediately after that switch, like this:

% myscript.pl -b 10

The space after the switch itself isn't required: -b10 works as well as -b 10. You can also nest these switches as long as the value appears after the switch itself, like this:

% myscript.pl -acb10 # OK
% myscript.pl -abc10 # wrong

Note

If you want to pass regular arguments to your script that look like switches, you can include the argument -- to indicate that the items that follow are arguments, not switches. For example, if you wanted to pass the argument -whatever to your script, you could call it

% myscript.pl -acb10 -- -whatever


For each switch defined in getopts, the getopts function creates a scalar variable switch with the name $opt_x, where x is the letter of the switch (in the example above, getopts would create three variables for $opt_a, $opt_b, and $opt_c). The initial value of each scalar variable is undefined. Then, if that switch was included in the arguments to the script (in @ARGV), getopts sets the value of its associated variable to 1. If the switch required a value, getopts assigns the value on @ARGV to the scalar variable for that option. The switch, and its associated value, are then deleted from @ARGV. After getopts finishes processing, your @ARGV will either be empty, or will contain any remaining filename arguments which you can then read with file handles or with <>.

After getopts is done, you'll end up with a variable for each switch that will either have a value of 0 (that switch wasn't used), 1 (that switched was used), or have some value (that switch was used with the given value). You can then test for those values and have your script do different things based on the switches it was called with, as shown here:

if ($opt_a) {  # -a was used
   ...
}
if ($opt_b) {  # -b was used
 ...
}

So, for example, if your script was called like this:

% script.pl -a

then getopts('abc') will set $opt_a to 1. If it was called like this:

% script.pl -a -R

then $opt_a will be set to 1, and the -R switch will be quietly deleted with no variable set. If you called it like this:

% script.pl -ab10

and called getopts like this:

getopts('ab:c'),

then $opt_a will be set to 1, and $opt_b will be set to 10.

Note that if you're using use strict Perl will complain about the $opt_ variables suddenly popping into existence. You can get around this by predeclaring those variables with use vars, like this:

use vars qw($opt_a $opt_b $opt_c);

or, you can use the our keyword, as follows:

our ($opt_a, $opt_b, $opt_c);

Error Processing with getopts

Note that getopts reads your @ARGV in order, and stops processing when it gets to an element that does not start with a dash (-) or that isn't a value for a preceding option. This means that when you call your Perl script, you should put the options first and the bare arguments last, otherwise you may end up with unprocessed options or errors trying to read files that aren't files. You may want to write your script to make sure @ARGV is empty after getopts is done, or that its remaining arguments do not start with a dash.

The switches defined by getopts are expected to be the only switches that your script will accept. If you call a script with a switch not defined in the argument to getopts, getopts will print an error (“Unknown option”), delete that option from @ARGV, and return false. You can use this behavior to make sure your script is being called correctly, and exit with a message if it's not. Just put the call to getopts inside an if statement, as follows:

if (! getopts('ab:c')) {
   die "Usage: myscript -a -b file -c 
";
}

Note also that if getopts stops processing your switches in the middle because of an error, any switch variables that were set beforehand will still have their values, and may contain bad values. Depending on how robust you'd like your argument checking to be, you might want to check those values if getopts returns false (or exit altogether).

getopt

The getopt function works just like getopts, in that it takes a string argument with the switches, and it assigns $opt_ variables to each of those arguments and removes them from @ARGV as it goes. However, getopt differs from getopts in three significant respects:

  • The argument to getopt is a string containing the switches that must have associated values.

  • getopt does not require you to define arguments without values beforehand. It allows any single-letter options, and creates an $opt_ variable for each one.

  • getopt does not return a (useful) value, and does not print errors for unexpected options.

Say, for example, you have a call to getopt like this:

getopt('abc'),

This function assumes that your script will be called with any of three switches, -a, -b, or -c, each one with a value. If the script is called with switches that don't have values, getopt will not complain; it happily assigns the next element in @ARGV to the variable for that switch—even if that next element is another switch or a filename that you might have wanted to read as a file. It's up to you to figure out if the values are appropriate, or if the script was called with the wrong set of arguments.

Essentially, the core difference between getopt and getopts is that getopt doesn't require you to declare your options, but also makes it more difficult to handle errors. I prefer getopts for most cases, to avoid having to do a lot of value testing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset