Before you get too worked up over all that syntax, just remember that the normal way to define a simple subroutine ends up looking like this:
sub razzle { print "Ok, you've been razzled. "; }
and the normal way to call it is simply:
razzle();
In this case, we ignored inputs (arguments) and outputs
(return values). But the Perl model for passing data into and out of a
subroutine is really quite simple: all function parameters are passed
as one single, flat list of scalars, and multiple return values are
likewise returned to the caller as one single, flat list of scalars.
As with any LIST
, any arrays or hashes
passed in these lists will interpolate their values into the flattened
list, losing their identities--but there are several ways to get
around this, and the automatic list interpolation is frequently quite
useful. Both parameter lists and return lists may contain as many or
as few scalar elements as you'd like (though you may put constraints
on the parameter list by using prototypes). Indeed, Perl is designed
around this notion of variadic functions (those
taking any number of arguments), unlike C, where they're sort of
grudgingly kludged in so that you can call printf
(3).
Now, if you're going to design a language around the
notion of passing varying numbers of arbitrary arguments, you'd better
make it easy to process those arbitrary lists of arguments. Any
arguments passed to a Perl routine come in as the array
@_
. If you call a function with two arguments, they
are accessible inside the function as the first two elements of that
array: $_[0]
and $_[1]
. Since
@_
is a just a regular array with an irregular
name, you can do anything to it you'd normally do to an
array.[2] The array @_
is a local array, but
its values are aliases to the actual scalar parameters. (This is known
as pass-by-reference semantics.) Thus you can modify the actual
parameters if you modify the corresponding element of
@_
. (This is rarely done, however, since it's so
easy to return interesting values in Perl.)
The return value of the subroutine (or of any other
block, for that matter) is the value of the last expression evaluated.
Or you may use an explicit return
statement to
specify the return value and exit the subroutine from any point in the
subroutine. Either way, as the subroutine is called in a scalar or
list context, so also is the final expression of the routine evaluated
in that same scalar or list context.
Perl does not yet have named formal parameters, but
in practice all you do is copy the values of @_
to a my
list, which serves nicely for a list of
formal parameters. (Not coincidentally, copying the values changes
the pass-by-reference semantics into pass-by-value, which is how
people usually expect parameters to work anyway, even if they don't
know the fancy computer science terms for it.) Here's a typical
example:
sub maysetenv { my ($key, $value) = @_; $ENV{$key} = $value unless $ENV{$key}; }
But you aren't required to name your parameters, which
is the whole point of the @_
array. For example,
to calculate a maximum, you can just iterate over
@_
directly:
sub max { my $max = shift(@_); for my $item (@_) { $max = $item if $max < $item; } return $max; } $bestday = max($mon,$tue,$wed,$thu,$fri);
Or you can fill an entire hash at once:
sub configuration { my %options = @_; print "Maximum verbosity. " if $options{VERBOSE} == 9; } configuration(PASSWORD => "xyzzy", VERBOSE => 9, SCORE => 0);
Here's an example of not naming your formal arguments so that you can modify your actual arguments:
upcase_in($v1, $v2); # this changes $v1 and $v2 sub upcase_in { for (@_) { tr/a-z/A-Z/ } }
You aren't allowed to modify constants in this way, of
course. If an argument were actually a scalar literal like
"hobbit
" or read-only scalar variable like
$1
, and you tried to change it, Perl would raise
an exception (presumably fatal, possibly career-threatening). For
example, this won't work:
upcase_in("frederick");
It would be much safer if the upcase_in
function were written to return a copy of its parameters instead of
changing them in place:
($v3, $v4) = upcase($v1, $v2); sub upcase { my @parms = @_; for (@parms) { tr/a-z/A-Z/ } # Check whether we were called in list context. return wantarray ? @parms : $parms[0]; }
Notice how this (unprototyped) function doesn't care whether
it was passed real scalars or arrays. Perl will smash everything
into one big, long, flat @_
parameter list. This
is one of the places where Perl's simple argument-passing style
shines. The upcase
function will work perfectly
well without changing the upcase
definition even
if we feed it things like this:
@newlist = upcase(@list1, @list2); @newlist = upcase( split /:/, $var );
Do not, however, be tempted to do this:
(@a, @b) = upcase(@list1, @list2); # WRONG
Why not? Because, like the flat incoming parameter list in
@_
, the return list is also flat. So this stores
everything in @a
and empties out
@b
by storing the null list there. SeeSection 6.3 for
alternatives.
If you want your function to return in such a way
that the caller will realize there's been an error, the most natural
way to do this in Perl is to use a bare return
statement without an argument. That way when the function is used in
scalar context, the caller gets undef
, and when
used in list context, the caller gets a null list.
Under extraordinary circumstances, you might choose
to raise an exception to indicate an error. Use this measure
sparingly, though; otherwise, your whole program will be littered
with exception handlers. For example, failing to open a file in a
generic file-opening function is hardly an exceptional event.
However, ignoring that failure might well be. The
wantarray
built-in returns
undef
if your function was called in void
context, so you can tell if you're being ignored:
if ($something_went_awry) { return if defined wantarray; # good, not void context. die "Pay attention to my error, you danglesocket!!! "; }
Subroutines may be called recursively because each
call gets its own argument array, even when the routine calls
itself. If a subroutine is called using the &
form, the argument list is optional. If the &
is used but the argument list is omitted, something special happens:
the @_
array of the calling routine is supplied
implicitly. This is an efficiency mechanism that new users may wish
to avoid.
&foo(1,2,3); # pass three arguments foo(1,2,3); # the same foo(); # pass a null list &foo(); # the same &foo; # foo() gets current args, like foo(@_), but faster! foo; # like foo() if sub foo predeclared, else bareword "foo"
Not only does the &
form make the
argument list optional, but it also disables any prototype checking
on the arguments you do provide. This is partly for historical
reasons and partly to provide a convenient way to cheat if you know
what you're doing. See the section Section 6.4 later in this
chapter.
Variables you access from inside a function that
haven't been declared private to that function are not necessarily
global variables; they still follow the normal block-scoping rules
of Perl. As explained in Section
2.5 in Chapter 2, this
means they look first in the surrounding lexical scope (or scopes)
for resolution, then on to the single package scope. From the
viewpoint of a subroutine, then, any my
variables
from an enclosing lexical scope are still perfectly visible.
For example, the bumpx
function below has
access to the file-scoped $x
lexical variable
because the scope where the my
was declared--the
file itself--hasn't been closed off before the subroutine is
defined:
# top of file my $x = 10; # declare and initialize variable sub bumpx { $x++ } # function can see outer lexical variable
C and C++ programmers would probably think of
$x
as a "file static" variable. It's private as
far as functions in other files are concerned, but global from the
perspective of functions declared after the my
. C
programmers who come to Perl looking for what they would call
"static variables" for files or functions find no such keyword in
Perl. Perl programmers generally avoid the word "static", because
static systems are dead and boring, and because the word is so
muddled in historical usage.
Although Perl doesn't include the word "static" in its lexicon, Perl programmers have no problem creating variables that are private to a function and persist across function calls. There's just no special word for these. Perl's richer scoping primitives combine with automatic memory management in ways that someone looking for a "static" keyword might never think of trying.
Lexical variables don't get automatically garbage
collected just because their scope has exited; they wait to get
recycled until they're no longer used, which is
much more important. To create private variables that aren't
automatically reset across function calls, enclose the whole
function in an extra block and put both the my
declaration and the function definition within that block. You can
even put more than one function there for shared access to an
otherwise private variable:
{ my $counter = 0; sub next_counter { return ++$counter } sub prev_counter { return --$counter } }
As always, access to the lexical variable is limited to code
within the same lexical scope. The names of the two functions, on
the other hand, are globally accessible (within the package), and,
since they were defined inside $counter
's scope,
they can still access that variable even though no one else
can.
If this function is loaded via
require
or use
, then this is
probably just fine. If it's all in the main program, you'll need to
make sure any run-time assignment to my
is
executed early enough, either by putting the whole block before your
main program, or alternatively, by placing a
BEGIN
or INIT
block around it
to make sure it gets executed before your program starts:
BEGIN { my @scale = ('A' .. 'G'), my $note = -1; sub next_pitch { return $scale[ ($note += 1) %= @scale ] }; }
The BEGIN
doesn't affect the subroutine
definition, nor does it affect the persistence of any lexicals used
by the subroutine. It's just there to ensure the variables get
initialized before the subroutine is ever called. For more on
declaring private and global variables, see my
and our
respectively in Chapter 29. The
BEGIN
and INIT
constructs are
explained in Chapter 18.