There are several ways to create references, most of which we will describe before explaining how to use (dereference) the resulting references.
You can create a reference to any named variable or
subroutine with a backslash. (You may also use it on an anonymous
scalar value like 7
or
"camel
", although you won't often need to.) This
operator works like the &
(address-of)
operator in C--at least at first glance.
Here are some examples:
$scalarref = $foo; $constref = 186_282.42; $arrayref = @ARGV; $hashref = \%ENV; $coderef = &handler; $globref = *STDOUT;
The backslash operator can do more than produce a single reference. It will generate a whole list of references if applied to a list. See Section 8.3.6 for details.
In the examples just shown, the backslash operator
merely makes a duplicate of a reference that is already held in a
variable name--with one exception. The 186_282.42
isn't referenced by a named variable--it's just a value. It's one of
those anonymous referents we mentioned earlier.
Anonymous referents are accessed only through references. This one
happens to be a number, but you can create anonymous arrays, hashes,
and subroutines as well.
You can create a reference to an anonymous array with square brackets:
$arrayref = [1, 2, ['a', 'b', 'c', 'd']];
Here we've composed an anonymous array of three elements,
whose final element is a reference to an anonymous array of four
elements (depicted in Figure
8.2). (The multidimensional syntax described later can be
used to access this. For example,
$arrayref->[2][1]
would have the value
"b
".)
We now have one way to represent the table at the beginning of the chapter:
$table = [ [ "john", 47, "brown", 186], [ "mary", 23, "hazel", 128], [ "bill", 35, "blue", 157] ];
Square brackets work like this only where the Perl parser is
expecting a term in an expression. They should not be confused
with the brackets in an expression like
$array[6]
--although the mnemonic association
with arrays is intentional. Inside a quoted string, square
brackets don't compose anonymous arrays; instead, they become
literal characters in the string. (Square brackets do still work
for subscripting in strings, or you wouldn't be able to print
string values like "VAL=$array[6]
". And to be
totally honest, you can in fact sneak anonymous array composers
into strings, but only when embedded in a larger expression that
is being interpolated. We'll talk about this cool feature later in
the chapter because it involves dereferencing as well as
referencing.)
You can create a reference to an anonymous hash with braces:
$hashref = { 'Adam' => 'Eve', 'Clyde' => $bonnie, 'Antony' => 'Cleo' . 'patra', };
For the values (but not the keys) of the hash, you can freely mix other anonymous array, hash, and subroutine composers to produce as complicated a structure as you like.
We now have another way to represent the table at the beginning of the chapter:
$table = { "john" => [ 47, "brown", 186 ], "mary" => [ 23, "hazel", 128 ], "bill" => [ 35, "blue", 157 ], };
That's a hash of arrays. Choosing the best data structure is a tricky business, and the next chapter is devoted to it. But as a teaser, we could even use a hash of hashes for our table:
$table = { "john" => { age => 47, eyes => "brown", weight => 186, }, "mary" => { age => 23, eyes => "hazel", weight => 128, }, "bill" => { age => 35, eyes => "blue", weight => 157, }, };
As with square brackets, braces work like this only where
the Perl parser is expecting a term in an expression. They should
not be confused with the braces in an expression like
$hash{key}
--although the mnemonic association
with hashes is (again) intentional. The same caveats apply to the
use of braces within strings.
There is one additional caveat which didn't apply to square
brackets. Since braces are also used for several other things
(including blocks), you may occasionally have to disambiguate
braces at the beginning of a statement by putting a
+
or a return
in front, so
that Perl realizes the opening brace isn't starting a block. For
example, if you want a function to make a new hash and return a
reference to it, you have these options:
sub hashem { { @_ } } # Silently WRONG -- returns @_. sub hashem { +{ @_ } } # Ok. sub hashem { return { @_ } } # Ok.
You can create a reference to an anonymous
subroutine by using sub
without a subroutine
name:
$coderef = sub { print "Boink! " }; # Now &$coderef prints "Boink!"
Note the presence of the semicolon, required here to
terminate the expression. (It isn't required after the more common
usage of sub
NAME
{}
that declares and defines a named
subroutine.) A nameless sub {}
is not so much a
declaration as it is an operator--like do {}
or
eval {}
--except that the code inside isn't
executed immediately. Instead, it just generates a reference to
the code, which in our example is stored in
$coderef
. However, no matter how many times you
execute the line shown above, $coderef
will
still refer to the same anonymous subroutine.[2]
Subroutines can also return references. That may sound
trite, but sometimes you are supposed to use a
subroutine to create a reference rather than creating the reference
yourself. In particular, special subroutines called
constructors create and return references to
objects. An object is simply a special kind of reference that
happens to know which class it's associated with, and constructors
know how to create that association. They do so by taking an
ordinary referent and turning it into an object with the
bless
operator, so we can speak of an object as a
blessed reference. There's nothing religious going on here; since a
class acts as a user-defined type, blessing a referent simply makes
it a user-defined type in addition to a built-in one. Constructors
are often named new
--especially by C++
programmers--but they can be named anything in Perl.
Constructors can be called in any of these ways:
$objref = Doggie::->new(Tail => 'short', Ears => 'long'), #1 $objref = new Doggie:: Tail => 'short', Ears => 'long'; #2 $objref = Doggie->new(Tail => 'short', Ears => 'long'), #3 $objref = new Doggie Tail => 'short', Ears => 'long'; #4
The first and second invocations are the same. They both call
a function named new
that is supplied by the
Doggie
module. The third and fourth invocations
are the same as the first two, but are slightly more ambiguous: the
parser will get confused if you define your own subroutine named
Doggie
. (Which is why people typically stick with
lowercase names for subroutines and uppercase for modules.) The
fourth invocation can also get confused if you've defined your own
new
subroutine and don't happen to have done
either a require
or a use
of
the Doggie
module, either of which has the effect
of declaring the module. Always declare your modules if you want to
use #4. (And watch out for stray Doggie
subroutines.)
See Chapter 12 for a discussion of Perl objects.
References to filehandles or directory handles can be created by referencing the typeglob of the same name:
splutter(*STDOUT); sub splutter { my $fh = shift; print $fh "her um well a hmmm "; } $rec = get_rec(*STDIN); sub get_rec { my $fh = shift; return scalar <$fh>; }
If you're passing around filehandles, you can also use the
bare typeglob to do so: in the example above, you could have used
*STDOUT
or *STDIN
instead of
*STDOUT
and *STDIN
.
Although you can usually use typeglob and references
to typeglobs interchangeably, there are a few places where you
can't. Simple typeglobs can't be bless
ed into
objectdom, and typeglob references can't be passed back out of the
scope of a localized typeglob.
When generating new filehandles, older code would often do something like this to open a list of files:
for $file (@names) { local *FH; open(*FH, $file) || next; $handle{$file} = *FH; }
That still works, but now it's just as easy to let an undefined variable autovivify an anonymous typeglob:
for $file (@names) { my $fh; open($fh, $file) || next; $handle{$file} = $fh; }
With indirect filehandles, it doesn't matter whether you use use typeglobs, references to typeglobs, or one of the more exotic I/O objects. You just use a scalar that--one way or another--gets interpreted as a filehandle. For most purposes, you can use either a typeglob or a typeglob reference almost indiscriminately. As we admitted earlier, there is some implicit dereferencing magic going on here.
In unusual circumstances, you might not know what
type of reference you need when your program is written. A reference
can be created by using a special syntax, affectionately known as
the
*foo{
THING
}
syntax.
*foo{
THING
}
returns a reference to the THING
slot in
*foo
, which is the symbol table entry holding the
values of $foo
, @foo
,
%foo
, and friends.
$scalarref = *foo{SCALAR}; # Same as $foo $arrayref = *ARGV{ARRAY}; # Same as @ARGV $hashref = *ENV{HASH}; # Same as \%ENV $coderef = *handler{CODE}; # Same as &handler $globref = *foo{GLOB}; # Same as *foo $ioref = *STDIN{IO}; # Er…
All of these are self-explanatory except for
*STDIN{IO}
. It yields the actual internal
IO::Handle
object that the typeglob contains,
that is, the part of the typeglob that the various I/O functions are
actually interested in. For compatibility with previous versions of
Perl, *foo{FILEHANDLE}
is a synonym for the
hipper *foo{IO}
notation.
In theory, you can use a
*
HANDLE
{IO}
anywhere you'd use a
*
HANDLE
or a
*
HANDLE
, such as for
passing handles into or out of subroutines, or storing them in
larger data structures. (In practice, there are still some wrinkles
to be ironed out.) The advantage of them is that they access only
the real I/O object you want, not the whole typeglob, so you run no
risk of clobbering more than you want to through a typeglob
assignment (although if you always assign to a scalar variable
instead of to a typeglob, you'll be okay). One disadvantage is that
there's no way to autovivify one as of yet.[3]
splutter(*STDOUT); splutter(*STDOUT{IO}); sub splutter { my $fh = shift; print $fh "her um well a hmmm "; }
Both invocations of splutter()
print
"her um well a hmmm
".
The
*foo{
THING
}
thing returns undef
if that particular
THING
hasn't been seen by the compiler
yet, except when THING
is
SCALAR
. It so happens that
*foo{SCALAR}
returns a reference to an anonymous
scalar even if $foo
hasn't been seen yet. (Perl
always adds a scalar to any typeglob as an optimization to save a
bit of code elsewhere. But don't depend on it to stay that way in
future releases.)
A final method for creating references is not really a method at all. References of an appropriate type simply spring into existence if you dereference them in an lvalue context that assumes they exist. This is extremely useful, and is also What You Expect. This topic is covered later in this chapter, where we'll discuss how to dereference all of the references we've created so far.
[2] But even though there's only one anonymous subroutine, there may be several copies of the lexical variables in use by the subroutine, depending on when the subroutine reference was generated. These are discussed later in Section 8.3.7.
[3] Currently, open my $fh
autovivifies a
typeglob instead of an IO::Handle
object, but
someday we may fix that, so you shouldn't rely on the
typeglobbedess of what open
currently
autovivifies.