Global Variables and Packages

Variable scope, in general, refers to the availability and existence of a variable in your script. A global scope, then, refers to a variable that is available to all parts of your script and exists as long as your script is running. Local scope, in turn, refers to a variable that has some limited scope and might pop in and out of existence depending on what part of your Perl script is currently executing.

We'll start this chapter with a look at global variables and global scope. In the next section we'll turn to local scope.

The Problem with Globals

Throughout this book, we've been using global variables (and global scope) in most of the examples, with the exception of the occasional local variable in a subroutine. There's a good reason for this: Global variables are easy to create and easy to use. Any variable that is not explicitly declared with my (or, as you'll learn soon, local) automatically be-comes a global variable, regardless of the context in which you use it, and is available at any point in that script.

This makes simple scripts easy to write. But as your scripts become larger and larger, the use of global variables becomes more and more problematic. There are more variables to keep track of and more variables taking up space as your script runs. Global variables that mysteriously appear deep in the body of a script can be difficult to debug—which part of the script is updating this variable at what time? Do you even remember what that global variable does?

As you develop larger scripts that use global variables, there's also a significant danger that you will accidentally use a name for a variable that already exists somewhere else in your script. Although this problem can make it more difficult to debug your scripts, it's a particularly difficult problem if you have to incorporate your scripts into someone else's code, or if you want to create reusable Perl libraries. The risk of clashing variable names across multiple bodies of code becomes a very real and very painful problem.

The best way to control the potential of name clashes with promiscuous global variables is to not use them. Organize all your scripts in subroutines, and declare all your variables local to those subroutines. Data that needs to be shared between subroutines can be passed from subroutine to subroutine via arguments. Many software developers argue that all programs—no matter how small, no matter how specialized the purpose—should be written this way, that the avoidance of global variables is Good Software Design.

In real life, however, everyone uses the occasional global variable, particularly in situations where every part of the script must access the same stored data in a list or other structure. Which brings us to another method for organizing and managing global variables: packages.

What's a Package?

A package is a way to bundle up your global variables so they aren't really global anymore—they're only global inside a given package. In other words, each package de-fines its own variable name space. Packages enable you to control which global variables are available to other packages, thereby avoiding the problems of clashing variable names across different bits of code.

Chances are good you'll only need to develop your own packages if you're creating Perl modules or libraries or classes in object-oriented Perl programming—all topics that are too advanced for this book. Even if you don't develop your own, packages are all around you as you write and run your Perl scripts, whether you know it or not. With that in mind, having at least a passing understanding of how packages work will help you not only understand how Perl looks at variables and name spaces in your own scripts, but also how importing code from modules works as well. And it'll help in the event that your code grows to the point where it does become a library or a module later. Learn the rules now, and it'll be that much easier later.

How Packages and Variables Work

The core concept of the package is that every line of Perl is compiled in the current package, which can be the default package or one you define yourself. Each package has a set of variable names (called a symbol table) that determines whether a variable is available for Perl to use and what the value of that variable currently is. The symbol table contains all the names you could possibly use in your script—scalars, arrays, hashes, and subroutines.

If you refer to a variable name—say, $x—in your script, Perl will try to find that variable in the current package's symbol table. If you tell Perl to switch packages, it will look up $x in the new package. You can also refer to a variable by its complete package name, which tells Perl which symbol table to look in for the variable and its value. Package names contain both the name of the package, two colons, and the name of the variable. The special character indicating whether the variable is a scalar, a list, or a hash, or so on, is still included at the start of the package.

So, for example, $main::x would be used to refer to the scalar variable $x stored in the main package, whereas $x would refer to the scalar variable $x in the current package, which may or may not be the same variable stored in the package main. They have the same variable name but because they live in different packages, they have different values (or they might not exist at all in other packages).

Main is the default package that you've been using all along, although you haven't been aware of it. When you create and use global variables in scripts that do not define an explicit package, you're actually creating variables that belong to the main package (you might have seen this come up in the error messages you get when you make a mistake—some of them refer to "Name main::foo used only once..."). All this time, whenever I've been referring to global variables, I've actually been slightly dishonest: Global variables are actually package variables belonging to the package main.

Note that the existence of global variables in packages doesn't make them any less difficult to manage if you use lots of them. A hundred global variables defined in main are going to be just as difficult to use as a hundred global variables defined in a new package called mypackage. Using local variables and passing data between subroutines is still good-programming practice for your own bit of the Perl world.

To create a new package, or switch between packages, use the package function:

package mypack;  # define or switch to a package other than main

Package definitions have a scope similar to that of local variables: Package definitions inside a subroutine or block compile all the code inside that subroutine or block as part of that new package, and then revert to the enclosing package. Calling package at the start of a script defines a new package for that entire script.

As I said earlier, you'll define your own packages most often when you're writing code libraries or modules of your own. The most important things to understand about packages are

  • Every variable name and value is stored in a symbol table for the current package.

  • You can refer to a variable name as either a plain variable name for the current package, or with a complete package name. This determines which symbol table Perl checks for the variable's value.

  • The default package is package main.

A Simple Package Example

Here's a simple example of a program that uses packages. Its only purpose is to demonstrate how packages affect variable scope. In this program, I create three packages, and demonstrate how the scope of global variables is affected by those package declarations. The source code is in Listing 13.1.

Listing 13.1. A Program that Demonstrates the Use of Packages
#!/usr/ bin/perl -w

package foo;
print "Package foo ...
";

$bar = 'bar';

package red;
print "Package red ...
";

$blue = 'blue';

print "Value of $bar: $bar
";
print "Value of $blue: $blue
";
print "Value of $foo::bar: $foo::bar
";
						

Let's look at the source code. First, create a package called foo, and initialize a variable in it, $bar. Then, initialize a second package, called red. In it, initialize a new variable called $blue, and print out the values of $bar, $blue, and $foo::bar. When the value of $bar is printed, nothing is displayed because there is no variable called $bar in the current package. On the other hand, $blue's value is printed out as you would expect, because it is native to the current package. Finally, when $bar is qualified with a package name, the value is printed correctly. By using the full-package qualifier, the value can be retrieved from the other package.

Under ordinary circumstances, you probably wouldn't create more than one package in a single program like I did in this case. However, by presenting the code in this way, you can see how packages can be used to encapsulate variables so they don't get in the way of variable names that you can use elsewhere in your code.

Using Non-Package Global Variables

One way of creating well-mannered globals is to create packages. However, there is one other trick that is very common for creating globals: Declare your globals as local to your own script.

If you declare your global variables with the my modifier, as you do local variables inside subroutines, then those global variables won't belong to any package, not even main. They'll still be global to your script, and available to all parts of that script (including subroutines), but they won't conflict with any other variables inside actual packages, including those declared in package main. Global variables declared with my also have a slight performance advantage because Perl doesn't have to access the package's symbol table each time the variable is referenced.

Caution

There's one case where you don't want to use non-package globals, and that's if you're writing scripts to run in Apache's mod_perl environment. Because of the way mod_perl works, if you do so, the main body of your script will run as though it's an anonymous subroutine, and your global variables won't be visible to the other subroutines in your program.


Because of these advantages of using non-package globals, it's recommended Perl practice to do this for all but the simplest of Perl scripts. To make use of it in your own scripts, simply include the my modifier when you declare your global variables, the same way you do for the locals:

my %names = (); # global hash for the names

Perl also includes a special feature to help you make sure you're using all your variables properly, as either local variables inside subroutines, or as non-package variables if they're global. Add the line use strict at the top of your script to turn on this feature, like this:

#!/usr/local/bin/perl -w
use strict;

my $x = '';  # OK
@foo = ();   # will cause script to exit

# remainder of your script

With use strict in place, when you run your script Perl will complain about stray global variables and exit. Technically, use strict will complain about any variables that are not declared with my, referenced using package names, or imported from elsewhere. Consider it an even stricter version of the variable warnings you get with -w.

There's one odd side effect of the use strict command worth mentioning: It will complain about the placeholder variable in a foreach loop, for instance, $key in this example:

foreach $key (keys %names) {
  ...
}

Technically, this is because the foreach variable, which is implicitly local, is actually a global variable pretending to be a local. It's still only available to the foreach, and behaves as if it were local, but internally it's declared differently (you'll learn about the two different kinds of local variables later in this lesson, in “Local Variables with my and local.”) To fix the complaint from use strict, simply put a my in front of the variable name:

foreach my $key (keys %names) {
  ...
}

Throughout the rest of this book, all the examples will use use strict and declare global variables as my variables.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset