Using Perl Modules

Half of learning how to script Perl successfully is knowing how to write the code. The other half is knowing when NOT to write the code. Or, to be more exact, knowing when to take advantage of the built-in Perl functions or when to use libraries and modules that other people have written to make your programming life easier.

If you've got a given task to do in Perl that sounds kind of complex but that might also be something other programmers might have done in the past, chances are really good that someone else has beaten you to it. And, in the Perl tradition, he may very well have packaged their code in a module or a library and made it available for public downloading. If it's a really common task, that module might even be part of the standard Perl distribution. Much of the time, all you have to do to make use of these libraries is import them into your own Perl scripts, add some code to customize them for your particular situation, and that's it. You're done.

Throughout many of the lessons in the remainder of this book, we'll be looking at a number of modules that you have available to you as part of the standard Perl distribution, as part of the distribution for your particular platform, or as downloadable files from the Comprehensive Perl Archive Network. In this section, then, you'll learn the basics: what a module is, and how to import and use modules in your own scripts.

Some Terminology

But first, some terminology. I've been bandying about the terms function, library, module, and package, and it's worth noting what all these terms mean.

A built-in function, as I've noted before, is a function that comes with Perl and is available for you to use in your script. You don't need to do anything special to call a built-in function; you can call it anytime you want to.

A Perl library is a collection of Perl code intended for reuse in other scripts. Old-style Perl libraries were nothing more than this, and were used in other Perl scripts by importing them with the require operator. More recently, the term library has come to be equivalent to a Perl module, with old-style libraries and require falling out of favor. More about importing code with require later in this lesson.

A Perl module is a collection of reusable Perl code. Perl modules define their own packages, and have a set of variables defined by that package. To use a module, you import that module into your script with the use operator, and then you can (usually) refer to the subroutines (and, sometimes, variables) in that module as you would any other subroutines (or variables). Some modules are object-oriented, which means using them in slightly different ways, but the basic procedure is the same.

In addition, there are also pragmas, which are special kinds of Perl modules that affect both how Perl compiles and runs a script (whereas most Perl modules only affect its actual execution). Otherwise, they behave the same. use strict is an example of the use of a pragma. We'll look at pragmas later in this lesson in the section entitled “Using Pragmas.”

Getting Modules

Where are these modules found? If you have Perl, you already have a number of modules to play with and you don't need to do anything further. The standard Perl library is the collection of modules, pragmas, and scripts that are distributed with the standard Perl distribution. Different versions of Perl for different platforms might have a different standard library—the Windows version of Perl, for example, has a set of modules for accessing capabilities specific to Windows machines. Although you have to explicitly import these modules to use them, you don't have to download them or install them.

The “official” set of library modules is fully described in the perlmod man page, and includes modules for the following:

  • Interfaces to databases

  • Simple networking

  • Language extensions, module and platform-specific development support, dynamic module and function loading

  • Text processing

  • Object-oriented programming

  • Advanced math

  • File, directory, and command-line argument handling

  • Error management

  • Time

  • Locale (for creating international scripts)

For the Windows version of Perl, there are standard Win32 modules for Windows extensions, including the following:

  • Win32::Process: Creation and use of Windows processes

  • Win32::OLE: For OLE automation

  • Win32::Registry: Access to the Windows Registry

  • Win32::Service: Management of Windows services

  • Win32::NetAdmin: Remotely create users and groups

MacPerl includes Mac modules for accessing the Mac Toolbox, including AppleEvents, dialogs, files, fonts, movies, Internet Config, QuickDraw, and speech recognition (whew!). We'll explore many of the Mac and Win32 modules on Day 18, “Perl and the Operating System.”

In addition to the standard Perl library, there is the Comprehensive Perl Archive Network, otherwise known as CPAN. CPAN is a collection of Perl modules, scripts, documentation, and other tools relating to Perl. Perl programmers all over the world write modules and submit them to CPAN. To use the modules from the CPAN, you'll need to download and install those modules into your Perl distribution; sometimes you'll also need to compile them with a C compiler. Let's start in this section, using the modules you've already installed; we'll look more at CPAN later in this lesson.

Importing Modules

To gain access to the code stored in any module from your script, you import that module using the use operator and the name of that module:

use CGI;

use Math::BigInt;

use strict;

The use operator imports the subroutine and variable names defined and exported by that module into the current package so that you can use them as though you had defined them yourself (in other words, that module has a package, which defines a symbol table; importing that module loads that symbol table into your script's current symbol table).

Module names can take on many forms—a single name refers to a single module, for example, CGI, strict, POSIX, or Env. A name with two or more parts separated by double-colons refers to parts of larger modules (to be exact, they refer to packages defined inside other packages). So, for example, Math::BigInt refers to the BigInt part of the Math module, or Win32::Process refers to the Process part of the Win32 module. Module names conventionally start with an initial capital letter.

When you use the use operator to import a module's code into your script, Perl searches for that module's file in a special set of directories called the @INC array. @INC is a Perl special variable that contains any directories specified on the Perl command line with the -I option, followed by the standard Perl library directories (/usr/lib/perl5 and various subdirectories on Unix, perllib on Windows, MacPerl:lib on Macintosh), followed by . to represent the current directory. The final contents of @INC will vary from system to system, and different versions of Perl might have different values for @INC by default. On my system, a Linux machine running Perl 5.005_02, the contents of @INC are

/usr/lib/perl5/5.00502/i486-linux
/usr/lib/perl5/5.00502
/usr/lib/perl5/site_perl/5.005/i486-linux
/usr/lib/perl5/site_perl/5.005
.

If you want to import a module that's stored in some other directory, you can use the lib pragma at the start of your script to indicate that directory in your script:

use lib '/home/mystuff/perl/lib/';
use Mymodule;

Perl module files have the same names as the modules themselves, and have the extension .pm. Many modules contain just plain Perl code with some extra framework to make them behave like modules, so if you're curious about how they work you can go ahead and look at the code. Other modules, however, contain or make use of compiled code specific to the platform on which they run, and are not quite as educational to look at.

Note

These latter modules use what are called Perl extensions, sometimes called XSUBS, which enable you to tie compiled C libraries into Perl modules. Working with extensions is way too advanced for this book, but I'll provide some pointers to more information on Day 20, “Odds and Ends.”


Using Modules

Using use enables you to import a module. Now what? Well, now you can use the code that module contains. How you actually do that depends on whether the module you're using is a plain module or an object-oriented one (you can find out from the documentation for that module whether it's object-oriented or not).

For example, take the module Carp, part of the standard Perl library. The Carp module provides the carp, croak, and confess subroutines for generating error messages, similar to how the built-in functions warn and die behave. By importing the Carp module, the three subroutine names are imported into the current package, and you gain access to those subroutines as if they were built-in functions or subroutines you defined yourself:

use Carp;
open(OUT,">outfile" || croak "Can't open outfile
";

Note

The carp and croak subroutines, by the way, are analogous to warn and die in that they are used to print an error message (and then exit, in the case of die and croak). The difference is that if they're used inside a module, the carp subroutines are better at reporting where an error occurred. In the case where a script imports a module that has a subroutine that calls carp, carp will report the package and line number of the enclosing script, not the line number inside of the module itself (which would not be very useful for debugging). We'll come back to carp when we look at CGI scripts on Day 16, “Using Perl for CGI Scripting.”


Object-Oriented Modules

Some modules are object oriented. Object-oriented programming involves designing systems so that the components are treated as “objects” that comprise both the data associated with them, and the code used to perform actions associated with the object. The description of what an object looks like is referred to as a class, and you deal with objects by creating instances of those classes. For example, in the object-oriented world, you might have a class called Date, which contains the current date and time, and methods that convert the date and time to alternate calendars, or allow you to display the date in various formats. You could create an instance of Date containing the current date and time, and call a method of the date class called convert_to_julian to get the Julian version of the date. Perl implements the idea of classes and instances of those classes, the means by which it does so are described on Day 20. Modules are often written in an object-oriented manner, here's what you need to know to use them.

In object-oriented programming parlance, functions and subroutines are called methods, and they're executed in a different way than normal functions. So, for modules you import that are object-oriented, you'd use special syntax to get at that code (your script itself doesn't have to also be object-oriented, so don't worry about that. You can mix and match object-oriented Perl with regular Perl). Here's an example from the CGI module, which we'll look at in more detail on Day 16:

use CGI;
my $x = new CGI;
my $name = $x->param("myname");
print $x->header();
print $x->start_html("Hello!");
print "<H2>Hello $name!
";
print $x->end_html();

Weird-looking, isn't it? If you're familiar with object-oriented programming, what's going on here is that you're creating a new CGI object, storing the reference to that object in the $x variable, and then calling methods using the -> notation.

If you're not familiar with object-oriented programming, this is going to seem odd. Here's the short version:

  • The line my $x = new CGI; creates a new CGI object and stores a reference to it in the variable $x. $x isn't a normal scalar like an array or a string; it's a reference to a special CGI object.

  • To call subroutines defined by the module, otherwise known as a method, you use the variable holding the object, the -> operator, and the name of the subroutine. So the line $x->header() calls the header() subroutine, defined in the object stored in $x.

Follow this same notation for other subroutines in that same module, and you'll be fine. We'll get back to references and object orientation later Day 19, “Working with References.”

Modules from the Inside Out

If the documentation for the module you're using is insufficient, you can discover which variables and subroutines from the module are available to your program by doing a bit of investigative work. When someone writes a module, they have to list the variables and subroutines (both of which are considered symbols in a module) that are exported to Perl programs that use the module.

There are two key variables in a module that indicate which symbols (variables and subroutines) are available to programs that import the module, @EXPORT and @EXPORT_OK. The symbols listed in @EXPORT are available to the importing module automatically. The symbols listed in @EXPORT_OK are available for import, but are not imported automatically when you import the module. I'll explain how to import them in the next section. These two variables are your first clues to how an undocumented module works.

Let's look at the Carp module, which I discussed earlier. You can examine the @EXPORT and @EXPORT_OK variables to see which symbols are exported. In the case of Carp, you can also just read the documentation by typing perldoc Carp, but that's not the point of this exercise.

@EXPORT = qw(confess croak carp);
@EXPORT_OK = qw(cluck verbose);

As you can see from those two variables, there are three symbols that are imported automatically with this module, and two that are optional imports. One thing you'll need to look out for is symbols that are not variables or methods that you can use, but instead are flags that control how the module is used in your program. Flags will only appear in the @EXPORT_OK variable. If a symbol is imported automatically, it doesn't work very well as a flag that only takes effect when it is imported manually. The only way to tell flags from other optionally exported symbols is by examining the source code of the module.

The exported symbols make up the public interface of the module. Using the package qualifier, you can always access global variables and subroutines within the module if you choose to do so, but keep in mind that accessing the internal structure of the module in a way that the author did not intend can cause results that you didn't plan for. Generally speaking, it's better to use a module in the way the author intended.

Importing Symbols by Hand

Importing a module using use brings in the variables and subroutine names defined and exported by the module in question. The words and exported in the preceding sentence are important—some modules export all their variables, some export only a subset, and others export none at all. With use, you gain access to all the code in the module, but not necessarily as if you had defined it yourself. Sometimes you'll have to do some extra work to gain access to the parts of the module you want to use.

If a module you're using doesn't export any variables or subroutine names, you'll find out soon enough—when you try and use those names, you'll get undefined errors from Perl. There are two ways to gain access to the features of that module:

  • You can refer to those variables or subroutines using the full-package name.

  • You can import those symbols (variable or subroutine names) by hand, in the use statement.

With the first method, all you need to do to access a module's variables or subroutines is add the package name to those variables or subroutines. This is especially useful if you've got variables or subroutines of the same name in your own code and you don't want those names to clash:

# call additup, defined in Mymodule module
$result = &Mymodule::additup(@vals);

# change value of the $total variable (defined in Mymodule)
$Mymodule::total = $result;

Note that if the package name itself contains two colons, you just add the whole thing before the variable name:

$Text::Wrap::columns = 5;

The second method, importing all the symbols you need, is easier if you intend to call a module's subroutines a lot in your own code; importing them means you don't have to include the package name each time. To import any name from a module into the current package, add those names to the end of the use command. A common way to do this is to use the qw function, which enables you to leave off the quotes and add new symbols easily:

use MyModule qw(oneSub, twoSub, threeSub);

Note that these are symbol names, not variable names. There are no special letters before these names, and all the variables in the module with that name will be imported (the symbol foo will import $foo, @foo, &foo, and so on). To import specific variables, you can use the variable characters:

use MyModule qw($count);

Some modules are defined so that they have a set of variables that are imported by default, and a set that are only imported by request (if you look at the code, the hash %EXPORT commonly defines exported symbols by default; %EXPORT_OK defines the optional symbols). The easiest way to import both these things is to issue two calls to use: one for the defaults, and one for any optional symbols:

use Mymodule;    # import all default names
use Mymodule qw(this, that);   # import this and that as well
						

Import Tags

Some of the larger modules enable you to import only a subset of their features, for greater efficiency. These modules use what are called import tags. If you have a module that uses import tags, you can find out which tags a module supports by checking the documentation of that module. Alternately, the %EXPORT_TAGS hash in the source code will show you which tags you can use (tags are exported from the module, and imported into your code).

To import a subset of a module, add the tag to the end of the use statement:

use CGI qw(:standard);

Once again, qw is the quote word function; although you could just quote the import tag itself; this format is more commonly used and enables you easily to add more tags if you need them.

How a module behaves—if it uses import tags, or if it has variables or subroutines that must be imported by hand—is defined by the module itself, and (hopefully) documented. You might try running perldoc on the module to extract any online documentation the author of that module provided, or check the readme files that came with the module to make sure you're using it right.

Using Pragmas

The line use strict for restricting global variables to the current script is an example of a special kind of module called a pragma. A pragma is a module that affects how Perl behaves at both compile time and runtime (as opposed to regular modules, which provide code for Perl just at runtime). The strict pragma, in particular, tells Perl to be strict in its parsing of your code, and to disallow various unsafe constructs.

The notions of compile-time and runtime might initially seem odd if you remember back to Day One, where I noted that Perl isn't a compiled language like C or Java. In those languages, you run a compiler to convert your source code into bytecode or an executable file; then you execute that new file to actually run the program. With Perl, the script is your executable. There's no intermediate compiled step.

In reality, I fibbed a little on Day One. Perl does indeed compile its source code, just as C and Java do. But then it goes ahead and runs the result; there is no intermediate executable file hanging around.

What this means is that there are tasks that Perl does during compile time (as the script is compiled), and tasks that happen during runtime (as the result is executing). At compile-time, Perl checks for syntax and verifies that everything it needs to run that script is available. At runtime, the script actually executes and operates on the data you give it. As you grow more advanced in your knowledge of Perl, you'll learn that when some operations happen is as important as whether they happen at all.

But back to pragmas. As I mentioned, a pragma is a bit of imported code that affects how Perl operates both during compile time and runtime. Unlike most imported code in modules and libraries, which only affect a script's runtime behavior, pragmas can change the whole look and feel of your code and how Perl looks at it.

Perl has very few pragmas in its standard library (unlike modules, of which there are dozens). Pragmas are conventionally spelled in all lowercase letters, to differentiate them from modules. You use pragmas just like you do modules, with the use operator:

#!/usr/local/bin/perl -w
use strict;
use diagnostics;

Each of the pragmas can be used at the top of your script to affect the entire script. They can also be used inside a block, in which case they only change the behavior of that enclosing block. At the end of the block the normal script behavior resumes.

Some of the more useful Perl pragmas include strict and diagnostics. You can find a more complete listing of available pragmas in the perlmod man page under the section “Pragmatic Modules.”

strict

The strict pragma that you've seen already, restricts various unsafe constructs in your scripts. The strict pragma watches for misplaced global variables, barewords (unquoted strings with no definitions in the language or defined as subroutines), and symbolic references (which we'll look at in greater detail on Day 19). You can control only some of these unsafe constructs by including the strings 'vars', 'subs', or 'refs' after the use strict, like this:

use strict 'vars';

You can turn off strictness for various blocks of code using the no strict command (and with optional 'vars', 'subs', and 'refs', if necessary). The no strict applies only to the end of the enclosing block (subroutine, conditional, loop, or bare block), and then Perl reverts to the usual amount of strictness.

diagnostics

The diagnostics pragma is used to turn on Perl verbose warnings. It works similarly to the -w switch; however, it can be used to limit diagnostic warnings and messages to specific parts of your script enclosed in blocks. You cannot turn off diagnostics at the compile phase of your script (as you can with strict using no strict), but you can control runtime warning using the enable and disable subroutines:

use diagnostics;
#
# various bits of code

disable diagnostics;
# code that usually produces run-time warnings
enable diagnostics;

# continue on as usual...
							

The English Module

The English module is worth mentioning specifically because, like the pragmas, it offers a way to change how Perl interprets your script, but unlike the pragmas, it operates at runtime, and you can't limit its scope to a block. The English module is used to make Perl less terse in its built-in special variable names. Although true Perl wizards can gleefully litter their scripts with variables like $_, $", $\_, and so on, most mere mortals have trouble keeping all but the most common special variables straight. That's where use English can help, by aliasing various longer variable names to the shorter versions.

For example, the variable $, is known as the output field separator, and it's used to separate items in print statements. With use English, you can still refer to the variable as $, if you like, or you can also use the names $OUTPUT_FIELD_SEPARATOR or $OFS. All three will work equally well.

You can find a list of Perl's special variables and all their names (both using the English module and not) in the perlvar man page.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset