Going Deeper

As I mentioned at the beginning of this lesson, I could easily go on for pages and pages and chapters and chapters about all the different aspects of CGI. Documenting all of CGI.pm itself would give us a much larger lesson than I've got here. There are a few other features of CGI.pm I'd like to mention, however. All of these features are documented in the documentation for CGI.pm (perldoc CGI will show it to you), as well as in the Web page at http://stein.cshl.org/WWW/software/CGI/.

If you need further details about CGI itself, feel free to visit the Web pages or to read the book I mentioned at the beginning of this lesson. The Usenet newsgroup comp.infosystems.www.authoring.cgi can also provide assistance in getting your CGI scripts to work on different platforms.

Using CGI Variables

When your CGI script gets called, along with the data from the browser, the Web server also provides several values in its environment that relate to the script itself, to the Web server, and to the system running the browser that submitted the form in the first place. On Unix systems, these are environment variables that you can access from inside your Perl script using the %ENV hash. Other Web servers may have different ways of passing in these variables. However, CGI.pm provides subroutines to access these variables in a way that works across platforms and Web servers. You aren't required to use any of these subroutines in your CGI scripts, but you might find the data useful.

Table 10.1 shows the CGI variable subroutines for CGI.pm.

Table 10.1. CGI Variable Subroutines
Subroutine What it Gives You
accept() A list of MIME types the browser will accept.
auth_type() The authentication type (usually 'basic').
path_info() Path information encoded into the script URL (if used).
path_translated() Same as path_info() expanded into a full pathname.
query_string() Arguments to the CGI script tagged on to the URL.
raw_cookie() Returns raw cookie information. Use the cookie() subroutines to manage this information.
referer() The URL of the page that called this script (note the incorrect spelling).
remote_addr() The IP address of the host that called this script (the browser's host).
remote_ident() The user's ID, but only if the system is running ident (not common).
remote_host() The name of the host that called this script.
remote_user() The name of the user that called this script (usually only set if the user has logged in using authentication).
request_method() The Web server method the script was called with (for example, GET or POST).
script_name() The name of the script.
server_name() The host name of the Web server that called this script.
server_software() The name and version of the Web server software.
virtual_host() For servers that support virtual hosts, the name of the virtual host that is running this script.
server_port() The network port the server is using (usually 80).
user_agent() The browser name and version that called this script, for example Mozilla/4.x (Win95).
user_name() The name of the user who called this script (almost never set).

POST Versus GET

CGI scripts can be called by a browser in one of two forms: POST and GET. GET submissions encode the form elements into the URL itself; POST sends the form elements over the standard input. GET can also be used to submit a form without actually having a form—for example, you could have a link in which the URL contained the hard-coded form elements to submit.

Inside your CGI script, CGI.pm will process both these kinds of methods and store them in the parameters array, so you don't have to worry about which method was used to submit the script. If you really want to get the parameters from the URL, use url_param() instead of param().

Redirection

Sometimes the result of a CGI script isn't a raw HTML file, but rather a pointer to an existing HTML file on this server or elsewhere. CGI.pm supports this result with the redirect() subroutine:

print redirect('http://www.anotherserver.com/anotherfile.html'),

The redirect tells the user's browser to retrieve a specific page on the Web, rather than to display any HTML. Because a CGI script that uses redirect doesn't create a new Web page, you should not combine a redirect and any HTML code in the output from a CGI script.

Cookies and File Upload

CGI.pm also enables you to manage cookie values and to handle files that get uploaded via the file upload feature of HTML forms. For managing cookies, see the cookie() subroutine in CGI.pm. File upload works similarly to ordinary form elements; the param() subroutine is used to return the filename that was entered in the form. That filename is also a filehandle that is open and which you can read lines from using the standard Perl mechanisms.

See the CGI.pm documentation for information on both these things.

CGI Scripts and Security

Every CGI script is a potential security hole on your Web server. A CGI script runs on your server based on input from any random person out on the Web. Depending on how carefully you write your scripts and how determined someone is, your CGI scripts could offer openings up to and including allowing malicious users to break into your system and irreparably damage it.

The best way to prevent a problem is to understand it and take steps to avoid it. A great starting point is at the World Wide Web security FAQ at http://www-genome.wi.mit.edu/WWW/faqs/www-security-faq.html. Perl also includes a feature, called taint mode, which prevents you from using nonsecure data in a way that could do harm to your system. More about taint mode on Day 20, “Odds and Ends.”

Embedding Perl in Web Servers

Each time a Perl CGI script is called from a Web server, Perl is called to execute that script. For very busy Web servers, running lots and lots of CGI scripts can mean running many copies of Perl at once, and a considerable load for the machine acting as the Web server. To help with performance, many Web servers provide a mechanism for embedding the Perl interpreter inside the Web server itself, so that scripts that run on Web servers no longer run as actual CGI scripts. Instead, they run as if they were Web server libraries, reducing the overhead and startup time for each script. In many cases, you don't even have to modify your CGI scripts to get them to work under this system.

Different Web servers on different platforms provide different mechanisms for doing this. You'll need to check the documentation that comes with your Web server to see if Perl-based CGI scripts can be embedded, and if so, where to get the tools or modules to embed them.

If you're using an ISAPI-based Web server on Windows (such as IIS), you'll need the Perl for ISAPI package (sometimes called PerlIIS). This package is part of ActiveState's version of Perl for Windows, and is installed with that package. You can also get it separately from ActiveState's Web site at http://www.activestate.com.

If you're using the open source Apache Web server, mod_perl is an Apache module that embeds the Perl interpreter in the Apache Web server. While the most obvious feature this gives you is better CGI performance, it also allows you access, with Perl, to all of Apache's internal extension APIs, allowing almost infinite customization of the Web server itself. Find out more about the Apache Web server at http://www.apache.org, and the Apache/Perl Integration Project, the developers of mod_perl, at http://perl.apache.org.

If you want to run CGI scripts under mod_perl, you should use the Apache::Registry module, which is designed for porting scripts that use CGI.pm to the mod_perl environment. You can read the documentation for the module at http://www.perldoc.com/cpan/Apache/Registry.html.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset