Let's finish up with a slightly more useful script: webbuild.pl takes a simple text file as an argument, prompts you for some basic values, and then spits out an HTML version of your text file. It's not a very sophisticated HTML generator—it won't handle embedded boldface or other formatting, it doesn't handle links or images; really, it does little other than stick paragraph tags in the right place, let you specify the foreground and background colors, and give you a simple heading and a link to your e-mail address. But it does give you a basic HTML template to work from.
In addition to simply converting text input to HTML, the webbuild.pl script prompts you for several other values, including
The title of the page (<title>…</title> in HTML).
Background and text colors (here I've limited it to the built-in colors supported by HTML, and we'll verify the input to make sure that it's one of those colors). This part also includes some rudimentary online help as well.
An initial heading (<h1>…</h1> in HTML).
An e-mail address, which will be inserted as a link at the bottom of the final HTML page
Here's what running the webbuild.pl script would produce with the some given prompts and output:
% webbuild.pl janeeyre.txt Enter the title to use for your web page: Charlotte Bronte, Jane Eyre, Chapter One Enter the background color (? for options): ? One of: white, black, red, green, blue, orange, purple, yellow, aqua, gray, silver, fuchsia, lime, maroon, navy, olive, or Return for none Enter the backgroundcolor (? for options): white Enter the text color (? for options): black Enter a heading: Chapter One Enter your email address: [email protected] ****************************** <html> <head> <title>Charlotte Bronte, Jane Eyre, Chapter One</title> </head> <body bgcolor="white" text="black"> <h1>Chapter One</h1> <p>There was no possibility of taking a walk that day. We had been wandering, indeed, in the leafless shrubbery an hour in the morning; ... more text deleted for space ... fireside, and with her darlings about her (for the time neither </p> <hr> <address><a href="mailto:[email protected]">[email protected]</a></address> </body> </html>
The resulting HTML file, as the previous output shows, could then be copy-and-pasted into a text editor, saved, and loaded into a Web browser to see the result (Figure 7.1 shows that result).
Later in this book (on Day 15, “Managing I/O,” specifically), I'll show you a way to output the data to a file, rather than to the screen.
One note about the text file you give to webbuild.pl to convert: The script assumes the data you give it is a file of paragraphs, with each paragraph separated by a blank line. For example, here are the contents of the file janeeyre.txt, which I used for the example output:
There was no possibility of taking a walk that day. We had been wandering, indeed, in the leafless shrubbery an hour in the morning; but since dinner (Mrs. Reed, when there was no company, dined early) the cold winter wind had brought with it clouds so sombre, and a rain so penetrating, that further outdoor exercise was now out of the question. I was glad of it: I never liked long walks, especially on chilly afternoons: dreadful to me was the coming home in the raw twilight, with nipped fingers and toes, and a heart saddened by the chidings of Bessie, the nurse, and humbled by the consciousness of my physical inferiority to Eliza, John, and Georgiana Reed. The said Eliza, John, and Georgiana were now clustered round their mama in the drawing-room: she lay reclined on a sofa by the fireside, and with her darlings about her (for the time neither
Listing 7.3 shows the code for our script.
There's little that's overly complex, syntax-wise, in this script; it doesn't even use any arrays or hashes (it doesn't need to; there's nothing that really needs storing or processing here). It's just a lot of loops and tests.
There are at least a few points to be made about why I organized the script the way I did, so we can't end this lesson quite yet. Let's start with the large foreach loop starting in line 18.
This loop handles the prompt for both the background and text colors. Because both of these prompts behave in exactly the same way, I didn't want to have to repeat the same code for each one (particularly given that there's a really huge if test in lines 31 through 47). Later, you'll learn how to put this kind of repetitive code into a subroutine, and then just call the subroutine twice. But for now, because we know a lot about loops at this point, and nothing about subroutines, I opted for a sneaky foreach loop.
The loop will run twice, once for the string 'background' and once for the string 'text'. We'll use these strings for the prompts, and later to make sure the right value gets assigned to the right variable ($bgcolor or $text).
Inside the foreach loop, we have another loop, an infinite while loop, which will repeat each prompt until we get acceptable input (input verification is always a good programming practice). At the prompt, the user has three choices: enter one of the sixteen built-in colors, hit Return (or Enter) to use the default colors, or type ? for a list of the choices.
The tests in lines 25 through 50 process each of these choices. First, ?. In response to a question mark, all we have to do is print a helpful message, and then use next to drop down to the next iteration of the while loop (that is, redisplay the prompt and wait for more data).
The next test (starting in line 30) makes sure we have correct input: either a Return, in which case the input is empty (line 30); or one of the sixteen built-in colors. Note that the tests all test lowercase colors, which would seem overly limiting if the user typed BLACK or Black or some other odd-combination of upper and lowercase. But fear not; in line 23, we used the lc function to lowercase the input, which combines all those case issues into one (but conveniently doesn't affect input of ?).
If the input matches any of those seventeen cases, we call last in line 47 to drop out of the while loop (keep in mind that next and last, minus the presence of labels, refer to the nearest enclosing loop—to the while, not to the foreach). If the input doesn't match, we drop to the final else case in line 48, print an error message, and restart the while loop.
The final test in the foreach loop determines whether we have a value for the background color or for the text color, and assigns that value to the appropriate variable.
The final part of the script, starting on line 68 and continuing to the end, prints the top part of our HTML file, reads in and converts the text file indicated on the command line to HTML, and finishes up with the last part of the HTML file. Note the tests in line 69 and 70; if there are no values for $bgcolor or $text, we'll leave off those attributes to the HTML <body> tag altogether. (A simpler version would be to just leave them there, as bgcolor="" or text="", but that doesn't look as nice in the output).
You'll note also the use of the qq function. You learned about qq in passing way back in the “Going Deeper” section on Day 2, “Working with Strings and Numbers.” The qq function is a way of creating a double-quoted string without actually using any double-quotes. I used it here because if I had actually used double-quotes, I would have had to backslash the double-quotes in the string itself. I think it looks better this way.
Lines 74 through 80 read in the input file (using <>), and then simply print it all back out again, inserting paragraph tags at the appropriate spots (that is, where there are blank lines). I use the $paragraph variable to keep track of whether there's an open <p> tag with no corresponding closing tag. If there is, the script prints out a closing </p> tag before printing another opening <p>. A more robust version of this script would watch for things such as embedded special characters (accents, bullets, and so on) and replace them with the appropriate HTML codes—but that's a task done much easier with pattern matching, so we'll leave it for later.
All that's left is to print the final e-mail link (using an HTML mailto URL and link tags) and finish up the HTML file.