Anyone who uses a computer should be familiar with tools that find and replace text. Such tools are typically rudimentary, but a find and replace tool for a web design application has stiffer requirements. Not only does it need to be able to locate and replace text in the body of pages, but it also must be able to locate and replace text that appears in code. Most importantly, it must be able to distinguish between the two. The word “button,” for example, has an entirely different connotation when used in the body of a page than when used in code. If you wanted to replace all HTML buttons with image buttons, you wouldn’t want to also replace the phrase button-down shirt with the phrase image-down shirt.
Regular expressions comprise a language that can be difficult to master. A full discussion is outside the scope of this book. If you’re interested in learning regular expressions, I highly recommend that you read Sams Teach Yourself Regular Expressions in 10 Minutes from Sams Publishing.
The find and replace tool in Expression Web is well-suited to performing intelligent find and replace for sites. It offers features specifically designed for searching HTML code, and it provides a powerful search capability utilizing regular expressions. Regular expressions (sometimes referred to as regex) are specialized search strings that match patterns in text or code. As you’ll see in this chapter, the use of regular expressions adds powerful search capabilities that would not be possible otherwise.
Finding and replacing a simple word or phrase is a straightforward endeavor. In web design, however, things are rarely simple. Suppose you have designed a site that contains hundreds of pages with Social Security numbers (SSNs) on them. Because of new requirements in your company, you are charged with the task of reformatting these SSNs. You need to keep the last four digits and replace the rest of each SSN with asterisks. A simple find and replace just won’t suffice, but a regular expression is perfect for such a job.
The actual structure of an SSN is more restrictive than the regular expression used here. To keep the regular expression example less complex, I opted for a simpler pattern.
The most efficient way of working with regular expressions is to separate your search into parts. When looking for an SSN, you need to find three numbers followed by a dash, two numbers followed by another dash, and then four numbers. Additionally, the first digit of the series must be between 0 and 7.
The final regular expression looks like this:
[0-7][0-9]^2-[0-9]^2-{[0-9]^4}
If you’re unfamiliar with regular expressions, this example might appear to have a complex syntax, but it’s actually simple. Let’s break it apart so you’ll fully understand how the language works.
The first set of numbers you want to match consists of three digits and is known as the area number. The first digit must be between 0 and 7 because the Social Security Administration has never issued an SSN with an area number higher than 728.
The regular expression to match this pattern is as follows:
[0-7][0-9]^2
The first part of this expression, [0-7]
, indicates that any single digit between 0 and 7 produces a match. The second part of the expression, [0-9]^2
, indicates that any two digits between 0 and 9 produces a match. The ^
character is the repeat expression character, and it is followed by the number of times the preceding expression should be repeated.
The middle set of numbers you need to match consists of two digits between 0 and 9 and is known as the group number. The syntax for the regular expression is [0-9]^2
. This syntax should now be familiar to you. It means that you want to match a character between 0 and 9 and then repeat that expression two times.
The last set of numbers you want to match consists of four digits between 0 and 9 and is called the serial number. The syntax for the regular expression is [0-9]^4
. The curly braces surrounding this portion of the regular expression are explained in the “Replacing Text” section later in this chapter.
Between each set of numbers is a dash character. A dash is a special kind of character because not only can you write a regular expression to look for a dash in some text, but it is also used in regular expression syntax. (In the regular expression we’re using in this chapter, the dash is used to indicate a range of digits.) To actually find an explicit dash in text, you need to specify that you are looking for an actual dash and not using it as part of the regular expression.
The character, called the escape character, does just that. By preceding a character with the
, you are telling Expression Web that you want to match that character. Therefore, the regular expression
-
matches a dash character in text.
The Find and Replace dialog (shown in Figure 10.1) is made up of three tabs:
• Find—Provides tools for locating text within one or more pages
• Replace—Provides tools for locating and replacing text within one or more pages
• HTML Tags—Provides tools for locating and replacing HTML code
Figure 10.1. The Find and Replace dialog provides all the tools you need to locate and replace both text and HTML code.
To find specific text in one or more pages, open the Find and Replace dialog by selecting Edit, Find. Enter the text you want to search for in the Find What text box, select the desired options, and click Find All to display the search results. You can choose to search from the insertion point up, from the insertion point down, or in all directions by selecting the desired radio button in the Direction section.
Some of the options in the Find Where section might be disabled based on what you currently have open in Expression Web.
You can specify where to search for the text entered using the radio buttons in the Find Where section of the Find and Replace dialog. The following options are available:
• All Pages—Searches for the specified text in all pages in the current site.
• Open Page(s)—Searches for the specified text in all open pages.
• Selected Page(s)—Searches for the specified text in all selected pages. Pages can be selected in the Folder List panel or in Folders View.
• Current Page—Searches for the specified text in the current page only.
For more information on using the Folder List panel and Folders View, see Chapter 1, “An Overview of Expression Web.”
You can also specify additional options for searching in the Advanced section. Check the Regular Expression check box if the text you have entered is a regular expression. If you want to search source code for the text you have entered, check the Find in Source Code check box. The other options should be self-explanatory.
When you click Find All, Expression Web displays the results in the Find 1 panel by default. You can display the results in a second panel (the Find 2 panel) by selecting the Find 2 radio button in the Display Results In section. Figure 10.2 shows the results of a search for SSNs using the regular expression shown in Figure 10.1. The results are displayed in the Find 2 panel, but previous search results in the Find 1 panel can be recalled easily by clicking the Find 1 tab to display the Find 1 panel. This is useful when you want to start a new search but are not yet finished working on results from a previous search.
Figure 10.2. The results of a regular expression search for SSNs are displayed in the Find 2 panel. The ability to use two panels for search results allows you to easily work with two different result sets.
If you need assistance entering a regular expression search, click the right-pointing arrow button to the right of the Find What box, as shown in Figure 10.3. You can easily build simple regular expressions using this method.
Figure 10.3. The right-pointing arrow button can make creating regular expression searches fast and easy, but don’t expect to find advanced regular expression syntax here.
Complex regular expressions will likely require manual entry instead of using the Regular Expressions button. Fortunately, Expression Web keeps a list of recently used searches so you can easily recall a complex search later. By clicking the downward-pointing button to the right of the Find What box, you can access a list of previously entered searches, as shown in Figure 10.4.
Figure 10.4. Expression Web maintains a list of recent searches you can recall with the click of a button.
For an even better way of saving complex searches, see “Saving Queries,” p. 182.
To replace text, open the Find and Replace dialog by selecting Edit, Replace. If the Find and Replace dialog is already open, you can simply click the Replace tab. Enter the text you want to search for in the Find What text box and the text that should replace it in the Replace With text box, as shown in Figure 10.5.
Figure 10.5. The Replace tab allows you to locate text and replace it with the text you specify.
You can also use regular expressions when replacing text. Remember that the requirements for our Social Security example are that all SSNs should be reformatted so that only the last four digits of the number are displayed. All other digits should be replaced with asterisks. Without regular expressions, you wouldn’t be able to perform such a complex replace, but with regular expressions, it’s fairly straightforward.
We’ve already covered the regular expression used to locate SSNs. Here it is again:
[0-7][0-9]^2-[0-9]^2-{[0-9]^4}
I’ve already explained everything in the regular expression with the exception of the curly braces. Curly braces in a regular expression enable you to store the result of the expression inside the braces so it can be used later. An expression inside curly braces is called a tagged expression. You can have any number of tagged expressions in a regular expression.
Tagged expressions are used when replacing text using regular expressions. Let’s look at a specific example using the SSN replacement we’re performing. Consider the following SSN:
232-00-2323
When our regular expression locates this SSN, it should replace it with the following:
***-**-2323
Replacing the numbers with asterisks is simple, but you also need to leave the last four digits as they are. You could just change the regular expression so that it located all patterns of ###-## and just replaced them with asterisks. However, some instances of that pattern might not be SSNs. For example, suppose the pages also contain employee numbers and are in the format ###-##. In that case, you would be replacing the employee numbers with asterisks, and that’s not what you want.
Tagged expressions are the perfect solution to this problem. By using tagged expressions, you can use your regular expression as-is and easily perform the replace operation that is required.
Figure 10.6 shows the Find and Replace dialog ready to perform the SSN replacement. The regular expression in the Replace With text box shows the asterisks that will be used in place of the first five numbers in the SSN. The expression 1
that appears in place of the last four digits will be replaced with the result of the first tagged expression.
Figure 10.6. By using tagged expressions, portions of matched text can be stored for use when replacing text.
The right-pointing arrow button next to the Replace With text box (see Figure 10.7) provides easy access to tagged expressions. Simply click the right-pointing arrow button and then select the desired tagged expression to have it inserted into your regular expression.
Figure 10.7. Insert tagged expressions by clicking the Regular Expressions button.