2.20. Insert the Regex Match into the Replacement Text

Problem

Perform a search-and-replace that converts URLs into HTML links that point to the URL, and use the URL as the text for the link. For this exercise, define a URL as “http:” and all nonwhitespace characters that follow it. For instance, Please visit http://www.regexcookbook.com becomes Please visit <a href="http://www.regexcookbook.com">http://www.regexcookbook.com</a>.

Solution

Regular expression

http:S+
Regex options: None
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Replacement

<ahref="$&">$&</a>
Replacement text flavors: .NET, JavaScript, Perl
<ahref="$0">$0</a>
Replacement text flavors: .NET, Java, XRegExp, PHP
<ahref=""></a>
Replacement text flavors: PHP, Ruby
<ahref="&">&</a>
Replacement text flavor: Ruby
<ahref="g<0>">g<0></a>
Replacement text flavor: Python

Discussion

Inserting the whole regex match back into the replacement text is an easy way to insert new text before, after, or around the matched text, or even between multiple copies of the matched text. Unless you’re using Python, you don’t have to add any capturing groups to your regular expression to be able to reuse the overall match.

In Perl, «$&» is actually a variable. Perl stores the overall regex match in this variable after each successful regex match. Using «$&» adds a performance penalty to all your regexes in Perl, so you may prefer to wrap your whole regex in a capturing group and use a backreference to that group instead.

.NET and JavaScript have adopted the «$&» syntax to insert the regex match into the replacement text. Ruby uses backslashes instead of dollar signs for replacement text tokens, so use «&» for the overall match.

Java, PHP, and Python do not have a special token to reinsert the overall regex match, but they do allow text matched by capturing groups to be inserted into the replacement text, as the next section explains. The overall match is an implicit capturing group number 0. For Python, we need to use the syntax for named capture to reference group zero. Python does not support «».

.NET, XRegExp, and Ruby also support the zeroth capturing group syntax, but it doesn’t matter which syntax you use. The result is the same.

See Also

Search and Replace with Regular Expressions in Chapter 1 describes the various replacement text flavors.

Recipe 3.15 explains how to use replacement text in source code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset