3.17. Replace All Matches Within the Matches of Another Regex

Problem

You want to replace all the matches of a particular regular expression, but only within certain sections of the subject string. Another regular expression matches each of the sections in the string.

Say you have an HTML file in which various passages are marked as bold with <b> tags. Between each pair of bold tags, you want to replace all matches of the regular expression before with the replacement text after. For example, when processing the string before <b>first before</b> before <b>before before</b>, you want to end up with: before <b>first after</b> before <b>after after</b>.

Solution

C#

Regex outerRegex = new Regex("<b>.*?</b>", RegexOptions.Singleline);
Regex innerRegex = new Regex("before");
string resultString = outerRegex.Replace(subjectString,
                      new MatchEvaluator(ComputeReplacement));

public String ComputeReplacement(Match matchResult) {
    // Run the inner search-and-replace on each match of the outer regex
    return innerRegex.Replace(matchResult.Value, "after");
}

VB.NET

Dim OuterRegex As New Regex("<b>.*?</b>", RegexOptions.Singleline)
Dim InnerRegex As New Regex("before")
Dim MyMatchEvaluator As New MatchEvaluator(AddressOf ComputeReplacement)
Dim ResultString = OuterRegex.Replace(SubjectString, MyMatchEvaluator)

Public Function ComputeReplacement(ByVal MatchResult As Match) As String
    'Run the inner search-and-replace on each match of the outer regex
    Return InnerRegex.Replace(MatchResult.Value, "after");
End Function

Java

StringBuffer resultString = new StringBuffer();
Pattern outerRegex = Pattern.compile("<b>.*?</b>");
Pattern innerRegex = Pattern.compile("before");
Matcher outerMatcher = outerRegex.matcher(subjectString);
while (outerMatcher.find()) {
    outerMatcher.appendReplacement(resultString,
      innerRegex.matcher(outerMatcher.group()).replaceAll("after"));
}
outerMatcher.appendTail(resultString);

JavaScript

var result = subject.replace(/<b>.*?</b>/g, function(match) {
    return match.replace(/before/g, "after");
});

PHP

$result = preg_replace_callback('%<b>.*?</b>%',
                                replace_within_tag, $subject);

function replace_within_tag($groups) {
    return preg_replace('/before/', 'after', $groups[0]);
}

Perl

$subject =~ s%<b>.*?</b>%($match = $&) =~ s/before/after/g; $match;%eg;

Python

innerre = re.compile("before")
def replacewithin(matchobj):
    return innerre.sub("after", matchobj.group())

result = re.sub("<b>.*?</b>", replacewithin, subject)

Ruby

innerre = /before/
result = subject.gsub(/<b>.*?</b>/) {|match|
    match.gsub(innerre, 'after')
}

Discussion

This solution is again the combination of two previous solutions, using two regular expressions. The “outer” regular expression, <b>.*?</b>, matches the HTML bold tags and the text between them. The “inner” regular expression matches the “before,” which we’ll replace with “after.”

Recipe 3.16 explains how you can run a search-and-replace and build the replacement text for each regex match in your own code. Here, we do this with the outer regular expression. Each time it finds a pair of opening and closing <b> tags, we run a search-and-replace using the inner regex, just as we do in Recipe 3.14. The subject string for the search-and-replace with the inner regex is the text matched by the outer regex.

See Also

This recipe uses techniques introduced by three earlier recipes. Recipe 3.11 shows code to iterate over all the matches a regex can find in a string. Recipe 3.15 shows code to find regex matches within the matches of another regex. Recipe 3.16 shows code to search and replace with replacements generated in code for each regex match instead of using a fixed replacement text for all matches.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset