How it works...

The std::regex_replace() algorithm has several overloads with different types of parameters, but the meaning of the parameters is as follows:

  • The input string on which the replacement is performed.
  • An std::basic_regex object that encapsulates the regular expression used to identify the parts of the strings to be replaced.
  • The string format used for replacement.
  • Optional matching flags.

The return value is, depending on the overload used, either a string or a copy of the output iterator provided as an argument. The string format used for replacement can either be a simple string or a match identifier indicated with a $ prefix:

  • $& indicates the entire match.
  • $1, $2, $3, and so on, indicate the first, second, third submatch, and so on.
  • $` indicates the part of the string before the first match.
  • $' indicates the part of the string after the last match.

In the first example shown in the How to do it... section, the initial text contains two words made of exactly three a, b, or c characters, abc and bca. The regular expression indicates an expression of exactly three characters between word boundaries. That means a subtext, such as bbbb, will not match the expression. The result of the replacement is that the string text will be --- aa --- ca bbbb.

Additional flags for the match can be specified to the std::regex_replace() algorithm. By default, the matching flag is std::regex_constants::match_default that basically specifies ECMAScript as the grammar used for constructing the regular expression. If we want, for instance, to replace only the first occurrence, then we can specify std::regex_constants::format_first_only. In the next example, the result is --- aa bca ca bbbb as the replacement stops after the first match is found:

    auto text{ "abc aa bca ca bbbb"s }; 
auto rx = std::regex{ R"([a|b|c]{3})"s };
auto newtext = std::regex_replace(text, rx, "---"s,
std::regex_constants::format_first_only);

The replacement string, however, can contain special indicators for the whole match, a particular submatch, or the parts that were not matched, as explained earlier. In the second example shown in the How to do it... section, the regular expression identifies a word of at least one character, followed by a coma and possible white spaces and then another word of at least one character. The first word is supposed to be the last name and the second word is supposed to be the first name. The replacement string has the $2 $1 format. This is an instruction to replace the matched expression (in this example, the entire original string) with another string formed of the second submatch followed by space and then the first submatch.

In this case, the entire string was a match. In the next example, there will be multiple matches inside the string, and they will all be replaced with the indicated string. In this example, we are replacing the indefinite article a when preceding a word that starts with a vowel (this, of course, does not cover words that start with a vowel sound) with the indefinite article an:

    auto text{"this is a example with a error"s}; 
auto rx = std::regex{R"(a ((a|e|i|u|o)w+))"s};
auto newtext = std::regex_replace(text, rx, "an $1");

The regular expression identifies the letter a as a single word ( indicates a word boundary, so a means a word with a single letter a) followed by a space and a word of at least two characters starting with a vowel. When such a match is identified, it is replaced with a string formed of the fixed string an followed by a space and the first subexpression of the match, which is the word itself. In this example, the newtext string will be this is an example with an error.

Apart from the identifiers of the subexpressions ($1, $2, and so on), there are other identifiers for the entire match ($&), the part of the string before the first match ($`) and the part of the string after the last match ($'). In the last example, we change the format of a date from dd.mm.yyyy to yyyy.mm.dd, but also show the matched parts:

    auto text{"today is 1.06.2016!!"s}; 
auto rx =
std::regex{R"((d{1,2})(.|-|/)(d{1,2})(.|-|/)(d{4}))"s};
// today is 2016.06.1!!
auto newtext1 = std::regex_replace(text, rx, R"($5$4$3$2$1)");
// today is [today is ][1.06.2016][!!]!!
auto newtext2 = std::regex_replace(text, rx, R"([$`][$&][$'])");

The regular expression matches a one- or two-digit number followed by a dot, hyphen, or slash; followed by another one- or two-digit number; then a dot, hyphen, or slash; and last a four-digit number.

For newtext1, the replacement string is $5$4$3$2$1; that means year, followed by the second separator, then month, the first separator, and finally day. Therefore, for the input string "today is 1.06.2016!", the result is "today is 2016.06.1!!".

For newtext2, the replacement string is [$`][$&][$']; that means the part before the first match, followed by the entire match, and finally the part after the last match are in square brackets. However, the result is not "[!!][1.06.2016][today is ]" as you perhaps might expect at a first glance, but "today is [today is ][1.06.2016][!!]!!". The reason is that what is replaced is the matched expression, and, in this case, that is only the date ("1.06.2016"). This substring is replaced with another string formed of the all parts of the initial string.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset