INI section headers appear at the beginning of a line, and are
designated by placing a name within square brackets (e.g., [Section1]
). Those rules are simple to
translate into a regex:
^[[^] ]+]
Regex options: ^ and $ match at line breaks |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
There aren’t many parts to this regex, so it’s easy to break down:
The leading ‹^
›
matches the position at the beginning of a line, since the “^ and $
match at line breaks” option is enabled.
‹[
› matches a
literal [
character. It’s
escaped with a backslash to prevent [
from starting a
character class.
‹[^]
]
› is a
negated character class that matches any character except ]
, a carriage return (
), or a line feed (
). The immediately following ‹+
› quantifier lets the class
match one or more characters, which brings us to….
The trailing ‹]
›
matches a literal ]
character to
end the section header. There’s no need to escape this character
with a backslash because it does not occur within a character
class.
If you only want to find a specific section header, that’s even
easier. The following regex matches the header for a section called
Section1
:
^[Section1]
Regex options: ^ and $ match at line breaks |
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby |
In this case, the only difference from a plain-text search for
“[Section1]” is that the match must occur at the beginning of a line.
This prevents matching commented-out section headers (preceded by a
semicolon) or what looks like a header but is actually part of a
parameter’s value (e.g., Item1=[Value1]
).
Recipe 9.14 describes how to match INI section blocks. Recipe 9.15 does the same for INI name-value pairs.
Techniques used in the regular expressions in this recipe are discussed in Chapter 2. Recipe 2.1 explains which special characters need to be escaped. Recipe 2.2 explains how to match nonprinting characters. Recipe 2.3 explains character classes. Recipe 2.5 explains anchors. Recipe 2.12 explains repetition.