9.14. Match INI Section Blocks

Problem

You need to match each complete INI section block (in other words, a section header and all of the section’s parameter-value pairs), in order to split up an INI file or process each block separately.

Solution

Recipe 9.13 showed how to match an INI section header. To match an entire section, we’ll start with the same pattern from that recipe, but continue matching until we reach the end of the string or a [ character that occurs at the beginning of a line (since that indicates the start of a new section):

^[[^]
]+](?:
?
(?:[^[
].*)?)*
Regex options: ^ and $ match at line breaks (“dot matches line breaks” must not be set)
Regex flavors: .NET, Java, JavaScript, PCRE, Perl, Python, Ruby

Or in free-spacing mode:

^ [ [^]
]+ ]  # Match a section header
(?:                # Followed by the rest of the section:
  
?
            #   Match a line break character sequence
  (?:              #   After each line starts, match:
    [^[
]       #     Any character except "[" or a line break character
    .*             #     Match the rest of the line
  )?               #   The group is optional to allow matching empty lines
)*                 # Continue until the end of the section
Regex options: ^ and $ match at line breaks, free-spacing (“dot matches line breaks” must not be set)
Regex flavors: .NET, Java, XRegExp, PCRE, Perl, Python, Ruby

Discussion

This regular expression starts by matching an INI section header with the pattern ^[[^] ]+], and continues matching one line at a time as long as the lines do not start with [. Consider the following subject text:

[Section1]
Item1=Value1
Item2=[Value2]

; [SectionA]
; The SectionA header has been commented out

ItemA=ValueA ; ItemA is not commented out, and is part of Section1

[Section2]
Item3=Value3
Item4 = Value4

Given the string just shown, this regex finds two matches. The first match extends from the beginning of the string up to and including the empty line before [Section2]. The second match extends from the start of the Section2 header until the end of the string.

See Also

Recipe 9.13 shows how to match INI section headers. Recipe 9.15 does the same for INI name-value pairs.

Techniques used in the regular expressions in this recipe are discussed in Chapter 2. Recipe 2.1 explains which special characters need to be escaped. Recipe 2.2 explains how to match nonprinting characters. Recipe 2.3 explains character classes. Recipe 2.5 explains anchors. Recipe 2.9 explains grouping. Recipe 2.12 explains repetition.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset