3.5. Test If a Match Can Be Found Within a Subject String

Problem

You want to check whether a match can be found for a particular regular expression in a particular string. A partial match is sufficient. For instance, the regex regexpattern partially matches The regex pattern can be found. You don’t care about any of the details of the match. You just want to know whether the regex matches the string.

Solution

C#

For quick one-off tests, you can use the static call:

bool foundMatch = Regex.IsMatch(subjectString, "regex pattern");

If the regex is provided by the end user, you should use the static call with full exception handling:

bool foundMatch = false;
try {
    foundMatch = Regex.IsMatch(subjectString, UserInput);
} catch (ArgumentNullException ex) {
    // Cannot pass null as the regular expression or subject string
} catch (ArgumentException ex) {
    // Syntax error in the regular expression
}

To use the same regex repeatedly, construct a Regex object:

Regex regexObj = new Regex("regex pattern");
bool foundMatch = regexObj.IsMatch(subjectString);

If the regex is provided by the end user, you should use the Regex object with full exception handling:

bool foundMatch = false;
try {
    Regex regexObj = new Regex(UserInput);
    try {
        foundMatch = regexObj.IsMatch(subjectString);
    } catch (ArgumentNullException ex) {
        // Cannot pass null as the regular expression or subject string
    }
} catch (ArgumentException ex) {
    // Syntax error in the regular expression
}

VB.NET

For quick one-off tests, you can use the static call:

Dim FoundMatch = Regex.IsMatch(SubjectString, "regex pattern")

If the regex is provided by the end user, you should use the static call with full exception handling:

Dim FoundMatch As Boolean
Try
    FoundMatch = Regex.IsMatch(SubjectString, UserInput)
Catch ex As ArgumentNullException
    'Cannot pass Nothing as the regular expression or subject string
Catch ex As ArgumentException
    'Syntax error in the regular expression
End Try

To use the same regex repeatedly, construct a Regex object:

Dim RegexObj As New Regex("regex pattern")
Dim FoundMatch = RegexObj.IsMatch(SubjectString)

The IsMatch() call should have SubjectString as the only parameter, and the call should be made on the RegexObj instance rather than the Regex class:

Dim FoundMatch = RegexObj.IsMatch(SubjectString)

If the regex is provided by the end user, you should use the Regex object with full exception handling:

Dim FoundMatch As Boolean
Try
    Dim RegexObj As New Regex(UserInput)
    Try
        FoundMatch = Regex.IsMatch(SubjectString)
    Catch ex As ArgumentNullException
        'Cannot pass Nothing as the regular expression or subject string
    End Try
Catch ex As ArgumentException
    'Syntax error in the regular expression
End Try

Java

The only way to test for a partial match is to create a Matcher:

Pattern regex = Pattern.compile("regex pattern");
Matcher regexMatcher = regex.matcher(subjectString);
boolean foundMatch = regexMatcher.find();

If the regex is provided by the end user, you should use exception handling:

boolean foundMatch = false;
try {
	Pattern regex = Pattern.compile(UserInput);
	Matcher regexMatcher = regex.matcher(subjectString);
	foundMatch = regexMatcher.find();
} catch (PatternSyntaxException ex) {
	// Syntax error in the regular expression
}

JavaScript

if (/regex pattern/.test(subject)) {
    // Successful match
} else {
    // Match attempt failed
}

PHP

if (preg_match('/regex pattern/', $subject)) {
    # Successful match
} else {
    # Match attempt failed
}

Perl

With the subject string held in the special variable $_:

if (m/regex pattern/) {
    # Successful match
} else {
    # Match attempt failed
}

With the subject string held in the variable $subject:

if ($subject =~ m/regex pattern/) {
    # Successful match
} else {
    # Match attempt failed
}

Using a precompiled regular expression:

$regex = qr/regex pattern/;
if ($subject =~ $regex) {
    # Successful match
} else {
    # Match attempt failed
}

Python

For quick one-off tests, you can use the global function:

if re.search("regex pattern", subject):
    # Successful match
else:
    # Match attempt failed

To use the same regex repeatedly, use a compiled object:

reobj = re.compile("regex pattern")
if reobj.search(subject):
    # Successful match
else:
    # Match attempt failed

Ruby

if subject =~ /regex pattern/
    # Successful match
else
    # Match attempt failed
end

This code does exactly the same thing:

if /regex pattern/ =~ subject
    # Successful match
else
    # Match attempt failed
end

Discussion

The most basic task for a regular expression is to check whether a string matches the regex. In most programming languages, a partial match is sufficient for the match function to return true. The match function will scan through the entire subject string to see whether the regular expression matches any part of it. The function returns true as soon as a match is found. It returns false only when it reaches the end of the string without finding any matches.

The code examples in this recipe are useful for checking whether a string contains certain data. If you want to check whether a string fits a certain pattern in its entirety (e.g., for input validation), use the next recipe instead.

C# and VB.NET

The Regex class provides four overloaded versions of the IsMatch() method, two of which are static. This makes it possible to call IsMatch() with different parameters. The subject string is always the first parameter. This is the string in which the regular expression will try to find a match. The first parameter must not be null. Otherwise, IsMatch() will throw an ArgumentNullException.

You can perform the test in a single line of code by calling Regex.IsMatch() without constructing a Regex object. Simply pass the regular expression as the second parameter and pass regex options as an optional third parameter. If your regular expression has a syntax error, an ArgumentException will be thrown by IsMatch(). If your regex is valid, the call will return true if a partial match was found, or false if no match could be found at all.

If you want to use the same regular expression on many strings, you can make your code more efficient by constructing a Regex object first, and calling IsMatch() on that object. The first parameter, which holds the subject string, is then the only required parameter. You can specify an optional second parameter to indicate the character index at which the regular expression should begin the check. Essentially, the number you pass as the second parameter is the number of characters at the start of your subject string that the regular expression should ignore. This can be useful when you’ve already processed the string up to a point, and you want to check whether the remainder should be processed further. If you specify a number, it must be greater than or equal to zero and less than or equal to the length of the subject string. Otherwise, IsMatch() throws an ArgumentOutOfRangeException.

Java

To test whether a regex matches a string partially or entirely, instantiate a Matcher object as explained in Recipe 3.3. Then call the find() method on your newly created or newly reset matcher.

Do not call String.matches(), Pattern.matches(), or Matcher.matches(). Those all require the regex to match the whole string.

JavaScript

To test whether a regular expression can match part of a string, call the test() method on your regular expression. Pass the subject string as the only parameter.

regexp.test() returns true if the regular expression matches part or all of the subject string, and false if it does not.

PHP

The preg_match() function can be used for a variety of purposes. The most basic way to call it is with only the two required parameters: the string with your regular expression, and the string with the subject text you want the regex to search through. preg_match() returns 1 if a match can be found and 0 when the regex cannot match the subject at all.

Later recipes in this chapter explain the optional parameters you can pass to preg_match().

Perl

In Perl, m// is in fact a regular expression operator, not a mere regular expression container. If you use m// by itself, it uses the $_ variable as the subject string.

If you want to use the matching operator on the contents of another variable, use the =~ binding operator to associate the regex operator with your variable. Binding the regex to a string immediately executes the regex. The pattern-matching operator returns true if the regex matches part of the subject string, and false if it doesn’t match at all.

If you want to check whether a regular expression does not match a string, you can use !~, which is the negated version of =~.

Python

The search() function in the re module searches through a string to find whether the regular expression matches part of it. Pass your regular expression as the first parameter and the subject string as the second parameter. You can pass the regular expression options in the optional third parameter.

The re.search() function calls re.compile(), and then calls the search() method on the compiled regular expression object. This method takes just one parameter: the subject string.

If the regular expression finds a match, search() returns a MatchObject instance. If the regex fails to match, search() returns None. When you evaluate the returned value in an if statement, the MatchObject evaluates to True, whereas None evaluates to False. Later recipes in this chapter show how to use the information stored by MatchObject.

Tip

Don’t confuse search() with match(). You cannot use match() to find a match in the middle of a string. The next recipe uses match().

Ruby

The =~ operator is the pattern-matching operator. Place it between a regular expression and a string to find the first regular expression match. The operator returns an integer with the position at which the regex match begins in the string. It returns nil if no match can be found.

This operator is implemented in both the Regexp and String classes. In Ruby 1.8, it doesn’t matter which class you place to the left and which to the right of the operator. In Ruby 1.9, doing so has a special side effect involving named capturing groups. Recipe 3.9 explains this.

Tip

In all the other Ruby code snippets in this book, we place the subject string to the left of the =~ operator and the regular expression to the right. This maintains consistency with Perl, from which Ruby borrowed the =~ syntax, and avoids the Ruby 1.9 magic with named capturing groups that people might not expect.

See Also

Recipe 3.6 shows code to test whether a regex matches a subject string entirely.

Recipe 3.7 shows code to get the text that was actually matched by the regex.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset