Operators for Tests and Comparisons

Perl's comparison operators are used to test the relationship between two numbers or two strings. You can use equality tests to see if two scalars are equal, or relational operators to see if one is “larger” than another. Finally, Perl also includes logical operators for making boolean (true or false) comparisons. You'll commonly use the operators for tests as part of conditional and loop operations—ifs and whiles—, which we'll look at some tomorrow and in detail on Day 6, “Conditionals and Loops.”

The Meaning of Truth

No, we're not going to digress into a philosophical discussion here, but before actually going through the operators, you do need to understand just what Perl means by the terms true and false.

First off, any scalar data can be tested for its truth value, which means that not only can you test to see if two numbers are equivalent, you can also determine if the number 4 or the string "Thomas Jefferson" is true. The simple rule is this: All forms of scalar data (all numbers, strings, and references) are true except for three things:

  • The empty string ("")

  • Zero (0), which can also be "0"

  • The undefined value (which looks like "" or 0 most of the time anyhow).

With these rules in mind, let's move onto the actual operators.

Equality and Relational Operators

Equality operators test whether two bits of data are the same; relational operators test to see whether one value is greater than the other. For numbers, that's easy: comparison is done in numeric order. For strings, a string is considered less than another if the first one appears earlier, alphabetically, than the other (and vice versa for greater than). Character order is determined by the ASCII character set, with lowercase letters appearing earlier than uppercase letters, and spaces count. If a string is equal to another string, it must have the same exact characters from start to finish.

Perl has two sets of equality and relational operators, one for numbers and one for strings, as listed in Table 2.3. Although their names are different, they are both used the same way, with one operand on either side. All these operators return 1 for true and "" for false.

Table 2.3. Equality and Relationship Operators
Test Numeric Operator String Operator
equals == eq
not equals != ne
less than < lt
greater than > gt
less than or equals <= le
greater than or equals >= ge

Here are a bunch of examples of both the number and string comparisons:

4 < 5            # true
4 <= 4           # true
4 < 4            # false
5 < 4 + 5        # true (addition performed first)
6 < 10 > 15      # syntax error; tests cannot be combined
'5' < 8          # true; '5' converted to 5
'add' < 'adder'  # use lt for strings; this is an error under -w
'add' lt 'adder' # true
'add' lt 'Add'   # false; upper and lower case are different
'add' eq 'add '  # false, spaces count

Note that none of the equality or relational expressions can be combined with other equality or relational expressions. Although you can have an arithmetic expression 5 + 4 - 3, which evaluates left to right, you cannot have an expression 6 < 10 > 15; this will produce a syntax error in Perl because there's no way to evaluate it.

Be careful also to remember that == is the equals test, and is not to be confused with =, which is the assignment operator. The latter is an expression that can return true or false, so if you accidentally use = where you mean ==, it can be hard to track down the error.

It might seem odd to have to worry about two sets of comparison operators when Perl can convert between numbers and strings automatically. The reason there are two sets is precisely because numbers and strings can be automatically converted; you need a way of saying “no, really, compare these as numbers, I mean it.”

For example, let's say there was only one set of relationship operators in Perl, as in other languages. And say you had an expression like '5' < 100. What does that expression evaluate to? In other languages, you wouldn't even be able to make the comparison; it'd be an invalid expression. In Perl, because numbers and strings can be converted from one to the other, this isn't invalid altogether. But there are two equally correct ways to evaluate it. If you convert the '5' to a number, you get 5 < 100, which is true. If you convert 100 to a string, you get '5' < '100', which is false, because in ASCII order, the character 5 appears after the character 1. To avoid the ambiguity, we need two sets of operators.

Forgetting that there are two sets of comparison operators is one of the more common beginning Perl programmer mistakes (and one which can be infuriatingly difficult to figure out, given that 'this' == 'that' converts both strings to 0 and then returns true).

However, if you turn on warnings in Perl, it will let you know when you've made this mistake (yet another good reason to keep warnings turned on in all your scripts, at least until you're feeling a bit more confident in your programming ability).

Logical Operators

Logical or boolean operators are those that test other tests and evaluate their operands based on the rules of boolean algebra. That is, for the values of x and y:

  • x AND y returns true only if both x and y are also true

  • x OR y returns true if either a or y (or both) are true

  • NOT x returns true if x is false and vice versa

Note

I've capitalized AND, OR, and NOT in the preceding list to differentiate the boolean algebra concepts from the actual operator names. Otherwise, things can get confusing when we start talking about how you can use && or and to deal with and, or || or or to deal with or, and/or not for not (I'll try to avoid talking this way for just this reason).


In Perl's usual way of making sure you've got enough syntax choices to hang yourself, Perl has not just one set of logical comparisons, but two: one borrowed from C, and one with Perl keywords. Table 2.4 shows both these sets of operators.

Table 2.4. Logical Comparisons
C-Style Perl Style What it means
&& and logical AND
|| or logical OR
! not logical NOT

Note

There are also operators for logical XOR—^ and xor—but they're not commonly used outside bit manipulations, so I haven't included them here.


The only difference between the two styles of operators is in precedence; the C-style operators appear higher up on the precedence hierarchy than the Perl-style operators. The Perl-style operators' very low precedence can help you avoid typing some parentheses, if that sort of thing annoys you. You'll probably see the C-style operators used more often in existing Perl code; the Perl-style operators are a newer feature, and programmers who are used to C are more likely to use C-style coding where they can.

Both styles of logical AND and NOT are short-circuiting, that is, if evaluating the left side of the expression will determine the overall result, then the right side of the expression is ignored entirely. For example, let's say you had an expression like this:

($x < y) && ($y < $z)

If the expression on the left side of the && is false (if $x is greater than $y), the outcome of the right side of the expression is irrelevant. No matter what the result, the whole expression is going to be false (remember, logical AND states that both sides must be true for the expression to be true). So, to save some time, a short-circuiting operator will avoid even trying to evaluate the right side of the expression if the left side is false.

Similarly, with ||, if the left side of the expression evaluates to true, then the whole expression returns true and the right side of the expression is never evaluated.

Both forms of logical operators return false if they're false. If they're true, however, they have the side effect of returning the last value evaluated (which, because it's a nonzero or nonempty string, still counts as true). Although this side effect would initially seem silly if all you care about is the true value of the expression, it does allow you to choose between several different options or function calls or anything else, like this:

$result = $a || $b || $c || $d;

In this example, Perl will walk down the list of variables, testing each one for “truth.” The first one that comes out as true will halt the expression (because of short-circuiting), and the value of $result will be the value of the last variable that was checked.

Many Perl programmers like to use these logical tests as a sort of conditional, as in this next example, which you'll see a lot when you start looking at other people's Perl code:

open(FILE, 'inputfile') || die 'cannot open inputfile';

On the left side of the expression, open is used to open a file, and returns true if the file was opened successfully. On the right side, die is used to exit the script immediately with an error message. You only want to actually exit the script if the file couldn't be opened—that is, if open returns false. Because the || expression is short circuiting, the die on the right will only happen if the file couldn't be opened.

I'll come back to this on Day 6, when we cover other conditional statements in Perl (and you'll learn more about open on Day 15, “Working with Files and I/O.”

Pattern Matching

One last operator I'd like to introduce today enables you to do pattern matching in Perl. Pattern matching, also called regular expressions, is a tremendously powerful feature in Perl that will probably form the core of a lot of scripts you write; in fact in the middle of this book, on Days 9 and 10, we'll go into pattern matching with mind-boggling detail. But pattern matching is so useful and so essential to Perl that it is worth introducing, even in a very limited capacity, way up here in Day 2.

You've already seen a test that compares two strings for equality using the eq operator, like this:

$string eq 'foo'

That test will only return true if the value contained in the scalar variable $string is exactly equal to the string 'foo'. But what if you wanted to test to see if the value of $string contained 'foo', or if the value of $string contained 123, or if it contained any digits at all, or three spaces followed by three digits, or any other pattern of letters or numbers or whitespace that you can think of? That's pattern matching. If you've used the * to refer to multiple filenames on a command line, it's the same idea. If you've used regular expressions in any form on Unix, it's exactly the same thing (Perl's are slightly different, but follow many of the same rules).

To construct a pattern matching expression, you need two things: a comparison operator and a pattern; like this:

$string =~ m/foo/

This expression tests to see if the value of $string contains the characters foo, and if it does, it returns true. $string could be exactly 'foo' and the test would be true. $string could also be 'fool', 'buffoon', or 'foot-and-mouth disease' and this test would still return true. As long as the characters f o and o, in that order, are contained somewhere inside $string, this test will return true.

The =~ operator is the actual pattern match operator; it says to do pattern matching on the scalar thing on the left side of the operator with the pattern on the right side of the operator. There is also an operator for negated patterns, that is, return true if the pattern doesn't match: !~. In this case, the test would only return true if $string did not contain the characters foo.

The m/…/ operator, to the right of the pattern matching operator, is the pattern itself. The part inside the slashes is the pattern you will match on. Here our pattern is foo. For now we'll stick to matching simple alphabetic and numeric characters, as we progress through the book you'll learn about special characters that match multiple kinds of things. Note that you don't have to include either single or double quotes around the characters you're looking for. If you want to include a slash in your pattern, preface it with a backslash:

$string =~ m/this/that/

This pattern will match the characters this/that anywhere inside $string.

For these patterns, the m part is optional. Most of the time, you'll see patterns written without the m, like this:

$string =~ /foo/

There's a major catch to watch out for with patterns that match specific characters, similar to those you've learned about today: the characters you're matching are case-sensitive. This pattern, /foo/, will only match exactly the characters f o and o; it will not match uppercase F or O. So if $string contains “Foo” or “FOO” the pattern matching test will not return true. You can make the pattern case-insensitive by putting an i at the end of the pattern, which means it will search both upper and lowercase.

/foo/i

You can use case-sensitive or insensitive pattern matching depending on what you're looking for in the test.

Table 2.5 shows a summary of the pattern-related operators and expressions. We'll look more at patterns as the book progresses.

Table 2.5. Operators for Patterns
Operator What it Means
=~ match test
!~ negated match test
m/…/ /…/ pattern
m/…/i /…/i case insensitive pattern

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset