Manipulating Strings

In the bulk of this lesson, we've explored ways you can manipulate lists and list contents with various built-in functions. Perl also has several useful functions for manipulating strings as well (and I summarized a number of these on Day 2). In this section, let's cover a few of these functions, including reverse, index, rindex, and substr.

Each of these functions is used to modify strings. In many cases, it might be easier or more efficient to modify strings using other mechanisms—concatenating them using the . operator, building them with variable values using variable interpolation, or searching and extracting substrings with patterns (as you'll learn in the next few days). But in many cases, these functions might be conceptually easier to use, particularly if you're used to similar string-manipulation functions from other languages.

reverse

You've already seen the reverse function, as used with lists, which reverses the order of elements in the list. reverse, when used in a scalar context, behaves differently: It reverses all the characters in the number or string.

Note

With a single string like “To be or not to be” or “antidisestablishmentarianism,” you'll end up with new strings that simply have all the characters reversed. Note, however, that if the strings you're reversing have newlines at the end, that the reversed strings will have those newlines at the beginning of the string (creating somewhat puzzling results). If you don't want the newline to be reversed, don't forget to chomp it first.


The different behaviors of reverse in list and scalar context can sometimes be confusing. Take, for example, this bit of code:

foreach $string (@list) {
   push @reversed, reverse $string;
}

Offhand, that bit of code looks like it takes all the string elements in the @list array, reverses each one, and then pushes the result onto the @reversed array. But if you run that code, the contents of @reversed appear to be exactly the same as the contents of @list. Why? The push function takes a list as its second argument, so reverse is called in a list context, not a scalar context. The string in $string is then interpreted as a list of one element, which, when reversed, still contains the one element. The characters inside that element aren't even touched. To fix this, all you need is the scalar function:

foreach $string (@list) {
   push @reversed, scalar (reverse $string);
}
						

index and rindex

The index and rindex functions are used to find substrings inside other strings. Given two strings (one to look in and one to search for), they return the position of the second string inside the first, or -1 if the substring was not found. Positions are marked between characters, with 0 at the start of the string. So, for example, you could create a grep-like bit of code with index or rindex like this:

foreach $str (@list) {
   if ((index $str, $key) != -1) {
      push @final, $str;
}

The difference between index and rindex is in where the function starts looking. The index function begins from the start of the string and finds the position of the first occurrence of that string; rindex starts from the end of the string and finds the position of the last occurrence of that string.

Both index and rindex can take an optional third argument, indicating the position inside the string to start looking for the substring. So, for example, if you had already found a match using one call to index, you could call index again and start looking where you left off.

substr

The substr function is shorthand for substring, and can be used to extract characters from or add characters to a string—although it's most common usage is to extract substrings of other strings. The substr takes up to three arguments:

  • The string to act on

  • The position (offset) of the start of the substring to extract or replace. You can use a negative number to start counting from the end of the string.

  • The length of the substring to extract or replace. If the length isn't included, substr will change the substring from the offset to the end of the string.

The substr function returns the characters it removed (it does not modify the original string). So, for example, to extract characters 5 through 8 in the string $longstring, and store them in $newstr, use this line:

$newstr = substr($longstring, 5, 3);

To create a new string that replaces characters or adds characters to another string, use the substr function on the left side of an assignment. The string you use on the right can be larger or smaller than the string you replace; Perl doesn't care:

substr($longstring, 5, 3) = "parthenogenesis";

If you wanted to search for and replace a substring everywhere in another string, you might use a while loop, index, and substr like this:

$str = "This is a test string.  The string we want to change.";
$pos = 0;
$key = 'i';
$repl = "*";

while ($pos < (length $str) and $pos != -1) {
    $pos = index($str, $key, $pos);

    if ($pos != -1 ) {
        substr($str,$pos,length $key) = $repl;
        $pos++;
    }
}

Don't become overly attached to this code, however; there are a number of ways to do this sort of operation using much less code. In particular, Perl's regular expressions enable you to compress that entire loop into one line:

$str =~ s/$key/$repl/g;

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset