Defining and Using Hashes

Arrays and hashes can be created and used in many of the same ways. Hashes, however, do have some peculiarities and extra features that result from the way data is stored in a hash. For example, when you put data into a hash, you'll have to keep track of two scalars for each element (the key and the value). Because hashes are unordered, you'll have to do extra work to extract values from the hash, as well as to sort them and print them. In addition, hashes perform differently than arrays in a scalar context. Read on to learn about all these things.

List Syntax and Hashes

List syntax—enclosing the elements of a list inside parentheses, separated by commas—works to create a hash just as well as it does an array. Just stick a hash variable on the left side of the assignment, rather than an array variable, like this:

%pairs = ('red', 255, 'green', 150, 'blue', 0);

With an array variable on the left side of the assignment, this statement would create an array of six elements. With a hash variable (%pairs), the elements are added to hash in pairs, with the first element a key and the second its value, the third element the second key, and the fourth element its value, and so on down the line. If there are an odd number of elements in the list, the last element will be a key in the hash, and its value will be undef. If you have Perl warnings turned on, you'll also get a warning about this (“Odd number of elements in hash assignment”).

With this kind of formatting, it's sort of difficult to figure out at a glance what parts of the list are the keys and which are the values, or if you do indeed have an odd number of elements without counting them. (It only gets worse the larger the lists get.) Many Perl programmers format list syntax for hashes like this, with the keys and values on their own lines:

%temps = (
   'Boston', 32,
   'New York', 24,
   'Miami', 78,
   'Portland', 45,
    # and so on...
);

Even better is the => operator, which behaves exactly the same way as the comma, but makes it easier to see the link between the keys and the values. So that first example up there with the colors would look like this:

%pairs = ('red'=>255, 'green'=>150, 'blue'=>0);

And the second, with the cities:

%temps = (
   'Boston' => 32,
   'New York' => 24,
   'Miami' => 78,
   'Portland' => 45,
    # and so on...
);

One other shortcut you can use for hashes: Perl expects the key part of each hash element to be a string, so you can leave the quotes off the key to save yourself some typing and Perl will figure out what you mean. If the key contains a space, however, you'll have to leave the quotes in place (Perl isn't that smart):

%pairs = (red=>255, green=>150, blue=>0);

In an array or list, any of your elements can be duplicates of any others because the elements are ordered.You can also use the index number to look up those values in the array. With hashes, however, because your keys are used to look up values in the hash, it is important that you use unique keys, with no duplicates. In fact, Perl won't allow you to use duplicate keys, the value of the key farther in the list will overwrite the value closer to the beginning, and there will only be one key/value pair for each unique key:

%temps = (
   'Boston' => 32,
   'New York' => 24,
   'Miami' => 78,
   'Portland' => 45,
   'Boston' -> 30,  # this value will overwrite 32
    # and so on...
} ;

Keys must be unique, but values, on the other hand, are entirely independent of each other and can contain as many duplicates as you need.

As with lists, () assigned to a hash variable creates an empty hash:

%hash = ();  # no keys or values

Converting Between Arrays, Lists, and Hashes

A second way to create a hash is to use an array or a list for its initial elements. Because both hashes and arrays use lists as their raw form, you can copy them back and forth between each other with no problems:

@stuff = ('one', 1, 'two', 2);
%pairsostuff = @stuff;

In this example, assigning the array @stuff to the hash %pairsostuff causes the array elements to be expanded into a list, and then paired off into two key/value pairs in the hash. It behaves just the same as if you had typed all the elements in list syntax. Watch out for those odd-numbered elements, however; you'll end up with a key whose value is undef (Perl warnings will let you know if this is happening. You might want to test an array before assigning it to a hash to make sure that it contains an even number of elements to avoid printing a warning).

What about converting a hash back into a list? Here's an example where you're assigning a hash to an array:

@stuff = %pairsostuff;When you put a hash on the right side of a list assignment, or in fact use it in any situation where a raw list is expected, Perl will “unwind” the hash into its component elements (key, value, key, value, and so on). The expanded list is then assigned to the array @stuff.

There is a catch to this nifty unwinding behavior: because hashes are not ordered, the key/value pairs you get out of a hash will not necessarily be in the same order you put them in, or in any kind of sorted order. Hash elements are stored in an internal format that makes them very fast to access, and are unwound in that internal order. If you must create a list from a hash in a certain order, you'll have to build a loop to extract them in a specific order (more about this later).

Accessing Hash Elements

To get at or assign a value to a hash, you need to know the name of the key. Unlike arrays, which just have bare values in a numeric order, hashes have key value pairs. When you know the key, however, you can then use curly braces ({}) to refer to a hash value, like this:

print $temps{'Portland'} ;
$temps{'Portland'}  = 50;

Note that this syntax is similar to the array access syntax $array[]—you use a scalar variable $ to get at a scalar value inside a hash, but here you use curly braces {} surrounding the key name, as opposed to brackets. The thing inside the braces should be a string (here we used a single-quoted string), although Perl will convert numbers to strings for you. Also, if the key only contains a single word, you can leave off the quotes and Perl will know what you mean:

$temps{Portland}  = 50; # same as $temps{'Portland'),

As with arrays, the variable name in the hash access syntax doesn't interfere with scalar variables of the same name. All the following refer to different things, even though the variable name is the same:

$name         # a scalar

@name         # an entire array

%name         # an entire hash

$name[$index] # a scalar value contained in the array name at $index

$name{key}    # a scalar value contained in the hash name at the key 'key'

Also, as with arrays (sensing the trend, here?), you can assign values to individual hash elements using that same hash element-access syntax with an assignment statement, and the old value at that key is replaced with the new value:

%hash{key}  = $newvalue;

If you assign a value to a key that does not exist, that key/value pair is automatically created for you.

Deleting Hash Elements

Use the delete function to delete elements, both keys and values, from a hash. Unlike with arrays, where delete did roughly the same thing as undef—simply undefining a value but leaving it there—with hashes, delete actually does delete every trace of the element from the hash.

The delete function takes a reference to a hash element (commonly just the hash access expression such as $hashname{'key'}) and deletes both that key and value, returning the value that was deleted. So, for example, to move an element from one hash to another (deleting it from one hash and adding it to another), you could use syntax something like this:

$hash2{$key}  = delete $hash{$key};

As with arrays, you can also test to see if a particular key/value pair exists in a hash using the exists function. The exists function tests to see if a given hash value exists in a hash and returns the value if it does (note that the value attached to that key could very well be undefined; exists only tests for the actual existence of the key). Use exists like this:

if (exists $hashname{$key} )  { $hashname{$key} ++; }

This particular statement tests to see if the value at the key $key exists, and if it does, it increments the value at that key (assuming, of course, that the value is a number).

Processing All the Values in a Hash

To process all the elements in an array, you use a foreach or a while loop to iterate over all the values, testing each one for some feature and then doing something to that value if the test was true. But how do you do that for hashes? Hashes are unordered, so you can't just start from key zero and go on until the end. There is no key zero, and no end (well, there is, internally, but you can't get at that order).

The most commonly used answer to this problem is to use one of two functions: keys or values. These functions both take a hash as an argument, and then return, respectively, a raw list of all the keys in the hash, or a raw list of all the values in the hash. With either of these lists, you can use foreach or another loop to process each element of the hash without worrying about missing any.

So, for example, let's say you had a hash containing a list of temperatures indexed by city name (as we had in a previous example in this section) and you wanted to print a list of those cities and temperatures, in alphabetical order. You could use keys to get a list of all the keys, sort to sort those keys, and then a foreach loop to print the value of each of those keys, like this:

foreach $city (sort keys %temps) {
    print "$city: $temps{$city}  degrees
";
}

This loop works by working through the list of elements and assigning each one to the $city variable in turn (or any variable you pick). You can then use that variable in the body of the loop as the key into the hash to get the value of the current element. This is an extremely common group of lines for accessing and processing hash elements; you'll see this a lot as we write examples over the next few days.

Hashes and Context

Let's return to context and go over how hashes behave in the various contexts. For the most part, hashes behave just like lists, and the same rules apply, with a couple of wrinkles.

You've seen how to create a hash from list syntax, where the hash will match keys to pairs, like this:

%pairs = (red=>255, green=>150, blue=>0);

In the reverse case, where you use a hash where a list is expected, the hash will unwind back into its component parts (in some undetermined order), and then follow the same rules for any list.

@colors = %pairs;      # results in an array of all elements
($x, $y, $z) = %pairs; # first three elements of unwound hash assigned to vars,
                       # remaining elements ignored
print %pairs;          # prints unwound hash elements concatenated together

In all these instances, if you use a hash in a list context—for example, on the right side of an assignment—then the hash will be “unwound” back into individual items, and then the list behaves as it does in any list or scalar context. The one peculiar case is this one:

$x = %pairs;

At first glance, this would seem to be the hash equivalent of the way to get the number of elements out of an array ($x = @array). However, Perl behaves differently with this one than it does with arrays—the result in $x will end up being a description of the internal state of the hash table (something like 3/8 or 4/100), which in 99% of cases is probably not what you want. To get the number of elements (key/value pairs) in a hash, use the keys function and then assign it to a scalar variable instead:

$x = keys %pairs;

The keys function returns a list of the keys in the hash, which is then evaluated in a scalar context, and gives the number of elements.

Note

Curious about just what I mean by “a description of the internal state of the hash?” Okay, then. The result of assigning a hash variable in a scalar context gives you two numbers, separated by a slash. The second is the number of slots that have been allocated for the internal hash table (often called “buckets”), and the first is the number of slots actually used by the data. There's nothing you can do with this number, so if you see it, you've probably done something wrong (probably you're trying to get the number of keys in your hash, and what you really wanted was to use %x = keys %hash instead of $x = %hash.


..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset