Chapter 6. Lists

Erlang is great at handling lists, long series of values. List processing makes it easy to see the value of recursion, and offers opportunities to get a lot of work done for very little effort.

List Basics

An Erlang list is an ordered set of elements. Generally you will process a list in order, from the first item (the head) to the last item, though there are times when you may want to grab a particular item from the list. Erlang also provides built-in functions for manipulating lists when you don’t want to go through the entire sequence.

Erlang syntax encloses lists in square brackets and separates elements with commas. A list of numbers might look like the following:

[1,2,4,8,16,32]

The elements can be of any type, including numbers, atoms, tuples, strings, and other lists. When you’re starting out, it’s definitely easiest to work with lists that contain only a single type of element, rather than mixing all the possibilities, but Erlang itself has no such constraint. There is also no limit on the number of items a list can contain, though eventually you may find practical limits of memory.

You can pattern match with lists just as you can with other Erlang data structures:

1> [1,X,4,Y] = [1,2,4,8].
[1,2,4,8]
2> X.
2
3> Y.
8

While it’s possible to use lists instead of tuples, your code will make more sense if you use tuples to handle data structures containing various kinds of data in a known structure, and lists to handle data structures containing less varied data in unknown quantities. (Tuples are expected to come in a certain order and can also contain lists, so if you have a data structure that’s mostly known except for an expanding part or two, including a list inside of a tuple can be a workable solution.)

Lists can contain lists, and sometimes this can produce surprising results. If, for example, you want to add a list to a list, you may end up with more levels of list than you planned:

4> Insert=[2,4,8].
[2,4,8]
5> Full = [1, Insert, 16, 32].
[1,[2,4,8],16,32]

You can fix that (if you want to) with the lists:flatten/1 function:

6> Neat = lists:flatten(Full).
[1,2,4,8,16,32]

This also means that if you want to append lists, you need to decide whether you’re creating a list of lists or a single list containing the contents of the component lists. To create a list of lists, you just put lists into lists.

7> A = [1,2,4].
[1,2,4]
8> B = [8,16,32].
[8,16,32]
9> ListOfLists = [A,B].
[[1,2,4],[8,16,32]]

To create a single list from multiple lists, you can use the lists:append/2 function or the equivalent ++ operator.

10> Combined1 = lists:append(A,B).
[1,2,4,8,16,32]
11> Combined2 = A ++ B.
[1,2,4,8,16,32]

Both produce the same result: a combined and flattened list.

Note

The ++ operator is right associative, which can change the order of the resulting list when you append multiple lists.

If you have a set of lists you’d like combined, you can use the lists:append/1 function, which takes a list of lists as its argument and returns a single list containing their contents:

12> C = [64,128,256].
[64,128,256]
13> Combined4 = lists:append([A,B,C]).
[1,2,4,8,16,32,64,128,256]
Note

If you want to generate a list of sequential integers (or characters), the lists:seq/2 function is handy. Its arguments are the start and end of the list values. For example, lists:seq(-2,8) produces [-2,-1,0,1,2,3,4,5,6,7,8], and lists:seq($A,$z) produces the string (list) "ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz".

Splitting Lists into Heads and Tails

Lists are a convenient way to hold piles of similar data, but their great strength in Erlang is the way they make it easy to do recursion. Lists are a natural fit for the “counting down” style of logic explored in Chapter 4: you can run through a list until you run out of items. In many languages, running through a list means finding out how many items it contains and going through them sequentially. Erlang takes a different approach, letting you process the first item in a list, the head, while extracting the rest of the list, the tail, so that you can pass it to another call recursively.

To extract the head and the tail, you use pattern matching, with a special form of the list syntax on the left:

[Head | Tail] = [1,2,4].

The two variables separated by a vertical bar (|), or cons, short for list constructor, will be bound to the head and tail of the list on the right. In the console, Erlang will just report the contents of the right side of the expression, not the fragments created by the pattern match, but if you work through a list you can see the results:

1> List = [1,2,4].
[1,2,4]
2> [H1 | T1] = List.
[1,2,4]
3> H1.
1
4> T1.
[2,4]
5> [H2 | T2] = T1.
[2,4]
6> H2.
2
7> T2.
[4]
8> [H3 | T3] = T2.
[4]
9> H3.
4
10> T3.
[]
11> [H4 | T4] = T3.
** exception error: no match of right-hand side value []

Line 2 breaks the initial list into two smaller pieces. H1 will contain the first item of the List, whereas T1 will contain a list that has everything except the first element. Line 5 repeats the process on the smaller list, breaking T1 into an H2 and a T2. This time T2 is still a list, as shown on line 7, but contains only one item. Line 8 breaks that single-item list again, putting the value into H3 and an empty list into T3.

What happens when you try to split an empty list, as shown on line 11? Erlang reports an error, “no match…”. Fortunately, this does not mean that recursion on lists is doomed to produce errors. That lack of a match will naturally stop the recursive process, which is probably what you want.

Note

Head and tail work only move forward through a list. If order matters and you really need to go through a list backwards, you’ll need to use the lists:reverse function and then walk through the reversed list.

Processing List Content

The head and tail notation was built for recursive processing. Using this approach, a list arrives as an argument and is then passed to another (usually private) function with an accumulator argument. A simple case might perform a calculation on the contents of the list. Example 6-1, in ch06/ex1-product, shows this pattern in use, multiplying the values of a list together.

Example 6-1. Calculating the product of values in a list
-module(overall).
-export([product/1]).

product([]) -> 0; % in case the list is empty, return zero
product(List) -> product(List,1).

product([], Product) -> Product;  % when list empty, stop, report

product([Head|Tail], Product) -> product(Tail, Product * Head).

In this module, the product/1 function is the gateway, passing the list (if the list has content) plus an accumulator to product/2, which does the real work. If you wanted to test the arriving list to make sure it meets your expectations, it probably makes the most sense to do that work in product/1, and let product/2 focus on recursive processing.

Note

Is the product of an empty list really zero? It might make more sense for an empty list to fail and produce a crash. Erlang’s “let it crash” philosophy is, as you’ll see later, pretty calm about such things. In the long run, you’ll have to decide which cases are better left to crash and which aren’t.

The product/2 function has two clauses. The first matches the empty list, and will get called at the end of the recursive process when there are no more entries to process, or if the list arrives empty. It returns its second argument, the accumulator.

If the arriving list is not empty, the second clause goes to work. First, the pattern match ([Head|Tail]) splits off the first value of the list from the rest of the list. Next, it calls product/2 again, with the remaining (if any) portion of the list and a new accumulator that is multiplied by the value of the first entry in the list. The result will be the product of the values included in the list:

1> c(overall).
{ok,overall}
2> overall:product([1,2,3,5]).
30

That went smoothly, but what happened? After product/1 called product/2, it made five iterations over the list, concluding with an empty list, as shown in Table 6-1.

Table 6-1. Recursive processing of a simple list in product/2
Arriving List Arriving Product Head Tail

[1,2,3,5]

1

1

[2,3,5]

[2,3,5]

1 (1*1)

2

[3,5]

[3,5]

2 (1*2)

3

[5]

[5]

6 (2*3)

5

[]

[]

30 (6*5)

None

None

The last arriving Product, 30, will3 be handled by the clause for the empty list and reported as the return value for product/2. When product/1 receives that value, it will also report 30 as its return value and exit.

Note

Because Erlang strings are lists of characters represented as numbers, you can do some strange things like enter overall:product("funny"). product/1 will interpret the character values as numbers, and return 17472569400.

Creating Lists with Heads and Tails

While there are times you want to calculate a single value from a list, most list processing involves modifying lists or converting a list into another list. Because you can’t actually change a list, modifying or converting a list means creating a new list. To do that, you use the same vertical bar head/tail syntax, but on the right side of the pattern match instead of the left. You can try this out in the console, though it’s more useful in a module:

1> X=[1|[2,3]].
[1,2,3]

Erlang interprets [1|[2,3]] as creating a list. If the value to the right of the vertical bar is a list, it gets appended to the head as a list. In this case, the result is a neat list of numbers. There are a few other forms you should be aware of:

2> Y=[1,2 | [3]].
[1,2,3]
3> Z=[1,2 | 3].
[1,2|3]

In line 2, there isn’t a list wrapped around the now two items in the head, but the constructor still blends the head and the tail together seamlessly. (If you do wrap them in square brackets, the list constructor assumes that you want a list as the first item in the list, so [[1,2] | [3]] will produce [[1,2],3].)

However, line 3 demonstrates what happens if you don’t wrap the tail in square brackets—you get a list, called an improper list, that still contains a constructor, with a strange tail. Until you’ve learned your way quite thoroughly around Erlang, you should avoid this, as it will create runtime errors if you try to process it as a normal list. Eventually you may find reasons to do this, or encounter code that uses it.

More typically, you’ll use list constructors to build lists inside of recursive functions. Example 6-2, which you can find in ch06/ex2-drop, starts from a set of tuples representing planemos and distances. With the help of the drop module from Example 3-8, it creates a list of velocities for the corresponding falls.

Example 6-2. Calculating a series of drop velocities, with an error
-module(listdrop).
-export([falls/1]).

falls(List) -> falls(List,[]).

falls([], Results) -> Results;
falls([Head|Tail], Results) -> falls(Tail, [drop:fall_velocity(Head) | Results]).

Much of this is familiar from Example 6-1, except that the Results variable gets a list instead of a number, and the last line of falls/2 creates a list instead of a single value. If you run it, however, you’ll see one minor problem:

1> c(drop).
{ok,drop}
2> c(listdrop).
{ok,listdrop}
3> listdrop:falls([{earth,20},{moon,20},{mars,20}]).
[12.181953866272849,8.0,19.79898987322333]

The resulting velocities are reversed: the Earth has more gravity than Mars, and objects should fall faster on Earth. What happened? That last key line in falls/2 is reading a list from the beginning to the end, and creating a list from the end to the beginning. That puts the values in the wrong order. Fortunately, as Example 6-3 demonstrates, this is easy to fix. You need to call lists:reverse/1 in the clause of the falls/2 function that handles the empty list.

Example 6-3. Calculating a series of drop velocities, with the error fixed
-module(listdrop).
-export([falls/1]).

falls(List) -> falls(List,[]).

falls([], Results) -> lists:reverse(Results);
falls([Head|Tail], Results) -> falls(Tail, [drop:fall_velocity(Head) | Results]).

Now it works:

4> c(listdrop).
{ok,listdrop}
5> listdrop:falls([{earth,20},{moon,20},{mars,20}]).
[19.79898987322333,8.0,12.181953866272849]
Note

You could instead have put the lists:reverse/1 call in the falls/1 gateway function. Either way is fine, though I prefer to have falls/2 return a finished result.

Mixing Lists and Tuples

As you get deeper into Erlang and pass around more complex data structures, you may find that you’re processing lists full of tuples, or that it would be more convenient to rearrange two lists into a single list of tuples or vice-versa. The lists module includes easy solutions to these kinds of transformations and searches.

The simplest set of tools are the lists:zip/2 and lists:unzip/1 functions. They can turn two lists of the same size into a list of tuples or a list of tuples into two lists.

1> List1=[1,2,4,8,16].
[1,2,4,8,16]
2> List2=[a,b,c,d,e].
[a,b,c,d,e]
3> TupleList=lists:zip(List1,List2).
[{1,a},{2,b},{4,c},{8,d},{16,e}]
4> SeparateLists=lists:unzip(TupleList).
{[1,2,4,8,16],[a,b,c,d,e]}

The two lists, List1 and List2, have different content, but the same number of items. The lists:zip/2 function returns a list containing a tuple for each of the items in the original list. The lists:unzip/1 function takes that list of two-component tuples and splits it out into a tuple containing two lists.

Note

Erlang also provides lists:zip3/3 and lists:unzip3/1, which do the same combining and separating on sets of three lists or tuple values.

You will also likely encounter times when you need to process a different kind of list containing tuples, a collection of values identified by keys. Many languages include associative arrays, where access is provided through key values, but Erlang’s lists are always sequential, and have no built-in concept of retrieving information with a key. However, the lists module provides functions that support treating a list of tuples as if it were a key/value store, such as a hash table, hash tree, or associative array.

These days, more and more Erlang programmers are using maps, described in Chapter 10, for this kind of work. However, you may find old code that requires you to work with tuples inside lists. The key must be in a consistent location in the tuples stored in the list. Because the functions let you specify the location, you can put them anywhere in the tuple as long as you’re consistent, but for this example, they’ll be in the first position.

The first function to explore is lists:keystore/4. It takes a key value, a position, a list that is the previous state of the key/value store (defined in line 1 of the following code sample), and a tuple. If no tuple has that key value, then the new tuple simply gets added to the list, as shown in line 2 of the following code sample. If, as in line 3, a tuple already has that key value, the function will return a list that replaces the matched tuple with the new one:

1> Initial=[{1,tiger}, {3,bear}, {5,lion}].
[{1,tiger},{3,bear},{5,lion}]
2> Second=lists:keystore(7,1,Initial,{7,panther}).
[{1,tiger},{3,bear},{5,lion},{7,panther}]
3> Third=lists:keystore(7,1,Second,{7,leopard}).
[{1,tiger},{3,bear},{5,lion},{7,leopard}]

You can also pass lists:keystore/4 an empty list for the array, and it will just return a list containing the new tuple.

Sometimes you want to replace a value only if it is present, not add a new value to the list. The similar lists:keyreplace/4 will do just that.

4> Fourth=lists:keyreplace(6,1,Third,{6,chipmunk}).
[{1,tiger},{3,bear},{5,lion},{7,leopard}]

There was no item in the previous list with a key value of 6, so lists:keyreplace/4 just returned a copy of the original list.

Note

All of these functions copy lists or create new modified versions of a list. As you’d expect in Erlang, the original list is untouched.

If you want to get information back out of a list, the lists:keyfind/3 argument will report the data that matches a given key:

5> Animal5=lists:keyfind(5,1,Third).
{5,lion}

If the key isn’t present, however, you’ll just get a return value of false, instead of a tuple:

6> Animal6=lists:keyfind(6,1,Third).
false

Building a List of Lists

While simple recursion isn’t too complicated, list processing has a way of turning into lists of lists in various stages. Pascal’s triangle, a classic mathematical tool, is relatively simple to create but demonstrates more intricate work with lists. It starts with a 1 at the top, and then each new row is composed of the sum of the two numbers above it:

         1
       1   1
     1   2   1
   1   3   3   1
 1   4   6   4   1
...

If those numbers seem familiar, it’s probably because they’re the binomial coefficents that appear when you put (x+y) to a power. That’s just the beginning of this mathematical marvel, described in more detail at http://bit.ly/2lFAvGG.

This is easily calculated with Erlang in a number of ways. You can apply the list techniques already discussed in this chapter by treating each row as a list, and the triangle as a list of lists. The code will be seeded with the first row—the top 1—represented as [0,1,0]. The extra zeros make the addition much simpler.

Note

This is not intended to be an efficient, elegant, or maximally compact implementation. At this point, a naive implementation likely explains more about lists. Once this makes sense, and you learn about list comprehensions in Chapter 7, you can explore what a vastly more compact version might look like. See http://bit.ly/2loNXhFp.

For a first step, Example 6-4 calculates rows individually. This is a simple recursive process, walking over the old list and adding its contents to create a new list.

Example 6-4. Calculating a row
-module(pascal).
-export([add_row/1]).
add_row(Initial) -> add_row(Initial, 0, []).

add_row([], 0, Final) -> [0 | Final];

add_row([H | T], Last, New) -> add_row(T, H, [Last + H | New]).

The add_row/1 function sets things up, sending the current row a 0 to get the math started, and an empty list you can think of as “where the results go,” though it is really an accumulator. The add_row/3 function has two clauses. The first checks to see if the list being added is empty. If it is, then it reports back the final row, adding a 0 at the front.

Most of the work gets done in the second clause of add_row/3. When it receives its arguments, the [H | T] pattern match splits the head of the list into the H value (a number) and the tail into T (a list, which may be empty if that was the last number). It also gets values for the Last number processed and the current New list being built.

It then makes a recursive call to add_row/3. In that new call, the tail of the old list, T, is the new list to process, the H value becomes the Last number processed, and the third argument, the list, opens with the actual addition being performed, which is then combined with the rest of the New list being built.

Note

Because the lists in the triangle are symmetrical, there is no need to use lists:reverse/1 to flip them. You can, of course, if you want to.

You can test this easily from the console, but remember that your test lists need to be wrapped in zeros:

1> c(pascal).
{ok,pascal}
2> pascal:add_row([0,1,0]).
[0,1,1,0]
3> pascal:add_row([0,1,1,0]).
[0,1,2,1,0]
4> pascal:add_row([0,1,2,1,0]).
[0,1,3,3,1,0]

Now that you can create a new row from an old one, you need to be able to create a set of rows from the top of the triangle, as shown in Example 6-5, which you can find in ch06/ex4-pascal. The add_row/3 function effectively counted down to the end of the list, but triangle/3 will need to count up to a given number of rows. The triangle/1 function sets things up, defining the initial row, setting the counter to 1 (because that initial row is the first row), and passing on the number of Rows to be created.

The triangle/3 function has two clauses. The first, the stop clause, halts the recursion when enough Rows have been created, and reverses the list. (The individual rows may be symmetrical, but the triangle itself is not.) The second clause does the actual work of generating new rows. It gets the Previous row generated from the list, and then it passes that to the add_row/1 function, which will return a new row. Then it calls itself with the new list, an incremented Count, and the Rows value the stop clause needs.

Example 6-5. Calculating the whole triangle with both functions
-module(pascal).
-export([triangle/1]).

triangle(Rows) -> triangle([[0,1,0]],1,Rows).

triangle(List, Count, Rows) when Count >= Rows -> lists:reverse(List);

triangle(List, Count, Rows) ->
  [Previous | _] = List,
  triangle([add_row(Previous) | List], Count+1, Rows).

add_row(Initial) -> add_row(Initial, 0, []).

add_row([], 0, Final) -> [0 | Final];

add_row([H | T], Last, New) -> add_row(T, H, [Last + H | New]).

Happily, this works.

5> c(pascal).
{ok,pascal}
6> pascal:triangle(4).
[[0,1,0],[0,1,1,0],[0,1,2,1,0],[0,1,3,3,1,0]]
7> pascal:triangle(6).
[[0,1,0],
 [0,1,1,0],
 [0,1,2,1,0],
 [0,1,3,3,1,0],
 [0,1,4,6,4,1,0],
 [0,1,5,10,10,5,1,0]]

Pascal’s triangle may be a slightly neater set of lists than most you will process, but this kind of layered list processing is a very common tactic for processing and generating lists of data.

Note

You can learn more about working with lists in Chapter 2 of Erlang Programming (O’Reilly); Section 3.7 and Chapter 4 of Programming Erlang, 2nd Edition (Pragmatic); Section 2.2.5 of Erlang and OTP in Action (Manning); and Chapter 1 of Learn You Some Erlang For Great Good! (No Starch Press).

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset