CHAPTER 6

image

Scala Collections

The collections framework in Scala is a high-performance and type-parametrized framework with support for mutable and immutable type hierarchies. These distinct and independent mutable and immutable type hierarchies enable switching between mutable and immutable implementations much simpler. Scala’s object-oriented collections also support functional higher-order operations such as map, filter, and reduce that let you use expression-oriented programming in collections.  You can access and use the entire Java collections library from your Scala code because Scala is a JVM language, but this is not recommended because higher-order operations are not available with Java Collections library. Scala has a rich collection library. This chapter gives you a tour of the most commonly used collection types and operations, showing just the types you will use most frequently. The goal of this chapter is to guide you through the myriad of options to find the solutions you need.

Scala Collection Hierarchy

Most collection classes needed by client code exist in three packages, scala.collection, scala.collection.immutable, and scala.collection.mutable. In this section we will explore the three main packages of the collection framework and show you how to use the general and some prevalent features.

package scala.collection

This package is comprised of all high-level abstract classes or traits, which generally have mutable as well as immutable implementations. Figure 6-1 shows all collections in package scala.collection.

9781484202333_Fig06-01.jpg

Figure 6-1. scala.collection package

The types in scala.collection package shown in Figure 6-1 are implemented in two different ways in the Scala libraries based on whether the implementations are immutable or mutable. To keep these different implementations separate, there are packages called scala.collection.immutable and scala.collection.mutable.

Image Note  All types in scala.collections package are implemented in different ways in the Scala libraries based on whether the implementations are immutable or mutable. To keep these different implementations separate, there are packages called scala.collection.immutable and scala.collection.mutable.

Sequences

Sequences  store a number of different values in a specific order. Because the elements are ordered, you can ask for the first element, second element, and so on.

As shown in Figure 6-1, sequences branch off into two main categories: indexed sequences and linear sequences. By default, Seq creates a List as shown in Listing 6-1.

Listing 6-1. By Default, Seq Creates a List

val x = Seq(1,2,3)
scala> val x = Seq(1,2,3)
x: Seq[Int] = List(1, 2, 3)

By default, IndexedSeq creates a Vector as shown in Listing 6-2.

Listing 6-2. By Default, IndexedSeq Creates a Vector

val x = IndexedSeq(1,2,3)

scala> val x = IndexedSeq(1,2,3)
x: IndexedSeq[Int] = Vector(1, 2, 3)

Sets

A Scala Set is a collection of unique elements. The common Set classes are shown in Figure 6-1. By default, Set creates an immutable Set as shown in Listing 6-3.

Listing 6-3. collection.immutable package Is Automatically Added to the Current Namespace

val x = Set(1,2,3)
scala> val x = Set(1,2,3)
x: scala.collection.immutable.Set[Int] = Set(1, 2, 3)

Image Note  While the collection.immutable package is automatically added to the current namespace in Scala, the collection.mutable is not.

Map

Scala Map is a collection of key/value pairs, where all the keys must be unique. The most common map classes are shown in Figure 6-1.When you just need a simple, immutable Map, you can create one without requiring an import as shown in Listing 6-4.

Listing 6-4. Creating an Immutable Map Without Requiring an Import

scala> val map = Map(1 -> "a", 2 -> "b", 3 -> "c")
map: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b, 3 -> c)

package scala.collection.immutable

The scala.collection.immutable package stores various types of collections that are all immutable. The main classes and traits in this package are shown in Figure 6-2, Figure 6-3, and Figure 6-4 for Seq, Set, and Map, respectively. The top part of the hierarchy looks the same as that shown in Figure 6-1 for the scala.collection package. It begins with Traversable and Iterable, then has three main subtypes in the form of Set, Seq, and Map. The difference is that there are many more subtypes.

9781484202333_Fig06-02.jpg

Figure 6-2. Immutable Seq

9781484202333_Fig06-03.jpg

Figure 6-3. Immutable Set

9781484202333_Fig06-04.jpg

Figure 6-4. Immutable Map

Immutable Sequence

Figure 6-2 illustrates the Seq in scala.collection.immutable package. The top part of the hierarchy looks the same as that shown in Figure 6-1 for the scala.collection package.

If you want an immutable collection that has efficient indexing, your default choice would generally be Vector. An immutable IndexedSeq creates a Vector as shown in Listing 6-5.

Listing 6-5. An Immutable IndexedSeq Creates a Vector

val x = scala.collection.immutable.IndexedSeq(1,2,3)
scala> val x = scala.collection.immutable.IndexedSeq(1,2,3)
x: scala.collection.immutable.IndexedSeq[Int] = Vector(1, 2, 3)

An immutable LinearSeq creates as List as seen in Listing 6-6.

Listing 6-6. An Immutable LinearSeq Creates a Vector

val x = scala.collection.immutable.LinearSeq(1,2,3)
scala> val x = scala.collection.immutable.LinearSeq(1,2,3)
x: scala.collection.immutable.LinearSeq[Int] = List(1, 2, 3)

An immutable Seq creates a List too as shown in Listing 6-7.

Listing 6-7. An Immutable Seq Creates a List

val x = scala.collection.immutable.Seq(1,2,3)
scala> val x = scala.collection.immutable.Seq(1,2,3)
x: scala.collection.immutable.Seq[Int] = List(1, 2, 3)

A collection in package scala.collection.immutable is guaranteed to be immutable for everyone. Such a collection will never change after it is created. This means that accessing the same collection value at different points in time will always yield a collection with the same elements.

Immutable Set

Figure 6-3 illustrates the Set in scala.collection.immutable package. The top part of the hierarchy looks the same as that shown in Figure 6-1 for the scala.collection package.

The difference between root collections and immutable collections is that clients of an immutable collection have a guarantee that nobody can mutate the collection as illustrated in Listing 6-8, through 6-10.

Listing 6-8. Using an Immutable Set

val m = collection.immutable.Set(1,2,3)
m: scala.collection.immutable.Set[Int] = Set(1, 2, 3)

Listing 6-9. An Immutable SortedSet creates a TreeSet

scala> val m = collection.immutable.SortedSet(1,2,3)
m: scala.collection.immutable.SortedSet[Int] = TreeSet(1, 2, 3)

Listing 6-10. Using an Immutable BitSet

scala> val m = collection.immutable.BitSet(1,2,3)
m: scala.collection.immutable.BitSet = BitSet(1, 2, 3)

Immutable Map

Figure 6-4 illustrates the Map in scala.collection.immutable package. The top part of the hierarchy looks the same as that shown in Figure 6-1 for the scala.collection package.

There are base mutable and immutable map classes, Map, that is the base map, with both mutable and immutable implementations as illustrated in Listing 6-11 through 6-13.

Listing 6-11. Using an Immutable Map Without Requiring an Import

scala> val m = Map(1 -> "a", 2 -> "b")

m: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b)

Listing 6-12. Using an Immutable Map With the Prefix

scala> val m = collection.immutable.Map(1 -> "a", 2 -> "b")

m: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b)

Listing 6-13. Using an Immutable SortedMap

scala> val m = collection.immutable.SortedMap(1 -> "a", 2 -> "b")
m: scala.collection.immutable.SortedMap[Int,String] = Map(1 -> a, 2 -> b)

package scala.collection.mutable

As the name implies, this package includes collections that are mutable. The scala.collection.mutable is the most extensive of the three packages. It is worth scanning through the API to look at the various types in this package and to see what they are used for. By default, Scala always picks immutable collections. For instance, if you just write Set without any prefix or without having imported Set from somewhere, you get an immutable set, and if you write Iterable you get an immutable iterable collection, because these are the default bindings imported from the scala package. To get the mutable default versions, you need to write explicitly collection.mutable.Set, or collection.mutable.Iterable.

Image Note  A useful convention if you want to use both mutable and immutable versions of collections is to import just the package collection.mutable.

import scala.collection.mutable

Image Note  Then a word like Set without a prefix still refers to an immutable collection, whereas mutable.Set refers to the mutable counterpart.

Now let’s look at Mutable Sequences, Sets, and Maps. It is worth noting that we will go through only the partial implementations. The scala.collection.mutable is extensive and it is worth going through the Scala docs for a detailed treatment.

Figure 6-5 illustrates the Sequences in scala.collection.mutable package.

9781484202333_Fig06-05.jpg

Figure 6-5. Mutable sequences

As you can see the difference from Figure 6-1 of  the scala.collection package is the Buffer type. Let’s look at that now.

Buffer

The Buffer type is implicitly mutable. There is no immutable Buffer. There are several subtypes of Buffer in the Scala libraries. The two most significant ones are ArrayBuffer and ListBuffer. You can create a buffer as shown in Listing 6-14.

Listing 6-14. Creating a Buffer

val buffer = collection.mutable.Buffer(1,2,3)
buffer: scala.collection.mutable.Buffer[Int] = ArrayBuffer(1, 2, 3)

A mutable Seq creates as ArrayBuffer (see Listing 6-15).

Listing 6-15. A Mutable Seq

val x = scala.collection.mutable.Seq(1,2,3)
x: scala.collection.mutable.Seq[Int] = ArrayBuffer(1, 2, 3)

A mutable LinearSeq creates as MutableList (see Listing 6-16).

Listing 6-16. A Mutable LinearSeq

val x = scala.collection.mutable.LinearSeq(1,2,3)
x: scala.collection.mutable.LinearSeq[Int] = MutableList(1, 2, 3)

A mutable IndexedSeq creates an ArrayBuffer (see Listing 6-17).

Listing 6-17. A Mutable IndexedSeq

scala> val x = scala.collection.mutable.IndexedSeq(1,2,3)
x: scala.collection.mutable.IndexedSeq[Int] = ArrayBuffer(1, 2, 3)

You can use a mutable Set as shown in Listing 6-18.

Listing 6-18. A Mutable Set

scala> val m = collection.mutable.Set(1,2,3)
m: scala.collection.mutable.Set[Int] = Set(1, 2, 3)

A mutable SortedSet creates a TreeSet as seen in Listing 6-19.

Listing 6-19. A Mutable SortedSet

scala> val m = collection.mutable.SortedSet(1,2,3)
m: scala.collection.mutable.SortedSet[Int] = TreeSet(1, 2, 3)

You can use a mutable BitSet as shown in Listing 6-20.

Listing 6-20. A Mutable BitSet

scala> val m = collection.mutable.BitSet(1,2,3)
m: scala.collection.mutable.BitSet = BitSet(1, 2, 3)

You can use a mutable Map as shown in Listing 6-21.

Listing 6-21. A Mutable Map

scala> val m = collection.mutable.Map(1 -> "a", 2 -> "b")
m: scala.collection.mutable.Map[Int,String] = Map(2 -> b, 1 -> a)

The idea of this section was to give you an overview of Scala collection hierarchy. You learned that there are three basic types Seq, Set, and Map, which are then implemented in mutable and immutable packages. Now we will look at the immutable and mutable collection classes.

Using Immutable Collection Classes

Scala has a wide variety of collections classes. Collections are containers of things. Those containers can be sequenced, linear sets of items as shown in Listing 6-22 and Listing 6-23.

Listing 6-22. A List

scala> val x = List(1,2,3,4)
x: List[Int] = List(1, 2, 3, 4)

Listing 6-23. Filtering Through List

scala> x.filter(a => a % 2 == 0)
res14: List[Int] = List(2, 4)
scala> x
res15: List[Int] = List(1, 2, 3, 4)

They may be indexed items where the index is a zero-based Int (e.g., Array) or any other type (e.g., Map) as illustrated in Listing 6-24.

Listing 6-24. Creating Array

scala> val a = Array(1,2,3)
a: Array[Int] = Array(1, 2, 3)
scala> a(1)
res16: Int = 2

Listing 6-25. Creating a Map

scala> val m = Map("one" -> 1, "two" -> 2, "three" -> 3)
m: scala.collection.immutable.Map[String,Int] = Map(one -> 1, two -> 2, three ->3)
scala> m("two")
res17: Int = 2

The collections may have an arbitrary number of elements or be bounded to zero or one element (e.g., Option). Collections may be strict or lazy.

Lazy collections have elements that may not consume memory until they are accessed (e.g., Range). Let’s create a Range (see Listing 6-26).

Listing 6-26. Creating a Range

scala> 0 to 10
res0: Range.Inclusive = Range(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

The nifty thing about Ranges is that the actual elements in the Range are not instantiated until they are accessed. So we can create a Range for all positive Integers but take only the first five elements. This code runs without consuming many gigabytes of RAM because only the elements that are needed are created (see Listing 6-27).

Listing 6-27. Using Range as Lazy Collection

scala> (1 to Integer.MAX_VALUE - 1).take(5)
res57: scala.collection.immutable.Range = Range(1, 2, 3, 4, 5)

Collections may be mutable (the contents of the reference can change) or immutable (the thing that a reference refers to is never changed). Note that immutable collections may contain mutable items.

Vector

You saw earlier that by default, specifying that you want an IndexedSeq creates a Vector (see Listing 6-28).

Listing 6-28. Creating a Vector

scala> val x = IndexedSeq(1,2,3)
x: IndexedSeq[Int] = Vector(1, 2, 3)

Accessing the vector using index:

scala> x(0)
res53: Int = 1

You can’t modify a vector, so you add elements to an existing vector as you assign the result to a new variable:

scala> val a = Vector(1, 2, 3)
a: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)
scala> val b = a ++ Vector(4, 5)
b: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5)

Use the updated method to replace one element in a vector while assigning the result to a new variable:

scala> val c = b.updated(0, "x")
c: scala.collection.immutable.Vector[java.lang.String] = Vector(x, b, c)

You can also use all the usual filtering methods to get just the elements you want out of a vector:

scala> val a = Vector(1, 2, 3, 4, 5)
a: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5)
scala> val b = a.take(2)
b: scala.collection.immutable.Vector[Int] = Vector(1, 2)
scala> val c = a.filter(_ > 2)
c: scala.collection.immutable.Vector[Int] = Vector(3, 4, 5)

In these examples, we created each variable as a val and assigned the output to a new variable just to be clear, but you can also declare your variable as a var and reassign the result back to the same variable:

scala> var a = Vector(1, 2, 3)
a: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)
scala> a = a ++ Vector(4, 5)
a: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3, 4, 5)

You may have seen that mixing a mutable variable (var) with an immutable collection causes surprising behavior. For instance, when you create an immutable Vector as a var, it appears you can somehow add new elements to it:

scala> var int = Vector(1)
int: scala.collection.immutable.Vector[Int] = Vector(1)
scala>int = int :+ 2
int: scala.collection.immutable.Vector[Int] = Vector(1, 2)
scala>int = int :+ 3
int: scala.collection.immutable.Vector[Int] = Vector(1, 2, 3)
scala>int.foreach(println)
1
2
3

Though it looks like you’re mutating an immutable collection, what’s really happening is that the int variable points to a new collection each time you use the :+. The int variable (var) is mutable so it’s actually being reassigned to a new collection during each step. However, the elements in an immutable collection like Vector cannot be changed. If you want to change the elements in a mutable collection, use ArrayBuffer.

List[T]

Scala’s List[T] is a linked list of type T. That means it’s a sequential list of any type, including Java’s primitives (Int, Float, Double, Boolean, Char) because Scala takes care of boxing (turning primitives into objects) for you. We can build a list using the same syntax we used to build an array with initial values as shown.

List(1, 2, 3)
scala> List(1,2,3)
res0: List[Int] = List(1, 2, 3)

Like the array type, the list type is parametric and Scala will figure out the best type if you use this syntax. There is no syntax for making an uninitialized list. That is because lists are immutable. Once you have created a list, the values in it cannot be changed. Changing it would require making a new list. However, there is another way to put lists together when we do not know initially all the values that will be stored in them. We can efficiently build lists one element at a time if we add elements to the front of the list. To add elements to a list we use the ”cons” operator, ::. Internally, List is made up of a “cons” cell (the scala.:: class [yes, that’s two colons]) with a tail that refers to another cons cell or the Nil object. It’s easy to create a List:

scala> 1 :: 2 :: 3 :: Nil
res20: List[Int] = List(1, 2, 3)

The previous code creates three cons cells, each with an Int in it. Anything that looks like an operator with a : (colon) as the first character is evaluated right to left. Thus, the previous code is evaluated just like the following:

scala> new ::(1, new ::(2, new ::(3, Nil)))
res21: ::[Int] = List(1, 2, 3)

:: takes a “head” which is a single element and a “tail” which is another List. The expression on the left of the ::is the head, and the expression on the right is the tail. To create a List using ::, we must always put a List on the right side. That means that the right-most element has to be a List, and in this case, we’re using an empty List, Nil.

We can also create a List using the List object’s apply method (which is defined as defapply[T](param: T*):List[T], which translates to “the apply method of type T takes zero or more parameters of type T and returns a List of type T”):

scala> List(1,2,3)
res22: List[Int] = List(1, 2, 3)

The type inferencer is pretty good at figuring out the type of the List, but sometimes you need to help it along:

scala> List(1, 44.5, 8d)
res27: List[AnyVal] = List(1, 44.5, 8.0)
scala> List[Number](1, 44.5, 8d)
res28: List[java.lang.Number] = List(1, 44.5, 8.0)

If you want to prepend an item to the head of the List, you can use ::, which actually creates a new cons cell with the old list as the tail:

scala> val x = List(1,2,3)
scala> 99 :: x
res0: List[Int] = List(99, 1, 2, 3)

Note that the list referred to by the variable x is unchanged, but a new List is created with a new head and the old tail. This is a very fast, constant-time, O(1), operation.

You can also merge two lists to form a new List. This operation is O(n) where n is the number of elements in the first List:

scala> val x = List(1,2,3)
scala> val y = List(99, 98, 97)
scala> x ::: y
res3: List[Int] = List(1, 2, 3, 99, 98, 97)

Getting Functional

The power of List and other collections in Scala come when you mix functions with the collection operators. Let’s say we want to find all the odd numbers in a List. It’s easy:

scala> List(1,2,3).filter(x => x % 2 == 1)
res4: List[Int] = List(1, 3)

The filter method iterates over the collection and applies the function, in this case, an anonymous function, to each of the elements. If the function returns true, the element is included in the resulting collection. If the function returns false, the element is not included in the resulting collection. The resulting collection is the same type of collection that filter was invoked on. If you invoke filter on a List[Int], you get a List[Int]. If you invoke filter on an Array[String], you get an Array[String] back. In this case, we’ve written a function that performs mod 2 on the parameter and tests to see whether the result is 1, which indicates that the parameter is odd.

We can also write a method called isOdd and pass the isOdd method as a parameter (Scala will promote the method to a function):

scala> def isOdd(x: Int) = x % 2 == 1
isOdd: (Int)Boolean
scala> List(1,2,3,4,5).filter(isOdd)
res6: List[Int] = List(1, 3, 5)

filter works with any collections that contain any type. For example:

scala> "99 Red Balloons".toList.filter(Character.isDigit)
res9: List[Char] = List(9, 9)

In this case, we’re converting a String to a List[Char] using the toList method and filtering the numbers. The Scala compiler promotes the isDigit static method on Character to a function, thus demonstrating interoperability with Java and that Scala methods are not magic.

Another useful method for picking the right elements out of a List is takeWhile, which returns all the elements until it encounters an element that causes the function to return false. For example, let’s get all the characters up to the first space in a String:

scala> "Elwood eats mice".takeWhile(c => c != ' ')
res12: Seq[Char] = ArrayBuffer(E, l, w, o, o, d)

Transformation

The map method on List (and Seq), transforms each element of a collection based on a function. For example, if we have a List[String] and want to convert it to all lowercase:

scala> List("A", "Cat").map(s => s.toLowerCase)
>res29: List[java.lang.String] = List(a, cat)

We can shorten the function so the code reads:

scala> List("A", "Cat").map(_.toLowerCase)
res30: List[java.lang.String] = List(a, cat)

The number of elements in the returned collection is the same as the number of elements in the original collection, but the types may be different. If the function passed into map returns a different type, then the resulting collection is a collection of the type returned from the function. For example, we can take a List[String] and calculate the length of each String, which will result in a List[Int]:

scala> List("A", "Cat").map(_.length)
res31: List[Int] = List(1, 3)

map provides a very powerful and uniform way to transform data from one type to another. We can transform our Strings to lowercase, to a List of their length, and we can extract data from a collection of complex objects. For example, if we have a database query that returns records of type Person defined as having a first method that returns a String containing the person’s first name, we can create a List of the first names of the people in the List:

scala> trait Person {def first: String }
defined trait Person
scala> val d = new Person {def first = "David" }
scala> val e = new Person {def first = "Elwood"}
scala> val a = new Person {def first = "Archer"}
scala> List(a, d, e).map(_.first)
res35: List[String] = List(Archer, David, Elwood)

Or, if we’re writing a web app, we can create an <li> (an HTML list element) containing the first name of each Person in our List:

scala> List(a,d,e).map(n =><li>{n.first}</li>)
List(<li>Archer</li>, <li>David</li>, <li>Elwood</li>)

We can combine the operations. Let’s update our Person trait:

trait Person {
      def first: String
      def valid: Boolean
}

Now we can write the code shown in Listing 6-29 to find all the valid Person records, sort by age, and return the first names.

Listing 6-29. First Name of Valid Persons

def validByAge(in: List[Person]) =
  in.filter(_.valid).
  map(_.first)

Reduxio

Scala has other abstractions for common collections operations. reduceLeft allows you to perform an operation on adjacent elements of the collection where the result of the first operation is fed into the next operation. For example, if we want to find the biggest number in a List[Int]:

scala> List(8, 6, 22, 2).reduceLeft(_ max _)
res50: Int = 22

In this case, reduceLeft takes 8 and 6 and feeds them into our function, which returns the maximum value of the two numbers: 8. Next, reduceLeft feeds 8 (the output of the last iteration) and 22 into the function, resulting in 22. Next, reduceLeft feeds 22 and 2 into the function, resulting in 22. Because there are no more elements, reduceLeft returns 22.

We can use reduceLeft to find the longest word:

scala> List("moose", "cow", "A", "Cat").
       reduceLeft((a, b) => if (a.length > b.length) a else b)
res41: java.lang.String = moose

Because Scala’s if expression works like Java’s ternary operator, the if in the previous code returns a if it’s longer than b. We can also find the shortest word:

scala> List("moose", "cow", "A", "Cat").
       reduceLeft((a, b) => if (a.length < b.length) a else b)
res42: java.lang.String = A

reduceLeft throws an exception on an Nil (empty) List. This is correct behavior as there is no way to apply the function on the members of the List as a Nil List has no elements.

foldLeft is similar to reduceLeft, but it starts with a seed value. The return type of the function and the return type of foldLeft must be the same type as the seed. The first example is summing up List[Int]:

scala> List(1,2,3,4).foldLeft(0) (_ + _)
res43: Int = 10

In this case, the seed value is 0. Its type is Int. foldLeft feeds the seed and the first element of the List, 1, into the function, which returns 1. Next, foldLeft feeds 1 (the result of the previous iteration) and 2 (the next element) into the function, resulting in 3. The process continues, and the sum of the List[Int] is generated: 10. We can generate the product of the List the same way:

scala> List(1,2,3,4).foldLeft(1) (_ * _)
res44: Int = 24

But because the return type of foldLeft is the type of the seed, not the type of the List, we can figure out the total length of a List[String]:

scala> List("b", "a", "elwood", "archer").foldLeft(0)(_ + _.length)
res51: Int = 14

I find that sometimes I need to work with more than one collection at a time. For example, if we want to generate the List of products of the numbers from 1 to 3:

scala> val n = (1 to 3).toList
n: List[Int] = List(1, 2, 3)
scala> n.map(i => n.map(j => i * j))
res53: List[List[Int]] = List(List(1, 2, 3), List(2, 4, 6), List(3, 6, 9))

We have nested map invocations, and that results in a List[List[Int]]. In some cases, this may be what we want. In other cases, we want the results in a single List[Int]. In order to nest the map operations but flatten the results of nested operations, we use the flatMap method:

scala> n.flatMap(i => n.map(j => i * j))
res58: List[Int] = List(1, 2, 3, 2, 4, 6, 3, 6, 9)

Look Ma, No Loops

So far, we’ve written a bunch of code that manipulates collections without explicit looping. By passing functions, that is, logic, to methods that control the looping, we let the library writers define the looping, and we define the logic in our app. However, syntactically, nested map, flatMap, and filter can get ugly. For example, if we want to find the product of the odd numbers from 1 to 10 times the even numbers from 1 to 10, we could write the following:

scala> def isOdd(in: Int) = in % 2 == 1
scala> def isEven(in: Int) = !isOdd(in)
scala> val n = (1 to 10).toList
scala> n.filter(isEven).flatMap(i => n.filter(isOdd).map(j => i * j))
res60: List[Int] = List(2, 6, 10, 14, 18, ... 10, 30, 50, 70, 90)

Scala provides the for comprehension, which provides syntactically pleasing nesting of map, flatMap, and filter. We can convert the nested statements from the previous example into a syntactically pleasing statement:

scala> for {i <- n if isEven(i); j <- n if isOdd(j)} yield i * j
res59: List[Int] = List(2, 6, 10, 14, 18, ... 10, 30, 50, 70, 90)

The for comprehension is not a looping construct but is a syntactic construct that the compiler reduces to map, flatMap, and filter. In fact, the two lines

n.filter(isEven).flatMap(i => n.filter(isOdd).map(j => i * j))

and

for {i <- n if isEven(i); j <- n if isOdd(j)} yield i * j

result in the same bytecode. The for comprehension can be used with any class, including user-generated classes, that implement map, flatMap, filter, and foreach. This means you can create your own classes that work with the for comprehension.

Lists also work well with Scala’s pattern matching and recursive programming. We’ll be exploring pattern matching in depth in Chapter 5. For this example, pattern matching is a lot like Java’s switch statement, but it can be used to compare things that are more complex than Ints, and Scala’s pattern matching allows you to match some elements and extract, or capture, others into variables.

The pattern-matching syntax is the same as List construction syntax. For example, if we are matching against List[Int], case 1 :: Nil =>will match List(1). case 1 :: 2 :: Nil => will match List(1,2). case 1 :: rest => will match any List that starts with 1 and will put the tail of the List into the variable rest.

Our example converts a List[Char] of Roman numerals to their Arabic numeral equivalent. The code matches the List to a series of patterns. Based on the matched pattern, a value is returned. The patterns are matched in order of appearance. However, the compiler may optimize the patterns by eliminating duplicate tests1. The code to convert from Roman numerals to Int is in Listing 6-30.

Listing 6-30. Roman Numerals

def roman(in: List[Char]): Int = in match {
    case 'I' :: 'V' :: rest => 4 + roman(rest)
    case 'I' :: 'X' :: rest => 9 + roman(rest)
    case 'I' :: rest => 1 + roman(rest)
    case 'V' :: rest => 5 + roman(rest)
    case 'X' :: 'L' :: rest => 40 + roman(rest)
    case 'X' :: 'C' :: rest => 90 + roman(rest)
    case 'X' :: rest => 10 + roman(rest)
    case 'L' :: rest => 50 + roman(rest)
    case 'C' :: 'D' :: rest => 400 + roman(rest)
    case 'C' :: 'M' :: rest => 900 + roman(rest)
    case 'C' :: rest => 100 + roman(rest)
    case 'D' :: rest => 500 + roman(rest)
    case 'M' :: rest => 1000 + roman(rest)
    case _ => 0
}

case 'I' :: 'V' :: rest => 4 + roman(rest) tests the first two characters, and if they are IV, the method returns 4 plus the Roman numeral conversion of the rest of the List[Char]. If the test falls through to case _ => 0, there are no more Roman numerals, 0 is returned, and there’s no more recursion—no more calls back into the roman() method. Without explicit looping or length testing or explicit branching logic, we’ve written a concise, readable method.

Scala’s List and other sequential collections provide powerful ways to define business logic in a concise, maintainable way. In the next section, we’re going to explore Tuples, which are fixed-length collections where each element can be a different type.

Range

Ranges are often used to populate data structures, and to iterate over for loops. Ranges provide a lot of power with just a few methods, as shown in these examples:

scala> 1 to 10
res0: scala.collection.immutable.Range.Inclusive =Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
scala> 1 until 10
res1: scala.collection.immutable.Range = Range(1, 2, 3, 4, 5, 6, 7, 8, 9)
scala> 1 to 10 by 2
res2: scala.collection.immutable.Range = Range(1, 3, 5, 7, 9)
scala> 'a' to 'c'
res3: collection.immutable.NumericRange.Inclusive[Char] = NumericRange(a, b, c)

You can use ranges to create and populate sequences:

scala> val x = (1 to 10).toList
x: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

Stream

A Stream is like a List, except that its elements are computed lazily, in a manner similar to how a view creates a lazy version of a collection. Because Stream elements are computed lazily, a Stream can be long ... infinitely long. Like a view, only the elements that are accessed are computed. Other than this behavior, a Stream behaves similar to a List.

Just like a List can be constructed with ::, a Stream can be constructed with the #:: method, using Stream.empty at the end of the expression instead of Nil:

scala> val stream = 1 #:: 2 #:: 3 #:: Stream.empty

stream: scala.collection.immutable.Stream[Int] = Stream(1, ?)

The REPL output shows that the stream begins with the number 1 but uses a ? to denote the end of the stream. This is because the end of the stream hasn’t been evaluated yet.

For example, given a Stream:

scala> val stream = (1 to 100000000).toStream
stream: scala.collection.immutable.Stream[Int] = Stream(1, ?)

you can attempt to access the head and tail of the stream. The head is returned immediately:

scala> stream.head
res0: Int = 1
but the tail isn't evaluated yet:
scala> stream.tail
res1: scala.collection.immutable.Stream[Int] = Stream(2, ?)

The ? symbol is the way a lazy collection shows that the end of the collection hasn’t been evaluated yet.

Tuples

Have you ever written a method that returns two or three values? Let’s write a method that takes a List[Double] and returns the count, the sum, and the sum of squares returned in a three-element Tuple, a Tuple3[Int, Double, Double]:

def sumSq(in: List[Double]): (Int, Double, Double) =
    in.foldLeft((0, 0d, 0d))((t, v) => (t._1 + 1, t._2 + v, t._3 + v * v))

The sumSq method takes a List[Double] as input and returns a Tuple3[Int, Double, Double]. The compiler desugars (Int, Double, Double) into Tuple3[Int, Double, Double]. The compiler will treat a collection of elements in parentheses as a Tuple. We seed the foldLeft with (0, 0d, 0d), which the compiler translates to a Tuple3[Int, Double, Double]. The function takes two parameters: t and v. t is a Tuple3, and v is a Double. The function returns a new Tuple3 by adding 1 to the first element of the Tuple, adding v to the second element of the Tuple, and adding the square of v to the third element of the Tuple. Using Scala’s pattern matching, we can make the code a little more readable:

def sumSq(in: List[Double]) : (Int, Double, Double) =
    in.foldLeft((0, 0d, 0d)){
     case ((cnt, sum, sq), v) => (cnt + 1, sum + v, sq + v * v)}

You can create Tuples using a variety of syntax:

scala> Tuple2(1,2) == Pair(1,2)
scala> Pair(1,2) == (1,2)
scala> (1,2) == 1 -> 2

The last example, 1 -> 2, is a particularly helpful and syntactically pleasing way for passing pairs around. Pairs appear in code very frequently, including name/value pairs for creating Maps.

Map[K, V]

A Map is a collection of key/value pairs. Any value can be retrieved based on its key. Keys are unique in the Map, but values need not be unique. In Java, Hashtable and HashMap are common Map classes. The default Scala Map class is immutable. This means that you can pass an instance of Map to another thread, and that thread can access the Map without synchronizing. The performance of Scala’s immutable Map is indistinguishable from the performance of Java’s HashMap.

We can create a Map:

scala> var p = Map(1 -> "David", 9 -> "Elwood")
p: ... Map[Int,String] = Map(1 -> David, 9 -> Elwood)

We create a new Map by passing a set of Pair[Int, String] to the Map object’s apply method. Note that we created a var p rather than a val p. This is because the Map is immutable, so when we alter the contents on the Map, we have to assign the new Map back to p.

We can add an element to the Map:

scala> p + 8 -> "Archer"
res4: ... Map[Int,String] = Map(1 -> David, 9 -> Elwood, 8 -> Archer)

But we haven’t changed the immutable Map:

scala> p
res5: ... Map[Int,String] = Map(1 -> David, 9 -> Elwood)

In order to update p, we have to assign the new Map back to p:

scala> p = p + 8 -> "Archer"

or:

scala> p += 8 -> "Archer"

And we can see that p is updated:

scala> p
res7: Map[Int,String] = Map(1 -> David, 9 -> Elwood, 8 -> Archer)

We can get elements out of the Map:

scala> p(9)
res12: java.lang.String = Elwood

What happens when we ask for an element that doesn’t exist?

scala> p(88)
java.util.NoSuchElementException: key not found: 88

This is mighty inconvenient. If you try to get an element that’s not in the Map, you get an exception. That’s kind of jarring. So far, we haven’t seen much in Scala that results in exceptions being thrown, but it makes logical sense. If you request something that doesn’t exist, that’s an exceptional situation. Java’s Map classes handle this situation by returning null, which has two drawbacks. First, you have to null-test the result of every Map access. Second, it means you can’t store a null in a Map. Scala has a kinder and gentler mechanism for dealing with this situation. The get() method on Map returns an Option (Some or None) that contains the result:

scala> p.get(88)
res10: Option[java.lang.String] = None
scala> p.get(9)
res11: Option[java.lang.String] = Some(Elwood)

You can return a default value if the key is not found:

scala> p.getOrElse(99, "Nobody")
res55: java.lang.String = Nobody
scala> p.getOrElse(1, "Nobody")
res56: java.lang.String = David

We can also use flatMap with Options to find all the values with keys between 1 and 5:

scala> 1 to 5 flatMap(p.get)
res53: Seq.Projection[java.lang.String] = RangeG(David)

In this case, we create a range of numbers from 1 to 5. We flatMap this collection, passing in a function, p.get. Wait, you say, p.get isn’t a function, it’s a method, but you didn’t include the parameter. Scala is very cool, because if it’s expecting a function with parameters of a particular type and you pass a method that takes those parameters, Scala will promote the method with its missing parameters to a function. We’ll explore Options in the next subsection.

Let’s continue exploring Map. We can remove elements from our Map:

scala> p -= 9
scala> p
res20: Map[Int,String] = Map(1 -> David, 8 -> Archer)

We can test the Map to see whether it contains a particular key:

scala> p.contains(1)
res21: Boolean = true

We can operate on the collection of keys. We get a collection of keys from our Map and use reduceLeft to find the largest key:

scala> p.keys.reduceLeft(_ max _)
res22: Int = 8

And we can use reduceLeft on the collection of values to find the largest String:

scala> p.values.reduceLeft((a, b) => if (a > b) a else b)
res23: java.lang.String = David

We can test whether any of the values contains the letter “z”:

scala> p.values.exists(_.contains("z"))
res28: Boolean = false

You can also add a bunch of elements to a Map using the ++ method:

scala> p ++= List(5 -> "Cat", 6 -> "Dog")
p: Map[Int,String] = Map(1 -> David, 8 -> Archer, 5 -> Cat, 6 -> Dog)

And you can remove a bunch of keys with the -- method:

scala> p --= List(8, 6)
res40: Map[Int, String] = Map(1 -> David, 5 -> Cat)

Maps are Scala collections and have collection manipulation methods. This means we can use methods, including map, filter, and foldLeft. One of the tricky parts of using Java’s immutable collections is iterating over the collection and simultaneously removing elements. In my code, I have to create an accumulator for the keys I’m going to remove, loop over the collection, find all the keys to remove, and then iterate over the collection of keys to remove and remove them from the collection. Not only that, but I frequently forget how brittle Hashtable is and inevitably forget this sequence and get some nasty runtime errors. In Scala, it’s much easier. But there’s a simpler way to remove unwanted elements from a Map:

def removeInvalid(in: Map[Int, Person]) = in.filter(kv => kv._2.valid)

Pretty cool, huh? Map has a filter method that works just like List’s filter method. The kv variable is a Pair representing the key/value pair. The filter method tests each key/value pair by calling the function and constructs a new Map that contains only the elements that passed the filter test.

Mutable Collections

The List, Set, and Map immutable collections we are familiar with cannot be changed after they have been created. They can, however, be transformed into new collections. For example, we can create an immutable map, and then transform it by removing one mapping and adding another.

val immutableMap = Map(1 -> "a", 2 -> "b", 3 -> "c")
scala> val immutableMap = Map(1 -> "a", 2 -> "b", 3 -> "c")
immutableMap: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b, 3 -> c)
val newMap = immutableMap - 1 + (4 -> "d")
scala> val newMap = immutableMap - 1 + (4 -> "d")
newMap: scala.collection.immutable.Map[Int,String] = Map(2 -> b, 3 -> c, 4 -> d)

Removing “a” and adding “d” gives us a different collection, while the original collection immutableMap remains the same.

scala> println(newMap)
Map(2 -> b, 3 -> c, 4 -> d)
scala> println(immutableMap)
Map(1 -> a, 2 -> b, 3 -> c)

What you end up with is a completely new collection stored in newMap. The original collection immutableMap, remains untouched.

However, there are times when you do want mutable data. For example, creating a mutable data structure that is only used within a function.

You can create mutable collections by directly from immutable collections. The List, Map, and Set immutable collections can all be converted to the mutable collection.mutable.Buffer type with the toBuffer method. Here is an example of converting an immutable map to a mutable one and then changing it back.

scala> val m = Map(1->"a", 2 -> "b")
m: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b)
scala> val b = m.toBuffer
b: scala.collection.mutable.Buffer[(Int, String)] = ArrayBuffer((1,a), (2,b))

The map, containing key-value pairs, is now a sequence of tuples. We can now add a new entry.

scala> b += (3 ->"c")
res56: b.type = ArrayBuffer((1,a), (2,b), (3,c))

After adding new entry, you can now change the buffer to map again.

scala> val newMap = b.toMap
newMap: scala.collection.immutable.Map[Int,String] = Map(1 -> a, 2 -> b, 3 -> c)

The buffer methods toList and toSet can be used in addition to toMap to convert a buffer to an immutable collection. The most straightforward way to modify collections is with a mutable collection type. We will look at some mutable collection classes in the following sections.

Mutable Queue

A queue is a first-in, first-out (FIFO) data structure. Scala offers both an immutable queue and mutable queue. You can create an empty, mutable queue of any data type. Listing 6-31 shows mutable queue for int.

Listing 6-31. Creating a Queue

import scala.collection.mutable.Queue
var ints = Queue[Int]()
scala> import scala.collection.mutable.Queue
import scala.collection.mutable.Queue
scala> var ints = Queue[Int]()
ints: scala.collection.mutable.Queue[Int] = Queue()

Image Note  While the collection.immutable package is automatically added to the current namespace in Scala, the collection.mutable is not. When creating mutable collections, make sure to include the full package name for the type.

Once you have a mutable queue, add elements to it using +=, ++=, and enqueue, as shown in Listing 6-32 through 6-34.

Listing 6-32. Adding Elements to the Queue

scala> ints += 1
res46: scala.collection.mutable.Queue[Int] = Queue(1)

Listing 6-33. Adding Multiple Elements to the Queue

scala> ints += (2, 3)
res47: scala.collection.mutable.Queue[Int] = Queue(1, 2, 3)

Listing 6-34. Using enqueue

scala> ints.enqueue(4)

scala> ints
res49: scala.collection.mutable.Queue[Int] = Queue(1, 2, 3, 4)

Because a queue is a FIFO, you typically remove elements from the head of the queue, one element at a time, using dequeue (see Listing 6-35).

Listing 6-35. Using dequeue

ints.dequeue
scala> ints.dequeue
res50: Int = 1
scala> ints
res51: scala.collection.mutable.Queue[Int] = Queue(2, 3, 4)

A Queue is a collection class that extends from Iterable and Traversable, so it has all the usual collection methods, including foreach, map, and so on.

Mutable Stack

A stack is a last-in, first-out (LIFO) data structure. Scala has both immutable and mutable versions of a stack. The following examples demonstrate how to use the mutable Stack class.

Listing 6-36 creates an empty, mutable stack of int data type.

Listing 6-36. Creating a Stack

import scala.collection.mutable.Stack
var ints = Stack[Int]()
scala> import scala.collection.mutable.Stack
import scala.collection.mutable.Stack
scala> var ints = Stack[Int]()
ints: scala.collection.mutable.Stack[Int] = Stack()

You can also populate a stack with initial elements when you create it as seen in Listing 6-37.

Listing 6-37. Creating a Stack

val ints = Stack(1, 2, 3)
scala> val ints = Stack(1, 2, 3)
ints: scala.collection.mutable.Stack[Int] = Stack(1, 2, 3)

You can now push elements onto the stack with push:

Listing 6-38. Creating a Stack

scala> ints.push(4)
res41: ints.type = Stack(4, 1, 2, 3)

Listing 6-39. Creating a Stack

scala>ints.push(5, 6,7)
scala> ints.push(5, 6 ,7)
res42: ints.type = Stack(7, 6, 5, 4, 1, 2, 3)

To take elements off the stack, pop them off the top of the stack (see Listing 6-40).

Listing 6-40. Creating a Stack

scala> val lastele = ints.pop
lastele: Int = 7

With this we complete the discussion of mutable collections and this chapter. The scala collections framework is broad, rather than deep, and deserves a book of its own. Nevertheless, this chapter tried to show you how to use a variety of Scala’s collection types.

Summary

This chapter gave you the tour of scala collections framework and showed you three main packages of the framework. This chapter has delved deeply into how to use lists. You have seen many ways to work with lists. You have seen the basic operations like head and tail, the higher-order operations like map, and the utility methods in the List object. Lists are just one kind of collection that Scala supports, however. This chapter has given an overview of the Scala collections library and the most important classes and traits in it. With this foundation you should be able to work effectively with Scala collections, and know where to look in Scaladoc when you need more information. For now we’ll turn our attention to the internals of Scala traits in the next chapter.

________________________

1You can see exactly how Scala turns patterns into code by typing scalac -print FileName.scala. This will cause the Scala compiler to emit desugared code that looks strangely like Java code.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset