CHAPTER 4

image

Functional Programming in Scala

In the non-fiction work Old Times on the Mississippi, Mark Twain wrote: “When I was a boy of 14, my father was so ignorant I could hardly stand to have the old man around. But when I was 21, I was astonished at how much the old man had learned in seven years”. Functional programming is the old man that comes to the rescue when writing robust concurrent software. Functional programming treats computation as the evaluation of mathematical functions and avoids state and mutable data. It is a declarative programming paradigm, in which programming is done with expressions. The imperative style of programming emphasizes sequence of operations characterized by iteration with loops, mutating data in place, and methods with side effects where the order of side effects is critical toward the right effect. The basic constructs in an imperative language, such as Java, are imperative statements that change the state of a program, as illustrated here:

x = x + 1

The functional style of programming emphasizes the results; characterized by passing function values into looping methods, immutable data, and methods with no side effects where the order in which operations occur is of no importance. In a functional language such as Scala, the basic constructs are declarative, as seen here and there are no side effects:

f(int x){return x + 1}

In functional language, the computation proceeds primarily by evaluation expressions. As a language that supports functional programming, Scala encourages an expression-oriented programming (EOP) model.

Expression-Oriented Programming

In expression-oriented programming every statement is an expression. To understand EOP, you have to understand the difference between a statement and an expression. A statement executes code, but does not return any value, for example:

customer.computeDiscount()

An expression returns value. Expressions are blocks of code that evaluate to a value as seen here:

val discount = computeDiscount(customer)

Image Note  An expression-oriented programming language is a programming language where every construct is an expression, and thus evaluates to a value.

In Scala, the following expression returns a result:

scala> 2 + 2
res0: Int = 4

The strength of expression-oriented programming is more discernible from an if/else expression, which also returns a value in Scala:

val test = if (3 > 2) "true" else "false"

The preceding if clause checks a conditional expression and returns one expression or another, depending on the value of the conditional expression. An if block in Java does not evaluate to a value. The code as illustrated here is illegal in Java, in contrast to Scala, because the if clause in Java is a statement, not an expression.

boolean test = if (3 > 2) "true" else "false"

To accomplish the same effect shown in the Scala if clause, you have to use the ?: syntax in Java as illustrated here:

boolean test = 3 > 2 ? true : false ;

In Java, there is a difference between if and ?:if is a statement while ?: is an expression. The if clause in Scala is much more like ?: in Java than the if clause in Java. Scala has unified the concept of ?: with its if blocks and so Scala has no ?: syntax.

Image Note  As mentioned earlier, every construct in Scala is an expression where the order in which operations occur is of no importance and therefore these expressions can be executed in any order. This simple concept has a deep implication in concurrency in multicore programming where you can execute expressions in parallel. We explore concurrency in Chapter 10.

A Pure Function

In mathematics, functions are nothing short of pure, in that they lack side effects. Consider the classic functions in(x):

y = sin(x).

Regardless of how many times sin(x) gets called, sin(x) does not modify the state. Such a function is called a pure function and is oblivious to the context. This obliviousness to the surrounding context is known as referential transparency.

Referential Transparency

An expression is referentially transparent if it can be substituted by its resulting value, without changing the behavior of the program, regardless of where the expression is used in the program. For instance, you can assign the expression of two immutable variables x and y to a third variable z, like this:

val z = x + y

Now, anywhere the expression x + y is used throughout the given scope of your program, you can substitute it by z within that scope, without affecting the result of the program. As stated before, functional programming gives you the right foundation to think about concurrency. The three keystones of this foundation are: referential transparency, higher-order function, and immutable value. Understanding these key elements is crucial to understanding functional programming. In functional programming, a pure function with one or more input parameters, does not mutate the input parameters and always returns the same value for the same input.

Image Note  A pure function is referentially transparent and has no side effects.

A pure function is free of side-effects, however, a function that never cause side effects would be useless. A language that does not sanction side effects would be useless, as input and output is essentially the ramification of side effects.

We have introduced sufficient theory for you to begin to explore Scala functions. In Chapter 2, you learned the basic syntax of Scala functions and how you can declare, define, and call functions. Before continuing with the sections that follow in this chapter, we recommend you skim through the Scala functions introduced in Chapter 2.

Now we will get started with a basic functional construct in Scala—the function literal.

Function Literal/Anonymous Function

A literal is the simplest form of expression. A literal is a notation for representing a fixed value in source code. Almost all programming languages have notations for atomic values such as integers, floating-point numbers, strings, and so on. Literals are often used to initialize variables. In the following, 1 is an integer literal:

 int x = 1;

Scala, allows you to express functions as literals. Function literals allow you to have an expression of a function type that you can write in a short format without declaring a name for it. A function type could be one of the following:

  • The type of a variable or parameter to which a function can be assigned
  • An argument of a higher-order function taking a function parameter
  • The result type of higher-order function returning a function

The syntax for a function literal starts with a parenthesized comma−separated list of arguments followed by an arrow and the body of the function. A function literal is also called an anonymous function, that is, a function without any name specified with function literal syntax. Consider an add function:

val add = (x: Int, y: Int) => x + y

Using the function literal you can define the add function as illustrated here:

(x: Int, y: Int) => x + y.

A function literal is instantiated into objects called function values. A function value is a function object and you can invoke the function object in the same manner as you invoke any other function. The function object extends one of the FunctionN traits, such as Function0, Function1, and so on up to Function22.Depending on the number of arguments in the function, the corresponding FunctionN trait is chosen by the compiler. For a function with two arguments, the compiler elects Function2 as the underlying type. For a function with 3 arguments compiler chooses  Function3 , for a function with 4 arguments,  Function4  and so on.

Because the function value is an object, it could be stored in a variable and it could be invoked using the parentheses function-call as illustrated here:

scala> val add = (a: Int, b: Int) => a + b
add: (Int, Int) => Int = <function2>
scala>add(1, 2)
res23: Int = 3

The invocation of this function is converted to a call to the apply method of the function class instance, which is assigned to the variable.

From these kind of function literals the Scala compiler generates a function object that mixes in one of the FunctionN traits—the left side of the → becomes the parameter list and the right side becomes the implementation of the apply method. Every function that you define in Scala becomes an instance of an implementation that features a certain FunctionN Trait ranging from Function1 up to Function22.

Now, to take a deeper look into Function traits we first write a function that calculates the area of a rectangle as illustrated here:

val areaOfRectangle:(Int, Int) => Int = (width:Int, height:Int) => {
width*height
}

When you run this function in REPL you will see that compiler elects and chooses the Function2 Trait for this function. Why? Simply because there are two arguments to this function.

scala> val areaOfRectangle:(Int, Int) => Int = (width:Int, height:Int) => {
     |  width*height
     | }
areaOfRectangle: (Int, Int) => Int = <function2>

You can invoke this function as seen here:

scala> areaOfRectangle(5,3)
res01: Int = 15

Now, let’s take a look at Trait scala.Function2 in the Scala package:

trait Function2[-T1, -T2, +R] extends AnyRef {
    ...
abstract def apply( v1 :T1, v2 :T2 ) : R
    ...
}

This shows only the apply method. The two type parameters T1, T2 in the apply method take the type of the arguments, while type parameter R represents the function's return type.

For every function that you define in Scala, the compiler comes up with an instance of the appropriate Function Trait, where the type parameters are parameterized with the given types of the arguments and the return type of the function.

In the areaOfRectangle function we defined earlier, the type of areaOfRectangle function is

(Int, Int)  =>Int

This is same as illustrated here:

Function2[Int,Int,Int]

So we could have also defined our add function this way:

val areaOfRectangle:  Function2[Int,Int,Int]  = (width:Int, height:Int) => { width*height }

You can test this on REPL as seen here:

scala> val areaOfRectangle:  Function2[Int,Int,Int]  = (width:Int, height:Int) => { width*height }
areaOfRectangle: (Int, Int) => Int = <function2>

Now, you could explicitly call the method apply on a given function as illustrated:

areaOfRectangle.apply(5,3)

You can test this on REPL as illustrated here:

scala> val area = areaOfRectangle.apply(5,3)
area: Int = 15

You could go a step further and define a function by implementing an appropriate Function Trait and define its required apply method. Let’s do this for the areaOfRectangle function:

val areaOfRectangle :(Int, Int) => Int = new Function2[Int, Int, Int]{
        def apply(width:Int, height:Int):Int = {
                width*height
        }
}

You can test this on REPL as seen here:

scala> areaOfRectangle(5,3)
res18: Int = 15

Now that you have learned function values and function types, you will see how you can use a function literal to pass it into a method that takes a function, or to assign it to a variable. Now we will discuss these in detail in the following section.

First Class Function and Higher Order Function

One of the key factors in Scala that beautifully blends functional paradigm into object-oriented paradigm is that functions are objects. In functional programming, functions are first-class citizens. A first-class function is a function that can be

  1. Assigned to variables,
  2. Passed as an argument to the other function, and
  3. Returned as values from the other function.

The other function emphasized in point 2 that takes a function as an argument and the other function emphasized in point 3 that returns a function, are called higher-order functions. In the sections that follow, you will learn about all three aspects of a first class function.

Image Note  In functional programming, functions are first-class citizens, meaning functions can be assigned to variables, functions can be passed to other functions, and functions can be returned as values from other functions. And such functions, which take functions as arguments or return a function, are called higher-order functions.

Function as Variable

Just as you pass String, Int, and other variables around in an OOP, you can pass a function around like a variable. You can define a function literal, and then assign that literal to a variable. The following code defines a function literal that takes an Int parameter and returns a value that is twice the amount of the Int that is passed in:

(i: Int) => { i * 2 }

You can now assign that function literal to a variable:

val doubler = (i: Int) => { i * 2 }
scala> val doubler = (i: Int) => { i * 2 }
doubler: Int => Int = <function1>

The variable doubler is an instance of a function, known as a function value. You can now invoke doubler as illustrated here:

doubler(2)
scala> doubler(2)
res25: Int = 4

Under the hood, doubler is an instance of the Function1 trait, which defines a function that takes one argument. In terms of implementation, doubler is a function created using the keyword val and assigned to a variable. To define the doubler as a method instead of as a function you have to define the doubler method in a class, and use the keyword def to define a method.

Beyond just invoking doubler, you can also pass it to any function (or method) that takes a function parameter. We will discuss this in the following section.

Function as Parameter

You can create a function or a method that takes a function as a parameter. For this, first define a method that takes a function as a parameter.

def operation(functionparam:(Int, Int) => Int) {
println(functionparam(4,4))
}
scala> def operation(functionparam:(Int, Int) => Int) {
     | println(functionparam(4,4))
     | }
operation: (functionparam: (Int, Int) => Int)Unit

The operation method takes one parameter named functionparam, which is a function. The functionparam function takes two Int and returns an Int. The operation method returns a Unit that indicates that operation method returns nothing.

Next, define a function that matches the expected signature. The following add function matches that signature, because it takes two Int arguments and returns an Int:

val add = (x: Int, y:Int) => { x + y }

Now you can pass an add function into the operation method:

operation(add)
scala> operation(add)
8

Any function that matches this signature can be passed into the operation method. Let’s define two new functions named subtract and multiply that take two Int and return an Int:

val subtract = (x: Int, y:Int) => { x - y }
val multiply = (x: Int, y:Int) => { x*y }

Now you can pass these functions into your operation method:

operation(subtract)
scala> operation(subtract)
0
operation(multiply)
scala> operation(multiply)
16

Returning a Function

You can return a function from a function or method. In order to do this, first define an anonymous function.

The following code declares an anonymous function that takes a String argument and returns a String:

(name: String) => { "hello" + " " + name }

Now we will define a method that returns the anonymous function that we just defined.

def greeting() = (name: String) => {"hello" + " " + name}

On the left side of the = symbol you have a normal method declaration:

def greeting()

On the right side of the = is a function literal (an anonymous function):

(name: String) => {"hello" + " " + name}
scala> def greeting() = (name: String) => {"hello" + " " + name}
greeting: ()String => String

Now you can assign greeting() to a variable:

val greet= greeting()
scala>  val greet= greeting()
greet: String => String = <function1>

The greet function is now equivalent to your anonymous function (name: String) => {"hello" + " " + name}.

Because the anonymous function takes a String parameter name, you can pass it a name:

greet("Reader")
scala> greet("Reader")
res26: String = hello Reader

Closure

A closure is a function, whose return value depends on the value of one or more variables declared outside this function. Consider the following multiplier function:

val multiplier = (x:Int) => x * 3

In the multiplier function, i is the variable used in the function body. x is defined as a parameter to the function. Now let’s modify the multiplier function as illustrated here:

val multiplier = (x:Int) => x * y

Because is a formal parameter to the function, it is bound to a new value each time multiplier is called. However, j is not a formal parameter. Let’s further modify the multiplier function as illustrated:

var y = 3
val multiplier = (x:Int) => x * y

Now, y has a reference to a variable outside the function but in the enclosing scope.

scala> var y = 3
y: Int = 3
scala> val multiplier = (x:Int) => x * y
multiplier: Int => Int = <function1>

Now you can invoke the multiplier function as illustrated here:

multiplier(3)
scala> multiplier(3)
res37: Int = 9

The multiplier function references j and reads its current value each time. The Scala compiler creates a closure (closes over) that encompasses the variable in the enclosing scope.

Partially Applied Function

In functional programming languages, when you call a function that has parameters, you are said to be applying the function to the parameters. When all the parameters are passed to the function you have fully applied the function to all the parameters.

A simple add function:

val add = (x: Int, y: Int) => x + y
scala> val add = (x: Int, y: Int) => x + y
add: (Int, Int) => Int = <function2>

<function2> indicates it is a function of two parameters.

scala> add(1,2)
res01: Int = 3

But when you give only a subset of the parameters to the function, the result of the expression is a partially applied function.

val partiallyAdd = add(1, _:Int)

Because you haven’t provided a value for the second parameter, the variable partiallyAdd is a partially applied function. You can see this in the REPL:

scala> val partiallyAdd = add(1, _:Int)
partiallyAdd: Int => Int = <function1>

The output in the REPL shows that partiallyAdd is a function that implements the Function1 trait. Implemeting a Function1 trait indicates that partiallyAdd function takes one argument. When you give partiallyAdd an Int value 2, you get the sum of the Int number passed into the add and partiallyAdd functions:

scala> partiallyAdd(2)
res02: Int = 3

The first argument 1 was passed into the original add function and the new function named partiallyAdd was created, which is a partially applied function; then, the second argument 2 was passed into partiallyAdd. When you provide all the parameters, the original function is executed, yielding the result.

Curried Function

Currying converts a function with multiple parameters creating a chain of function, each expecting a single parameter.

Let’s look at a simple add function that adds two Int parameters, a and b, as illustrated here:

val add = (x: Int, y: Int) => x + y
scala> val add = (x: Int, y: Int) => x + y
add: (Int, Int) => Int = <function2>
scala> add(3,3)
res38: Int = 6

In Scala, curried functions are defined with multiple parameter lists, as follows:

def add(x: Int)(y: Int) = x + y

You can also use the following syntax to define a curried function:

def add(x: Int) = (y: Int) => x + y

As you can see, instead of one list of two Int parameters, you apply the curried add function to two lists of one Int parameter. Thus the curried add function looks like this:

scala> def curriedAdd(a: Int)(b: Int) = a + b
curriedAdd: (a: Int)(b: Int)Int
scala> curriedAdd(2)(2)
res1: Int = 4

You could define more than two parameters on a curried function. Our add function that takes two arguments, is transformed into its curried equivalent.

Function Composition

In functional programming, you can compose functions from other functions. For example, tan(x) = sin(x)/cos(x). An implication of composability is that functions can be treated as values. So far, we’ve created simple functions and manipulated the function instances. However, we can also build functions from other functions. Functional composition provides the basis for a lot of cool things in Scala, including the parser combinator, which we will explore in Chapter 8. But for now, let’s see the difference between interpreting a series of commands and “compiling” a function that interprets them. First, let’s define a grammar. In our grammar, we have expressions, which can be constant values or named variables. Expressions can also be addition or multiplication of other expressions. Here’s a collection of case classes that describes our grammar (recall that we covered case classes in Chapter 2):

sealed trait Expr
case class Add(left: Expr, right: Expr) extends Expr
case class Mul(left: Expr, right: Expr) extends Expr
case class Val(value: Int) extends Expr
case class Var(name: String) extends Expr
We can build expressions like 1 + 1, Add(Val(1), Val(1)), 3 * (1 + 1), Mul(Val(3),
Add(Val(1), Val(1)), and a * 11, Mul(Var("a"), Val(11)).

We can evaluate an expression by interpreting the expression:

def calc(expr: Expr, vars: Map[String, Int]): Int = expr match {
case Add(left, right) => calc(left, vars) + calc(right, vars)
case Mul(left, right) => calc(left, vars) * calc(right, vars)
case Val(v) => v
case Var(name) => vars(name)
}

Let’s look at how this method works. expr is the expression to evaluate, and vars is a Map that contains our variables. We use pattern matching to determine what to do based on the case class. If expr is an Add, we extract the left and right parameters, which are themselves Exprs. We call calc to calculate the value of the left and right parameters and add the results. If expr is Mul, we do the same thing (except we multiply things rather than adding them). If expr is Val, we simply extract the value and return it. If expr is Var, we extract the name and return the lookup of the name in the vars Map. We can turn this from a method call into a function. Having a function allows us to pass around the logic that the expression represents. It also means that we don’t have to interpret the tree of Exprs each time. Let’s see how we can compose a function based on the Expr.

def buildCalc(expr: Expr): Map[String, Int] => Int = expr match {
case Add(left, right) =>
val lf = buildCalc(left)
val rf = buildCalc(right)
m => lf(m) + rf(m)
case Mul(left, right) =>
val lf = buildCalc(left)
val rf = buildCalc(right)
m => lf(m) * rf(m)
case Val(v) => m => v
case Var(name) => m => m(name)
}

The buildCalc method returns a function that can be passed to other functions. Also, the JVM can optimize the composed functions so that they perform better than the interpreted version. The performance of the composed function is better because there is no overhead associated with pattern matching each element. The function is evaluated by repeatedly calling the function’s apply method. Thus, the cost of each node is one or two method dispatches rather than the cost of the pattern matching. Let’s turn to other ways that functions can help us improve performance and readability.

Tail Calls and Tail Call Optimization

A recursive function is one that may invoke itself. Recursion plays a crucial role in functional programming because it offers a way to iterate over data structures using mutable data, since each function call has its own stack for storing function parameters. One classic example of a recursion can be seen in the implementation of factorial as shown:

scala> def factorial(number:Int) : Int = {
     |     if (number == 1)
     |        return 1
     |     number * factorial (number - 1)
     | }
factorial: (number: Int)Int

You can call this function as shown here:

scala> println(factorial(3))
6

One problem associated with using recursive functions is that invoking a recursive function too many times leads to stack-overflow error. The Scala compiler can optimize recursive functions with tail recursion so that recursive calls do not use all the stack space, therefore not running into stack-overflow error. The tail-recursion is a specific kind of recursion that occurs when a function calls itself as its final operation. With tail-recursion-optimized functions, recursive invocation doesn’t create a new stack but instead uses the current function’s stack. Only functions whose last statement is the recursive invocation can be optimized for tail-recursion by the Scala compiler.

Next is the implementation of factorial, calculated with tail-call recursion. To facilitate tail-call optimization, Scala provides an annotation available to mark a function to be optimized for tail-recursion. A function marked with the tail-recursion function annotation causes an error at compilation time if it cannot be optimized for tail-recursion. To mark a function to be optimized for tail-recursion, add @annotation.tailrec before the function definition.

Now mark the factorial function shown earlier with @annotation.tailrec to instruct the Scala compiler that this function must be optimized for tail-recursion and that if annotated function cannot be optimized for tail-recursion, the compiler should treat it as an error.

scala> @annotation.tailrec
     | def factorial(number:Int) : Int = {
     |     if (number == 1)
     |        return 1
     |     number * factorial (number - 1)
     | }
<console>:12: error: could not optimize @tailrec annotated method factorial: it contains a recursive call not in tail positionnumber * factorial (number - 1)^

As you can see, Scala compiler throws an error. The factorial method can’t be optimized because the recursive call is not the last statement in the function; the factorial calls itself and then performs multiplication with the result, so actually multiplication is the last statement in the function, not the recursive call. A function can’t be optimized for tail-recursion if the result of invoking itself is used for anything but the direct return value

The factorial method marked with @annotation.tailrec will not compile successfully until recursion is the final operation. So we need to perform the multiplication operation before invoking the factorial method for which we use accumulator argument to hold the computation in progress. This argument is computed with a multiplication before the recursive call. Thus recursion is a final operation.

scala> @annotation.tailrec
     | def factorial(accumulator: Int, number: Int) : Int = {
     |   if(number == 1)
     |     return accumulator
     |   factorial(number * accumulator, number - 1)
     | }
factorial: (accumulator: Int, number: Int)Int

A successful compile guarantees that the function will be optimized with tail recursion, so that each successive call will not add new stack frames.

scala> println(factorial(1,3))
6

Call-by-Name, Call-by-Value, and General Laziness

In Java programs, when you call a method with parameters, the value of the parameters are all calculated before the method is called. Thus, in

foo(1 + 1, "A String".length());

The expressions 1 + 1 and "A String".length() are both evaluated before the call to foo is made. This is usually what you want. However, there are some cases when you want to parameters to be optionally evaluated or repeatedly evaluated. In these cases, Scala provides the call-by-name mechanism. There’s no syntactic difference to the caller for call-by-name parameters.

The first example for call-by-name is the logging example. It’s computationally costly to calculate log messages simply to discard them if the message is not going to be logged. This is very common in Java code:

if (logger.level().intValue() >= INFO.intValue()) {
logger.log(INFO, "The value is "+value);
}

In this code, we have to push the decision to evaluate logger.log(INFO, "The value is "+value);into the place where we call logger. This means we need to wrap the call to logger in an if statement. It would be much better from a coding perspective if the cost of evaluating the String to be logged were incurred only if the String is going to be logged and if the current log level is known to and tested by the code inside logger rather than in the call to logger. Call-by-name gives us the ability to delay the evaluation of the String to log only if that String will actually be logged.

In Scala, we can define a log method that takes the thing to log as call-by-name:

def log(level: Level, msg: => String) =
if (logger.level.intValue >= level.intValue) logger.log(level, msg)

And you would call this code:

log(INFO, "The value is "+value)

The Scala version passes "The value is "+value as a function that is evaluated each time it is accessed in the log method. The log method will access it only if the log message is going to be printed. Your code is cleaner because you don’t have to repeatedly test the log level, but it performs as well as the previous Java code that has the inline test. In order to make something call-by-name, just put  => before the type. So, foo(s: String) is call-by-reference, and foo(s: => String) is call-by-name.

You may be wondering how the code could possibly perform as well if a function object is being created and handed off to the log method. In the JVM, the cost of creating an object that never escapes the current thread and is very short-lived is zero or very near zero. The JVM may also inline the log method such that the test is performed without an actual method call. The result is that your code will run as quickly with the Scala code as it will with the Java code that has the repeated test for log level. The first use of call-by-name is passing an expression that takes a long time to evaluate that may not be evaluated. The second use for call-by-name is the situation where we want to evaluate the expression many times in the target method, for example, if we want to evaluate an expression until some condition is met. That condition could be until the expression returns false or until the expression returns null. For example, we could collect all the Strings returned from an expression until we encounter a null:

def allStrings(expr: => String): List[String] = expr match {
case null => Nil
case s => s :: allStrings(expr)
}

We can test this method:

scala> import java.io._
import java.io._
scala> val br = new BufferedReader(new FileReader("foo.txt"))
br: java.io.BufferedReader = jaferedReva.io.Bufader@2bfa91
scala> allStrings(br.readLine)
res0: List[String] = List(import scala.xml._, , object Morg {,...)

Each time the call-by-name parameter, expr, is accessed, it is applied. If it is passed as a parameter that is also call-by-name, it will be passed without evaluation. In the previous code, you pattern match against the application of expr. If it’s null, return an empty List, a Nil. If it’s not null, return a List that is the current String and the result of allStrings(expr).

Summary

You saw Scala functions in action in the Chapter 2 where you declared, defined and called functions. As a continuation to the brief introduction in Chapter 2, this chapter provided a detailed treatment on functional programming in Scala, introducing several functional constructs of Scala as a functional programming language. In the next chapter you will learn pattern matching with Scala.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset