Chapter 15

Recursion

Back in chapter 6 we got our first introduction to recursion. At that point we used recursion to provide iteration. In chapter 8 we learned how to produce iteration using loops and have used that more than recursion since that point. If the only capability of recursion was to produce iteration it would not be of much interest in Computer Science because loops would be a complete substitute that would have simpler syntax for most uses. However, that is not the case. Recursion allows us to express a lot more than just simple iteration, and because of this, recursion is a critical tool in writing concise solutions to many different problems.

15.1 Power of Recursion

To understand the real power of recursion and where it comes from, it might help to revisit some code from the end of chapter 6. Early in that chapter we used a recursive function to count down from a specified number using a single argument. The code for doing that looked like this.

def countDown(n:Int) {
  if(n>=0) {
  println(n)
  countDown(n-1)
  }
}

As long as the argument has not gotten below zero, this function prints the number and counts down. We also wrote code to count up using two arguments where one argument was incremented for each subsequent call. At the end of the chapter you were presented with the following code.

def count(n:Int) {
  if(n>=0) {
  count(n-1)
  println(n)
  }
}

You were told to enter this code and run it to see what it does. If you did so, you might have been rather surprised to see that this code counts up. That seems surprising because the argument is clearly decrementing. The reason this can count up has to do with the memory of the computer and in particular the call stack, which was introduced in section 13.5.

To understand how this works, consider figure 15.1. This shows a graphical representation for different frames on the call stack and what happens when you call count(5). This call immediately calls count(4) before it does anything else. The call to count(4) gets a new stack frame that keeps track of where it was in the call to count(5) so that when it returns it can go back to that point. The call to count(4) also goes straight into a call to count(3), which also gets a new stack frame. This continues until we get down to count(0) which does nothing at all, but return. When it returns, control returns to the call to count(1) and resumes right after the call to count(0), which is the line with the print statement. So it prints a 1. After the print, count(1) returns to count(2) which similarly does a print. Because the prints are happening as it pops back up the stack, the numbers get printed in ascending order even though the function only includes decrement. The memory of the stack is essential for this to work because each stack frame remembers its own value of n.

Figure 15.1

Figure showing the call stack for the function count which prints the numbers counting up. When you call the function with 5, it calls itself with 4 before doing anything else. This calls itself with 3 and so on down to 0. The call with 0 does nothing but return. It returns back into the call to count(1) that does the print and continues on back up the stack.

This shows the call stack for the function count which prints the numbers counting up. When you call the function with 5, it calls itself with 4 before doing anything else. This calls itself with 3 and so on down to 0. The call with 0 does nothing but return. It returns back into the call to count(1) that does the print and continues on back up the stack.

This is a very significant point to remember about recursive calls. While the variables in the recursive calls all have the same names, they are not the same variables. It is like multiple people with the same name. Just because two or three people are named "Pat", that does not mean they are the same person. In this case, there were six different versions of n that were created and they took on the values from 5 down to 0. Each was distinct from the others and occupied different parts of memory.

This example shows how the stack can come into play with recursion, but it does not really show the power of recursion. To do that, we need to have a recursive function that can call itself more than once. In the following sections we will look at several different examples of this and see problems that really require the stack and are significantly harder to convert over to loops.

15.2 Fibonacci Numbers

The classic example of recursion is the Fibonacci numbers. This is a sequence of numbers where each number is defined as the sum of the previous two. So in mathematical notation this is written as f(n) = f(n − 1) + f(n − 2). This is not a complete definition, however, because we need to know how the sequence starts. That is to say that we need a base case for the recursion. It is customary to have the first two elements be 1. So the sequence then is 1, 1, 2, 3, 5, 8, 13, 21, ...

We can write this function in Scala with one short function definition.

def fib(n:Int):Int = if(n<3) 1 else fib(n-1)+fib(n-2)

So for n=1 and n=2 we get back 1. For n=3 we get back 1+1=2. For larger numbers, the process is more complex and it is instructive to try to visualize it. We do this by considering what happens on the call stack in a manner similar to figure 15.1. Unlike the code for figure 15.1, this function calls itself twice and that makes things a bit more complex.

For example, consider a call to fib(4). This is equal to fib(3)+fib(2). In order to know what that is, recursive calls have to be made to each of those functions. By convention, we will imagine that the calls are processed from left to right. So it calls fib(3), which gets a stack frame. That in turn calls fib(2) which gets a stack frame. Unlike examples we have seen before, when fib(3) returns, there is another call to fib(2) still waiting. So immediately a new stack frame appears where the one for fib(3) had just been. This process of stack frames returning to be replaced with a new one happens a lot in this calculation of the Fibonacci numbers as well as other recursive functions that call themselves more than once. Instead of drawing this as a vertical stack as was done in figure 15.1, it is customary to draw it in a branching structure called a tree as is drawn in figure 15.2. Each new stack frame is still below the one that called it, but different stack frames at the same depth in the stack are drawn next to one another horizontally. Arrows are used to keep track of things.

Figure 15.2

Figure showing a call tree for fib(4). Boxes represent stack frames. Straight arrows point in the direction of function calls. Curved arrows show returns and are labeled with return values.

This figure shows a call tree for fib(4). Boxes represent stack frames. Straight arrows point in the direction of function calls. Curved arrows show returns and are labeled with return values.

In this representation, boxes represent the stack frames. We have dispensed with putting the function names and left only the argument value. The straight arrows show function calls. The curved arrows show returns and are labeled with return values. These types of diagrams can be very helpful to you when you are trying to figure out what happens in a complex recursive function. When you draw them yourself you can dispense with the boxes and arrows. Values with lines between them will typically suffice.

The Fibonacci numbers are a standard example of recursion, but they are not a great one outside of instructional purposes. You can fairly easily write a function that calculates Fibonacci numbers that uses a loop instead of recursion and it will be a lot faster. That is not true of the examples that follow.

15.3 Permutations

This next example is a bit more practical. We are given a List of values and we want to perform some function on all permutations of this List. A permutation of a collection contains all the same elements, but in potentially different orders. This type of behavior might be in order if you have different tasks and you need to find an optimal order in which to perform the tasks based on some rule.

So the next question is, how could we do this? We have seen previously that sequences in Scala have a method called permutations which will do for us, but we want to see how we could do it ourselves. To figure this out we will employ a common idiom when building recursive functions. It is one that we talked about in chapter 6 with our simpler recursive functions. The idea is to take a large problem and break it down so that we solve one step and then apply the same function to solve what is left. In this case, one step is picking a first element. Unlike what we did in chapter 6, there will generally be more than one option here. Any of the elements in the current List could be the first element in the permutation. What follows that is the permutations of everything else in the List. That gives us a recursive definition. We want to pick an element to go first, then make a recursive call on the rest. After that returns, we pick a different element to go first and recurse again. This should be repeated for all the elements in the List.

To convert this into a function we need to figure out what information has to be passed in. The obvious part is that we have to pass in the List of values we are permuting. We also need to pass in the elements in the permutation that has been built so far. This can be done as a second List. If we made the function such that it returns a List of Lists with all the permutations, these would be the only arguments we would need. However, such a function is not useful for many situations because the number of permutations grows very quickly with List length and the cost in memory would become prohibitive. Instead, we will pass in a function that operates on a List that we will call each time a complete permutation is built. That way the caller can decide what is done with the different permutations.

So we are writing a function that takes three arguments, the List of numbers we want to permute, a function to apply to any finished permutations, and the current List for this permutation. This is done in the following code.

def permute(nums:List[Int],f:(List[Int])=>Unit,p:List[Int]) {
  if(nums.isEmpty) {
 f(p)
  } else {
 var before=List[Int]()
 var after=nums
 while(!after.isEmpty) {
   val perm = after.head :: p
   permute(before ::: after.tail,f,perm)
   before ::= after.head
   after = after.tail
 }
 }
}

When the List is empty, we have gotten a full permutation and we call the function on it. Otherwise we have a loop that runs through all the elements we want to permute and makes recursive calls assuming each one has been added to the head of the permutation. This is done using two variable Lists and a while loop. To start off with, one List is empty and the other is all the elements in the input List. In the loop we not only make the recursive call, we also move elements from one List to the other. We can check to see that this function works by calling it like this.

permute(List(1,2,3),println,Nil)

We are getting the six different permutations of a list with three elements and simply printing them. The first call should always use Nil for the last argument.

This function displays an interesting feature that is generally true of recursive functions. Because it is defined in terms of itself, we write it assuming that it works. When we are done writing it, it will work. This logic seems circular, but it is actually using a method called induction. First we make the function work for the simple base case. Every case above that is defined in terms of a smaller one. Eventually this gets down to the base case which generally works in a trivial way. If the base case works, the case right above it should work. When the one above the base case works, the one above that should work. This logic progresses upward to allow us to solve problems of arbitrary size.

We mentioned earlier that we do not return a List of the permutations because there can be a lot of them and that would require a lot of memory. It is worth taking a second to get a slightly better understanding of what "a lot" means here. Specifically, we would like to know the order of the number of permutations. This will not only tell us how long a list would be if we built one, it will give us a measure of how long this function will take to run for any task that we might give it as it will be the number of times that the function parameter, f, gets called.

As is normally the case for order analysis, we will think of things in terms of the size of our input. In this case, that is the size of the input List which we will call n. The first call to permute makes n calls to itself, each of which is passed a List with n − 1 elements. Those them make n − 1 calls with Lists of size n − 2. This process continues until we get down to 0 elements in the List. If you picture this as a tree similar to that in figure 15.2, we have a top box with n branches off it. Those lead to n boxes that each have n − 1 branches off of them. So at the third level we have n ∗ (n − 1) boxes. At the next level we have n ∗ (n − 1) ∗ (n − 2). In you continue this down to 1 we get the factorial function that we played with way back in chapter 6. Indeed, the order of the permutation function is O(n!). If you recall from our earlier discussion, the factorial function grows very quickly. So quickly in fact, that we had to use the Long or BigInt types if we wanted to use even modestly large inputs. Clearly you do not want to try to get all the permutations of Lists with much more than 10 elements in them. Depending on what you are doing, even lists of 10 elements might take a while.

15.4 Towers of Hanoi

Our next example of these more powerful recursive functions is a bit more playful. You have likely seen and perhaps even played Towers of Hanoi. It is a "game" played with three pegs and a number of disks of different sizes that have holes in them so they can be put on the pegs. Your goal in playing is to move the disks from one peg to another following two rules:

  1. You can only move one disk at a time.
  2. No disk can be placed on a disk smaller than it.

We want to write a program that can solve this. To do so, we need to first analyze the problem a bit more.

If you start by picturing a peg with 7 disks on it stacked from largest at the bottom to smallest at the top and try to picture the solution, you will probably have a hard time. The first step is clearly to move the top disk to one of the other pegs, but which one? It gets more complex from there. Instead of trying to solve a hard problem, we should start with an easy problem and try to build up from there.

The easiest setup to solve is when you have one disk. Simply move the disk from where it started to the peg you want it to finish on. That was a bit trivial, so what about two disks? First you have to move the top disk to the peg you do not want to end on, then you move the second disk to the end peg and move the smaller one over on top of it. It might not be clear at this point, but that points toward a general solution. To see this, think of the situation with three disks. We could run through every single move, but we just saw how to move two disks so we can assume we know how to do that. Using that, we move two disks to the middle peg. Note that we do not literally move two disks at a time as that would violate the rules. Instead, we use the approach we just developed to move two disks in three separate one disk moves. Then the largest disk is moved to the end peg and the two disks are moved on top of it.

This points to a general solution. If we make the assumption that we know how to move N − 1 disks, we can move N disks by first moving the top N − 1 disks to the off peg, then moving the largest one to the destination, then moving the N − 1 disks back on top of the largest one. Note that this is a recursive definition. The solution for N disks is relies on the solution to N − 1 disks. The base case is the trivial one where we move a single disk across.

We now have an approach to solving this problem. Next we need to figure out how we want to represent the problem in the memory of the computer. We need to come up with a way to represent disks and pegs. For the disks, all we really care about is the size of the disk. This can nicely be represented by an Int. A peg is really just a stack of disks so we need something like an Array or a List of Ints. To decide which, we should consider how they are used. The number of disks on a peg changes over time, but they are always added or removed at one location, the top of the stack. This points to using a List where the head of the List is the top disk. We can use :: to add a disk to a peg and the tail is what is left when a disk is removed. The script below shows code that implements the three pegs as an Array[List[Int]]. There is a function to move one disk that includes a check to make sure we are not breaking the second rule.

val num = if(args.length>0) args(0).toInt else 7
val pegs = Array(List.tabulate(num)(i => i+2), List[Int](), List[Int]())


  
def moveDisk(from:Int,to:Int) {
 require(pegs(to).isEmpty || pegs(from).head < pegs(to).head)
 pegs(to) = pegs(from).head :: pegs(to)
 pegs(from) = pegs(from).tail
}


  
def moveNDisks(n:Int,from:Int,to:Int) {
	val other = 3-from-to
	if(n==1) {
   moveDisk(from, to)
	} else {
   moveNDisks(n-1, from, other)
   moveDisk(from, to)
   moveNDisks(n-1, other, to)
	}
}


  
moveNDisks(pegs(0).size, 0, 2)


  
println(pegs.map(a => a.mkString("")).mkString("
"))

The function moveNDisks is recursive and relies on the moveDisk function so we can feel certain the program is not cheating. The line val other = 3-from-to might seem a bit odd at first. The function is only passed the number of the peg to move from and the number of the peg to move to, but we have to refer to the third peg in the function as that is where the top n-1 disks are to go first. That line calculates the number of the other peg using the simple observation that if you add the indexes of all the pegs you get 3. The script ends with a call to the moveNDisks function followed by a print of what is left at the end. This makes it simple to verify that the puzzle was actually solved.

Given the physical and visual nature of the Towers of Hanoi, the text print is a bit unsatisfying. This is a problem that really deserves to have a graphic display. The following script adds a BufferedImage, and Panel, and a Frame so that you can watch the disks move around as the recursion solves the puzzle.

import scala.swing._
import java.awt.image.BufferedImage
import java.awt.{Graphics2D,Color}


  
val num = if(args.length>0) args(0).toInt else 7
val pegs = Array(List.tabulate(num)(i => i+2), List[Int](), List[Int]())


  
val img = new BufferedImage(600, 300, BufferedImage.TYPE_INT_ARGB)
val g = img.createGraphics


  
val panel = new Panel {
  preferredSize = new Dimension(img.getWidth, img.getHeight)
  override def paint(g:Graphics2D) {
    g.drawImage(img, 0, 0, null)
  }
}


  
def renderPegs {
  g.clearRect(0, 0, img.getWidth, img.getHeight)
  g.setPaint(Color.red)
  for(i <- 0 until pegs.length) {
  	 for(j <- 0 until pegs(i).length) {
	    	  val s=pegs(i)(j)*10
  			  g.fillRect(100+i*200-s/2,
  				 img.getHeight-(pegs(i).length-j)*30, s, 20)
  	 }
  }
  panel.repaint()
  Thread.sleep(100)
}


  
def moveDisk(from:Int,to:Int) {
  require(pegs(to).isEmpty || pegs(from).head < pegs(to).head)
  pegs(to) = pegs(from).head :: pegs(to)
  pegs(from) = pegs(from).tail
  renderPegs
}


  
def moveNDisks(n:Int,from:Int,to:Int) {
  val other = 3-from-to
  if(n==1) {
    moveDisk(from, to)
  } else {
    moveNDisks(n-1, from, other)
    moveDisk(from, to)
    moveNDisks(n-1, other, to)
  }
}


  
val frame=new MainFrame {
  title="Hanoi"
  contents=panel
  resizable=false
}


  
frame.visible=true


  
moveNDisks(pegs(0).size, 0, 2)


  
println(pegs.map(a => a.mkString("")).mkString("
"))

If you play with this code a bit you will notice that tall stacks of disks take a very long time to move. The leads to the question of just how many moves does it take to get N disks from one peg to another? The recursive code makes this question fairly easy to answer if we are happy with a recursive function for the answer.

f(N)=11+2f(N1)N=1otherwise

This can be simplified to f (N) = 2N − 1. So the number of moves grows exponentially in the number of disks. Stacks of 20 or 30 disks would indeed take a very long time to complete.

15.5 Mazes

Another class of problems that calls for recursion is those involving mazes. This might also seem to be more for recreation, but mazes are a simplified case of something called graphs which are very important in Computer Science and are used to represent all types of different problems. The same approaches that we will use here for mazes apply to graphs as well. There are also a number of applications where doing things like finding the optimal route through some type of restricted path like a maze is significant.

To keep things simple, we are going to use a rather basic approach to building our mazes. Instead of having a grid with walls between cells, we will have a grid where complete squares can either be open or be occupied by a wall. This representation means that we can use a 2-D Array of values that tell us if there is a wall or not. In theory we could use an Array[Array[Boolean]] for this purpose, but in practice we will have use of being able to put numeric values in the "rooms" so an Array[Array[Int]] will be more practical.

We will define the maze with code like the following.

val maze=Array(Array(0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
	 	  Array(-1,-1,-1, 0,-1,-1,-1,-1,-1,-1),
	 	  Array(0, 0, 0, 0,-1, 0, 0, 0,-1, 0),
	 	  Array(0,-1, 0,-1,-1,-1,-1, 0, 0, 0),
	 	  Array(0,-1, 0,-1, 0, 0,-1,-1,-1, 0),
	 	  Array(0,-1,-1,-1,-1, 0, 0, 0, 0, 0),
	 	  Array(0,-1, 0, 0,-1, 0,-1,-1,-1, 0),
	 	  Array(0, 0, 0,-1,-1, 0,-1, 0, 0, 0),
	 	  Array(0,-1, 0,-1,-1, 0,-1, 0,-1,-1),
	 	  Array(0,-1, 0, 0, 0, 0,-1, 0, 0, 0))

This builds a 2-D Array of Ints that uses a 0 to represent an open square and a -1 to represent a wall. The use of -1 for a wall is intentional so that positive numbers can be used to represent other things later. This particular maze is only 10×10 in size. This is a good size to start off with for our purposes though it does not allow for a very complex maze.

Our first task is the thing that most people would probably want to do with a maze. We want to know how to get through it in the shortest number of steps. To start with, we will ask the simple question of how many steps long is the shortest path. This has a significant advantage over trying to actually return a path. Not only is it a simple numeric value, it is also unique. There can be multiple paths all with the same length. That produces ambiguity if we ask for the optimal path. It also happens that having a function to find the length of the shortest path is sufficient as multiple calls to this function can be used to construct the optimal path even though such a construction is less than optimal.

You can write a function to find the length of the shortest path using loops, but it is a fairly complex code. We can write a simpler solution using recursion. As with all recursive functions we can break the solution down into two broad parts, base cases and recursive cases. The base cases should be trivial to solve. There are two good base cases for the maze problem. One base case is when you reach the exit. If you are at the exit it takes zero steps to get out. Another base case is if you are checking a location which is out of the maze or in a wall. For that we want to return a value that could not possibly be a solution. We will come back to that after discussing the recursive cases.

Picture yourself standing at a location somewhere in the maze and you want to know how many steps it takes to get from your current location to the exit on the shortest path. The recursive way to do this is to imagine that you had an oracle that could tell you that answer from any of the neighboring squares and figure out how you would combine those values to determine the distance from where you are. As we said earlier in the chapter, we write a recursive function by assuming we have one that works and writing based on that assumption. Once we are done, assuming we have handled the cases properly, it will work. The oracle is our assumed working function. If you are in the maze and you can ask the oracle to tell you how many steps it takes to get out from one step to the north, south, east, and west, how would you use that information to determine the shortest distance from where you are?

To make this more concrete, picture yourself at a four-way intersection. You ask the oracle about each direction and get values of 7, 13, 4, and 19. So the shortest path lies in the direction that gave you back 4. However, that was the distance from one step in that direction. The minimum distance from where you are is one greater than that, or 5 steps. It is the minimum of the four, plus one step to get to that other location. The solution of taking the minimum also points us in the proper direction for the base case of hitting a wall or going out of bounds. If the best solution is the minimum, we should return a number that can not possibly be the minimum for a bad path. A large number such as one billion will suffice for virtually any maze. You might be tempted to use Int.MaxValue, but remember that we are adding 1 to it. Try doing Int.MaxValue+1 in the REPL and you will see why that is not an ideal choice.

That defines our recursive case. There is only one other factor that needs to be considered. If you have ever been in a hedge maze or some other maze where you can not see over the walls, it can be quite difficult to tell one location from another, and unless you do something to mark your previous locations you can wind up running in circles and never finding your way out. A rather famous solution to this problem was described by the Brothers Grimm in Hansel and Gretel who left breadcrumbs to mark their path. We will choose the same approach. Assuming you have no birds in your computer this should work better for you than it did for the children. If you do have birds in your computer, you should probably deal with that before reading on.

So the last facet that we need for our algorithm is the ability to drop down breadcrumbs to mark our path. If we ever come back upon our breadcrumbs we will treat it just like being in a wall or out of bounds. We will also sweep up our breadcrumbs before we return from a path. This was not required by Hansel and Gretel who simply wanted to be able to find their path. We, however, want to find an optimal path and leaving breadcrumbs all over the place would often cause us to miss such a path. Converting this all to code produces the following.

def shortestPath(maze:Array[Array[Int]],x:Int,y:Int,endX:Int,endY:Int):Int = {
 if(x<0 || y<0 || y>=maze.length || x>=maze(y).length || maze(y)(x)!=0) {
 1000000000
 } else if(x==endX && y==endY) {
 0
 } else {
 maze(y)(x)=1
 val dist=(shortestPath(maze,x+1,y,endX,endY) min
 shortestPath(maze,x-1,y,endX,endY) min
 shortestPath(maze,x,y+1,endX,endY) min
 shortestPath(maze,x,y-1,endX,endY)) + 1
 maze(y)(x)=0
 dist
 }
}

The whole function is built from if expressions. The first option is for going out of bounds, hitting a wall, or coming across one of our breadcrumbs, and it returns one billion, a number so large it can not possibly be a valid path, but one small enough that we also will not overflow it by adding 1 for other steps through the maze. The second case is where we have reached the end. These two can be reversed and would need to be if the exit were outside of the bounds or marked by a special value in the array. The last case in the recursive case which drops a breadcrumb, makes four recursive calls and combines them with min, then picks up the breadcrumb and returns. This code could be invoked with this call.

println(shortestPath(maze,0,0,9,9))

For the maze shown above, the return value from this call is 36. For this particular maze, that is the length of the only path from 0,0 to 9,9. You should play with the maze some to make sure that the function finds the shortest path for other configurations.

This function literally tests every single path through the maze and gives us back the shortest. There are strengths and weaknesses to this approach. The weakness is that if there are a lot of paths this could take a while. We will explore that more and consider ways to address it in chapter 28. The strength is that it takes only very minor modifications of this code to produce other similar functions such as one to find the longest path or to count up how many paths there are. These particular problems are left as exercises for the student, but we will tell you now that the only real changes are to alter the return values of the base cases and the way in which the recursive calls are combined.

15.6 Sorts

In chapter 13 we used loops to provide all of our iterations. We could have used recursion instead. In fact, some of the sorts are expressed very nicely as recursive sorts. Consider a maxSort. The two loops would be written as two recursive functions and one can be nested inside of the other. The inner one finds the index of the maximum value before some location. This is nested in a function that runs back through the array finding the max and swapping it into place before moving back one spot. The code for doing this looks like the following.

def maxSort(a:Array[Double],end:Int) {
  def maxIndex(end2:Int):Int =
  if(end2==0) 0 else {
  val m=maxIndex(end2-1)
  if(a(m) > a(end2)) m else end2
  }


  
  if(end>0) {
  valindex=maxIndex(end)
  if(index!=end) {
  val tmp=a(index)
  a(index)=a(end)
  a(end)=tmp
  }
  maxSort(a,end-1)
  }
}

This is a bit longer than the version that uses a loop, but not by all that much. A wrapper function would be needed if we wanted to have a version we could call with only a single argument.

Perhaps an even better example is the insertion sort. The inner loop of the insertion sort is doing an insert so we write a recursive insert function. This gets called by an outer function that does the sort and simply walks down the Array putting each element in place.

def insertionSort(a:Array[Double],i:Int=1) {
  def insert(v:Double,i:Int) {
  if(i<=0 || v>=a(i-1)) a(i)=v else {
   a(i)=a(i-1)
   insert(v,i-1)
  }
  }


  
  if(i<a.length) {
  insert(a(i),i)
  insertionSort(a,i+1)
  }
}

The insertion sort is quite nice and neat as a recursive function. The fact that it does not have a full swap helps with that.

For the max sort and the insertion sort, the use of recursion is only a change in presentation. It does not really alter how they run or how efficient they are. The recursion in these functions is only used for iteration, nothing more. However, recursion also opens the door to some more efficient sorts. The next two subsections describe two of these.

15.6.1 Divide and Conquer Sorts

Recursion that calls itself multiple times opens up a new style of problem solving called divide and conquer. The idea is exactly what the name sounds like. You take a big problem, divide it up, conquer/solve the pieces, then build a solution to the larger problem from the solutions to the pieces. The divide and conquer approach is very general and can be applied to many types of problems. In this section, we will look at two different sort algorithms that are built on this idea.

15.6.1.1 Merge Sort

The first sort we want to consider is the merge sort. The general idea of a merge sort is that we are given a collection that we break into two even pieces. Each of those pieces is sorted individually and then the sorted results are merged back together.

This is a fast sort algorithm because the merge operation is O(n). The reason for this is that when you want the lowest element from either collection, you do not have to look through all the elements, you only need to consider the elements at the low ends of the two collections. That means you only do one comparison. To find the next lowest element you again do one comparison. So you can make a sorted collection of n elements from two smaller sorted collections with only n − 1 comparisons. The fact that you are repeatedly cutting the collection in half means that you get down to a single element in log2(n) cuts. This gives an overall performance of O(n log(n)). For large values of n, this is much better than the O(n2) performance of earlier sorts.

There is one minor problem with the merge sort, it can not be done in-place. Recall that this means that it takes additional memory at least proportional to n to to complete the sort. A well-crafted merge sort can get everything done with one additional Array of length n. Getting the memory usage down to that level is a bit complex so we will leave that for chapter 28. In this chapter, we will use the inability to do the sort in-place as an excuse for showing you the sort using Lists, which require extra work space anyway because they are immutable.

The first solution that we will look at is all recursive. It has a recursive mergeSort function that takes a List[Int] and returns a List[Int].1 This uses a recursive merge function to put the Lists together.

def merge(lst1:List[Int], lst2:List[Int]):List[Int] = (lst1,lst2) match {
  case (Nil,_) => lst2
  case (_,Nil) => lst1
  case (h1::t1, h2::t2) =>
  if(h1<h2) h1 :: merge(t1, lst2)
  else h2 :: merge(lst1, t2)
}


  
def mergeSort(lst:List[Int]):List[Int] = lst match {
  case Nil => lst
  case h::Nil => lst
  case _ =>
  val (l1, l2) = lst.splitAt(lst.length/2)
  merge(mergeSort(l1), mergeSort(l2))
}

Both of the functions use pattern matching for the different cases in the recursion. This makes them nice, short, and easy to read. Unfortunately, the recursive version of merge has to allocate a new stack frame for every element that is merged. That means that this version of the code can not scale up to large sizes.

To get around this limitation we need to use a more imperative merge that works with a while loop. such a version might look like the following.

def merge(lst1:List[Int], lst2:List[Int]):List[Int] = {
  var l1 = lst1
  var l2 = lst2
  var ret = List[Int]()
  while(l1.nonEmpty && l2.nonEmpty) {
  if(l1.head<l2.head) {
  ret ::= l1.head
  l1 = l1.tail
  } else {
  ret ::= l2.head
  l2 = l2.tail
  }
  }
  if(l1.nonEmpty) ret :::= l1.reverse
  else ret :::= l2.reverse
  ret.reverse
}

This version is not as pretty, and is about twice as long. However, it handles significantly longer Lists making it significantly more useful.

One last thing to note about the merge sort is the way in which it does work. As the recursive calls go down the stack, very little happens. The input List gets split in half and those two halves get passed down. This repeats until we reach a single element. The real work happens as the recursion pops back up the stack. Before a function can return, it calls merge, which does the real work. So if you picture in your head something like figure 15.1, the down arrows are not associated with much work. It is the up arrows where the work is done.

15.6.1.2 Quicksort

Another divide and conquer sort is quicksort. The idea of quicksort is to pick a special element called the pivot, then move it to the correct location with all the elements that are less than it before it and all those that are greater than it after it. Those elements to either side will not yet be sorted so recursive calls are made to sort each of them. The recursion terminates when it gets down to a single element.

Unlike merge sort, quicksort can be done in-place. However, that will be left as an exercise for chapter 28. For now we will make the quicksort like our merge sort and work on a List, for which the idea of being done in-place does not make sense.

The quality of a quicksort is largely dependent on the selection of a pivot. A good pivot will be in the middle of the collection to cut it in half. A really bad pivot would be the minimum or maximum element which does nothing to split up the data and only pulls out that one element from the next level of the recursion. Keeping with the idea of simplicity for the implementations in this chapter, we will use the first element as the pivot. This is not a good way to do this in general and can lead to very bad behavior on sequences that are already sorted. That too will be addressed in chapter 28.

With these simplifications in place, we can write a version of quicksort with the following code.

def quicksort(lst:List[Double]):List[Double] = lst match {
  case Nil => lst
  case h::Nil => lst
  case _ =>
  val pivot = lst.head
  val (less, greater) = lst.tail.partition(_<pivot)
  quicksort(less) ::: (pivot :: quicksort(greater))
}

The pivot is set to the head of the List. We then use partition to split the rest of the elements between those that are less than the pivot and those that are not. Finally, we call quicksort on those two sublists and put the whole thing together into a result. The result is a short and simple sort function that actually works fairly well on random data.

It is worth asking what the order of this sort really is. Unlike the merge sort, which is very stable in the amount of work that it does, the performance of our quicksort can vary dramatically depending on the input. Like the merge sort, each level of the recursion does O(n) work. In this case that work is the partition and sticking Lists together. If the pivot is in the middle of the values, each level of the data is cut in half and we get O(log(n)) levels for the recursion and O(n log(n)) overall performance. For random data, this is the expected behavior. However, our simple pivot selection can lead to very poor behavior and this sort is O(n2) if the input is already sorted.

The last thing to note about quicksort is that unlike merge sort, it does most of its work going down the call stack. The call to partition is where all the comparisons happen. This List version does have to merge Lists on the way back up the call stack. The version we write with Arrays in chapter 28 will not have to do even that.

15.7 Putting it Together

To see another use of recursion we want to solve a problem that we have done before using a different approach. Twice now we have included code that makes suggestions for employee schedules based on data on ridership. The work of finding schedules was done using the permutations and combinations methods on sequences. As we saw earlier in this chapter, recursion can also be used to generate permutations. It can do combinations as well.

Instead of repeating exactly what we did before, we are going to use recursion to push it a bit further. Instead of showing possible groups of employees for each day, we want to show possible schedules for a full week that include a limit on how many days each week any given ride operator is willing/able to work.

We will not repeat the entire code from chapter 14 for the various menu options. This simply shows a function that can be used to build these more complete schedule types. For this to work a daysPerWeek:Int member was added to the EmployeeData case class. You can also modify menu option 5 to call this function:

def recursiveBuildWeeklySchedule {
  val daysInfo = for(y <- years; m <- y.months; d <- m.days) yield d
  val days = daysInfo.map(_.dayOfWeek).distinct


  
  case class WorkerDays(name:String, numDays:Int)
  case class WorkerAssigns(day:String, workerRide:List[(String, String)])


  
  def printSchedule(schedule:List[WorkerAssigns]) {
  println("Possible Schedule:")
  println(schedule.mkString("
"))
}


  
def recurByWorker(daysLeft:List[String], workerAvail:List[WorkerDays],
  schedule:List[WorkerAssigns], workersLeft:List[String],
  ridesNeedingOps:List[String]) {
 if(ridesNeedingOps.isEmpty) {
 recurByDay(daysLeft, workerAvail, schedule)
 } else if(workersLeft.length>=ridesNeedingOps.length) {
 val worker = employeeInfo.filter(_.name == workersLeft.head).head
 for(ride <- worker.rides) {
  ridesNeedingOps.indexOf(ride) match {
  case -1 =>
  case i =>
   val newAvail = (for(w <- workerAvail) yield {
    if(w.name == worker.name) w.copy(numDays = w.numDays-1)
    else w
   }).filter(_.numDays>0)
   val newSchedule = schedule.head.copy(workerRide = (worker.name, ride) ::
    schedule.head.workerRide) :: schedule.tail
   recurByWorker(daysLeft, newAvail, newSchedule, workersLeft.tail,
    ridesNeedingOps.patch(i,Nil,1))
  }
 }
 recurByWorker(daysLeft, workerAvail, schedule, workersLeft.tail,
	 ridesNeedingOps)
  }
}


  
def recurByDay(daysLeft:List[String], workerAvail:List[WorkerDays],
 schedule:List[WorkerAssigns]) {
 if(daysLeft.isEmpty) {
 printSchedule(schedule)
 } else {
 val day = daysLeft.head
 val thisDay = daysInfo.filter(_.dayOfWeek==day)
 val rides = thisDay.map(_.ride).distinct
 val operatorRides = rides.flatMap(ride => {
   val nums = thisDay.filter(_.ride==ride).map(_.numRiders)
   val avg = nums.sum/nums.length
   val rideData = rideInfo.find(_.name==ride).get
   Array.fill(rideData.numberOfOperators+(if(avg>=rideData.heavyCount) 1 else
    0))(ride)
 })
 recurByWorker(daysLeft.tail, workerAvail, WorkerAssigns(day, Nil)::schedule,
   workerAvail.map(_.name), operatorRides)
  }
}


  
recurByDay(days, employeeInfo.map(e => WorkerDays(e.name, e.daysPerWeek)),
 List[WorkerAssigns]())
}

This is a significantly more complex recursive function than what we looked at previously. It is worth taking some time to study what is going on. The recursion is broken into two separate functions that are nested in the primary function. The primary function does little more than pull together the information on the days and then make a call to recurByDay.

As the name implies, the recurByDay function is using days as the primary recursive argument. Each level down the call stack has one fewer elements on the daysLeft List. The base case is when daysLeft is empty. That means we have a full schedule for the week and we are ready to print it. This function borrows code from the earlier version to build up a List of ride names with multiple copies for the number of operators needed.

All of the information that comes into recurByDay, as well as other information that it figures out is passed into recurByWorker. The name here is again meant to give an image of what is going on. This function recurses through the workers, considering one worker on each call. The base case is when there are no more rides that need operators. In the recursive case the code runs through all the riders that worker is trained to operate and if any of them still need operators, one recursive branch is taken with that worker assigned to that ride. There is also a recursive branch at the end where the current worker is not assigned to work that day.

Note that the base case of recurByWorker contains a call back to recurByDay so these are mutually recursive functions. It is reasonably common in more complex situations, when the recursion needs to run through a space that has several different independent parameters, to break the problem up like this into different functions that handle changes in different options and depend on one another to get a complete solution.

15.8 End of Chapter Material

15.8.1 Summary of Concepts

  • Recursion can be used for much more than iteration. The memory of the stack frames on the call stack allows recursive functions to call themselves more than once and try different possibilities for solving a problem.
  • One of the most common examples of a recursive function that calls itself more than once is a function to generate the Fibonacci sequence. Each number in this sequence is the sum of the previous two, so this recursive function calls itself with arguments one less than the current value and two less than the current value.
  • A more interesting example is the Towers of Hanoi. Here we saw that if we knew how to solve the problem with N disks, we could extend it to N + 1 disks. This and a base case constitutes a complete solution for any value of N.
  • Finding the shortest path through a maze works well as a recursive problem. At each point you need to find out the distance from the different options you have. Those can be combined to give you an appropriate value for the current location.
  • Sorts can also be written recursively. Any of the sorts we looked at previously can be converted to recursion, but there is not a strong motivation to do so.
  • Using a recursive divide and conquer approach to sorting leads us to two other sorts that do O(n log(n)) comparisons instead of the O(n2) from our previous sorts.
    • A merge sort repeatedly breaks the collection in two going down the call stack, then merges the sorted results as it comes back up the call stack.
    • Quicksort works by picking a pivot and putting it in the right place, then recursing on the elements that are less than the pivot as well as those that are not.

15.8.2 Exercises

  1. Write functions that calculate Fibonacci numbers using the following approaches.

    • Using a loop with an Array or List.
    • Using a loop with three var declarations.
    • Using a recursion function that takes three arguments, but only calls itself once.
  2. Write a function that will calculate the longest non-self-intersecting path through a maze.

  3. Write a function that will count the number of non-self-intersecting paths through a maze.

  4. Find the size of the biggest completely empty maze one can solve with the recursive search function that finishes in a minute or less.

  5. The following is a function called the Ackermann function, which is significant in theoretical Computer Science. Put this into Scala and play with it a bit.

    f(0,n)=n+1f(m+1,0)=f(m,1)f(m+1,n+1)=f(m,f(m+1,n))

  6. Write a recursive function that will build the power-set of a List. The powerset is the set of all subsets of a given set. The fact that it is a set means that we do not care about order, unlike with permutations. For example, the power set of List(1,2,3) is List(Nil, List(1), List(2), List(3), List(1,2), List(1,3), List(2,3), List(1,2,3).

  7. Write a recursive function that builds combinations of a List instead of permutations. The caller should be able to specify how many elements are desired in the combination. (Hint: All combinations of all sizes are part of the power-set.)

15.8.3 Projects

  1. If you have been doing the ray-tracing and graphics options, this project continues with that. Ray tracers can implement both reflections and transparency using recursion. If you have a function that tells you what color to draw for a particular ray, you can do reflection by recursively calling that function with the reflected ray and combining that color with the color of the reflecting surface. How much of each goes into the color depends on the fractional reflectivity of the object.

    For this project you will give your objects colors, but you still will not worry about lighting. So when a ray hits an object, it gets the full color of that object unless the object is reflective. If it has reflectivity R then it gets (1 − R) of the color of that object and R of the color from the reflected ray. To add colors you need to be able to build your own colors and get the components out of colors. The java.awt.Color class has methods called getRed, getGreen, and getBlue that return the amount of each primary in a color. The values are Ints in the range of 0-255 inclusive. You can also make a new Color object by calling new Color(red:Int,green:Int,blue:Int). The values passed in must be in the same range or you will get an error.

    To calculate reflection, you need the unit-normal vector for the surface you are reflecting off of. For a plane this is easy as the normal is the same everywhere. For a sphere, this is the vector from the center to the point of intersection. In both cases, you want it normalized. If you have that, the direction of the reflected ray is given by

    rreflected=r2(r  ·  n)n,

    where r is the direction of the incoming ray and n is the unit normal. The unit normal will also be significant for calculating lighting in a later project.

  2. For this project you will write a program that solves 9×9 Sudoku puzzles. It will do this through a recursive function.

    The input will be nine lines of text, each with nine characters on it. The characters will either be a space or a single digit number. You should output a board with all the blanks filled in and put spaces between the numbers to keep it readable. It turns out that most puzzles have more than one solution. You only need to provide one.

    For an extra challenge, make it so your program has a Graphical User Interface (GUI). It should load puzzles from a file and use a Table or a GridPanel to display the puzzle. Users should be able to select a solve option to have other spaces filled in.

  3. For this project, you will write a program that recursively parses a string for an arithmetic expression and returns the value of it. Examples would be 5+7 or 9*(5+7.5)/2. Your parser should do proper order of operations (things in parentheses bind highest, and / before + and -, and go from left to right for the same level of priority).

    The approach I want you to take for solving this problem is with a divide and conquer recursive algorithm that breaks the problem into pieces starting with the lowest priority operation. You will write a function parse(exp:String):Double. This function will return the value of the expression in the String exp. First it should find the lowest priority operator (it can not be in parentheses). If it finds one, it recurses twice with the substrings on either side of that operator (use the substring method of String or take and drop). If there is not an operator that is not in parentheses you can check if the String starts with '('and pull the bounding parentheses off and recurse on what is left. If it does not start with '('you know that part of the expression is just a number so use toDouble to get the value.

    The user will give you a formula, that does not include any spaces, at the commandline and you should simply print the value it evaluates to. So a potential invocation of your program might be as follows: scala parser.scala 5+3*(70/5).

  4. If you did project 7 (p.367), you can extend it in this project. If you go further into chapter 1 of "The Algorithmic Beauty of Plants", you will find that you can use L-systems to model grasses and trees. We will not do the 3-D implementation, but you can add handling for '[' and ']' to allow branching structures. This is best done using recursive calls that start on a '[' and return on a ']'. Instead of passing in a String, pass in the Iterator[Char] that you get by calling the iterator method on a String. The advantage of the Iterator is that the elements are consumed when you call next so that when you return from a function call, the code will automatically be working on what remains after the ']'.

  5. Determining if a particular recipe can be made with items from your pantry is not all that hard and does not require using recursion. Planning an entire meal for a large dinner party is another matter. To do this, the recipes need to be annotated with what type of dish they are: entr´ee, dessert, side, etc. When planning a meal, the user must be able to specify how many of each type of dish they wish to make for the meal. You can then use recursion to find the meals that can be made fitting those requirements with ingredients that are on hand.

    You should attach information on how much the user likes certain recipes so that only the top 5 or so meals are shown. If you want a bit of an extra challenge, consider handling the situation where there are not any meals that can be made with what is on hand and then you list top meals that need few additional ingredients.

  6. If you have been doing the schedule building problems, you can now extend the functionality using recursion. Annotate each course with how it fits into your curriculum. Then you can specify not only how many hours you want to take, but also what requirements you want to fulfill.

    This is a problem that you could solve using permutations and combinations, but as the number of course options grows, those options become less efficient. Using recursion, you can cut the recursion short for any combination if you determine that it can not possibly work. For example, if you hit the proper number of hours before you are done with courses, you do not need to consider variations in the remaining courses, just check if you have satisfied the other requirements.

  7. Having the ability to do recursion opens up a lot of possibilities for the motion of computer-controlled characters in games that have barriers. Moving straight to a location is fine if the playing space is open and empty, but when there are obstacles, like in a maze, it is important to have a way to navigate them. Recursion gives you a way to do this. For this project you can implement your choice of simple game with enemies that use recursion to find ways around obstacles. Note that you probably do not want the enemy to have the ability to follow the shortest path to the player unless the player has a significant speed advantage. Instead, you can throw in some randomness to take choices that are somewhat less than optimal.

  8. If you have been doing the text-adventure/text-map project options, you can write some utility functions that can be used to help check out the map. Recursion can let you see things like if you can get from one room to another, how many steps it takes, how many paths there are, or even the paths themselves. You can implement these as new commands when you run the program. For example a "canReach" command could be given a room number/identifier to see if the specified room can be reached from the current room.

  9. If you have been working on the music library, you can throw some recursion into that as well if you give a few hints. You would need to annotate songs with hints as to what are good options for songs to follow it. You can make the recursion only follow from one song to another when the user has recommended that it is worth doing. Those recommendations can be given a "strength" as well. The user should select a starting song and the program should find the playlist with the highest total strength for all the connections.

    Note that this is a problem where you could run into problems if there are lots of songs with lots of connections. Doing this with a simple recursive algorithm could lead to extremely long runtimes. However, assuming that the user does not enter too many connections, recursion should work fine as long as you do not test song combos that have not been marked by the user. This is why you can not simply consider every ordering of songs.

Additional exercises and projects, along with data files, are available on the book’s website.

1Unlike version with Arrays, a sort using a List must return a new List as the original one can not be altered.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset