Unit 3: Recursion

Credit: Gavin Taylor for the original version of these notes.

1 Recursion

Recursion is a programming technique which is a Big Deal, and which can make many programming tasks much easier. You've never seen anything like it before, but once you've got it, it is extraordinarily powerful. Briefly, recursion is having a function call itself.

Beginning of Class 7

Function calls

Understanding recursion really depends on a deep understanding of how function calls work. Imagine the following set of functions:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
def f1():
    print("inside f1")
 
def f2():
    print("starting f2")
    f1()
    print("leaving f2")
 
def main():
    print("starting main")
    f2()
    print("leaving main")
 
if __name__ == "__main__":
    main()

If we were to run this program, the output would be:

starting main
starting f2
inside f1
leaving f2
leaving main

To understand why, let's think about the call stack. The call stack works like a stack of plates: plates can only be placed on the top, and can only be removed from the top. For this reason, they're often referred to as LIFO (last-in-first-out). Instead of plates, though, the call stack has the functions that are running; when a function is called, it is placed on the top of the stack. A function ONLY runs when it is at the top of the stack; the rest of the time, it sits and waits for the one above it to return.

So, we start running the program. Obviously, it prints "starting main." At this point, our call stack looks like this:

\[ \array{ \boxed{\texttt{main()}} } \]

Our next line is to call the function f2. So, f2 gets put on the top of the stack:

\[ \array{ \boxed{\texttt{f2()}} \\ \boxed{\texttt{main()}} } \]

We've done enough programming now to know that now f2 is running, not main. The reason, though, is because now f2 is on the top of the call stack, so main just sits and waits until f2 returns.

So, f2 runs, printing "starting f2." It then calls f1, leading to this stack:

\[ \array{ \boxed{\texttt{f1()}} \\ \boxed{\texttt{f2()}} \\ \boxed{\texttt{main()}} } \]

So, f1 runs, and f2 (and main) sits and waits. f1 prints "inside f1," then returns, "popping" off the stack, taking us back to this:

\[ \array{ \boxed{\texttt{f2()}} \\ \boxed{\texttt{main()}} } \]

f2, back on the top of the stack, gladly continues where it left off, printing "leaving f2," before returning and popping off the stack itself:

\[ \array{ \boxed{\texttt{main()}} } \]

Back on the top of the stack, main now finishes, printing "leaving main," and returning, ending the program.

OK! So, how about recursion. Let's consider the standard first-recursion-example of finding a factorial using recursion:

1
2
3
4
5
def fac(n):
    if n == 0:
        return 1
    else:
        return n * fac(n-1)

If fac(3) is run, it runs through happily, until it gets to fac(n-1), and which point, fac(2) is placed on top of the stack. fac(2) begins to run, and fac(3) sits and waits for it to return. fac(1) is then placed on top, and fac(2) and fac(3) sit and wait. Finally, briefly, fac(0) is placed on the stack, just long enough to return 1.

The story so far looks like this:

\[ \array{ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \boxed{\texttt{fac(1)}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \boxed{\texttt{fac(0)}} \\ \boxed{\texttt{fac(1)}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \texttt{return 1} \\ \boxed{\texttt{fac(1)}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \]

At this point, fac(1) picks up where it left off, multiplying its n (1) by what fac(0) returned (also 1), and returning that solution. fac(2) then multiplies that returned value of 1 by its n (2), and returning the solution. fac(3) then multiplies that 2 by its n (3), returning a final answer of 6.

The second half of the story looks like this:

\[ \array{ \texttt{return 1} \\ \boxed{\texttt{fac(1)}} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \texttt{return 1} \\ \boxed{\texttt{fac(2)}} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \texttt{return 2} \\ \boxed{\texttt{fac(3)}} } \longrightarrow \array{ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \phantom{\boxed{\texttt{f}}} \\ \texttt{return 6} \\ } \]

Beginning of Class 8

Recursion with lists

We can also use recursion to write functions that deal with lists. As a first example, here's one that multiplies together all the numbers in a list.

1
2
3
4
5
6
7
8
9
10
11
def mul(lst):
    return _mul(lst, 0)
 
def _mul(lst, start):
    """Helper function for mul.
    Multiplies the numbers in the given list, from the given starting point.
    """
    if start >= len(lst):
        return 1
    else:
        return lst[start] * _mul(lst, start+1)

What you see here is really common in recursive methods. We have the primary function mul which doesn't take any extra arguments, and that calls a helper function _mul that takes an extra argument to help with the recursion.

Of course, you would have no problem at all writing this function with a loop instead of recursion. In fact, that would be a good exercise to do, and to compare what's similar and what's different. One thing you should notice is that in the loop version, you probably need to use a temporary variable to hold the product as you go along, and in the recursive version this goes away (it's basically stored in the return values on the stack). That's one of the things that can be easier about using recursion.

Here's another example of recursive list functions, and one that creates a list in the process. It's something you might remember from the first homework: returning a list with every other element from the original.

1
2
3
4
5
6
7
8
9
10
11
12
13
def every_other(lst):
    return _every_other(list, 0)
 
def _every_other(lst, start):
    """Helper function for every_other.
    Returns a list with every other number in lst, starting at the given index.
    """
    if start >= len(lst):
        return []
    else:
        res = _every_other(lst, start+2)
        res.insert(0, lst[0])
        return res

Is this better than the loops version that you wrote in HW1? Probably not. But it illustrates the main idea that a small change in the recursion (here, doing start+2 instead of start+1 in the recursive call) can have a big effect on what the program does. That's really useful when we have more complicated data structures later in the class.

Tips for writing recursive methods

We're going to need a base case (the thing that makes everything stop running; in the above factorial example, it's if n == 0), and at least one recursive case. The base case is usually easier... what's an input that makes this problem really easy? Write that first, then make the recursive case build towards it.

What does "build towards it" mean? Well it means that whatever input you start with, you want to get closer to the base case with each recursive call. With a number, usually the base case is some small constant like 1 or 0, and the recursive call makes the number smaller (maybe decreasing by 1 or dividing by 2) to get closer to the base case. With a list, usually the base case is an empty list or a size-1 list, and the recursive call is going to be an a smaller list (maybe by removing the last entry in the list, or something like that.)

Beginning of Class 9

2 Big-O with Recursion

We have some practice identifying the Big-O of iterative algorithms. However, now that we're experts on recursion, it's important that we be able to identify the Big-O of recursive algorithms also. We're not going to spend a huge amount of time on this, but we'll come up with a general strategy that gets a reasonable big-O bound for the sorts of examples we'll see in this class.

The process of analyzing a loop's running time can be summarized as three steps: (1) find the number of loop iterations, (2) find the number of steps in each loop interation, and (3) multiply 1 and 2 together.

Finding the big-O of a recursive function can be done in much the same way:

  1. Find the number of recursive steps needed to reach the base case.
  2. Find the big-O cost of each individual function call.
  3. Multiply (1) and (2) together.

The (potentially) tricky part here is step 1 (finding the number of recursive steps to reach the base case), but we'll get a lot of practice on it with different kinds of data structures as we go through the class. For now, you should be able to convince yourself that both the fac(n) and mul(lst) functions above are \(O(n)\).

3 Binary Search

Now let's look at a more involved example: searching through a sorted list to find a particular number. This algorithm is called "binary search", and it's an extremely important algorithm for many applications. We'll talk about it repeatedly this semester.

Binary Search with loops

Here's a version of binary search with loops:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
def binary_search(lst, needle):
    """Searches for the given needle in the given list.
    The list must be in sorted order already.
    """
 
    # left and right are the boundaries of where the needle could be
    left = 0
    right = len(lst) - 1
 
    while left <= right:
        # look at the middle index
        middle = (left + right) // 2
        if needle < lst[middle]:
            # go to the left side of the list
            right = middle - 1
        elif needle > lst[middle]:
            # go to the right side of the list
            left = middle + 1
        else:
            # we found it!
            return True
 
    # No place left to look, so it must not be here
    return False

Make sure you understand how this algorihtm works! It's basically cutting the search space in half at each step, until it narrows in on a single position.

The running time is determined by the number of iterations in the main while loop. Observe that the distance (right - left) starts as \(n-1\) and decreases by half at each step. Like we've seen with loops before, this means that the number of iterations, and the overall running time of this algorithm, is \(O(\log n)\).

Binary search with recursion

You saw this one coming, right?

1
2
3
4
5
6
7
8
9
10
11
12
13
def binary_search(lst, needle):
    return _binary_search(lst, needle, 0, len(lst)-1)
 
def _binary_search(lst, needle, left, right):
    if left > right:
        return False
    middle = (left + right) // 2
    if needle < lst[middle]:
        return _binary_search(lst, needle, left, middle-1)
    elif needle > lst[middle]:
        return _binary_search(lst, needle, middle+1, right)
    else:
        return True

Which version you like better is a matter of opinion, but hopefuly you see at least the simplicity and attraction of the recursive way. It's really important to carefully look and see that these two functions are equivalent to each other. Find where each part of one version shows up in the other one.

A note about Python and why this matters

Different programming languages handle the recursion vs. iteration debate differently. As I noted above, some languages (like certain flavors of Lisp) don't have any iteration at all. On the other hand, unfortunately, Python has a relatively large overhead for function calls, so using recursion in Python is not usually the fastest way of solving problems. Depending on the depth of the recursion, this could be a totally imperceptible fraction of a second difference, or it could be a bit more than that. Recursion doesn't change the Big-O cost, but it could make a difference in practice.

From the point of view of this class, this doesn't change anything; we'll still implement many of our data structures using recursion because it makes them much simpler to understand. And, many of the things that we learn in this class will be theoretical, and independent of programming language; so, while it's true an iterative version of an algorithm may be faster than the recursive version in Python, that won't necessarily be true if you switch to C, or Java, or any of the many, many other programming languages out there.

Soon, we will encounter situations (with trees) where the recursive version of a function is three lines, and the loops version is 20 lines. So take the time now to really get into recursion. It's not going away! And it's wonderful!