Unit 3: Advanced Scheme

This is the archived website of SI 413 from the Fall 2012 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

**Required reading**:These notes

Structure and Intterpretation of Computer Programs (SICP) (available free online), Sections 1.3 and 3.1

Programming Language Pragmatics (PLP), Sections 6.6.1 and 10.7

**Recommended reading**:SICP, Section 5.4.2

PLP, Sections 10.5, 10.6, and 10.8

**Code from these notes**: u03.scm**Slides**: note-taking version**Homework for this unit**: HW 3, HW 4

**Readings for this section**: PLP, Sections 10.7 (required) and 10.8 (recommended).

Remember there are two important characteristics of a "pure" functional programming language:

*Referential Transparency*. This fancy term just means that, for any expression in our program, the result of evaluating it will always be the same.In fact, any referentially transparent expression could be replaced with its value (that is, the result of evaluating it) without changing the program whatsoever.

Notice that imperative programming is about as far away from this as possible. For example, consider the C++ for loop:

for (int i = 0; i < 10; ++i) { /* some stuff */ }

What can we say about the "stuff" in the body of the loop? Well, it had better not be referentially transparent. If it is, then there's no point in running over it 10 times!

*Functions are First Class*. Another way of saying this is that functions are data, just like any number or list. Functions are values, in fact! The specific privileges that a function earns by virtue of being first class include:Functions can be given names.

This is not a big deal; we can name functions in pretty much every programming language. In Scheme this just means we can do

(define (f x) (* x 3))

Functions can be arguments to other functions.

This is how the built-in "higher-order functions" like

`map`

and`filter`

work: one of their arguments is another function. Functions that you write can also take procedures as arguments, just like any other argument.This is neat, but again doesn't make Scheme

*that*special. Passing a function pointer to another function is already possible in C, for example, which basically accomplishes the same thing, albeit with a much more annoying syntax.Functions can be returned by functions.

This means that, when functions are first-class, we can make "function factories", functions that produce other functions. Again, in many languages (even C) it is possible to return other

*pre-existing*functions, but functional languages go one step further to allow the return of*nameless*functions. That is, functions can be created on the fly and returned. We'll see what this is about very shortly.Functions can be stored in data structures.

When functions are first-class, we can make lists or even trees of functions, just like a list of numbers or any other data structure. Again, remember that first-class status means that

*functions are data*. In fact, later in this unit, we'll see how functions can actually represent entire objects like we would in Java or C++.

**Readings for this section**: SICP Section 1.3 (required), and PLP Sections 10.5 and 10.6 (recommended).

The *Lambda Calculus* is a way of expressing mathematics and computation, all with functions, developed by Alonzo Church in the 1930s. It was actually a competing idea with Alan Turing's universal (Turing) machine for the best way to express what is computable. The famous "Church-Turing Thesis", which is one of those things that everyone believes but is still unproven, states that the two models are actually equivalent. Since we now associate all computation with Turing machines, this basically says that every kind of computation can be expressed by some kind of composition of (possibly recursive) functions.

Math geekery aside, the basic principles of Church's Lambda Calculus find their way into modern programming languages like Scheme: Functional recursion is important and useful, functions are data, and not every function needs to have a name.

To honor this mathematical tradition, many programming languages that allow the creation of nameless functions use the keyword (or function name) `lambda`

to do it. For example, let's say we want to create a function that takes an argument `x`

and adds 3 to it. Here's how you would write that in Python:

lambda x: x + 5

Or in Haskell:

x -> x + 5

Or in C#:

x => x + 5

And, finally, in Scheme (or Lisp):

(lambda (x) (+ x 5))

Now if you type this last one into DrScheme, what you get back is just `#<procedure>`

, which is the interpreter telling you that this is a procedure (and it has no name).

To actually call this function, we do it the same way we would any other function in Scheme: by putting it first after an open parenthesis, followed by its argument:

((lambda (x) (+ x 5)) 8)

That function call now produces the answer, 13. I broke it up for you to see the structure, but we could of course also write it on a single line.

Now with this simple example, it's hard to see the utility of `lambda`

and "anonymous" (i.e., nameless) functions in general. After all, the last expression above could certainly be accomplished more easily just by writing `(+ 8 5)`

!

Well, we will see many examples of the power of `lambda`

, but one useful purpose is in combination with higher-order functions like `map`

and `apply`

. For example, we can use the function above to add 5 to every element in a list:

(map (lambda (x) (+ x 5)) (list 3 5 8 10))

Before we would have needed a recursive function (or in another language, a for loop) to accomplish the same thing. Now we can accomplish the same thing, with a very nice separation of the *action* (adding 5) from the *data* (the actual list).

One last note here on `lambda`

: it lies beneath some of the Scheme code you've already been writing! For example, when you define a function like

(define (foo x y) (* (+ x 1) (- y 1)))

this is really equivalent to defining `foo`

as a *constant* lambda-expression:

(define foo (lambda (x y) (* (+ x 1) (- y 1))))

You might be even more startled to discover that a `let`

expression is also just a `lambda`

in disguise. Think about it: `let`

is just saying "associate these names with these values, and then evaluate this expression". Well that's precisely what it means to call a function! For example, this `let`

expression:

(let ((x 8) (y 9)) (+ (sqr x) (sqr y)))

is exactly the same as:

((lambda (x y) (+ (sqr x) (sqr y))) 8 9)

Really! They are equivalent! The argument list in the `lambda`

is just the names of all the `let`

variables, and the actual arguments are the values that each `let`

variable gets assigned to.

These translations are not just for our entertainment, they are actually how the Scheme interpreter works. So function definitions are just a shorthand for a constant definition of a `lambda`

expression, and `let`

statements are a (very convenient) shorthand for a `lambda`

evaluation. There is a term for this sort of thing: *syntactic sugar*. We say that function `define`

s and `let`

s are "syntactic sugar" in Scheme because they are convenient (and sweet!), but don't really provide any new capabilities to the language. In other words, they are empty calories. Delicious, useful, wonderful empty calories.

**Readings for this section**: SICP, Section 3.1 (required).

Remember that one of the key principles of "pure" functional programming is *referential transparency*. What this means specifically is that any function call (or any expression) in our program could be replaced with the value that results from that function call, without changing the program at all.

Informally, what referential transparency means is that there are no side effects to any computation. Every function returns some value, and that is the only thing that happens. Nothing else changes. In particular, if we made the same function call twice, we are just wasting our time because the result will be identical.

Some programming languages, such as Haskell, enforce referential transparency completely. You must do pure functional programming in such languages. As a result, they can perform a number of optimizations to the code that would not be possible if side effects were happening. But on the other hand, doing some seemingly simple tasks suddenly becomes complicated because of the restrictions in the language.

Scheme is quite so pure as that. While Scheme is primarily designed to be a functional programming language, *you can break the rules* if you really want to. Now I'm going to tell you how to do it. This doesn't mean that you *should* break referential transparency all the time in your Scheme programs, just that you *can* if you have a really good reason.

The simplest kind of side-effect is printing some text to the screen. Now it is important to distinguish this from the return value of a function, which DrScheme also displays on the screen. Actually, DrScheme makes this distinction by formatting the output you ask for explicitly in purple, as opposed to the return value of the function which is printed in black.

Here are the important functions for printing in Scheme:

`(display X)`

: Displays`X`

just like the interpreter would print it. Note that`X`

could really be*anything*: a number, a symbol, a list, ...`(newline)`

: Prints a line break`(printf "string" args...)`

: Works similarly to the printf from C that you know and love. The difference is that instead of having to do`%d`

or`%s`

or`%u`

for all the different types of arguments, you only need to do`~a`

for any kind of argument.Try `(printf "Homework ~adue ~a" 4 'tomorrow) for an example.

Now that we can break referential transparency, some other basic aspects of Scheme also get screwed up. To illustrate the difficulty, say we want to write a function that prints every element in a list:

(define (print-list L) (if (null? L) ; WHAT TO DO FOR BASE CASE??? (display (car L)) (newline) (print-list (cdr L)) ; HOW TO GET ALL THIS TO HAPPEN IN THE ELSE??? ))

There are two problems here. First, what should we do in the base case? We don't want to print anything, but Scheme *requires* that there is a return value. In this case, we want to return the same thing that the built-in printing functions above do, which is `(void)`

. This is different than `null`

! One key difference is that `(void)`

is not printed out by DrScheme, where as `null`

is displayed as `'()`

.

The second issue is that we want to do more than one thing in the else case. When we have referential transparency, there's never any cause for this, because we aren't really *doing* anything other than computing the answer. But to allow for sequential execution with side-effects, Scheme has the `begin`

special form. This just evaluates each of its arguments in turn, and returns the value of the last expression.

Now we can actually write the `print-list`

function:

(define (print-list L) (if (null? L) (void) (begin (display (car L)) (newline) (print-list (cdr L)))))

**Readings for this section**: PLP Section 6.6.1 (required), and SICP Section 5.4.2 (recommended).

In any programming language, it is very important to be able to accomplish tasks *efficiently*. In a low-level language like C, this is always going to be possible, although it might require some more programming work to make it happen. But in high-level languages, especially *declarative* languages where the translation to machine code is not as obvious, we still have to make sure that our code is fast.

Consider the classic example of the Fibonacci function:

(define (fib n) (if (<= n 1) n (+ (fib (- n 1)) (fib (- n 2)))))

This implementation is correct, but really *really* slow. For example, try computing `(fib 40)`

. The problem is that there are two recursive calls, resulting in an exponential explosion of the running time.

But think about it - how many recursive calls do we really need to compute `(fib 40)`

? All we really need is the 40 previous Fibonacci values. If we could just remember them when they get computed...

Well that's exactly what the technique called *memoization* gives us. The basic idea (recall from algorithms class) is that we will save the results of every function call in a hash table. Then whenever a new call is made, we first check to see if the result is already stored, and if so just return it and avoid any computation.

Here's how we could do that in Scheme for the Fibonacci function. You'll notice that a global variable is used for the hash table. Oh yeah, there are hash tables in Scheme... did I mention that? Of course, they also destroy referential transparency. Do you see how?

(define fib-hash (make-hash)) (define (fib-memo n) (cond [(not (hash-has-key? fib-hash n)) (hash-set! fib-hash n (if (<= n 1) n (+ (fib-memo (- n 1)) (fib-memo (- n 2)))))]) (hash-ref fib-hash n))

Let's say that, for some reason, we want to sum up the squares of all the numbers from 1 up to n, for any n. Here's a decent way to write that function in Scheme:

(define (ssq n) (if (= n 0) 0 (+ (sqr n) (ssq (- n 1)))))

Now doing something like `(ssq 10)`

or even `(ssq 1000)`

works fine. But if we want to go really big, like `(ssq 4000000)`

, the interpreter runs out of memory.

Now 4 million is a big number, so maybe this is just impossible for a computer. But if you wrote this function in Python, like

def ssq(n): sum = 0 i = 1 while i <= n: sum = sum + i*i i = i + 1 return sum

and then called `ssq(4000000)`

, it would return the result in less than a second and not use more than a few kilobytes of memory. So what's going on? Is Scheme inherently slower than other languages - even other *interpreted* languages like Python - on this kind of problem?

The problem with our Scheme function is that it is recursive, and there is a memory overhead for each recursive call. As each call is made, a new activation frame is pushed onto the stack with information such as the argument values. And this is absolutely necessary, since the computation has to successively pop all these values off the stack and then add `(sqr n)`

to them before returning to the next level.

It would seem that the imperative programming way of storing the running sum in a local variable is the "right" way to do this, but how could we do this in Scheme, with referential transparency? The answer is to build this local *accumulator* variable into the recursive function as an extra argument. In each recursive call, this argument will keep track of the running sum, or whatever else we're computing. Here's how the previous function could work:

(define (ssq-helper n accum) (if (= n 0) accum (ssq-helper (- n 1) (+ (sqr n) accum)))) (define (ssq n) (ssq-helper n 0))

See the difference? In the new helper function, *there is nothing to do after the recursive call returns*. This means that the stack space for this recursive call can be re-used, without having to wait for the whole computation to finish.

More formally, we say that the recursive call in the `ssq-helper`

function is in *tail position*. Tail position means the last statement of a `let`

, `define`

, or `begin`

, or any clause of an `if`

or `cond`

, or a few other things. Since the `ssq-helper`

recursive call is in tail position within the `if`

, and the `if`

is in tail position in the `define`

, the recursive call is in tail position in the entire function. And when *every* recursive call is in tail position, the function is said to be *tail recursive*.

So what? Well this means that the Scheme interpreter can *optimize* this function call so that a bunch of extra stack space is not actually used for the recursive calls. Essentially, the Scheme interpreter will turn our recursive-looking code (tail recursive, that is) into the iterative code that we might write in C++ or Java. This is called *tail call optimization*, and it's actually a requirement of any Scheme interpreter or compiler to do it.

Here are some general guidelines for writing tail-recursive functions:

- Write a helper function with an extra
*accumulator*variable to store all the extra information associated with the recursion. - The base case return value of the helper function will just be the value stored up in the accumulator argument. (If it's a list, you will often have to
`reverse`

it.) - Make sure the helper function is tail recursive! The computation that you used to do outside the recursive call usually goes inside it now, in the accumulator argument.
- The original function now becomes very simple: it just calls the helper function. The initial value of the accumulator is what used to be returned in the base case (usually
`0`

or`null`

).