SI413:

Reading

Section 3.3 of Programming Language Pragmatics

Homework

NO HW.

Single Global Scope (review)

We discussed this in detail in our previous class. This is what we might expect from, for example, a simple calculator program. From an implementation side, this means that we keep a single global symbol table --- i.e. a structure that maps names to values.

Dynamic Scope - Central Reference Table (review)

We discussed this in detail in our previous class and you are implemented (a slightly simplified version) of this for lab. From the semantics side, dynamic scope means that a non-local reference of a name 'x' in function 'f' refers to the binding of 'x' that was in scope when the function 'f' was called. From the implementation side, this means that we keep track of bindings in a single global structure that maps names to stacks of values. This data strucure is called a central reference table. This is actually a very simple mechanism, from an implentation perspective, which is nice. The only complexifying issue is that we need to track when variables come to the end of their scope so that we know to pop them off the stack. This is the thing I did not ask you to do in lab. It requires a step in between constructing the AST and actually eval/exec-ing the AST. This "analysis" stage traverses the AST finding "new" statments with a parent block node, and adding a new "endscope" node to the end of the block that reminds the block which names to pop. I decided this was a bit more work than I wanted for the lab, but maybe it's too bad we didn't do it. First off, then we'd really have a full dynamic scope implementation ... which is cool. Second of all, the idea of analysing code by traversing the AST is really important!

Lexical Scope - (review)

The semantics of lexical scope are that non-local references resolve to the binding that was in scope at the point at which the function was defined (as opposed to when the function was called). Note that the languages you're used to all use lexical (aka static) scoping. So how about implementation? Well ... that's tricky. We'll look at several different scenarios

Lexical Scope - Locals and Globals only

Recall that Standard C allows only local scope, i.e. the local variables in a function, and global scope, i.e. one global symbol table. In other words, the only non-local references that are allowed are global variables. Pragmatically, this means that you cannot define functions within other functions (this is often referred to as "nested functions"). By limiting functions to being defined only in the global scope, C ensures that non-local references can only refer to global variables under the lexical scoping rules. So presto - local and global scope only! But how to implement this?

Easy ... this is the good old call stack you know and love! Each function call results in a new record being pushed to the call stack. Each record on the stack is its own scope - basically its own symbol table. Give a name, either that name is a local binding, i.e. you get the value from the mini-symbol table that is the function call stack record, or it is global. Those are the only options. We can see this in action with the following program:

int y = 10;

int f(int x) { return x + y; }

int g(int z) {
  int x = 5;
  return f(x+1);
}

int main() {
  g(2);
}

Lexical Scope - nested functions

The good old call stack as per the previous section is not quite enough when we allow nested functions, i.e. functions defined within functions. Pascal, for example, allows this, as does gcc's default (and non-standard!) C complier. With nested functions we may have non-local references that are not global. Concretely, consider the following modification of the program from the previous section, where we've nested the defintion of f within the definition of g, and we've changed f to return x + y + z.

int y = 10;

int g(int z) {
  int x = 5;
  int f(int x) { return x + y + z; }
  return f(x+1);
}

int main() {
  g(2);
}

The interesting part to this is that the "z" referred to in f's return statement is not a local binding, nor is it a global binding. The semantics of lexical scope are that we must use the binding that is in scope when f is defined, and f is defined inside g. This means the z binding we are referring to is the parameter to the function g. So our good old call stack doesn't work, since we won't find the binding for z in f's call stack record nor in the global symbol table. We need a little more.

The way lexical scoping is usually handled when nested function definitions are allowed is that the call stack record for a function with a defintion that is nested within another function (like f is nested in g in the above example) is augmented with a pointer to the stack record for the function call in which it was defined. So at the point at which x + z + z is evaluated the picture looks like this:

So now when we evaluate x + y + z, the x is local (i.e. we find it in the call stack record for f), the z is not local, but we follow the parent pointer down and find it in the call stack record for the parent (i.e. g), and y is not local, nor do we find it in the parent, so it must be global --- and we find it in the global symbol table.

So life is good! We can implement lexical scope with nested scopes - which are really functions that get created at runtime, which is cool! However, true first class functions would mean also being able to return the function f from g. Note, not returning the result of calling f, rather actually returning the function f. And what we'll see next class is that this causes serious trouble! So we're not done with lexical scope yet!