SI413: Static vs. Dynamic Typing

Reading

Programming Language Pragmatics, Chapter 7 up through 7.1.1, plus the box at the bottom of p.310.

Homework

Study for exam, finish Project Phase II, work on labs.

Problems with no type checking

Right now our interpreter does some type checking (where?) but not a lot. So we can do some funny and less-funny things:

spl> new f := lambda x { ret := x*x; };
spl> write 0+f;
764787504
spl> write 764787504 = f;
true

This is funny ... I got the interpreter to say that a number and a function are equal. Silly interpreter! How about:

spl> new f := 10;
spl> write f@7;
Segmentation fault (core dumped)

This is decidedly less funny ... I got the interpreter to segfault. We like the idea that systems know how to print or read or otherwise manipulate objects based on their type, rather than having the programmer specify everything. Moreover, we like interpreters to recognize illegal operations and do the print-error-and-return-to-interpreter thing rather than crash. In our compiled code, we like errors like type mismatches to be found so that we don't have mysterious crashes once the code gets run. So type checking is important stuff.

What is "type"
A "Type System" provides:
1. mechanism for defining types and associating them with basic program elements, e.g. variable declarations.
2. rules for type inference - i.e. given an expression and the types of its subexpressions, infer the type of the expression.
3. mechanism for determining "type equivalence"
4. mechanism for determining "type compatability" - i.e. in a given context can two types be used interchangably
Do functions have types? Depends on whether functions are 1st/2nd/3rd class objects.
Do types have type? Dependes on whether types are 1st/2nd/3rd class objects.

Static vs. Dynamic

There are two basic ways to do type checking: static or dynamic. You should recognize those words by now! Dynamic means things that happen at runtime (e.g. as the SPL program executes). Static means things that happen prior to runtime ... that would be compile time in a compiled language and in a dynamic language that would be after parsing but before executing.

Dynamic Typing

Basically every object in the executing program includes a tag that tells you its type. In your SPL interpreter we are actually doing this already:

enum VType { NUM_T, BOOL_T, FUN_T, NONE_T };

class Value {
private:
  // The value is either an int, a bool, or a pointer to a Lambda.
  union { int num; bool tf; Closure func; } val;
  VType type;
  ...

Here the "VType type;" is tagging the value with its type. Now we can go back and make sure that every operation checks that the values it gets as input have the appropriate type, and issue errors if not. (Sounds like a good lab!) This sounds great (and it is!), but is there any downside?

Efficiency!
Type errors don't show up until a program is running (and then it might only show up with certain kinds of inputs). Errors at runtime are a whole lot worse to deal with than errors at compile time (or errors that are discovered during code analysis prior to execution for languages that aren't compiled).

Static Typing

In a statically-typed language the type of every expression is known prior to executing a program. This allows (see above) for faster and safer code. So how is static typing accomplised? Type Inference!

Type inference just means deducing the type of an expression based on the types of its constituent elements and the operators appearing in the expression. For example, in C if we see 3.25 + x and know x is a double, we deduce that because 3.25 and x are doubles, and givfen what we know about "+", the result will be a double. We can actually do type inference in our SPL interpreter. If you see new i := read * 42; we can deduce that the right-hand side is a NUM, whereas if you see new i := x OR false; we can deduce that the right-hand side is a BOOL. But .... we can't always do this! For example, in

new x := 42;	  
if read > 10 { x := lambda z { ret := z*z; } }
write x;

... what about the "x" in "write x"? We can't deduce its type!

So what needs to be added to the mix in order to do static type checking? Typically we require type annotations, which in this casee would mean declaring the type of the variable x, and only allowing assignments to x that we could guarantee would be of that type. Of course Java/C/C++ work this way. Other languages ensure that each variable has a single consistent type throughout its scope without the programmer always stating that type explicitly. In fact, starting with C++11, C++ allows this some times. Consider this example:

vector< stack<int> > V;
auto x = V[1]; // the compiler will deduce that x has type stack<int>

A crucial part of the above is that functions have types. This allows us to deduce types for expressions involving function calls. Type expressions for functions in C, by the way, can get pretty ugly. Below is a simple example:

#include <stdio.h>

int f(double x) { int a = x; return x == a ? a : a+1; }
int g(double x) { int a = x*100; return a; }

int main()
{
  int (*foo)(double); // declares "foo" to be a function that takes a double and returns an int
  int u, v;
  scanf("%d",&u);
  if (u > 0)
    foo = f;
  else
    foo = g;

  printf("%d\n",foo(3.14159));
  
  return 0;
}

Type declarations in C can get quite gruesome. Check out the left to right rule with examples from here. If you take the time to look into that, you should be able to to figure out what type foo has in the declaration below:

double* (*foo[])(int)

Is foo an array? Is foo a function? If foo is an array, what type does foo[0] have? If it's a function, what type does foo(0) have? Yikes! Using the left-to-right rule starting at foo, we get that foo is an array of objects of type "blah", where "blah" is

double* (*blah)(int)

In other words, "blah" is a pointer to a function that takes an int and returns a double*. So, foo is an array of objects, each of which is a pointer to a function that takes an int and returns a double*. So

foo[0](3)[0]

... is a double!

Type expressions the language Julia, which is a much newer language, are much easier to read, and are often deduced for you by the system rather than something the programmer has to explicitly write. Here's an example (note that Julia is interpreted, so what you see here is an interpreter session):

julia> g(x::Int32,y::Float64) = [(1,x*y + 1),(x,y)]
g (generic function with 2 methods)

julia> typeof(g(2,3.3))
Array{Tuple{Int32,Float64},1}

What you should notice here is that the interpreter deduced that the result of the call g(2,3.3) was Array{Tuple{Int32,Float64},1}.