Unit 9: Variables

This is the archived website of SI 413 from the Fall 2012 semester. Feel free to browse around; you may also find more recent offerings at my teaching page.

This unit is about some more aspects of variables, beyond their scope, that vary between different programming languages. We will see some different options for how things work, including perhaps some surprising revelations about languages you know and love. Remember to always ask yourself, why do you think the language designers made the choices they did? There are probably good reasons, even if you disagree or find it annoying!

Beginning of Class 21

1 Assignments

Every language that has variables must have some way of associating values with those variables. This is called an assignment, and we have seen some varying syntaxes for it:

new x := 10;
x := 5;
(define x 10)
(set! x 5)
int x = 10;
x = 5;

This might seem like a completely obvious thing, and you might be wondering what more you could possibly learn about it. Well, you might be surprised! In different languages, what an assignment does can have different meanings (having to do with how things are stored and when things are copied). Furthermore, what kinds of things are allowed on either side of an assignment varies widely between languages. Why this variation, and what kinds of things are allowed?

1.1 Model of variables

Consider the following C++ program:

int x = 5;
int y = x;

After executing this program, of course x will equal 6, but y will still equal 5! This is because C++ uses the value model of variables by default, which means that the values are copied in memory every time there is an assignment. So the second line (assigning x to y) causes there to be two "5"s stored in memory, only one of which is changed in the third line.

The same exact thing will happen in Java if we run the same program. But what about this Java code snippet?

ArrayList<String> a = new ArrayList<String>();
ArrayList<String> b = a;

After this, a is clearly an ArrayList that contains exactly one String, "boo". But what about b? Does it contain "boo" too, or is it empty? The answer is that, in Java, a and b will share the same memory here, and so they both contain "boo". There is really only a single ArrayList in memory in this program, and a and b are just two references to it. This is because, for Objects like ArrayList, Java follows the reference model of variables, which means that assignments don't cause the values to get copied, but just makes the new name point to the same thing as the old name.

We can make C++ follow the reference model too, if we want, by using reference variables. Here's the example above, with reference variables:

int x = 5;
int &y = x;

After executing this code, both x and y will equal 6, because they're both just references to the same variable in memory!

1.2 l-values and r-values

Every assignment statement has two parts: a left-hand side and a right-hand side. Any kind of code that could go on the right hand side of an assignment is called an r-value. Similarly, anything that could appear on the left is called - you got it - an l-value.

R-values are pretty straightforward: in most languages, any expression (i.e., any code that returns a value) can be an r-value.

The more interesting question is about l-values. What can be an l-value? It depends on the language. Here are some options:

We saw more examples than this in class, and there are more kinds of things still in C++. Part of your homework is experiment, research, and demonstrate a few more of these to me.

Beginning of Class 22

Looking at how assignments can be used in the context of a program provides a nice way to talk about l-values in r-values. In Scheme, a definition such as

(define x 5)

is purely a statement, not an expression of any kind. So it can't be used as an r-value or an l-value. The same is true in our SPL language; an assignment does not return a value.

In many languages, an assignment is an expression and therefore can be used as an r-value. Typically the returned value is whatever value just got assigned. In Java, for example, we can write:

x = (y = 0);

as a shorthand way of setting y to zero, and then setting x to zero (assuming they have already been declared). Actually, the parentheses in the code above are unnecessary, as the assignment operator = is right-associative. Remember what that means?

It is relatively rare, but not unheard-of, for assignments to be valid l-values in a language. Although the following is not valid in Java or Python, we can do this in C++:

dancpp (x = 1) = 2; (again, assuming x has already been declared). What this does is first assign x to the value 1, then re-assign the same variable x to 2. Why would you ever want to do this? I can't think of a good reason, really. But it's valid in the language because C++ returns a reference from an assignment statement, and allows any reference to be used as an l-value in an assignment. You might not want to write code like this, but if you are writing a C++ compile, you had better be able to handle it!

1.3 Constants, immutables, clones

There are two distinct but unrelated concepts that I want you to understand:

We started this unit by talking about the reference and value models of variables. Immutables provide a mechanism for treating a reference model as if it is a value model. Again, in the example of Strings in Java, we know that since Strings are objects, Java uses a reference model of variables in assignments of Strings. So for instance, the code

String a = "I only live once";
String b = a;

results in b and a pointing to the same memory location. But since Strings are immutable in Java, the programmer can behave as if a and b each had their own copy of the string, since there won't be any difference either way.

Clones provide a mechanism for using a value model of variables against the default of the language when immutables are not an option. Really, this is just a fancy name for copying; since the value model of variables means every assignment makes a copy, we can simulate this by manually copying something when it gets assigned. In Java, the Cloneable interface indicates that an object supports the clone method to make a copy of itself. You can see examples of this in the slides for this unit or in any number of places online.

2 Types

Besides assignments, the other thing that (almost) always comes along with having variables in a programming language is types. We all know informally that a type is something like int or String that indicates what kind of thing some value is in a program. More formally, the type of something is the information about how it can be used and combined with other values in a program. As with assignments, there are a plethora of options when it comes to types, and we will explore a few of them.

It's important to remember that just because types are not declared does not mean that a language has no type safety! For example, the following program is invalid in Python or in C++:

x = 5;

(The same variable x is being assigned to an integer and then accessed like a function.) The difference is that in C++, x would have to have been declared as either an int or a function, and then one of these lines would raise a compile-time error. In Python, this error would not be realized until this part of the code is actually reached by the running program.