Reading

Required: These notes!
Recommended: Java in a Nutshell, TO APPEAR

Separating interface from implementation: supporting vs. enforcing

What we saw last class is that OOP views programs not as a collection of functions, but as a collection of objects, each of which combines data and functions (fields and methods, in Java parlance) in a single bundle. So what is the "interface" and "implementation" we want to separate? The interface is the collection of prototypes for the member functions (methods) we provide for outside use, while the implementation consists of the definitions of those functions, data members (fields), and any helper functions we use to make the object work, but which we don't really intend users of the class to call.

Given what we've seen so far, Java allows us to separate the interface from the implementation, but it doesn't enforce that separation. Programmers who make use of our classes can access data members (fields) and helper functions, even if we don't intend them to. This is an important point. It means that we can hope that folks stay away from our implementation, but we can't rely on it. Why does this matter? Well, for starters, it means reckless programmers can mess up our code — say by setting a data member (field) to an inappropriate value, for example setting a "distance" data member to a distance in miles when you were assuming the value was in kilometers. It also means that you cannot change your implementation without having to worry about breaking the code of folks who rely on your classes. After all, they may have been using elements of the implementation unbeknownst to you.

To truly realize the benefits of OOP (and, indeed, of separation of interface from implementation) we need to enforce this separation. The mechanism for this is access modifiers.

Access Modifiers: public, private, protected

Access modifiers allow the programmer to indicate what can be used (called, assigned to, read from) by what parts of a program in a way that is enforced not just by the compiler but, more importantly, by the JVM!

There are three basic modifiers: public, private, protected. Within the scope of the language as we know it, only public and private are meaningful. The access modifier (if present) is the very first thing in a declaration.

The golden rule: In most situations, the rule you want to follow is simple: make the class itself and all member-functions (methods) you intend outsiders to use public, make all other member-functions (methods) and all data-members (fields) private. If you do this, you have a well-defined interface (the public methods), and a well-defined implementation (everything else), and their separation is enforced by the compiler and the JVM.

Initializing objects

In order to free those using your class from remembering to call initialization routines, and in order to allow you as a class implementor to be sure that your objects never get corrupted or in a bad state (meaning that the values in data-members (fields) are somehow wrong), you need to be able to control the initialization of objects. Java has three different ways to initialize non-static (i.e. the usual kind!) data members:

A properly designed "Batter" class

Since good "data hiding" design says keep data "private", here is a good implementation of the Batter class from last lecture:
Batter.java
public class Batter
{
  private int hits;
  private int atBats;

  public void record(String outcomes)
  {
    for(int i = 0; i < outcomes.length(); i++)
    {
      if (outcomes.charAt(i) == 'h') hits++;
      if (outcomes.charAt(i) != 'w') atBats++;
    }
  }

  public double average()
  {
    return (double)hits / atBats;
  }
}

We can see one benefit of following the data-hiding stricture right away. No matter how anyone else uses this class, no matter what they do with it, no matter how they rely on it, if I change the class to this to the following, it all still works!

Batter.java
public class Batter2
{
  private int hits;
  private int outs;
  private int walks;

  public void record(String outcomes)
  {
    for(int i = 0; i < outcomes.length(); i++)
    {
      if (outcomes.charAt(i) == 'h') hits++;
      if (outcomes.charAt(i) == 'o') outs++;
      if (outcomes.charAt(i) == 'w') walks++;
    }
  }

  public double average()
  {
    return (double)hits / (hits + outs);
  }
}
How do I know that? Because the only things that changed (the fields and the definitions of methods) were things that no code outsde the class could ever make use of, access, or touch in anyway. So these changes could not possibly affect any outside code!

"static" methods and fields

Access levels (public/private) control who can access something. We now discuss a different modifier, static, that determines if a field/method is specific to an instantiated object, or generalized and not specific to any one object.

Up to this point, we've made a big deal that all member fields in a class have separate copies in each instance of the class, so if we have two variables: Point one,two; then one.x is a different variable from two.x.

There is an exception to this. We can declare a member field as static. In that case there is a single variable that is shared between all instances.

  class Point {
    int x,y;
    static int num;
  }

In this case one.num is the same variable as two.num. If I change one.num, I have changed two.num as well.

This is handy for information that should be shared across all objects of the class. Imagine you wanted to establish a unique ID number to each object. Keeping a static field for the next free ID would be handy.

But, what is the value of num initially? We can't initialize it because we don't have an object yet in order to name it. The answer is to use the name of the class not the name of a variable instance of the class. We could make num 0 by saying Point.num=0; Note that this is only valid for static fields.

Stylistically, since we can always access num via the class name instead of a variable name, it has become preferred that we always do. This way it signals to the reader of our code that this is not a regular member field.

Another use for the static modifier on fields is in conjunction with final. Final means that the value of this thing cannot be changed, which makes it good for a constant:

  class Tools {
    final double SQRT2 = 1.41421356;
  ...}
    

but the problem with this is we couldn't access this thing without creating an object of type Tools:

  Tools t = new Tools();
  double d = t.SQRT2;
    

That seems wasteful, but if static fields exist even when no objects of that type exist, thaen we can declare it as:

  class Tools {
    final static double SQRT2 = 1.41421356;
  ...}
    

and access it directly using the class name:

      double d = Tools.SQRT2;
    

We've seen the keyword static also applied to methods. A static method still is a member method of that class, but like static fields, it is not associated with any particular object. What this means is that inside this method, you cannot access any member fields that are not static. Consider adding the following to Point:

  public static int foo() {
    return x;
  }
    

If foo were not static, we could happily do one.foo(), and it would access the x field of the object one. But, since this is static, we would access this function as Point.foo(). So which object's x do we access? It's not clear!

The compiler agrees with your confusion. For this reason, we would get a compile time error: "non-static variable x cannot be referenced from a static context."

However, there is no impediment to accessing other static fields or functions:

  public static int foo() {
    return num; // the field 'num' is declared static above
  }
    

Static methods are used in 2 cases:

  1. When we only want to access static fields in the class, like the example above.
  2. When we don't need to access any of the state of an object because all the information we need is in the arguments. This is usually for utility functions that are best called in a structured/procedural way. The built in math functions are all examples of that: Math.pow(3,2);