Structs

Compelling introduction

This lesson begins a big new topic for us --- user defined types, or structs as they are called in C and C++. This is something that you have, I hope, already felt the need for. Let's consider a few example problems:

Suppose I want a function midpoint that takes two points and returns their midpoint.
Suppose I want to read in a list of 20 Midshipmen names and alpha codes, and print out the Midshipmen names ordered by class year.
Suppose I want to store a bunch of student names along with their grades on 10 homework assingments.

Each of these are things we can do (think about how), but only with difficulty. The problem is that in each case we are working with "physical" objects that do not have a corresponding built-in type in C++.

It would be natural to write a midpoint function if there were a type called point that encapsulated both the x and y coordinates --- it's prototype would be point midpoint(point a, point b);.
It would be natural to sort 20 Midshipmen ordered by alpha codes (which would order by class year) if there were a type mid that encapsulated both alpha code and name --- I'd have an array mid *A = new mid[20].
Finally, it'd be natural to store student names along with homework info if there was a type student --- I'd just store it in an array of student objects.

Clearly, all of these problems scream out for the ability of the user to wrap up one or more existing types into one package and call it a new type. In C++, struct is the mechanism that allows you to do this.

The simple example of `point`

Let's take the simple example of our midpoint function. We decided that the existence of a type point would make such a function simple and natural. We need to wrap up a double for the x-coordinate and a double for the y-coordinate into a single object of a new type - point. Here's how that's accomlished in C++:


struct point {   // <-- Declares a new type called "point"
  double x, y;   // <-- Says that a "point" contains two doubles named x and y
};               // <-- Don't forget the semicolon!

This struct definition, like function definitions, appears outside of main or of any other function definitions, and it must appear before you try to use an object of type point. From the point of this definition onwards you can use point as a new type. If you want to access the double x within a point object named P, you write P.x --- note that P.x is an object of type double, so anything you can do with a double you can do with P.x! Moreover it is an l-value, it can be assigned to, passed by reference, etc. The objects packeged together in a new struct are called data members. We'll start off simple by creating an object of type point, reading values into the object, and printing it out:

Quick check

Consider the code on the left.

What is the type of P?
What is the type of P.x?
Is P an l-value?
Is P.x an l-value?

Answers are given below (drag your mouse to see the answer).

1. point    2. double    3. yes    4. yes


int main() {
  // Creates an object P of type point
  point P;

  // Reads & stores coordinate values
  cout << "Enter x-coord: ";
  cin >> P.x;
  cout << "Enter y-coord: ";
  cin >> P.y;

  // Writes out point P
  cout << "Point is (" << P.x
       << ',' << P.y << ")" << endl;

  return 0;
}

One very important thing to note here is that we can't say cin >> P in order to read into point P. Nor can we say cout << P in order to write out point P. Why? Because cin and cout know nothing about the type point! On the other hand, cin >> P.x works perfectly well, because cin is just reading into a double, which we know it does just fine. The same restrictions apply in C with scanf. Now, let's look at defining the function midpoint:


point midpoint(point a, point b) {
  point m;
  m.x = (a.x + b.x)/2;
  m.y = (a.y + b.y)/2;
  return m;
}

Hopefully this code is pretty much self-explanatory. Notice that by wrapping up two doubles in the type point I can, in a sense, return two objects from a function! Take a look at this complete program that reads two points from the user and prints out their midpoint.
Note: The diagram on the right illustrates the call stack when we make a call like midpoint(P,Q).

Q: What is the type of midpoint(P,Q)?

Answer: point

What does the compiler know how to do with our new types?

cin and cout don't work

As we just saw, when we define a new type using struct, cin doesn't know how to read that type, and cout doesn't know how to write that type. In fact, the compiler doesn't know how to do any of the implicit or explicit conversions with your new types, so that neither double(P), where P is an object of type point, nor point(j), where j is an object of type int will be recognized.

assignment and pass-by-value work

The only things the compiler knows how to do with your user defined type are assignment using the = operator, and copy for pass-by-value arguments in function calls. These are done by assigning/copying each data member independently.

scoping rules and creation with new

Most importantly, however, objects of user defined type are created and destroyed and passed around just like any other type: scoping rules are the same, creation with new is the same, parameter passing is the same, parameter type matching for function overloading is the same ... all of these things you've already learned still apply.

Note: The string, ifstream, and ofstream objects that we've already been using are structs rather than built-in types.

Heterogeneous Data

Although it would be painful, we could imagine implementing our midpoint function with arrays of two doubles --- believe me, it'd be painful! Where things really get interesting is when we wrap up objects of different types in one object, because there we really can't use arrays like we could for points. Let's think about our example of Midhipmen names and alpha codes. We might define a mid as follows:


struct mid {
  int alpha;
  string first, last;
};

Notice that this struct has three data members, one of type int and two of type string. Let's consider using this type in the following problem: The file Mids.txt contains the names and alpha codes of the Midshipmen in my two sections. I want to write a program that will read that data and store it in an array for later processing.

To test what I've done, we'll simply allow the user to enter an alpha, and we'll return the name of the Mid with that alpha, or an error message if none is found. Creating the array and reading in data from the file is easy:

Quick check: Consider the code on the left.

What is the type of new mid[41]?
What is the type of A?
What is the type of A[i]?
What is the type of A[i].alpha?
What is the type of A[i].first?
What will be the value of A[0].alpha?
What will be the value of A[0].last?

Answers are belew (drag your mouse).

1. mid*     2. mid*     3. mid        4. int    
5. string   6. 166030   7. SHERIDAN


// Create and array of 41 Mids
mid* A = new mid[41];

// Open file
ifstream fin("Mids.txt");

// Read in mids
for(int i = 0; i < 41; i++)
  fin >> A[i].alpha >> A[i].last >> A[i].first;

By the way:
C++ has two names / keywords for user-defined types: struct and class. They are more or less the same (except that in structs members are public by default, and in classes they are private by default.

Of course that distinction won't make any sense at this point, but in case you come back and read this later ...), but historically come from different places. C++ gets struct from C, the language that C++ extends. The term class comes from "object-oriented programming", which is a style of programming C++ supports, but which we do not cover in this course.

We are going to use the keyword struct to get you familiar with what you'll see in the context of C programming in your Systems Programming course. Of course this means that in your Objected Oriented Programming course (taught using Java) the keyword class will be familiar.

Problems

Write a program that reads in three points describing the vertices of a triangle and computes the midpoint triangle they define, i.e. the triangle whose vertices are the three midpoints of the previous triangle. A typical run of your program should look like:
```
Enter triangle vertices: (0,0) (0,1) (1,0)
Midpoint triangle verts: (0,0.5)(0.5,0.5)(0.5,0) 
```
Notice how my solution defines functions for writing and reading points!
How about something completely different: organizing data on our congressional representatives. This tab-separated values (tsv) file contains a bunch of information about all 538 members of the current House and Senate (source). Some of the code to read in this data is in this C++ program, but it's right now limited to just sorting by first name. Use a struct to write a program that prints out the 10 youngest congresspeople, first and last names. Here is my solution.