# Structs

## Compelling introduction

This lesson begins a big new topic for us --- user defined types, or structs as they are called in C and C++. This is something that you have, I hope, already felt the need for. Let's consider a few example problems:
• Suppose I want a function `midpoint` that takes two points and returns their midpoint.
• Suppose I want to read in a list of 20 Midshipmen names and alpha codes, and print out the Midshipmen names ordered by class year.
• Suppose I want to store a bunch of student names along with their grades on 10 homework assingments.
Each of these are things we can do (think about how), but only with difficulty. The problem is that in each case we are working with "physical" objects that do not have a corresponding built-in type in C++.
• It would be natural to write a midpoint function if there were a type called `point` that encapsulated both the x and y coordinates --- it's prototype would be `point midpoint(point a, point b);`.
• It would be natural to sort 20 Midshipmen ordered by alpha codes (which would order by class year) if there were a type `mid` that encapsulated both alpha code and name --- I'd have an array `mid *A = new mid[20]`.
• Finally, it'd be natural to store student names along with homework info if there was a type `student` --- I'd just store it in an array of `student` objects.
Clearly, all of these problems scream out for the ability of the user to wrap up one or more existing types into one package and call it a new type. In C++, struct is the mechanism that allows you to do this.

## The simple example of `point`

Let's take the simple example of our midpoint function. We decided that the existence of a type `point` would make such a function simple and natural. We need to wrap up a `double` for the x-coordinate and a `double` for the y-coordinate into a single object of a new type - `point`. Here's how that's accomlished in C++:
``````
struct point {   // <-- Declares a new type called "point"
double x, y;   // <-- Says that a "point" contains two doubles named x and y
};               // <-- Don't forget the semicolon!
``````
This struct definition, like function definitions, appears outside of `main` or of any other function definitions, and it must appear before you try to use an object of type `point`. From the point of this definition onwards you can use `point` as a new type. If you want to access the `double x` within a `point` object named `P`, you write `P.x` --- note that `P.x` is an object of type double, so anything you can do with a double you can do with `P.x`! Moreover it is an l-value, it can be assigned to, passed by reference, etc. The objects packeged together in a new `struct` are called data members. We'll start off simple by creating an object of type `point`, reading values into the object, and printing it out:

#### Quick check

Consider the code on the left.
1. What is the type of `P`?
2. What is the type of `P.x`?
3. Is `P` an l-value?
4. Is `P.x` an l-value?
```1. point    2. double    3. yes    4. yes
```
``````
int main() {
// Creates an object P of type point
point P;

// Reads & stores coordinate values
cout << "Enter x-coord: ";
cin >> P.x;
cout << "Enter y-coord: ";
cin >> P.y;

// Writes out point P
cout << "Point is (" << P.x
<< ',' << P.y << ")" << endl;

return 0;
}
``````
One very important thing to note here is that we can't say `cin >> P` in order to read into `point` `P`. Nor can we say `cout << P` in order to write out `point` `P`. Why? Because `cin` and `cout` know nothing about the type `point`! On the other hand, `cin >> P.x` works perfectly well, because `cin` is just reading into a `double`, which we know it does just fine. The same restrictions apply in C with `scanf`. Now, let's look at defining the function `midpoint`:
``````
point midpoint(point a, point b) {
point m;
m.x = (a.x + b.x)/2;
m.y = (a.y + b.y)/2;
return m;
}``````
Hopefully this code is pretty much self-explanatory. Notice that by wrapping up two `double`s in the type `point` I can, in a sense, return two objects from a function! Take a look at this complete program that reads two points from the user and prints out their midpoint.
Note: The diagram on the right illustrates the call stack when we make a call like `midpoint(P,Q)`.

Q: What is the type of `midpoint(P,Q)`?

`Answer: point`

## What does the compiler know how to do with our new types?

#### cin and cout don't work

As we just saw, when we define a new type using `struct`, `cin` doesn't know how to read that type, and `cout` doesn't know how to write that type. In fact, the compiler doesn't know how to do any of the implicit or explicit conversions with your new types, so that neither `double(P)`, where `P` is an object of type `point`, nor `point(j)`, where `j` is an object of type `int` will be recognized.

#### assignment and pass-by-value work

The only things the compiler knows how to do with your user defined type are assignment using the `=` operator, and copy for pass-by-value arguments in function calls. These are done by assigning/copying each data member independently.

#### scoping rules and creation with new

Most importantly, however, objects of user defined type are created and destroyed and passed around just like any other type: scoping rules are the same, creation with `new` is the same, parameter passing is the same, parameter type matching for function overloading is the same ... all of these things you've already learned still apply.

Note: The `string`, `ifstream`, and `ofstream` objects that we've already been using are `struct`s rather than built-in types.

## Heterogeneous Data

Although it would be painful, we could imagine implementing our `midpoint` function with arrays of two `double`s --- believe me, it'd be painful! Where things really get interesting is when we wrap up objects of different types in one object, because there we really can't use arrays like we could for points. Let's think about our example of Midhipmen names and alpha codes. We might define a `mid` as follows:
``````
struct mid {
int alpha;
string first, last;
};``````
Notice that this struct has three data members, one of type `int` and two of type `string`. Let's consider using this type in the following problem: The file `Mids.txt` contains the names and alpha codes of the Midshipmen in my two sections. I want to write a program that will read that data and store it in an array for later processing.

To test what I've done, we'll simply allow the user to enter an alpha, and we'll return the name of the Mid with that alpha, or an error message if none is found. Creating the array and reading in data from the file is easy:

Quick check: Consider the code on the left.
1. What is the type of `new mid[41]`?
2. What is the type of `A`?
3. What is the type of `A[i]`?
4. What is the type of `A[i].alpha`?
5. What is the type of `A[i].first`?
6. What will be the value of `A[0].alpha`?
7. What will be the value of `A[0].last`?
```1. mid*     2. mid*     3. mid        4. int
5. string   6. 166030   7. SHERIDAN
```

``````
// Create and array of 41 Mids
mid* A = new mid[41];

// Open file
ifstream fin("Mids.txt");

for(int i = 0; i < 41; i++)
fin >> A[i].alpha >> A[i].last >> A[i].first;
``````

By the way:
C++ has two names / keywords for user-defined types: `struct` and `class`. They are more or less the same (except that in structs members are public by default, and in classes they are private by default.

Of course that distinction won't make any sense at this point, but in case you come back and read this later ...), but historically come from different places. C++ gets `struct` from C, the language that C++ extends. The term `class` comes from "object-oriented programming", which is a style of programming C++ supports, but which we do not cover in this course.

We are going to use the keyword `struct` to get you familiar with what you'll see in the context of C programming in your Systems Programming course. Of course this means that in your Objected Oriented Programming course (taught using Java) the keyword `class` will be familiar.

## Problems

1. Write a program that reads in three points describing the vertices of a triangle and computes the midpoint triangle they define, i.e. the triangle whose vertices are the three midpoints of the previous triangle. A typical run of your program should look like:
```Enter triangle vertices: (0,0) (0,1) (1,0)
Midpoint triangle verts: (0,0.5)(0.5,0.5)(0.5,0) ```
Notice how my solution defines functions for writing and reading points!
2. How about something completely different: organizing data on our congressional representatives. This tab-separated values (tsv) file contains a bunch of information about all 538 members of the current House and Senate (source). Some of the code to read in this data is in this C++ program, but it's right now limited to just sorting by first name. Use a `struct` to write a program that prints out the 10 youngest congresspeople, first and last names. Here is my solution.