#include.
This lecture we'll look into more detail about how that's
done, the organization benefits of doing it, and a few of the
pitfalls. We considered a program like this:
| point.h | point.cpp | main0.cpp |
|
|
|
~/$ g++ -c point.cpp compiles point.cpp to produce point.o ~/$ g++ -c main0.cpp compiles main0.cpp to produce main0.o ~/$ g++ -o ex0 main0.o point.o links point.o and main0.co to produce executable ex0... or you can compile and link it all at one go, like this:
~/$ g++ -o ex0 main0.cpp point.cppLarge programs can take minutes or even hours to compile from scratch. When you compile separately, a change in one .cpp means only that .cpp needs to be recompiled - for the rest, the old .o files can be reused. So separate compilation is the way to go once projects start to grow.
bool f(int); in one
place and string f(int); in another..h file
and use #include to put them in other files. This is
wisdom, pay attention to it!
Now, there's one problem that can come up, which is illustrated by the following example:
| point.h | point.cpp | main1.cpp |
|
|
|
| triangle.h | triangle.cpp | |
|
|
$ g++ -c point.cpp
$ g++ -c triangle.cpp
$ g++ -c main1.cpp
In file included from triangle.h:1:0,
from main1.cpp:2:
point.h:1:8: error: redefinition of 'struct point'
point.h:1:8: error: previous definition of 'struct point'
The problem is that this causes the struct point to be
defined twice in main.cpp. Why? Well,
point.h includes the definition, and
point.h is #included in
main.cpp, so that's the first definition of point
in main. However, main also #includes
triangle.h, which in turn #includes
point.h, which has the definition of the struct
point, so that's the second definition of point in main.
Admittedly, if the programmer was on top of things he could've
avoided this, but it's not always avoidable, and it's expecting
alot of the guy who uses structs point and
triangle to go back and worry about this stuff.
The way around this problem is to guard your
.h files with #defines.
The #include that we've been using and the
#define that we're about to use are examples
of "preprocessor directives". The #define lets
you give a string x and say "every time you see x
in the source code, replace it with string y." This can
be used and abused in all sorts of ways, but for the moment
we'll simply use #define to determine
whether or not this is the first time the compiler has seen our
.h while compiling a give
.cpp file. We can test whether or not a
certain string has been #defined already
using the #ifndef preprocessor directive,
which we read as "if not defined".
For example
cout << "This is ";
#ifndef SILLYSTYRING
cout << "not ";
#endif
cout << "much fun!" << endl;
If SILLYSTRING has been #defined
previously in our program, the word not won't get
printed out --- the #ifndef SILLYSTRING asks
if it's true that SILLYSTRING hasn't been defined. If
it is, then all the text until the #endif is seen by
the compiler. Otherwise, as in our case, it is as if the code in
between was never there.
How does this help us with our multiple-definition problem?
point.h | triangle.h | main.cpp |
|
|
|
main.cpp we first
#include the file point.h. Since this
is the first time the compiler has seen point.h
it'll find that POINTHEADER has not been defined.
Therefore all the code until the #endif, which is
essentially the whole file, will be seen and evaluated by the
compiler. This will include the line that #defines
POINTHEADER. Now, the next time we come across
point.h, which is when main.cpp does the
#include "triangle.h", the compiler will hit that
#ifndef line and will find that
POINTHEADER has been defined, and therefore
everything up to the #endif, which is essentially
the whole file, will be ignored by the compiler. This way we
never get any multiple definitions.
Golden rule number 2: All .h files should
be "protected" by #ifndefs this way!
C++ allows the programmer to make new types that really act like the builtin types, by which I mean that the usual operators, like *, <, ++, work with the new types, and I/O with << and >>, and so on.
There's actually some controversy about this, i.e. about whether or not it's a good idea to let programmers do this. Some people feel that it leads to hard-to-follow code.
In any event, it's not going to be a major
theme for this course, but I'd like to show it to you so that you
understand that we really can build new types in C++ if we want
to. Also, I should note, this is why + and >> and so on
work with C++ string objects. The implementers of the
string library defined all these operators for their nice string
type.
q to quit.
|
First solution. My first solution is a
pretty simple program.
The meat the program is: |
Operator overloading.
While it may be simple, it would be nice to be able to write the function as if
point were a built-in type, meaning that I could add points
p and m by saying p + m.
|
|
|
+ to point objects. Doing
this is quite easy once you understand the following:
a + b is just the same as the function call
operator+(a,b) in C++.
So if you want to tell the compiler what + means for two
point objects, you need to define the function
operator+(point a,point b) --- i.e. overload the
+ operator for points. The prototype is
clear:
point operator+(point a, point b);
... at least I hope it's clear that we should return a point when
we add two points. The function definition is ... just like any
other function definition:
point operator+(point a, point b) {
point S;
S.x = a.x + b.x;
S.y = a.y + b.y;
return S;
}
A Π B,
where "Π" stands for some operator, then that is equivalent
to a function call operatorΠ(A,B).
So, to subtract two points we'd define
point operator-(point A, point B);
... and to compare two points with less than we'd define
bool operator<(point A, point B);
... or to multiply a point (on the left) by a real number (on
the right) we'd define
point operator*(point A, double w);
point struct over and over, and
you'll like being able to add points. Wouldn't it be nice to
define the midpoint function like this:
point midpoint(point a, point b) {
return (a + b)/2.0;
}
Now, in addition to defining operator+ for two
point objects, what else would you need?
Well, (a + b) is an object of type
point, and I'm dividing it by an object of type
double, so I need to define
operator/(point,double). What type of object should
be returned here?
point operator/(point P, double z) {
point Q;
Q.x = P.x / z;
Q.y = P.y / z;
return Q;
}
istream& operator>>(istream &in, point &A) {
char c;
return in >> c >> A.x >> c >> A.y >> c;
}
ostream& operator<<(ostream &out, point A) {
return out << '(' << A.x << ',' << A.y << ')';
}
The prototypes of these two should actually make some sense.
For example, we've talked before about how cin >> x
actually evaluates back to cin.
The other kind of array is static. A static array is local to the scope in which it is declared. It also means you can't pass around pointers to the array, so you won't be returning or sending static arrays to and from functions. Common practice is to use static arrays when you know the size of the array at compile-time. It's easy to create them inside of a struct, if you know you always need 10 ints, for instance.
Dynamic arrays are more general - anything you can do with static arrays you can do with dynamic arrays, and then some. However, in some instances static arrays are simple - for example if you wanted to hardcode the names of the days of the week into a program. You know the size will be 7, so there's no point in making a dynamic array.
Creating an Array of 6 ints |
|
| Static Array | Dynamic Array |
|
|
Consider a program that uses an array to store the vertices of a quadrilateral. Since we know that a quadrilateral always has exactly four vertices, we could use static arrays.
struct Quad {
char label;
point vert[4];
};
A is a
static array, the pointer A cannot be changed. The
contents of the array to which it points can, of course, be
changed. But not the pointer itself. Other differences are
best illustrated by an example.
To understand the difference between this version of
Quad and the previous version, consider this picture:
You see that in the static array version the array of vertices is
embedded in the Quad object. In the dynamic version,
the pointer is embedded in the object, while the array is outside
of the object, somewhere else in memory.
Compare the static array version of main
with the dynamic array version of main.
The above picture really tells you all you need to know to understand the difference between using static and dynamic arrays ... when you really can use static arrays.
Let's look at one example to see what consequences arise from this picture.
Quad S;
... // S has 'Q' and (0,0), (1,0), (1,1), and (0,1)
print(S); // It will print Q (0,0) (1,0) (1,1) (0,1)
Quad R;
R = S; // Copying will have different meanings!
R.label = 'P';
for(int i = 0; i < 4; i++)
R.vert[i].x++;
print(S);
print(R);
Q:
Suppose that I have a Quad object S that contains the
label 'Q' and the vertices (0,0) (1,0) (1,1) (0,1). I then print out
S and then R. What will I get?
A:
It depends whether I'm using the static version of
Quad or the dynamic version. (Drag your mouse for answers)
| Static Version | Dynamic Version |
Q (0,0) (1,0) (1,1) (0,1)
Q (0,0) (1,0) (1,1) (0,1)
P (1,0) (2,0) (2,1) (1,1)
|
Q (0,0) (1,0) (1,1) (0,1)
Q (1,0) (2,0) (2,1) (1,1)
P (1,0) (2,0) (2,1) (1,1)
|
Why the difference? Look at the picture!
| Dynamic Version | Static Version |
|
|
This is not a reason to use static over dynamic, but it is a good example of how and why they behave differently.
Here's another example:
Q:
How does swap(A,B) behave differently for
two Quads, A and B,
with the dynamic versus static array versions of
Quad?
A: Once again, the picture should tell you that while the result it the same, a lot more work gets done in the static case, where the entire contents of the arrays are swapped, rather than simply the pointers.
int prime[10] = {2,3,5,7,11,13,17,19,23,29};
Note: this is purely about static arrays, it doesn't
concern structs at all.
Here's my solution.