Consider the following problem: I want to write a program that will read in a file from the Census tables of populations and population densities and simply print out a list of all the towns in the state. (For example, take a look at Maryland's geographic census data.) This is not a completely easy problem, as we'll see, in part because towns' names are not always a single word, take "East New Market", for example. So, the question is this: how do I go about designing a program that solves a non-trivial problem like this? Well, first of all, the basic structure of the program should be clear:
int main()
{
// Open census data file
// Read in (and ignore!) header information
// Loop over each row
{
// Process row
}
return 0;
}
We'll use this as a starting point. For the moment, let's make a
file "tempdata" that contains our data. Eventually
our program should allow the user to specify the data file, but
we'll worry about that later.
Let's start off easy. With this assumption, our program becomes:
int main()
{
// Open census data file
ifstream fin("tempdata");
// Read in (and ignore!) header information
// Loop over each row
{
// Process row
}
return 0;
}
Now to get us a little further, let's try to make the problem
simple by considering simple data! Let's assume that our data
file has no header information, and only a single row. We'll
worry about dealing with headers and multiple rows later.
Looking at a single row of data like
24 04625 530 0.6 0.2 870.3 2254.0 Barton townwe see that there are 7 pieces of information to be read in (and ignored) before we come to the town name, "
Barton".
So our program becomes
int main()
{
// Open census data file
ifstream fin("tempdata");
// Read in (and ignore!) header information
// Loop over each row
{
/***** Process row **************/
// Read (and ignore) 7 entries
// Read name of city/town/CDP/village
// Print name if town
}
return 0;
}
Finally, we arrive at this, which is a
complete working program that we can compile and test on our
simplified input. It seems to work. We can test it on other
rows like:
24 04000 736,014 209.3 80.8 3516.6 9108.0 Baltimore cityand it works like a champ. However, it fails for something like this:
24 05550 8,860 6.7 2.6 1329.1 3442.5 Bel Air townWhy does it fail? It fails because our program tacitly assumed that a town's name would be a single word, and that's not the case here. So we need to keep reading in words, stopping when we read in
town or city or CDP or village.
So we might modify that piece of our code to look like this:
// Read name of city/town/CDP/village
string t;
fin >> t;
while(t != "town" && t != "city" && t != "CDP" && t != "village")
{
fin >> t;
}
// Print name if town
if (t == "town")
cout << ??? << endl;
This is fine, except that when it comes time to print out the town
name, we don't know what to do. We threw away the town name in
the process of looking for city/town/CDP. What we can do is store
the town name in a new string, using the fact that for strings,
the "+" operator sticks strings together,
i.e. concatenates strings. So now we have
this, which is a
complete working program that we can compile and test on our
simplified input ... this time with town names of multiple words.
All right. Next let's remove our simplifying assumption that
says tempdata only contains one row. In other
words, lets deal with several rows of data. The key here is
going to be determining that there is no more data to read.
There's a trick to this: If you read in from an
ifstream object and you come to the end of the
file, the ifstream object goes into an error
state. And remember, if an istream is cast to a
bool, it goes to false if it's in an error state, and true
otherwise. So all we have to do is place the
fin >> s that reads in the first string of a line
in the while loop test-condition, and we'll fall out of the loop
when the file ends and there is no "next line" to read.
Here's a solution along these lines.
Next let's lift the assumption that all this header information
has been stripped out of the file. How can we read through that
garbage to get to the stuff we care about? Well, notice that
the last word in the header is "NAME", and that
it only occurs in that one place. We can keep reading in
strings until we read the string "NAME"! This is
pretty simple, it looks something like this:
// Read in (and ignore!) header information
string kill;
do {
fin >> kill;
} while(kill != "NAME");
Lastly, we need to modify our program so that the file name is
input by the user, as opposed to being hard-coded into the
program. This is done, for example, as follows:
// User enters file name
string file;
cout << "Enter name of input file: ";
cin >> file;
// Open census data file
ifstream fin(file.c_str());
if (!fin)
{
cout << "Error opening file!" << endl;
return 0;
}
And finally, here is a complete program!
John Jeffery Jones m040101 85, 95, 22, 54; Sandra Smith m040201 99, 98; Xavier Xenon Junior m050400 78, 80, 82, 84, 86, 88, 90;For this input, your program would print out
Sandra Smith had the highest average.A good way to approach this problem would be to write three successive programs:
Jones m040101 85 Smith m040201 99 Xenon m050400 78
John Jeffery Jones m040101 85 Sandra Smith m040201 99 Xavier Xenon m050400 78Hint: Remember that if
S is an object
of type string and c is an
object of type char, then c + S
is the string object you get by tacking the
character c onto the beginning of S.