Due: 0800 next Tuesday

Step 1: Word Count and Average Word Length

In this lab, you're going to do some basic file analysis to some famous books. Download this tarball to a lab04 directory, and untar it with the command
tar xzf books.tgz

Write a program in a file called part1.cpp which prompts the user for a filename of a text file, and prints out the total number of words in the file, as well as the average word length.

The length of a string s can be found with the command s.length(). For example, to read in a string from a user and print out its length, you would run:

string s;
cin >> s;
int lengthOfWord = s.length();
cout << lengthOfWord << endl;

An example run of the program is shown here:

~$ ./part1
Enter a filename: shortExample.txt
Word count: 12
Average word length: 4.3333

~$ ./part1
Enter a filename: prideAndPrejudice.txt
Word count: 124749
Average word length: 4.70193
Warning:

If you have word count 13 for shortExample.txt, you are counting the last word twice! Go back to notes on Class 09 (see the section of "when a file ends") and figure out why, and fix your code!

If you're finished before the lab end, demo to your instructor. Also submit with the command

~/bin/submit -c=IC210 -p=lab04 part1.cpp

Step 2: Sentence Length

We'll assume sentences always end with one of the following:
. ! ?
and that only ends of sentences contain those characters (this may not be entirely accurate, but is good enough for our purposes).

Length of a sentence

For example, the following shows a sentence of length 2.
 Hello World!
In other words, the length of a sentence is the number of words in a sentence.

Your Task

In a file called part2.cpp, augment your solution to part 1 to additionally output the average sentence length.

string::find() function

You can find if a string s contains some string, using find() function. For example, if you want to find if string s has string "he", you can use the command s.find("he").

It will return an int. In particular:

For example, consider this code:
string s;
cin >> s;
if (s.find("he") == string::npos)
  cout << "Not found..." << endl;
else
  cout << "Found!" << endl;
If string s is "world", the code will output "Not found...". On the other hand, if s is "hello", the code will output "Found!".

So, using this find function, one can check if a string has a "." as follows:

string s;
cin >> s;
if (s.find(".") != string::npos)   // we use != this time 
  cout << "Found!" << endl;
else
  cout << "Not found..." << endl;

Example run

Here is an example run:
~$ ./part2
Enter a filename: prideAndPrejudice.txt
Word count: 124749
Average word length: 4.70193
Average sentence length: 17.1264

If you're finished before the lab end, demo to your instructor. Also submit with the command

~/bin/submit -c=IC210 -p=lab04 part1.cpp part2.cpp

Step 3: Multiple Files

Finally, we'll augment your part 2 solution in a file called part3.cpp so that this will run multiple times, until the filename entered is "quit". For example:
~$ ./part3
Enter a filename: prideAndPrejudice.txt
Word count: 124749
Average word length: 4.70193
Avg sentence length: 17.1264

Enter a filename: shortExample.txt
Word count: 12
Average word length: 4.3333
Avg sentence length: 4

Enter a filename: quit
~$

If you're finished before the lab end, demo to your instructor. Also submit with the command

~/bin/submit -c=IC210 -p=lab04 part1.cpp part2.cpp part3.cpp