This lab is a subset of a project developed by Dr. Chambers. You will write a program that allows the user to interactively search through a dataset of over 180,000 Twitter posts. Here is a sample run:
~/$ java Lab06 alltweets.txt
188671 tweets
> filter Potter
37 tweets
> filter Ron 
1 tweets
>  dump
This is a ficticious post about Harry Potter and Ron Weasley.	dbrown88	2013-08-12
1 tweets
> reset
188671 tweets
> filter Timberlake
10 tweets
> filter! Justin
2 tweets
>  dump
@adamaviv J. Timberlake is da bomb! lmcdowell 2015-01-27
Staying at Timberlake Lodge while I ski. adamaviv 2015-02-12
2 tweets
> quit

As you can see, the basic commands are: dump, quit, reset, filter and filter! — the distinction between "filter" and "filter!" being that "filter" keeps everything that has the keyword, while "filter!" keeps everything that doesn't have the keyword.
Note: The filter and filter! commands only look for a match in the tweet message text, i.e. not in the user name or date!

My real goal for you, aside from having some fun, is to make use of inheritance to solve problems, to understand how it allows you to add or modify implementations without access to the underlying implementation, and to face situations where you have to make good decisions about where to put functionality.

Note: this lab is to be done in pairs. In this case, that'll mean two people and one keyboard. You will take turns typing, and all files will have both of your names in them. I want you to think hard about design decisions: what code goes in which class? What functionality (interface) should each class provide? How can I maximize code reuse? Am I following the principles of encapsulation and information hiding? Am I making good use of inheritance? Talk about these things between the two of you as you work. Discuss first, code second.

The Data

We are providing to you almost 200 thousand actual public tweets from Twitter. Your program will read them all, and provide a search interface. Dr. Chambers has a continual feed into Twitter that downloads millions of tweets every day. This is just a fraction of a fraction of the data that we have here in the CS department. We use it to do fundamental research in artificial intelligence and information extraction, such as finding correlations with presidential approval polls (some of your elder students have conducted such research). This short project will let you have some fun poking around the data.
one: You may not distribute this project's data to anyone beyond the USNA. Our agreement with Twitter prevents copying and distributing.
two: This is raw, real world data. The standard disclaimer applies as it does whenever we step out onto the Web. You may come across offensive material. Please behave like mature adults and future officers, as appropriate.
We have two files for you, "alltweets.txt" (the whole data set), and "sometweets.txt" (a small subset for testing). Download them to your current directory (hopefully called something like "lab06") with the following commands:
curl -O http://intranet.cs.usna.edu/~nchamber/ic211/sometweets.txt
curl -O http://intranet.cs.usna.edu/~nchamber/ic211/alltweets.txt

Code that you MUST base your program off of

You must use as a foundation for your program two classes that we have written for you: Tweet and TweetQueue. The twist is, that I am not giving you access to the source code (.java files) for these classes, only the compiled bytecode (.class files) ... and some really nice documentation. Note, as you look through the documentation, that TweetQueue includes the nice iterator interface we went over in class on Wednesday.

Part 1: Reading and writing

In this section you'll be implementing the commands "dump" and "quit". You will doubtless want to create at least one new class by extending Queue to do this. Here are some requirements:
  1. Your program (which must start from a class named Lab06) must print a prompt (including # of tweets) exactly as shown in the example output.
  2. The user can give the "dump" commands over and over again and should see all the data each time.
  3. The input file (given as a command-line argument) can only be read once by the program, and if no argument is given, the program must print a nice usage message and exit. Similarly, if the file can't be opened for some reason, an error message should be printed.
  4. You must use the Tweet and TweetQueue classes. You should not use any linked-list (or array) code other than TweetQueue. You can build off of it, of course.
Note: no lab submission violating these rules will be accepted.

Part 2: Filter and Filter!

For this part you will add the commands "filter", "filter!" and "reset". Note: The filter and filter! commands only look for a match in the tweet message text, i.e. not in the user name or date! Here are some requirements and helpful suggestions:
  1. You may modify Lab06 as much as you like, but do not modify (unless you find some bugs) your Step 1 extension or TweetQueue. Instead, if you need added functionality I want you to extend your Step 1 extension of TweetQueue.
  2. To find a keyword in a Tweet, you might find the String method indexOf(String str) useful.
  3. My solution had a line in Lab06 that looked something like this: currQueue = currQueue.filter(...); You don't have to do it this way, but it might help.

Challenge Step: Undo! (optional)

As an optional challenge if you finish early, see if you can add an "undo" command, so that you can go back to the Queue that was current before the previous "filter" or "filter!" command. Ideally, you'd want unlimited undo, meaning that you could undo (in reverse) all the filter and filter! commands up to either the program start or the last "reset" operation. Unlimited undo can be a bit of a challenge. You want to keep a "stack" of all the old Queues. A "Stack" is like a Queue, except that instead of enqueue and dequeue you have "push" and "pop". Pop is just like "dequeue", you remove and return the frontmost item in the list. But "push" adds a new item to the front rather than to the back like "enqueue" does. So pop returns the most recently added item (the newest) rather than the oldest item like we get with a Queue. As an example, you may look at my implementation of Stacks for Strings, which should be easy to adapt.