Linked Lists 2: Traversal

We introduced linked lists as a data structure in the previous class. Today we'll talk about how to write code that can traverse a list for basic functions like search.

Traversing a list

Suppose LIST is a variable of type Node* in our program, and suppose that it points to the list 87, 42, 53, 4 as illustrated in the following picture:

We'll start things off, however, by looking at the simplest of programs: read and store ints in a linked list, then print the values stored in a list. Printing a list, and indeed most things you do with a linked list, involves traversing a list — which means visiting each node in the list, in order. Doing this iteratively (as opposed to recursively, which we'll see in a later lecture) means using a temporary pointer that points to the first node originally, then moves to point to the second node in the list, then moves to point to the third, and so on.


Node* p = LIST;
while(????) {
   cout << p->data << " "; // do whatever you have to do while traversing the list
   p = p->next;
}
The last line of the while-loop body, p = p->next; is what moves us from one node to the next.

The only real difficulty is going to be this:

How do we know when we're done? What do we replace the ???? with?
To address this question, consider the following scenario:
  1. Loop continued, and finally the last iteration has come. That is, p points to the last node (i.e., the node containing 4).
  2. Now, the expression p = p->next is executed, which implies that the value of p becomes NULL. Recall the "slash" in the node means that the value of next field is NULL.
  3. Right before the next iteration starts, the program checks the test condition of the loop, and the condition should be evaluated to be false, since the loop must exit at this moment.
  4. It turns out that the test condition (p != NULL) satisfies our requirements (if you are not convinced, walk the printing code snippet below using the above linked list example, slowly and node by node, and convince yourself).
So, the actual code looks like:

Node* p = LIST;
while( p != NULL ) {
   cout << p->data << " ";
   p = p->next;
}
Note this code works even when the list is empty (i.e., when LIST is NULL).

Now let's look at the concrete example of traversing the list in order to print out its values.

#include <iostream>
using namespace std;

struct Node {
  int data;
  Node* next;
};

void add2front(int val, Node* &L);

int main() {
  // Create list
  Node* L = 0;
  int x;
  while( cin >> x )
    add2front(x,L);

  // Print list
  Node* T = L;
  while( T != NULL ) {
    cout << T->data << ' ';
    T = T->next;
  }
  cout << endl;
  return 0;
}

void add2front(int val, Node* &L) {
  Node* T = new Node;
  T->data = val;
  T->next = L;
  L = T;
}
  
Note that you can write the while-loop for printing nice and compactly as a for loop:

for( Node* T = L; T != NULL; T = T->next )
  cout << T->data << ' ';
cout << endl;
I really prefer writing it this way, because it makes the traversal code (the for(Node* T = L; T != NULL; T = T->next)) totally boiler-plate: that stays the same for any kind of list you ever make, and really no matter what you're doing for the traversal.

In general this way of writing functions for manipulating linked lists is pretty good.

Compute the length

Here's a typical example of a linked list problem: Write a function that determines the length of a linked list. We know that our function's prototype should look like this:
int length(Node*);
It returns an int because, of course, the length of a list is an integer. It takes a pointer to a Node, because we manipulate lists by manipulating pointers, just like with arrays. The process we use to determine the length is straightforward too: move through the list from front to back one node at a time, keeping track of how many you visit in the process.

int length(Node* L) {
  // Initialize: curr points to the node we're about to visit
  Node* curr = L;
  int count = 0;

  while(????) {
     // record that we've visited the node pointed to by curr
     count++;

     // set curr to the next node to visit
     curr = curr->next;
  }

  return count;
}
The only real difficulty is going to be this: How do we know when we're done? What do we replace the ???? with? The easiest way to determine this is to work through our current function as it operates on an example list:
Before the 1st loop iteration
Clearly we want to continue our iteration ... we haven't even started yet (see the picture on the right).

When the 1st loop iteration is executed
Now, curr will be set to curr->next, which leaves it pointing at the second node. In addition, count will be incremented, showing that we've visited one node so far (see the picture below right).

Before 2nd loop iteration
Noticing from the picture that we're pointing to the last node in the list, you might be tempted to think that we ought to stop now. However, curr is pointing to the node we are about to visit, not the one we just finished visiting. So, we haven't visited node 2 yet, and therefore should continue looping!
Before 3rd loop iteration
Now curr is the null pointer (i.e. pointer 0) so there's no nodes left to visit. We've visited them all, as count attests to, we we should exit our loop.
From the preceding example, what was it that told us when we should continue looping versus when it was time to stop? It was the value of curr! Specifically, as long as curr isn't the null pointer we should continue looping! From this, we see that our function's completed definition should be:

int length(Node* L) {
  // Initialize: curr points to the node we're about to visit
  Node* curr = L;
  int count = 0;

  while(curr != NULL) {
     // record that we've visited the node  pointed to by curr
     count++;

     // set curr to the next node to visit
     curr = curr->next;
  }

  return count;
}
Note: Of course we could also write the exact same code in a for-loop, as discussed above.

int length(Node* L) {
  int count = 0;
  for( Node* curr = L; curr != NULL; curr = curr->next )
     count++;

  return count;
}
Convince yourself that our length implementation does work for the empty list!

Search in linked lists

Recall that, earlier in the semester, we discussed that where arrays are concerned, it's "indices not values". With linked lists we have something similar: pointers not values.

As with arrays, when this becomes really important is search. Suppose we want to write a "search" function for linked lists? What should we return if the search is successful? What should we return if a search is unsuccessful? In both cases, the best thing to do is to return a pointer.

Thus, the search prototype will be (in this example for lists with data of type double):
Node* search(dobule value, Node* L);
Recall that it's best to have a function bool match(double a, double b); — where the types of the arguments a and b match the type of the data in the Node — that takes care of determining whether the data in an individual Node matches the value being searched for.

As an exercise, try downloading Ex0.cpp and see if you can provide a definition for the "search" function. If you've done it properly, the program should run like this:

~/$ ./prog
34.2, 54.98, 92.0, 83.9;
search 54.98
Found!
search 54.88
Not Found!
search 12
Not Found!
quit
Check the solution.

Deleting a list

Since nodes in a linked list are each created dynamically with "new", they live on in your program until they are each individually destroyed with "delete". It is often useful to have a function void deletelist(Node* L); that deletes each of the nodes in a list. In good ol' top-down fashion, I'll implement this function "wishing" that other functions were available to me:

void deletelist(Node* L) {
  while( L != NULL )
    deletefirstnode(L);
}
So, you see that the real problem is deleting the first node in a list.

void deletefirstnode(Node*& L) { // Assumes L is not the empty list!!
  Node* T = L;
  L = L->next;
  delete T;
}
Hopefully you see that L = L->next effectively removes the first node from L, but that to actually give back the memory we allocated when creating that node with new, we need to call delete.

Problems

  1. Find the maximum value in a list of double's. Check out my solution.
  2. Construct a reverse copy of a list of double's. Check out my solution.