SI204 Class 36: Pointer basics and intro to linked lists

Reading
Section 17.1 of Absolute C++.

 

Lecture

Pointers to a single object

Up until now, we've used pointers to point to arrays of objects, rather than individual objects. It turns out that pointers to individual objects are very important in a wide range of circumstances.

You can allocate a single object of a given type by just leaving the square brackets off the new statement. For example, to create a single int object and a pointer, p, that points to it, we could write:

int *p = new int;
    

To delete the int that p points to, you use the usual delete statement, except without the square brackets. So to delete the int that p points to, you would write:

delete p;

Now, it turns out that you can treat p as an array containing just one int, in the sense that p[0] does indeed give you the int that p points to. However, when we have a pointer to a single object, we usually use an asterisk (*) to get the object from the pointer. This is called dereferencing the pointer. The syntax is *p, which evaluates to the object p points to. So, for example, to allocate a single int using new, and then to assign that int a value we might do something like this:

Stept 1: int *p, x = 2;
Stept 2: p = new int;
Stept 3: (*p) = x + 1;
    

So in this example, p is a pointer to an int, while *p is the actual integer that p points to. Now, when the objects you allocate are structs, the * for dereference can get a little ugly. For example, consider our ever popular struct point:

struct point
{
  double x, y;
};

We allocate a single object of type point and initialize its x and y coordinates to zero with the following code:

point *ptr = new point;
(*ptr).x = 0;
(*ptr).y = 0;

This syntax is a little bit ugly ... maybe even more than a bit. Therefore, the inventors of the C language (from which C++ is derived) decided to provide a shorthand notation for accessing a data-member of a struct from a pointer: the -> operator. The pointer goes on the left, the name of the data member goes on the right. So the above example becomes:

point *ptr = new point;
ptr->x = 0;
ptr->y = 0;

Remember, this is just a shorthand to make things easier for programmers to write and easier to read.

Note: A pointer can also point to a local variable (i.e. one that was declared rather than allocated with new), although this has very little utility in C++ (There's one exception that comes up in data structures, but for the most part pointers to local variables is a relic of C). In order to get a pointer to an object from the object itself, you use the & operator. Don't let the fact that this symbol is also used for pass-by-reference confuse you! The expression &name evaluates to a pointer to the variable name. The & is called the "address-of operator". So, for example:

int *p, x;
p = &x;
(*p) = 3;
cout << x << endl;

will print out "3". The second line sets p to point to the int x. This third line sets the "int pointed to by p" (which is x) to 3.

Linked List Intro

A lot of the benefit of pointers to a single object, rather than an array, won't be apparent until you get into data structures. However, one big use of them is our next topic: linked lists. Up to this point we've used arrays to store a sequential collection of objects. Arrays are great, but they have a few limitations that cause us some headaches.

  1. The size of an array is fixed from the moment it is allocated with new, so you always have to know how many objects you want to store ahead of time.
  2. Adding an element to the front of an array, or inserting an element somewhere in the middle is a big job - first you need extra "room" in the array, and second you need to move everything else in the array back an element, which is a lot of work.

Linked lists are an alternative to arrays that avoids these limitations. A linked list consists of "nodes", each of which contains a piece of data (in a linked list of ints that data would be an int, in a linked list of strings that data would be a string, etc) and a link to the "next" node in the list. The following picture shows a linked list storing the numbers 87, 42, 53, 4.

You absolutely have to keep this kind of picture in mind if you want to program with linked lists! The idea here is that you move through the collection starting with a pointer to the first node in the list, which I've marked "LIST" in the picture. From there you simply follow the links (arrows) from one piece of data to the next. The last node in a linked list does not have a link to a "next" node - we usually show its next-field with a slash to indicate that.

To realize this picture in C++ code, we see that we have to manipulate "nodes". Therefore, it's natural to make a struct Node. Let's assume that we want to store strings in our list, then a Node will consist of a string, which we might call "data", and a link to the "next" node. The question is, what type of object would the link to the "next" node be? Hopefully you see that what we need is a pointer to the next node in the list, which means we need a pointer to an object of type Node - i.e. we need a data member of type Node*.

struct Node
{
  string data;
  Node* next;
};

Note: by convention, the next field of the last element in the list is set to the value 0 to indicate that it's not pointing to anything. This is often called a "null pointer".

Manipulating a linked list in your program

Remember that we manipulate an array in our programs by manipulating a pointer variable that points to the array. Similarly, we manipulate a linked list by manipulating a pointer to the first node in the list. So, suppose we want to deal with linked lists of integers. We would define a struct Node like:

struct Node
{
  int data;
  Node* next;
};

Now, we might have a variable of type Node* called LIST that points to the first node in our linked list of integers. Our list might look something like this:

We could print out the first integer in our list with the statement

cout << LIST->data;

which says "print the data field of the Node pointed to by LIST". You can simply follow that in the picture, it prints out 87. Now, to print out the second integer in the list, we'd like to say the same thing, except instead of LIST, which points to the first node in the list, we need a pointer to the second node in the list. Then we could write:

cout << (pointer to the second node)->data;

Now, if you look back at the picture, you see that there is a pointer to the second node in the list, it is the next field from the first node in the list, i.e. LIST->next is a pointer to the second node in the list. So, to print out the second node in the list you write:

cout << (LIST->next)->data;

Actually you don't need the parentheses, because the -> is left associative. Anyway, to print out the third element of the list you could write:

cout << LIST->next->next->data;

On the other hand, this way lies madness! It's usually better to simply make a temporary pointer that points to what you want, and move that pointer through the list. For example:

Node* ptr = LIST; // Set ptr to point to first node
ptr = ptr->next;  // Set ptr to point to second node
ptr = ptr->next;  // Set ptr to point to third node

When do we stop?

Obviously, we want to stop once we have reach the last node on the list.  How do we know that we have reached that node?  Looking at the picture above, the last node in the list has a null pointer or a 0 in the next field.

The empty list

So far we have manipulated linked lists but have yet to actually create one.  Everything we have done so far has assumed that the list already existed.  It turns out that the easiest list to create is one that contains no nodes at all.  Once we create that one then all we need to do is learn how to add a new node to that list.  If we repeatedly add more nodes, we will have a linked list similar to the ones we have used earlier.

When you tell people that the ancient Greeks, who were extremely sophisticated mathematicians, had no concept of zero, most people think "What was wrong with them? It's so obvious!" Well, that's not a very fair appraisal, and you may gain a little appreciation of why when we talk about ... the empty list. When speaking of lists, it's really handy to have the concept of an empty list, i.e. a list of zero nodes. If you're thinking "Why would I do that? A list of nothing doesn't even make sense?", then you know how the ancient Greeks felt!

In our programs, we represent an empty list by a null pointer, i.e. with a pointer whose value is 0. Once you accept the empty list, lots of things become a bit easier to discuss. For example, the sequence of steps we will use to add a new element to the front of a list works perfectly well even when the list is empty.

So how do we create the empty list?  Simple - create a new pointer to an object of struct Node and initialize it to the null pointer.

If we again use the struct Node

struct Node
{
 int data;
 Node *next;
};

Node* LIST = 0;

Adding a node to the front of a list

Adding a Node to the front of a list is a pretty straightforward thing: You create a new Node, set it's data and next fields, and set LIST to point to it, since it's supposed to be the new "first node in the list". The following picture shows it all:

 

Initially we have:

Node *temp = new Node;

temp->data = 8;

temp->next = LIST;

LIST = temp;

At the end we have:

Whenever you want to add a new value val to the front of a linked list LIST, the same sequence of steps will do the job:

Node *temp = new Node;
temp->data = val;
temp->next = LIST;
LIST = temp;

 

In the example below, LIST is empty, and the exact same sequence of steps we used above will add the new value 3 to the "front" of the list:

 

Initially we have:

Node *temp = new Node;

temp->data = 3;

temp->next = LIST;

LIST = temp;

At the end we have:

At this point, we should have the idea that, since we have a fixed sequence of steps that add a new element to the front of a list, we might want to make a function! If we want to think of this function as modifying the LIST pointer, then the function better take the pointer to the front of the list by reference! Of course, the function will also need to be passed the value val that's supposed to be the data for the new node.

void add2front(int val, Node* &LIST)
{
  Node *temp = new Node;
  temp->data = val;
  temp->next = LIST;
  LIST = temp;
}

Here is an example program that reads ints from the user and stores them in a linked list using the add2front function.  

Problems

The type of object stored in the data fields of the nodes in a linked list are pretty much irrelevant to the problem of how to manipulate a list. Therefore, we're going to concentrate on basic linked list manipulations, which provide us with a wealth of interesting problems. These typically boil down to functions you'd like to define, like:

  • add2back

  • length

  • print

  • find

  • insertafter

  • insertbefore

  • max

  • min

We will be looking at some of these problems in lab.

 


Assoc Prof Christopher Brown

Last modified by LT M. Johnson 11/26/2007 10:21:22 AM