string *NAME is an array of names of contest
participants, and int *SCORE is an array of scores,
such that participant NAME[i] has score
SCORE[i], then knowing the value of the
largest element in SCORE won't tell me who the winner
is, but knowing the index of the largest element in
SCORE will.
So, with this in mind, let's consider writing a function
indexOfMax that will return the index of the element
with the maximum value in an array A of
N objects of type int.
int indexOfMax(int *A, int N)
{
int imax = 0;
for(int i = 1; i < N; i++)
if (A[imax] < A[i])
imax = i;
return imax;
}
Very simple function! Now, using our above example, the winner of
our contest is NAME[indexOfMax(SCORE,N)].
indexOfMax function:
void mystery(int *A, int N)
{
for(int length = N; length > 1; length--)
{
int k = indexOfMax(A,length);
int temp = A[length-1];
A[length-1] = A[k];
A[k] = temp;
}
}
What does the mystery function do? Well, it starts
by swapping the largest element in the array with the last element
of the array, and then pretends the last array slot isn't there
any more. Hopefully, you see that this puts the largest at the
back of the array, the next largest in the second to last spot,
etc. util the array is in sorted order! This sorting
algorithm is known as Selection Sort, because it
selects the largest of the remaining elements and puts
it in its proper spot in the sorted array.
If you define the swap function (as in
Class 18), you can write a
particularly succinct version of this function:
void selectionsort(int *A, int N)
{
for(int length = N; length > 1; length--)
swap(A[indexOfMax(A,length)],A[length-1]);
}
This is actually a subtle and important function for the way it
uses pass-by-reference on array elements. Make sure you
understand how it works!
indexOfMax and selectionsort functions
this way:
int indexOfMax(int *A, int N)
{
int imax = 0, i;
for(i = 1; i < N; i++)
if (before(A[imax],A[i]))
imax = i;
return imax;
} |
void selectionsort(int *A, int N)
{
for(int length = N; length > 1; length--)
swap(A[indexOfMax(A,length)],A[length-1]);
} |
before rather than
A[imax] < A[i]. Now, if before is
defined like this:
bool before(int a, int b)
{
return a < b;
}
it tells you that you want a to come before
b if a is smaller, i.e. we've got our
same old smallest-to-largest sorted order.
If, on
the other hand, before is
defined like this:
bool before(int a, int b)
{
return a > b;
}
the we'll get largest-to-smallest sorted order.
Think of before as a comes before
predicate. I use before to tell selection sort how to
determine if one value should come before another
element. Perhaps I want to sort numbers by increasing absolute
value, with the negative number coming first if we have opposite
pairs in the array.
I define before
like this:
bool before(int a, int b)
{
if (a != -b)
return abs(a) < abs(b);
else if (a <= 0)
return true;
else
return false;
}
Now we get an ordering by smallest-to-largest absolute values, with
ties (i.e elements with equal absolute value and opposite signs)
putting the positives to the back. For example,
45 32 -12 -32 0 -18 6would be "sorted" using this
before function as:
0 6 -12 -18 -32 32 45This is an important concept. The sorted order produced by a sorting algorithm like Selection Sort depends on the definition of your
before function - the same algorithm works for any
comparison function! In fact, when we get more advanced, we'll
even pass the before function as a parameter to
selection sort! Of course, we don't know how to do that yet!
Note: Most places you see selection sort they
don't break indexOfMax and swap out into
their own functions. Instead, they incorporate the whole thing
inside the loop, like this:
void selectionsort(int *A, int N)
{
for(int length = N; length > 1; length--)
{
// Find imax, the index of the largest
int imax = 0, i;
for(i = 1; i < length; i++)
if (before(A[imax],A[i]))
imax = i;
// Swap A[imax] & the last element
int temp = A[imax];
A[imax] = A[length - 1];
A[length - 1] = temp;
}
}
string rather than int. What has to
change in our selection sort algorithm? Not much!
As you can see from either this program
(separate functions) or this program
(all in one), nothing
really changes except the names of the types we're working with!
Now, by way of excercise, consider this problem: When I input strings with capital letters, capitals are counted as coming before lower-case. For example, here's a run of my program:
Enter number of strings: 4 Enter strings: Xena adroit lefty xylaphone Xena adroit lefty xylaphoneQuite possibly, this is not what I want. How can I change the program so that capitalization is not taken into account when sorting, although when I print the strings they should have their original capitalization? To make it simple, let's assume that the only possible capitals would be the first letter. Hopefully you see by now that all this requires is changeing the fuunction
before.
| old version of before | new version of before |
bool before(string a, string b)
{
return a < b;
} |
bool before(string a, string b)
{
// Remember we pass by value!
if ('A' <= a[0] && a[0] <= 'Z')
a[0] = a[0] + ('a' - 'A');
if ('A' <= b[0] && b[0] <= 'Z')
b[0] = b[0] + ('a' - 'A');
return a < b;
} |
strings a and
b. Note that this is okay because a and
b are passed by value. If they were passed by
reference, we wouldn't be able to print out with
capitalization. Using this version of the
program, we get:
Enter number of strings: 4 Enter strings: Xena adroit lefty xylaphone adroit lefty Xena xylaphone
swap(a,b) does when a and b
are pointers. Hopefully this picture will give you an idea:
search(A,N,x), where we search
for value x in the array A of
N elements, and return the index of an element of
A that matches the value x. If no such
element is found, an index of N may be returned and,
since it is not a valid index, the caller of the function can
determine that no match was found. For example, to search in an
array of objects of type string, I'd define the
following function:
int search(string *A, int N, string x)
{
int i = 0;
while(i < N && A[i] != x)
i++;
return i;
}
As with before in sorting, we may imagine different
ways in which elements are considered to match. The above search,
for instance, does not consider two strings equal if they differ
if one is in capitals and the other in lower case. We can address
in the same way as before, by using a function match
instead of == to determine whether two objects match:
int search(string *A, int N, string x)
{
int i = 0;
while(i < N && !match(A[i],x))
i++;
return i;
}
... and then changing the match function to
correspond to different ideas of what it means for two objects to
match. Meanwhile, the search function is always
the same!
[Note how short-circuit evaluation is an integral part of
search. Without it, the function would not work!]
Tip: It may not seem natural or obvious to return an int equal to the size of the array when a search fails. Why not -1? Why not size-of-the-array + 1? Certainly both of those work, as do many other options. However, experience has shown that returning the array size on fail is a nice way to do it and this approach has become pretty standard. The fact that it is pretty standard is really the most compelling argument of all. In the absence of compelling reasons to do otherwise, shouldn't we write our code to look like what people expect and are used to seeing?
~/$ ./prog Name (or quit): John John Adams John Tyler Name (or quit): James James Madison James Monroe Name (or quit): Steve Name (or quit): George George Washington Name (or quit): quitHere's my solution