In this lecture, we consider sorting numbers in an array.
Selection Sort
One natural way of sorting numbers is as follows:
- The first iteration (with i==0) puts the smallest number at index 0 in the array
- The second iteration (with i==1) puts the next-smallest number at index 1 in the array
- The third iteration (with i==2) puts the next-smallest number at index 2 in the array
- … and on until the entire thing is sorted!
This sorting algorithm is known as Selection Sort, because it selects the first
of the remaining elements and puts it in its proper spot in the sorted array.
I hope the following picture clarifies the idea behind Selection Sort algorithm.
data = 7, 4, 1, 3 iteration 0
| sorted up to index 0 iteration 1
| sorted up to index 1 iteration 2
| sorted up to index 2 (actually, all sorted)
|
#
#
# put
# # the smallest
# # # at index 0
# # #
# # # # (swap 2 -> 0)
---------
0 1 2 3
|
#
#
# put
# # the next smallest
# # # at index 1
# # #
# # # # (swap 3 -> 1)
---------
0 1 2 3
|
#
#
# put
# # the next smallest
# # # at index 2
# # #
# # # # (swap 3 -> 2)
---------
0 1 2 3
|
#
#
#
# #
# # #
# # #
# # # #
---------
0 1 2 3
|
Let's write code for Selection Sort using step-wise refinement:
Function prototype
The input should an array of numbers (along with its size). As for output, the
algorithm sorts the numbers inside the given array by using swaps, and
so the function should output nothing.
// prototype
void selection_sort(int* A, int n);
How many iterations?
We need n-1 iterations. As you see in the picture above, after n-1 iteration,
Selection Sort assures that n-1 smallest numbers are placed in order, which
actually means that all numbers are in order.
What to do for each iteration?
void selection_sort(int* A, int n) {
for (int i = 0; i < n-1; ++i) {
// find the index of the next smallest element
// have variable nexti contain that index
// put A[nexti] at index i
}
}
|
|
We need to do the following:
- Find the index of the next smallest element, which we call
nexti
- Put
A[nexti] at index i.
|
Putting A[nexti] at index i
Let's first take care of this easier task. It is sufficient to simply swap
A[nexti] and A[i].
// put A[nexti] at index i
// that is, swap A[nexti] and A[i]
int temp = A[i];
A[i] = A[nexti];
A[nexti] = temp;
Finding nexti
// A[nexti]: the smallest of A[i],...,A[n-1]
int nexti = i;
for (int j = i + 1; j < n; ++j)
{
if (before(A[j], A[nexti]))
nexti = j;
}
|
|
We observe the following:
- At iteration i,
A[nexti] is the smallest of A[i], ..., A[n-1].
Check the picture above so the observation should make sense to you.
|
|
Function before()?
In the above, we introduce a new function called before(). In fact,
following chunks of code are equivalent in our context.
| direct
|
// A[nexti] should be the smallest
if ( A[j] < A[nexti] )
nexti = j;
| | using before() |
// A[nexti] should be the smallest
if (before(A[j], A[nexti]))
nexti = j;
|
|
|
That is,
before(a, b) is a boolean function that returns true
if a should be smaller than b.
bool before(int a, int b){
return a < b;
}
|
Why before() function, though?
At this moment, it may not be clear to you why we bother to make a function
for a single line of code. Things will be clear soon. Stay tuned!
Copy and paste this code for your HW and Lab
This is the version we recommend you copy and use in your own programs as a
starting point.
void selection_sort(int* A, int n) {
for (int i = 0; i < n - 1; ++i) {
// find nexti, the index of the next element
int nexti = i;
for (int j = i + 1; j < n; ++j) {
if (before(A[j], A[nexti])) {
nexti = j;
}
}
// swap A[i] and A[nexti]
int temp = A[i];
A[i] = A[nexti];
A[nexti] = temp;
}
}
Writing a simple before() function
| Not recommended | Recommended
|
bool before(int a, int b)
{
if( a < b )
return true;
else
return false;
}
|
bool before(int a, int b)
{
return a < b;
}
|
|
|
Note the two versions are equivalent.
- If
a is smaller than b, then expression a
< b evaluates true, so return a < b will return
true.
- If
a is larger than or equal to b, then expression a
< b evaluates false, so return a < b will return
false.
It is always better to write simple code, and we strongly recommend to
use the simpler version (on the right).
|
Sorting numbers in different orders
Sometimes, we may want to sort the data in a different way.
descending order
If before is defined like this, we'll get largest-to-smallest sorted order:
bool before(int a, int b)
{
return a > b;
}
Think of before as a comes before predicate. I use
before to tell selection sort how to determine if one value should
come before another element.
From a bigger designing perspective, by having a dedicated function (i.e.,
before()) that controls the sorting order, your code gains more flexibility to
deal (with much less errors) with a variety of sorting situations.
Another example
Perhaps I want to sort numbers by smallest-to-largest absolute values, with
ties putting the negatives coming first. For example,
45 32 -12 -32 0 -18 6
would be "sorted" using this before function as:
0 6 -12 -18 -32 32 45
|
(-32 and 32 were ties, so the negative -32 should be come first)
|
I define before like this:
bool before(int a, int b)
{
if ( abs(a) != abs(b) ) // if a and b are not ties (having different absolute values)
return abs(a) < abs(b); // then compare them by the absolute value
else // else (i.e., if they are ties)
return a < b; // then the smaller (negative) should be before the larger (positive)
}
This is an important concept. The sorted order produced by a sorting algorithm
like Selection Sort depends on the definition of your before
function - the same algorithm works for any comparison function!
The Same Sorting Algorithm Works on Arrays of Any Type!
We might decide that we want to sort arrays of objects of type
string rather than int. What has to
change in our selection sort algorithm? Not much!
void selection_sort(string* A, int n) { // *** change here: int to string
for (int i = 0; i < n - 1; ++i) {
// find nexti, the index of the next element
int nexti = i;
for (int j = i + 1; j < n; ++j) {
if (before(A[j], A[nexti])) {
nexti = j;
}
}
// swap A[i] and A[nexti]
string temp = A[i]; // *** change here: int to string
A[i] = A[nexti];
A[nexti] = temp;
}
}
Here's a run of my program:
| before() function | sample run
|
bool before(string a, string b)
{
return a < b;
}
|
Enter number of strings: 4
Enter strings: Xena adroit lefty xylaphone
Xena
adroit
lefty
xylaphone
|
Note that the sorting has been performed in the case-sensitive manner. That is,
Xena is before adroit, since the ASCII code for 'X' is smaller than the ASCII
code for 'a'.
- Q: How can I change the program to sort strings in the
case-insensitive manner?
Hopefully you see by now that all this requires is changing the function
before.
| before() function() | sample run
|
string cap(string s)
{
for(int i=0; i < s.length(); i++)
if (s[i] >= 'a' && s[i] <= 'z')
s[i] = s[i] -'a'+'A';
return s;
}
bool before(string a, string b)
{
return cap(a) < cap(b);
}
Enter number of strings: 4
Enter strings: Xena adroit lefty xylaphone
adroit
lefty
Xena
xylaphone
|
|
Search
Searching for values in arrays is another fundamental operation.
The basic format is as follows:
That is,
- Input: We search for value
x in the array A of
N elements.
- Output: The function returns the index of an element of
A that matches the value x.
If no such element is found, an index of N may be returned and,
since it is not a valid index, the caller of the function can determine that no
match was found.
For example, to search in an array of objects of type string, I'd
define the following function:
int search(string* A, int N, string x)
{
for(int i=0; i < N; i++)
{
if( A[i] == x )
return i;
}
return N;
}
As with before in sorting, we may imagine different
ways in which elements are considered to match. The above search,
for instance, does not consider two strings equal if they differ
if one is in capitals and the other in lower case. We can address
in the same way as before, by using a function match
instead of == to determine whether two objects match:
int search(string* A, int N, string x)
{
for(int i=0; i < N; i++)
{
if( match(A[i], x) )
return i;
}
return N;
}
... and then changing the match function to
correspond to different ideas of what it means for two objects to
match. Meanwhile, the search function is always
the same!
Tip:
It may not seem natural or obvious to return an int equal to the
size of the array when a search fails.
- Why not -1? Why not size-of-the-array + 1?
Certainly both of those work, as do many other options.
However, experience has shown that returning the array size on
fail is a nice way to do it and this approach has become pretty
standard. The fact that it is pretty standard is really the
most compelling argument of all. In the absence of compelling
reasons to do otherwise, shouldn't we write our code to look
like what people expect and are used to seeing?
Practice Problems
-
Receive 7 integers from the terminal and print them with positive numbers first
(in ascending order). For simplicity, assume that the user never gives 0 as
input. Below is a sample run.
~$ ./a.out
1 -2 -5 3 7 -4 5
1 3 5 7 -5 -4 -2
Answer:
bool before(int a, int b)
{
if( a > 0 && b < 0 ) // positve(a) should be before negative(b)
return true;
else if ( a < 0 && b > 0 ) // negative(a) should not be before positive(b)
return false;
else // same sign: just compare the values
return a < b;
}
-
Write a program that reads in a list of non-negative integers from the user and
sorts them by their last digit. Numbers with the same last digit should appear
smallest to largest. So, 678 32 67 102 7 18 would appear as 32 102 7 67 18 678.
| |