SI221 Class 23 - Anaysis of algorithms

Homework
The program hw.cpp (which uses Stop_Watch.h) times & tests an implementation of AlgBrown f(n) = ceil(sqrt(1))*(-1)^1 + ceil(sqrt(2))*(-1)^2 + ... + ceil(sqrt(n))*(-1)^n In case you don't know, "ceil(sqrt(i))" is the first integer greater than or equal to sqrt(i). So for example, ceil(sqrt(7)) = 3, and ceil(sqrt(16)) = 4. Your homework is to improve on my algorithm, and make a graph showing the running times in microseconds of my algorithm and your algorith for values of n from 1 to 15. You must answer this question: did you "tune" the algorithm, i.e. improve the running time but not change the growth rate, or did you really "improve" the algorithm, i.e. change the growth rate? Make sure your improved algorithm still gets the right answers!

Turn In: A print out of your source code, a printout of your plot (use gnuplot, or Excel, or whatever you want), and the answer (justified) to "The question"!

Reading
Data Abstraction and Problem Solving with C++ pp 433-443 inclusize.

Lecture
Intro
We're going to lay the groundwork for understanding how the efficiency of algorithms is analyzed, and how different algorithms (solving the same problem!) are compared in terms of efficiency. We'll try to develop analytic tools for these things, but we'd like to keep it grounded by looking at empirical data as well. To aid in this, I give you Stop_Watch.h, which defines a class that makes timing code easy. What Stop_Watch gives you is the time spend executing your code, which means that your timing results shouldn't depend (too much) on how many other people are using the same computer as you or what they're doing. This stop watch has a granularity to it as well, which you need to be conscious of. Imagine a digital stop watch with a minutes and seconds read-out. If you tried to time something that took less than a second, it would read that 0 time had passed. This doesn't mean that the event literally took 0 time, but that the stop watch couldn't measure events to an accuracy under 1 second. Such a stop watch would have a granularity of 1 second. The Stop_Watch class's granularity is machine dependant, but it's probably not capable of measuring events under 1 millisecond ... probably closer to 10 milliseconds, in fact. In this case, you have to run the code you want to time over and over again - say 1000 times - and time the duration of the 1000-run trial. The time per run would be the total divided by 1000.

What do we mean by an algorithm's "efficiency"
In this section we're going to talk about several different algorithms for computing the function f(x,k) defined as:
            x8      x8             x8       x8
f(x,k)  =  ---  +  ---  + ... +  -----  +  ---
            1!      2!           (k-1)!     k!
We're going to be interested in comparing them based on speed - essentially asking which algorithm is fastest. This sounds clear, but isn't really as clear as you think!

I have implemented two algorithms for this problem: Alg1 and Alg2. Experimentally I've determined that to compute f(3.1,8) takes 1.32 microseconds with Alg1 and 0.23 microseconds with Alg2. On the other hand, computing f(0.75,12) takes Alg1 1.81 microseconds and takes Alg2 8.02 microseconds. Which is better? Pretty hard to say, isn't it? When we talk about running time of algorithms, which input are we talking about? Different input takes different amounts of time!

Let's shift the focus and look at one specific algorithm and try to run some experiments with it. The function f_alg0 below implements a pretty straightforward algorithm for computing f:

double f_alg0(double x, int k)
{
  double S = 0.0;
  for(int n = 1; n <= k; n++)
  {
    // Compute x^8
    double x8 = 1.0;
    for(int i = 0; i < 8; i++)    
      x8 = x8*x;
    
    // Compute n!
    double f = 1.0;
    for(int i = 1; i <= n; i++)
      f = f*i;
    
    // Add next term
    S += x8/f;
  }
  return S;
}
	  
If you run some experiments on this, you'll find that the time required does not depend on x at all, only on k. (this program allows you to run some time trials). This tells us important thing number 1: Any single arithmetic operation on the computer takes the same amount of time regardless of the specific values of the operands. In other words, it takes just as long to compute 13.4*5.2 as 30003.8798798*0.00003782. What matters in this case is k, because k affects how many times we go through our main loop - the bigger k is, the longer the computation takes. Thus, no one number gives us "the running time" of f_alg0. Instead, "the running time" is a function of k - let's call it T0(k). Below you see a plot of this function for values of k from 1 to 20.

T0(k) in (microseconds)

So, important thing number 2: "Running time" is a function, not a number - it's a function of the input. Typically, though not always, as the input gets "bigger", the computation requires more time. When we talk about the efficiency of an algorithm, we're talking about this function!

How do we compare running time functions?
When we look at a different algorithm for solving the same problem, we want to compare its running time function to that of the algorithm we already have. Here are implementations of four more algorithms for solving our problem, along with a plot of the running time functions for all 5 algorithms:

Alg1 We notice that x^8 is getting recomputed needlessly every time through the outer loop, so we compute once, before the loop. Alg2 We notice that each time through the loop the new value of the variable f is the old value of f times n, so we simply remember the old value and reuse it rather than recompute it. Alg3 We compute x^8 before the main loop, compute it with only three multiplications (rather than an 8-iteration) loop, and we factor the x^8 out of the sum. Alg4 We use all the best features of the previous three!.
double f_alg1(double x, int k)
{
  // Compute x^8
  double x8 = 1.0;
  for(int i = 0; i < 8; i++)    
    x8 = x8*x;
    
  double S = 0.0;
  for(int n = 1; n <= k; n++)
  {
    // Compute n!
    double f = 1.0;
    for(int i = 1; i <= n; i++)
      f = f*i;
    
    // Add next term
    S += x8/f;
  }
  return S;
}
double f_alg2(double x, int k)
{
  double S = 0.0, f = 1.0;
  for(int n = 1; n <= k; n++)
  {
    // Compute x^8
    double x8 = 1.0;
    for(int i = 0; i < 8; i++)    
      x8 = x8*x;

    // Compute n!
    f = f*n;

    // Add next term
    S += x8/f;
  }
  return S;
}
double f_alg3(double x, int k)
{
  // Compute x^8
  double x2 = x*x;
  double x4 = x2*x2;
  double x8 = x4*x4;

  double S = 0.0, f = 1.0;
  for(int n = 1; n <= k; n++)
  {
    // Compute n!
    double f = 1.0;
    for(int i = 1; i <= n; i++)
      f = f*i;

    // Add next term
    S += 1/f;
  }
  return x8*S;
}

		  
double f_alg4(double x, int k)
{
  // Compute x^8
  double x2 = x*x;
  double x4 = x2*x2;
  double x8 = x4*x4;

  double S = 0.0, f = 1.0;
  for(int n = 1; n <= k; n++)
  {
    // Compute n!
    f = f*n;

    // Add next term
    S += 1/f;
  }
  return x8*S;
}
		  

So, which algorithm is best? Hopefully you'd say Alg4. But how much better? How would you rank them? The key thing to notice, is that Alg0, Alg1 and Alg3 all curve up, whereas Alg2 and Alg4 are straight lines. To really see how important this distinction is, let's plot the computing times (in milliseconds now) over a much wider range of k values - from 100 up to 1000.

What we see is that there are two classes of algorithms: those whose computing time is a straight line, and those whose computing time curves up (actually, they're parabolic), and there's simply no comparison between the two families. Even though Alg3 was faster than Alg2 for small values of k, the fact that Alg2's running time was only growing linearly while Alg3's was curving up, growing fast and faster, meant that Alg3 was doomed. Alg2 was destined to catch up to it and surpass it, and by the time you look at values of k in the thousands ... well just forget about it!

What we're really interested in when we compare algorithms is the shape of the running time function's curve. That tells us how fast the running time grows as the input gets bigger and bigger. The difference between Alg0 and Alg3 for example, is irrelevent when you compare them to Alg2. Alg0 and Alg3's running times differ by a constant multiple - roughly T0(k) = 3/2 T3(k) would be my guess from the earlier graph. That's nice, but they both look equally slow on the last graph when we compare them to Alg2. Similarly, Alg4's running time differs from Alg2 by a constant multiple - roughly T2(k) = 4 T4(k) would be my guess from the earlier graph. That's nice, but that difference is irrelevent when you compare those two to all the other algorithms. Constant multiple differences in running time don't change the shape of the running time function graphs, they don't change the growth rate of the running time functions. So, important thing number 3: what we really care about is the growth rate of an algorithm's run-time function. This is justified by the above discussion comparing several algorithms solving our f(x,k) problem. It's got other justifications as well. The actual time in seconds for an implementation of an algorithm depends on the skill of the programmer, depends on whether or not you're using compiler optimization, depends on the language used, depends on the architecture of the computer ... depends on all sorts of things. The growth rate is independent of these things. All these things change the running time function by a constant multiple, they don't change the growth rate, they don't change the fundamental shape of the curve.

Hierarchy of growth rates
The "fast" algorithms from the previous sections grew like lines, which is to say they grew like n. We say that a function that grows like n is "(n)", so we'd write that "T2(k) = (n)" and "T4(k) = (n)". On the other hand, the other algorithms are "(n2)", which is to say they grow like n2. Clearly, n2 grows faster than n, so we prefer the (n). algorithm. There's a whole hierarchy of growth rates, including:
2^n  <------ Grows fast!
n^3
n^2
n log(n)
n
sqrt(n)
log(n) <---- Grows slow!
This link link demonstrates how each elment of the hierarchy might as well be the constant zero compared to everything above it - just like Alg2 and Alg4 took no time at all compared to the others when you started considering big enough values of k.

Ultimately what we'd like to do is look at an algorithm and place it on the growth rate hierarchy without even having to implement it! That way we can compare all the different algorithms we dream up for a problem, pick the best, and only bother to sit down and program the best

What about when two inputs of the samce "size" take different amounts of time?
In the example we've been considering, we used k to measure the "size" of the input, and for fixed k all of our algorithms took the same amount of time regardless of which particular input of size k we had - i.e. regardless of x. That doesn't always happen. For example, let's consider the problem of finding the index i such that the element A[i] in vector of integers has value x. In other words: the input consists of A and x, and the output is i (we'll assume that you know that x is in A, you simply don't know where). Clearly, the size of the vector, which we'll call NA, is the "size" of our input. If we have an algorithm like this:
int find(vector<int> &A, int x)
{
  int i = 0;
  while(i < A.size() && A[i] != x)
    i++;
  return i;
}
	  
We'd like the running time as a function of NA. Well, now for a given input size NA there's a whole range of possible running times from almost no time at all (if x is sitting in the first cell of A) to however long it takes to look through the whole vecotor. Here's the plot of all the possible running times as NA varies from 10 to 100 in increments of 5, and over all possible x values (remember we're assuming x is in the vector somewhere).

So now, the running time is no longer a function of the input size - because there is not a unique time value for each value of NA. However, if you take the maximum time for each NA, you get a function Tworst(N_A), the worst case running time of the algorithm. If you take the minimum time for each NA, you get a function Tbest(N_A), the best case running time of the algorithm. And if you take the average time for each NA, you get a function Tave(N_A), the average case running time of the algorithm. Here, Tworst(N_A) = (N_A), and Tbest(N_A) = (1), i.e. it grows like the constant 1, which is to say it doesn't grow at all, and Tave(N_A) = (N_A). Typically, we'll talk about the best, worst and average case running times of algorithms as functions of their input size.

Christopher W Brown
Last modified: Fri Apr 18 10:44:39 EDT 2003