Menu

Class 30: Introduction to P and NP


Reading
The intruction to Chapter 34 and sections 34.1 of Introduction to Algorithms.

Homework
We decided that to compare the difficulty of different problems we need to use as input size the standard of "bit length". This means you need to think about how to write a problem instance down as a string of symbols and, ultimately, to determine how many bits are required to represent that string. Thus, you need to develop the skill of representing objects as strings and determining bounds on the lengths of those strings.
Example: Suppose that I have an array of n non-negative integers, all less than d. How many bits would be required to represent this array? Well, you'd have to write down n and d and then you'd have to write down all n of the array elements. It takes roughly lg(x) bits to write down the number x, so this would require lg(n) + lg(d) + n*lg(d) = Θ(n lg(d)) bits. If you want to be picky, you'll have to be a bit clever to make sure that the person can interpret all the bits correctly. For example:
What input does this represent: 11111011011101011000010100 ?

could be:                       111 11  011 011 101 011 000 010 100
                                n=7 d=3  3   3   5   3   0   2   4


could be:                       11  111 0110111 0101100 0010100
                                n=3 d=7   55      44      20
Fortunately, we don't need to worry too much about this. With even a really silly plan, we could encode this without doing worse than, say, quadrupling the number of bits used. Thus, we could still encode our problem in Θ(n lg(d)) bits.

Just this once, I want you to worry about precisely what I said you didn't need to worry about in the above example. I want you to describe a scheme with which you could encode an array of n non-negative integers, all of which are less than d, as a string of bits. From your description I should be able to easily figure out how to encode any such array, without any limits whatsoever on n and d ... even if n and d were *HUGE*. This scheme must be unambiguous, in other words there can't be two different possible interpretations, like what I did above. Turn In: a description of your encoding scheme, an example n, d and array encoded according to your scheme, and an analysis of how many bits your method requires as a function of n and d.

Other study problems (not to turn in)

  1. Consider the problem of deciding whether or not there is a cycle in an unweighted, directed graph G with n vertices and e edges. The input for such a problem is simply G. Describe a scheme for encoding G as a sequence of bits and give the bit-length of a problem encoding as a function of n and e.
  2. Suppose you want to encode a list of non-negative integers, and their sizes can vary wildly. It's tough to encode so that you know where one integer stops and the next starts. One scheme is to encode a number n by first giving the number of bits in n in unary notation, then giving the number itself in binary: for example to encode 5:
          .--------- terminates the unary field width
          |
          V
    1 1 1 0 1 0 1  ←  represents the number 5
    -----   -----
    Tells   3-bit
    me n    representation
    takes   of the number 5.
    3 bits
    
    What is the list of numbers reprsented by:
    1 1 0 1 1 1 1 1 0 1 0 1 1 0 1 1 1 1 1 0 1 0 1 0
    
  3. Using the above scheme, write an expression for the number of bits required to represent the list of numbers n1, n2, n3, ..., nk?
  4. Which of the following expressions is a polynomial: n^2 - 3*n + 1, n^n + 1, n*k + k^2, n*lg(k), sqrt(n) + 2*n.
  5. Give the following information on the worst case running times of algorithms A, B, C, and D, which algorithms are "polynomial time"?
    TA(n) = n^2 - 3*n + 1,
    TB(n,k) = n^n + 1, n*k + k^2,
    TC(n,k) = n*lg(k),
    TD(n) = sqrt(n) + 2*n.
  6. If algorithm A runs in time O(n3/2), and s, the number of bits required to write down a "size n" problem, is n4, what is the running time of A in terms of s rather than n?


Lecture
The difficulty of problems rather than the efficiency of algorithms
Up to this point we have fixed a particular problem and compared the efficiencies of different algorithms for that problem. Now we are considering a more fundamental question: What is the inherent computational complexity of a given problem? We're not analyzing the complexities of algorithms that we already have, we're analyzing the complexity of the best algorithm that anyone could possibly ever invent!

We've already seen this problem addressed in one case: sorting! In Class 14 we proved that any general purpose sorting algorithm, including those that have yet to be invented, has a worst case time complexity that is Ω(n lg n), where n is the number of elements to be sorted. (Remember that mergesort actually meets this bound.) Thus, sorting is inherently n log n ... you can't do better in the worst case.

Why now? Why are we looking at these questions at this point in the semester? We've looked at a public key cryptography system based on integer factorization. In this systems, and in other public key systems, security rests on the presumed difficulty of some computational problem. It's not that our current algorithms are slow, it's that we expect that the problem is inherently difficult - i.e. that any algorithm solving it would be slow. Thus, these crypto-systems naturally move us to ask questions about the inherent difficulties of problems.

Comparing Different Problems
You should be able to convince yourself that the max problem, i.e. finding the maximum element in an array of n objects, is a Θ(n) problem. Thus, we can say that sorting is a more difficult problem than max. But how about comparing max and, for example, exponentiation? Which is more difficult?

To make this precise, we'll consider the problem power(3,n), i.e. computing powers of 3. Since our divide and conquer algorithm solves this problem in Θ(lg n) time, we're tempted to say that this must be an easier problem than max. However, I've tricked you a bit. What relationship is there between the "n" from the max problem and the "n" from the power(3,n) problem? None! For each problem, we choose a different parameter to use as our "input size", and we analyze the complexities of our algorithms in terms of these parameters. For different algorithms solving the same problem, this is fine (as long as we analyze our algorithms in terms of the same parameters), but if we're going to compare different problems, we need a universal standard that we can use to measure input size, and we need to analyze problems/algorithms complexities in terms of this universal standard. We'll use "bit size" to do this. In other words, the size of a particular instance of a problem is the number of bits required to write down that problem instance.

Input Size = Bit Size
When we analyze the complexity of a problem, it is typically in terms of some parameter that reflects the input size. If we're comparing different problems, however, such arbitrary parameters are insufficient. We need an absolute measure of size. For this we use the bit size, i.e. the number of bits required to write the input down.
Inportant fact: It take 1 + ⌊ lg k ⌋ bits to represent a positive integer k.

So, returning to the problem of power(3,n), we have a Θ(lg n) algorithm using divide and conquer, but what is the complexity of this algorithm in terms of the bit size of the input? Well, the input is the number n, and it takes lg(n) bits (approximately) to represent n. So our bit size S = lg n. Therefore, our Θ(lg n) algorithm is a Θ(S) algorithm, i.e. it's linear in the bit size. Of course, no algorithm can be sub-linear in the input size, since it has to read the entire number n in order to determine the answer. So, power(3,n) is linear in the bit-size of its input, just like max, so the two problems are of the same difficulty.

Note: this assumes that all comparisons and arithmetical operations take constant time. To be more precise, we need to pin down our problems precisely enough to factor these costs in as well.

Time out: What's a polynomial?
We'll talk a lot about "polynomial" running time functions --- about algorithms that do or don't run in polynomial time. So it's important that we understand what a polynomial actually is. A polynomial is an expression involving variables, numbers and the operations +, -, and *. That's it. Now, because it's a little bit tedious to write x*x*x - 3*x*x + 1, we allow the shorthand x^3 - 3*x^2 + 1. So exponentiation is allowed in a polynomial expression only if the exponent is a non-negative integer constant --- i.e. only if it is simply a shorthand for writing x*x*x... a bunch of times.
x^3 - 2 polynomial
x^3 - lg(x - 3) not a polynomial, "lg" is not one of the approved operations
-2*(x - 3)^2 + 4*x*(x^3 - 1) polynomial
(3*x^2 - 2)^k, where k = 7 polynomial
2^x - 1 not a polynomial, only constant exponents are allowed

An algorithm is "polynomial time" if it is O(p(n)) for some polynomial p. Thus, even though binary search is Θ(lg n) and lg n is not a polynomial, it is polynomial time because it is also O(n).

Tractable & Intractable
Tractable problems are the ones we can solve "in a reasonable amount of time". Intractable ones are those that cannot be solved "in a reasonable amount of time". We will call a problem tractable if it can be solved in a time that is bounded by some polynomial in the input size. In other words, if there is a constant k such that the problem can be solved in time O(Sk), where S is the bit size of the input, the problem is tractable. This includes nearly every problem we've talked about so far. Anything else is intractable. The 0/1 Knapsack Problem and Integer Factorization are the two problems we've discussed that experts suspect are intractable - although they don't know for sure.

This choice of definitions for tractable and intractable is philosophical, not mathematical. There is logic behind it however. An problem requiring time Θ(n^100) would certainly be unreasonably difficult to solve, but it just doesn't seem to arise in practice. The exponent is typically small if it exists. This definition has some other nice properties. For example, if an algorithm runs in polynomial time on your desktop computer, it also runs in polynomial time on a Turing machine ... of course the exponent my be quite a bit larger on the Turing machine! [Turing machines are a standard mathematical model of computation.]

Another nice feature of our definition of intractibility, is that it allows us to be somewhat sloppy about measuring input size. For example: Suppose we have an algorithm that we've analyzed as running in time O(Sk) and ... oops, the actual input size is the really the square of S! So, the acutal input size, call it T, is T = S^2. So the algorithm runs in time O((√T)k) = O(Sk/2), which is still polynomial. So if the actual bitsize of the input is a polynomial in what you've used as input size, polynomial time algorithms stay polynomial.

P vs. NP Informally Speaking
The class of problems that can be solved in polynomial time, i.e. O(nk) for some constant k, is denoted by P. Almost all the algorithms we've discussed are known to be in P. After all, we discussed polynomial time algorithms that solve them!

NP is the class of problems for which solutions can be verified in polynomial time. For example, we don't know of any polynomial time solution for the 0/1 Knapsack Problem. However, we certainly know how to verify that a purported solution is in fact a solution in polynomial time - just add up the weights in the purported solution and see if the result equals the capacity. Actually, there's another requirement too, namely that the bit-length of any purported solution is polynomial in the bit size of the problem (or equivalently that the number of possible solutions is exponential in the bit length of the problem).

If we can solve a problem in polynomial time, we can certainly verify a solution in polynomial time, so any problem in P is certainly in NP. But is this true the other way 'round? Or are there problems in NP that are not in P? Most everybody believes that there are, but nobody knows.

Brute Force, or Guess and Check
Because we can verify solutions quickly for problems in NP, we can adopt a brute force "Guess and Check" approach. Keep generating and guessing possible solutions until we find one that works or we've tried every possibility. Our restriction on the bit-size of possible solutions guarantees that we have to try at most exponentially many candidate solutions.

Therefore, if a problem is in NP then we know that it can be solved in exponential time. The big question is whether it can be solved faster ... perhaps in polynomial time.


Christopher W Brown
Last modified: Wed Apr 8 11:07:57 EDT 2009