It seems like restricting ourselves to decision problems would be a very big limitation, but in fact that's not really true. Many interesting problems have a decision problem variant. For example, the 0/1 Knapsack problem can be changed from "Which items can I choose to maximize the value of the items in the Knapsack without exceeding the capacity?" to "Is it possible to fill the Knapsack to achieve value k or more (not exceeding the capacity, of course)?" The rephrased problem is a decision problem, and it's clear that the original is at least as hard as the decision problem. Here's another example, the problem "What's the longest path in a given graph G?" can be phrased as "Is there a path of length at least k in G?". Once again it's clear that the decision problem version is easier. After all, if we knew how to find the longest path, we'd know how to answer whether or not there was a path of length k!
Example: Consider the problem of finding a minimum hitting set. If S is a set of n subsets of {1,..,k}, a minimum hitting set for S would be a minimum size subset of {1,...,k} that has non-empty intersection with (i.e. "hits") each of the n elements of S.
- Define a decision-problem variant of this problem. We'll call the decision-problem variant MINHITSET.
- Define a notion of "certificate" for MINHITSET.
Definition: A problem is in NP if it is a decision problem for which we have
- certificates whose length are polynomial in the size of the problem instance, and
- an algorithm that verifies certificates in time polynomial in the size of the problem instance.
To prove that a particular problem is in NP, we'll have to show that it satisfies the above definition - kind of like a lawyer showing that the defendant's actions satisfy the definition of the crime for which he's on trial.
Proposition: The problem LongPath(G,k) is in NP.
Proof: (PARTIAL!)
- Note that LongPath(G,k) is a decision problem, as the definition of NP requires!
- Here's my notion of certificate: A certificate is a list of vertices comprising a path of length at least k
- Here's my algorithm for verifying a certificate:
Verify(G,k,C)
- Read G, k, store graph G in an adjacency matrix
- Read certificate C into an array
- if m < k, where m is the length of C, return FALSE
- for i = 1 to m - 1 do
if G has no edge from vertex C[i-1] to C[i] return FALSE- for i = 0 to m - 1 do
for j = i + 1 to m - 1 do
if C[i] == C[j] return FALSE- return TRUE
Proposition: The problem LongPath(G,k) is in NP.
Proof: The problem is a decison problem, of course. We give the vertices the indices 0,1,...,n-1, where n is the number of vertices in the graph, and we will refer to vertices by index. We will encode a problem instance by writing n-1 0's followed by a 1, which encodes n, then k-1 0's followed by a 1, which encodes k, and then listing out the n^2 bits of the adjacency matrix for G one row at a time from top to bottom. Since k ≤ n, the bit size s of an encoded problem instance is thus:s = Θ(n^2).Now we are ready to show that LongPath(G,k) satisfies the two parts of our definition of NP:
- Certificates. A certificate for this problem is simply a list of at most n vertices. If these vertices form a path in G without repeated vertices and of length is at least k, the certificate is valid and it is invalid otherwise.
The certificate could be encoded by writing out the binary representations of each consecutive vertex in the path using ceiling(lg n) bits. Since the number of vertices in the path is at most n, the bit-length of any certificate is at most n*ceiling(lg n), which is O(n^2). So the certificate length is O(s), i.e. polynomial in the input size.
- Polynomial time algorithm for verifying certificates.
- Read problem instance, store G in an adjacency matrix, and store value k.
- Read the r indices in the Certificate into array P, if r < k return false.
- Let M be an array of n bits, initialized to 0.
- for i from 0 .. r-1 do
- if M[P[i]] != 0 return false (we have a repeated vertex)
- set M[P[i]] = 1.
- if i < r-1 and there is no edge from vertex P[i] to vertex P[i+1] return false.
- return true
It should be clear that each step can be done in time O(s), so the whole algorithm takes time O(s), which is indeed polynomial in s, the bit length of the encoded input, as required.
Hopefully you see that proving that a problem is in NP means nothing more than matching that problem to the above definition, just as we did for LongPath.