If your program asks for a random bit of data, the subsequent computation will depend on whether a 1 or a 0 was returned. However, you'll get one or the other, and the program will proceed with a single computation based on that value. In nondeterministic computation you can request a "nondeterministic bit" and, instead of getting one or the other, the computation branches off into two alternate realities, or "alternate computations", which proceed simultaneously  one branch corresponding getting a 1 and the other corresponding to getting a 0. If you have several of these "nondeterministic bits" in your program, you'll have many "alternate reality computations" proceeding simultaneously. That's nondeterministic computing.
Great, but in the end we need one single answer out of these many simultaneous "alternate reality" computations. What'll it be? Well, recall that we've been restricting ourselves to decision problems in our discussions of P and NP. For nondeterministic computation, we need to do the same thing. So, assuming we're trying to solve a decision problem, if each alternate computation comes up with the result "false", then the result of the nondeterministic computation is "false". As long as at least one of the alternate computations comes up with the result "true" the result of the nondeterministic computation is "true".
Real computers, like the one I'm typing on right now, are deterministic not nondeterministic. The nondeterministic model of computing is not supposed to model real computers, they are supposed to help categorize certain kinds of computations. However, there is this idea of a "quantum computer" out there, and quantum computers may be able to compute like our nondeterministic machines. This kind of "both at once" idea is present in quantum mechanics  Schroedinger's Cat, for example  and is the real basis for quantum computing. However, it's not clear that there will ever be general purpose programmable quantum computers. Still, the possiblity is out there ...
So how does this characterization of NP as the class of problems that can be solved in polynomial time with a nondeterministic machine mesh with our definition of NP as the class of decision problems for which we have certificates that can be verified in polynomial time? Well, suppose I have a definition of certificates for instances of Problem A and a polynomial time algorithm for verifying certificates. Then I could generate a string of nondeterministic bits as long as any certificate, and run the verification algorithm using that sequence of bits as the certificate. Thus, I would effectively be simultaneously verifying all possible certificates. One of them will give me TRUE as an answer if and only if the answer to the original instance is TRUE.
This question of whether there are problems in NP that cannot be solved in polynomial time is the biggest open problem in computer science and one of the biggest in math as well. Truth of the matter is, most of us are pretty convinced that P ≠ NP, even though we can't prove it.
The decision Problem B is reducible to the decision Problem A if there is an algorithm that transforms instance I_{B}of problem B into an instance I_{A} of problem A that is equivalent. In this context, "equivalent" means that the answer to Instance B is true if and only if the answer to Instance A is true.
Problem A  Set Partition: Given positive integer weights w_{1}, ..., w_{n}, Can the weights be partitioned into two sets of equal weight?  
Problem B  Simple Knapsack: Given positive integer weights w_{1}, ..., w_{n} and capacity C, Does some subset of the weights sum to exactly C?  
 

Suppose Problem B is polynomial time reducible to problem A. If there is a polynomial time algorithm for solving A, then we have a polynomial time algorithm for solving B. Here it is:
I_{B} > I_{A} > solution Polynomial Polynomial Time Time Reduction Solution s = size I_{B} time = O(s^{k}) t = size I_{A} time = O(t^{m}) = O(s^{k}) = O(O(s^{k})^{m}) = O(s^{km})So we deduce that the time to solve I_{B} is O(s^{km}), which is polynomial in the size of I_{B}. The only real mystery is why we were able to deduce that t = size I_{A} = O(s^{k}), and that is explained by the following.
Suppose that a program runs in 1/10 of a second, and the machine it runs on executes an instruction every 1/1,000,000 of a second. What's the most memory this program could possibly use? The most memory would be used if every instruction wrote to a new memory location. Since instructions reference at most a word at a time, this would use 1 word = 4 bytes per instruction. In 1/10 of a second, our machine performs 100,000 instructions, so we use at most 400,000 bytes, or 400K.So, you should see how information about runing time translates into information about space usage. We know now that Algorithm X requires only polynomial space, so any output from X will have bitsize that's polynomial in the input size.
Input: Simple Knapsack problem w_{1}, w_{2},
..., w_{n} and capacity C
Output: A Set Partition problem
that's equivalent to
the input Knapsack Problem.
Proposition: Given Simple Knapsack problem w_{1}, w_{2}, ..., w_{n} and capacity C, the above algorithm returns an equivalent Set Partition problem.
Proof: Let S = w_{1} + w_{2} + ... + w_{n}. First, we'll show that if the Knapsack problem is true, so is the set partition problem. Suppose that some subset of weights fills the knapsack to capacity C. The sum of all the wieghts not in the knapsack is S  C. If C ≥ S/2 the partition problem returned by the algorithm adds weight w_{n+1} = 2*C  S, so adding w_{n+1} to the set of weights not in the knapsack gives gives a set of total weight S  C + 2C  S = C, and we have a partition into two sets of weight C. If C < S/2 the partition problem returned by the algorithm adds weight w_{n+1} = S  2*C, so adding w_{n+1} to the set of weights in the knapsack gives a set of total weight C + S  2C = S  C, which is equal to the set of weights not in the knapsack. Thus if the original knapsack problem can be solved, so can the returned partition problem.Next we'll show that if the Partition Problem is true, so is the Knapsack problem. Suppose that the n+1 wieghts in the partition problem can be partitioned into two sets of equal weight. If C ≥ S/2 the partition problem returned by the algorithm adds weight w_{n+1} = 2*C  S, so the sum of all weights in the partition problem is S + 2C  S = 2C. Thus, both partition sets sum to C, and whichever doesn't have the weight w_{n+1} is a solution to the original knapsack problem. If C < S/2 the partition problem returned by the algorithm adds weight w_{n+1} = S  2*C, so the sum of all weights in the partition problem is S + S  2*C = 2*(S  C). Thus, both partition sets sum to S  C, and if we remove weight w_{n+1} from whichever set has it, the remaining weights sum to S  C  (S  2*C) = C and thus provide a solution to the original knapsack problem. Thus if the returned partition problem can be solved, so can the original knapsack problem.
We can't have one problem solvable without the other one being solvable as well, so either both are solvable or neither are solvable. Thus the two problems are equivalent.
Clearly, our algorithm takes polynomial time. Thus, Simple Knapsack is polynomial time reducible to Set Partition, which means among other things that if we can solve Set Partition in polynomial time we can also solve Simple Knapsack in polynomial time. So there's another way to win your million dollars!