next up previous contents index
Next: Basic results on codes Up: Background on information theory Previous: Binary symmetric channel   Contents   Index

Uncertainty

We want to formalize the notion of uncertainty. Consider two experiments. In the first, you flip a fair coin. In the second, you roll a fair dice. It is reasonable to say that the outcome of the second experiment is more uncertain than the outcome of the first, simply because there are more possibilities to choose from. Consider the uncertainty (whatever that is) of a random variable $ X$ which takes on only the distinct values $ x_1$, ..., $ x_n$ with non-zero probabilities $ p_1$, ..., $ p_n$, resp., where $ p_1+...+p_n=1$. The uncertainty of $ X$ should not depend on the values of $ X$ but only on the probabilites $ p_i$, $ 1\leq i\leq n$. Likely events should be less uncertain than rare events. With this motivation, we present Claude Shannon's definition.

Definition 3.2.1   The uncertainty or entropy of the above random variable $ X$ is defined to be

$\displaystyle H(X)=H(p_1,...,p_n)=
-\sum_{i=1}^n p_i\log_2p_i,
$

where $ \log_2$ is the logarithm to the base $ 2$.

Example 3.2.2   Suppose that $ X$ is the signal received by a (noisy) binary symmetric channel. Then

$\displaystyle H(X)=H(p,1-p)=-p\log_2 p-(1-p)\log_2(1-p).
$

The maximum uncertainty is when $ p=1/2$, in which case $ H(X)=1$.

It is intuitively obvious that when the channel creates a lot of errors then there is a limitation to the information which can be sent. The next definition makes this idea more precise.

Definition 3.2.3   The capacity of the channel is $ cap(X)=cap(p)=1-H(X)$.

In the case of the binary symmetric channel, the capacity is $ cap(X)=1+p\log_2 p+(1-p)\log_2(1-p)$. The minimum capacity is when $ p=1/2$. To justify the formula which defines the capacity, we need Shannon's fundamental theorem of information theory (also called the noisy channel theorem). It will be stated in section §3.4.

Exercise 3.2.4   Suppose that we send a sequence of length $ n$ consisting of 0's and $ 1$'s. (a) What is the probability of exactly one error? (Ans: $ np(1-p)^{n-1}$. Why?) (b) What is the probability of exactly two errors? (Ans: $ {\frac{1}{2}}n(n-1)p^2(1-p)^{n-2}$. Why?) (c) What is the probability of exactly $ k$ errors?


next up previous contents index
Next: Basic results on codes Up: Background on information theory Previous: Binary symmetric channel   Contents   Index
David Joyner 2002-08-23