next up previous contents index
Next: The Hamming metric Up: Error-correcting codes Previous: Uncertainty   Contents   Index

Basic results on codes

Before the formal definition is given, let us go straight to an example.

Example 3.3.1   ``Hi Bob''. ``What?'' ``Hi Bob'' (louder). ``What?'' ``Hi Bob'' (even louder). ``Oh, hi.'' This illustrates a repetition code. More precisely, the $ q$-ary repetition code of length $ n$ is the set of all $ n$-tuples of the form $ (x,x,...,x)$, for $ x\in \mathbb{F}_q$. We think of $ x$ as representing information you want to send. It could be the ``greyness'' of a pixel in a picture or a letter (represented in ASCII code) in a word, for example. Since the channel might contain noise, we send $ (x,x,...,x)$ instead, with the understanding that the receiver should perform a ``majority vote'' to decode the vector. (For example, if $ (0,1,0,...,0)$ was received then 0 ``wins the vote'').

This wasn't a very efficient example. Let's try again.

Example 3.3.2   In this example, we will design a code which will send a number in $ \{0, 1, 2, ..., 7\}$ to another person. We will see that in fact our design is not well-thought out. Hopefully, this flaw will motivate us to do a better job later. First, write the number in binary, 0 as $ 000$, $ 1$ as $ 001$, ..., $ 7$ as $ 111$. If $ c_i\in \mathbb{F}_2$ for $ 1\leq i\leq 3$, then define

$\displaystyle c_4=c_1+c_2,\ \ \ \ \
c_5=c_1+c_2+c_3,\ \ \ \ \
c_6=c_3.
$

For example, $ 4$ or $ 100$ is encoded as $ (1,0,0,1,1,0)$. Let $ C\subset \mathbb{F}_2^6$ denote the set of all vectors $ c=(c_1,...,c_6)$ where $ c_i\in \mathbb{F}_2$, for $ 1\leq i\leq 3$, and $ c_4,c_5,c_6$ are defined as above. In other words, define

$\displaystyle C=\{ v\in \mathbb{F}_2^6\ \vert\ Hv={\bf0}\},
$

where

\begin{displaymath}
H=
\left(
\begin{array}{cccccc}
1 & 1 & 0 & 1 & 0 & 0 \\...
... 0 & 1 & 0 \\
0 & 0 & 1 & 0 & 0 & 1
\end{array}
\right).
\end{displaymath}

This is a $ 3$ dimensional vector space over $ \mathbb{F}_2$. Why is this not a ``good'' code? Suppose for instance Alice wants to send the number $ 2$ or $ 010$, so she encodes $ (0,1,0)$ as $ c=(0,1,0,1,1,0)$. Suppose Bob receives $ v=(0,0,0,1,1,0)$. Bob knows that an error occurred in transmission since $ Hv\not= {\bf0}$, so $ v\notin C$. Suppose only one error occurred. (This is more likely than having two or more errors, and it is easier to detect exactly one error that two or more.) Where did the error occur? Bob doesn't know since $ v$ is ``near'' $ (1,0,0,1,1,0)$ and it is ``near'' $ (0,1,0,1,1,0)$. He can't tell which is correct. The problem with this example boils down to the fact that the first and second column of $ H$ are the same. We shall explore this more later.

Enough examples - now for the definition.

Definition 3.3.3   Let $ F$ be a finite field. A subset $ C$ of $ V=F^n$ is called a code of length $ n$. A subspace of $ V$ is called a linear code of length $ n$. If $ F=\mathbb{F}_2$ then $ C$ is called a binary code. If $ F=\mathbb{F}_3$ then $ C$ is called a ternary code. If $ F$ has $ q$ elements then $ C$ is called a $ q$-ary code. The elements of a code $ C$ are called code words or code vectors. Sometimes elements of $ F^n$ which do not necessarily belong to $ C$ are called ``received vectors''. The information rate of $ C$ is

$\displaystyle R={\frac{\log_q\vert C\vert}{n}},
$

where $ \vert C\vert$ denotes the number of elements of $ C$.


next up previous contents index
Next: The Hamming metric Up: Error-correcting codes Previous: Uncertainty   Contents   Index
David Joyner 2002-08-23