next up previous contents index
Next: -codes and error correction Up: Error-correcting codes Previous: Basic results on codes   Contents   Index


The Hamming metric

We've seen so far come simple examples of codes. What is needed is some notion of how to compare codewords. Geormetrically, two codewords are ``far'' from each other if there are ``a lot'' of coordinates where they differ. This notion is made more precide in the following definition.

Definition 3.4.1   If $ {\bf v}=(v_1,v_2,...,v_n)$, $ {\bf w}=(w_1,w_2,...,w_n)$ are vectors in $ V=F^n$ then we define

$\displaystyle d({\bf v},{\bf w})
=\vert\{i\ \vert\ 1\leq i\leq n,\ v_i\not= w_i\}\vert
$

to be the Hamming distance between $ {\bf v}$ and $ {\bf w}$. The function $ d:V\times V\rightarrow \mathbb{N}$ is called the Hamming metric. The weight of a vector (in the Hamming metric) is $ d({\bf v},{\bf0})$.

Note that

$\displaystyle d({\bf v},{\bf w}) =\vert\{i\ \vert\ 1\leq i\leq n,\ v_i-w_i\not= 0\}\vert =d({\bf v}-{\bf w},{\bf0})$ (3.1)

for any vectors $ {\bf v},{\bf w}\in F^n$ (or, more generally, any vectors in a linear code). Using this, it is easy to show that $ d$ satisfies the properties of a metric: Let $ v\in F^n$ and let

$\displaystyle B(v,r,F^n)=\{w\in Fn\ \vert\ d(v,w)\leq r\}.
$

This is called the ball of radius $ r$ about $ v$. Since $ F^n$ is finite, this ball has only a finite number of elements. It is not hard to count them using a little bit of basic combinitorics. Since this count shall be needed later, we record it in the following result.

Lemma 3.4.2   If $ 0\leq r\leq n$ and $ q=\vert F\vert$ then

\begin{displaymath}
\vert B(v,r,F^n)\vert=\sum_{i=0}^r
\left(
\begin{array}{c}
n\\
i
\end{array}
\right)
(q-1)^i.
\end{displaymath}

proof: Let

$\displaystyle B_i(v,r,F^n)
=\{w\in Fn\ \vert\ d(v,w)=i\}.
$

This is called the shell of radius $ i$ about $ v$. It is consists of all vectors with exactly $ i$ coordinates different from $ v$. There are \begin{displaymath}\left(
\begin{array}{c}
n\\
i
\end{array}
\right)\end{displaymath} ways to choose $ i$ out of $ n$ coordinates. There are $ (q-1)^i$ ways to choose these $ i$ coordinates to be different from those in $ v$. Thus,

\begin{displaymath}
\vert B(v,r,F^n)\vert=\sum_{i=0}^r \vert B_i(v,r,F^n)\vert...
...t(
\begin{array}{c}
n\\
i
\end{array}
\right)
(q-1)^i.
\end{displaymath}

$ \Box$

Example 3.4.3   If $ F=\mathbb{F}_{11}$ and $ V=F^{10}$ then

$\displaystyle C=\{(x_1,x_2,...,x_{10})\ \ \vert\ x_i\in F,\
x_1+2x_2+3x_3+...+9x_9+10x_{10}\equiv 0\ ({\rm mod}\ 11)\}
$

is called the ISBN code. This is an $ 11$-ary linear code of length $ 10$. This is the same code used in book numbering except that the number $ 10$ is denoted by $ X$ on the inside cover of a book. For example, $ (1,0,0,0,0,0,0,0,0,1)$ and $ (1,1,1,1,1,1,1,1,1,1)$ are code words. Their Hamming distance is $ 8$.

Example 3.4.4   The U. S. Post Office puts a bar code on each letter to help with its delivery. What are these funny symbols? Translated into digits, they are given in the following table.
number bar code
1 | | | | |
2 | | | | |
3 | | | | |
4 | | | | |
5 | | | | |
6 | | | | |
7 | | | | |
8 | | | | |
9 | | | | |
0 | | | | |
Each ``word'' in the postal bar-code has 12 digits, each digit being represented by short bars (we regard as a 0) and longer bars (which are regarded as a $ 1$), as above. The 12 digits are interpreted as follows: The first 5 digits are your zip code, the next 4 digits are the extended zip code, the next 2 digits are the delivery point digits, and the last digit is a check digit (all the digits must add to 0 mod $ 10$). For example, suppose that after translating the bars into digits, you found that the postal code on an envelope was $ ?62693155913$, where $ ?$ indicates a digit which was smudged so you couldn't read it. Since the sum must be 0 mod $ 10$, we must have $ ?=0$.

Definition 3.4.5   The weight distribution vector of a code $ C\subset F^n$ is the vector

$\displaystyle W(C)=(W_0,W_1,...,W_n)
$

where $ W_i$ denote the number of codewords in $ C$ of weight $ 1$. Note that for an linear code $ C$, $ W_0=1$, since any vector space must contain the zero vector.

Example 3.4.6   In Example 3.3.2, the code $ C$ has weight distribution vector $ W(C)=(1,0,1,3,2,1)$.

Let $ C\subset F^n$ be a code. The number

$\displaystyle d=\min_{{\bf v}\not= {\bf w}}
d({\bf v}, {\bf w}),
$

where the minimum runs over all distinct code words in $ C$, is called the minimum distance of $ C$. This (as we will see) measures the ability for the code to correct errors which might be introduced in transmission.

Theorem 3.4.7   If $ C$ is a linear code and if $ d$ denotes the minimum distance of $ C$ then

$\displaystyle d=\min_{{\bf v}\not= {\bf0}}
d({\bf v}, {\bf0}),
$

where the minimum runs over all non-zero code words in $ C$.

In other words, the minimum distance of a linear code is the length of its shortest non-zero vector. proof: Clearly, $ d\leq \min_{{\bf v}\not= {\bf0}}
d({\bf v}, {\bf0})$. Conversely, suppose $ {\bf v},{\bf w}$ are code words such that $ d=d({\bf v},{\bf w})$. By (3.1), $ d=d({\bf v}-{\bf w},{\bf0})$, so $ \min_{{\bf v}\not= {\bf0}}
d({\bf v}, {\bf0})\leq d$. $ \Box$

Definition 3.4.8   We say that $ C$ is $ e$-error correcting if the following property always holds: given any $ {\bf w}\in F^n$ whose Hamming distance from some $ {\bf c}\in C$ is $ \leq e$ then any $ {\bf c'}\in C$, $ {\bf c}\not= {\bf c'}$, must satisfy $ d({\bf w},{\bf c'})>e$ (in other words, a codeword at most distance $ e$ from $ {\bf w}$ is unique if $ {\bf w}$ has at most $ e$ errors).

Remark 3.4.9   This definition means that if you have sent off a code word $ {\bf c}\in C$ and know the receiver received a vector $ {\bf v}\in F^n$ which has $ e$ errors or less (i.e., $ d({\bf c},{\bf v})\leq e$) then the code word closest to $ {\bf v}$ (in the sense of the Hamming distance) is $ {\bf c}$. In this case, to decode the received vector $ {\bf v}$, all the receiver has to do to determine $ c$ is search $ C$ (which is a finite set) and find a code word $ {\bf c'}$ which is as close as possible to $ {\bf v}$. This strategy is called the nearest neighbor algorithm. If $ C$ is $ e$-error correcting and $ {\bf v}$ has no more than $ e$ errors then $ {\bf c'}={\bf c}$. If $ C$ is ``too large'' (where the definition of ``large'' is a function of the speed of your computer and your level of patience) then this algorithm is too slow for practical purposes.

The following theorem of Shannon is fundamental to the theory of error correcting codes.

Theorem 3.4.10 (fundamental theorem of information theory)   Consider a binary symmetric channel with $ p>1/2$. Let $ \epsilon>0$ and $ \delta>0$ be given. For all sufficiently large $ n$, there is a code $ C\subset \mathbb{F}_2^n$ with information rate $ R$ satisfying $ cap(p)-\epsilon <R<cap(p)$, such that the nearest neighbor algorithm decoding has average probability of incorrect decoding less that $ \delta$.

Shannon's theorem guarantees us that ``very good'' codes (in the sense that they are close to being best possible) exist. They may not be linear and even if they are, the theorem does not suggest that they are practical (i.e., ``fast'' encoding and decoding algorithms exist). However, we shall very briefly discuss in a later chapter ``low density parity check codes'', which can be both ``very good'' and practical. The proof of Shanon's theorem is not easy and goes beyond the scope of this book. For a proof and further discussion, see Ash [Ash], §3.5. Figure 3.1 graphs the capacity for a binary code.
Figure 3.1: The capacity function for binary codes.



\includegraphics[height=5cm,width=5cm]{/home/wdj/applied_algebra/gv_bound.eps}


Subsections
next up previous contents index
Next: -codes and error correction Up: Error-correcting codes Previous: Basic results on codes   Contents   Index
David Joyner 2002-08-23