Greg Coy1

## Contents

in A long standing problem has been to develop good" binary linear block codes, , to be used for error-correction. The length of the block is denoted and the dimension of the code is denoted . So in this notation is a -dimensional subspace.

Another important parameter is the smallest weight of any non-zero codeword, . This is related to error-correction because can correct errors.

We will construct an interesting family of codes which have the property that has a limit and has a limit . We will also investigate how this family violates the fake Goppa conjecture." Moreover, we can show that the Goppa conjecture is incompatible with a conjecture on hyperelliptic curves over finite fields, which we call the small content conjecture."

# Introduction

A linear code, , of length is a subspace of for some prime power and some . The addition on is inherited from the usual coordinate-wise addition on . Let be a linear code of length over . If then the code is called binary. We assume that has been given the standard basis , , ..., . The dimension of is denoted , so the number of elements of is equal to . A subset of without an addition operation is called a non-linear code.

Another important parameter associated to the code is the number of errors which it can, in principle, correct. For this notion, we need to introduce the Hamming metric. For any two , let denote the number of coordinates where these two vectors differ:

 (1)

This is called the Hamming distance or Hamming metric. Define the weight of to be the number of non-zero entries of . Note, because the vector has non-zero entries only at locations where and differ. The smallest distance between distinct codewords in a linear code is the minimum distance of :

 (2)

A linear code of length and dimension over has a basis of vectors of length . If those vectors are arranged as rows of a matrix , then we call the matrix a generator matrix for .

Example 1   The following matrix is a generator matrix for a code of length , dimension , and minimum distance over .

Definition 1   A linear code of length is a cyclic code if whenever is a codeword then so is its cyclic shift .

Example 2   Consider the binary code with generator matrix

Clearly these four rows , , , are obtained from the previous row by a shift to the right. Also note the shift of to the right is equal to . The shift of to the right is . And the shift of is . The shift of is . Therefore, the linear code generated by is invariatant under shifts to the right. Therefore is a cyclic code.

Cyclic codewords are conveniently represented as polynomials modulo . In fact, if then let

denote the associated codeword polynomial. In this notation the cyclic shift of corresponds to the polynomial . In other words cyclic shifts correspond to multiplication by . Since cyclic shifts leave cyclic codes invariant, multiplication by any power of modulo corresponds to a codeword in . Since is a linear code, the sum of any two such codeword polynomials is another codeword polynomial. Therefore, in fact, the product of any codeword polynomial times any polynomial in modulo is another codeword polynomial.

Denote by the ring of polynomials with coefficients in modulo :

 (3)

Define an ideal of to be any subset of closed under addition and multiplication by an arbitrary element of :
• If then , and
• If and then .
In other words an ideal in is simply a subset closed under addition and multiplication by an arbitrary polynomial modulo . In particular, the collection of codeword polynomials associated to a cyclic code is an ideal of .

Lemma 1   There is a natural one-to-one correspondence between cyclic codes of length over and ideals of .

This can be found in any book on coding theory, for example MacWilliams and Sloane [MS].

In fact the algebraic software package GUAVA allows you to easily pass back and forth between codewords as vectors and codewords as polynomials.

In order to define the generator polynomial of a cyclic code we need the following mathematical fact.

Lemma 2   Every ideal of is of the form . In other words every element of is a multiple of for some polynomial in .

Ideals of the form are called principal ideals, and is called a generator of the ideal .

Proof Suppose not. Let be a non-zero element in of smallest degree. Pick an arbitrary non-zero element in . By the division algorithm, we can write where and are polynomials and either or the degree of is strictly less than the degree of . Notice that belongs to by definition. This contradicts the assumption that has smallest degree unless . Therefore every element of is a multiple of . Take .

Definition 2   Let be a cyclic code of length . Let be the ideal corresponding to by Lemma 1. We call a generator polynomial of if it is a generator of .

Example 3   We continue with Example 2. Let . This is the codeword polynomial associated to the top row of the generator matrix. is the generator polynomial of the cyclic code . Note that .

More generally we have the following result:

Lemma 3   The generator polynomial always divides .

Proof By the division algorithm for some quotient and some remainder of degree less than . In this means that . So corresponds to a codeword since is the generator polynomial. This is impossible since has minimal degree of any codeword in unless .

Lemma 4   If the generator polynomial of a cyclic code has degree then .

Proof To prove this it suffices to show that . Note that

has elements.

Let be a prime. We denote by the subset of non-zero elements in the field . For any finite field , is a cyclic group. A generator of that cyclic group is a called a primitive element of the field .

We define to be a quadratic residue if is a square in the finite field . Let denote the subset of quadratic residues and denote the subset of non-quadratic residues.

A simple observation is that is a subgroup of , in particular a quadratic residue times another quadratic residue is a quadratic residue. Similarly, a quadratic residue times a quadratic non-residue is a quadratic non-residue. In particular we can represent the quotient using two elements, the identity and a fixed quadratic non-residue:

where denotes a fixed quadratic non-residue.

Definition 3   The Legendre symbol is defined by

Observe that by the above discussion the Legendre symbol is a multiplicative character on .

Lemma 5   .

Proof Follows from the above observation by a simple counting argument.

Now let us define the family of quadratic residue codes.

Definition 4   Let be a prime such that is a quadratic residue (by quadratic reciprocity, we know ). Choose an integer such that is divisible by and let denote a primitive element of . Let . Let

Define to be the cyclic code of length with generator polynomial and define to be the cyclic code of length with generator polynomial . These define the quadratic residue codes, or QR codes.

Note that these generator polynomials, and both divide . By the lemma above .

Example 4   This is the generator matrix for the QR code of length .

For any binary code of length let denote the extended code defined by

For example if is the code above then belongs to since belongs to .

Recall that ([MS],§1.9), the check matrix of is

Lemma 6   If then . If then and .

One of the most remarkable facts about quadratic residue codes is the following theorem.

Theorem 1 (Gleason-Prange)   If then the automorphism group of contains a group isomorphic to .

Recall that the automorphism group of a binary code of length is the group of all permutations of the coordinates which send codewords to codewords.

We intend to extend the Gleason-Prange theorem to a new class of codes studied here (called QQR codes").

These are some observations on the interesting paper by Bazzi and Mitter [BM].

If , let . Let be the quadratic residue character, which is on the set quadratic residues in , on the set non-quadratic residues, and is 0 on . Let and denotes the polynomial

where . (Note that , where is the set of elements , for . In particular, since is a subgroup, if and only if if and only if (by the quadratic reciprocity law). Moreover, if then .)

Define the QQR code as

where are as above. These are binary codes of length and dimension .

Lemma 7 (Bazzi-Mitter [BM], Proposition 3.3)   Assume and are non-quadratic residues mod (i.e. ).

If is a nonzero codeword of the binary code then the weight of this codeword can be expressed in terms of a character sum as

if is even, and

if is odd.

In fact, looking carefully at their proof, one finds the following result:

Lemma 8   Let be a nonzero codeword of .
(a)
If is even

(b)
If is odd and then the weight is

(c)
If and odd both hold then

Proof If then we claim

 (4)

where . In general it is true that . Indeed,

where counts the , such that . Now

so (4) is true.

Let , then we have

Let

Case 1. If is even and then so odd implies that is even, since 0 is not included in or . Likewise, even implies that is odd. Therefore .

Case 2. If is even and then parity parity . If is even then and if is odd then .

Case 3. is odd. We claim that . (Proof: Let and . Then , which is obviously a contradiction. Therefore , so Replace by to prove the claim.) Also note that

has elements. So

iff is even and

iff is odd.

Conclusion.

From which the lemma follows.

Remark 1
• even: The is equal to plus the number of -rational points on the (smooth projective model of the) hyperelliptic curve . In other words,

• odd: The is equal to plus the number of -rational points on the (smooth projective model of the) hyperelliptic curve . In other words,

• The genus of the curve , when is even, is . Therefore, Weil's estimate gives in this case

which is trivial if .

Remarks on QQR Codes: These codes have very interesting but unexplained behavior. The conjecture of Bazzi-Mitter that they (or a subsequence of them) form a good family'' of codes (where the rate, relative minimum distance pair stays bounded away from the axes and ) seems not to be consistent with the above (very limited) data on values, except possibly in the case .

We now introduce a new code, constructed similarly to the Quasi-Quadratic Residue Codes discussed earlier:

where, for prime and ,

Observe that this code is non-linear with respect to the usual coordinate-wise addition.

For any , let

and let

If denotes the symmetric difference between and then it is easy to check that

 (5)

We know that

by Lemma 8. If then

 (6)

We have the following trivial estimates:

and

therefore

This implies the minimum non-zero weight of satisfies , when .

We now compute the size of . We now prove the claim: if then the map that sends to the codeword is injective. This implies . Suppose not, then there are two subsets that are mapped to the same codeword. Subtracting, the subset satisfies . If is even then . This forces to be the empty set, so . Now if is odd then similar reasoning implies that is the empty set. Therefore, and or vice versa. This proves the claim.

In case , claim: . Again, suppose there are two subsets that are mapped to the same codeword. Then the subset for which . This implies either or . Therefore, either or .

We have proven the following result.

Theorem 2   The non-linear code has length and minimum non-zero weight . It has size if , and size if .

## Duality in the LQR code

First, a few observations.

For any , let denote the set which is if , and if .

By (5), it suffices to find the minimum non-zero weight of , . We first find a duality'' relation between and , and and .

By (6), it is obvious that if then .

If then , so

If then , , so

These facts together with (5) imply: if then ; if then . Consequently, if then

In particular, if then and this can be more compactly re-expressed as

 (7)

provided .

By (5), we have

 (8)

provided . Now that we know this, we can write, if ,

In particular,

 (9)

provided . This is also a consequence of (5) and (8). (So now we have at least three proofs of this fact!)

## Linear version of the LQR code

Let us denote the map by and its inverse (which only exists if ) by .

When , define addition'' on by

 (10)

for arbitrary subsets of . It is easy to check that this operation is well-defined in case .

The surprising fact is the following result.

Lemma 9   In case , (10) is well-defined.

Proof Assume . Recall from the above discussion that is a 2-to-1 map from the set of subsets of to the codewords of . For all , there is a and (for ) if and only if or . We know . This implies does not depend on the choice in , made to compute the right-hand side of (10).

This operation is commutative, since the symmetric difference operation is symmetric, and it is associative since is associative. Each element is the inverse of itself and the element is the identity.

Therefore, is a vector space over with the operation . When , the set of codewords , a singleton subset of , form a basis. When , the set of codewords , a singleton subset of , are linearly dependent (their sum is 0), so do not form a basis. However, if you just omit one (any one - pick your least favorite), you get a basis.

Define the metric as follows. For , let

Of course, the Hamming metric, denoted to be unambiguous, satisfies .

We make a few remarks comparing to . If is written , for polynomials , then

and

By definition of ,

On the other hand, if , ,

and

The difference between and can now by computed using Lemma 8.

We assert that, with this notion of addition, the -Hamming metric'' on is the more natural'' one to use.

This and the previous section prove the following result.

Theorem 3   The linear code has length and minimum -distance . It has size if , and size if .

## The fake Singleton bound

Here is an example from Amin Shokrollahi [S].

This example will construct, in a naive way, a set of vectors, call it , in , and an addition in , such that is a binary vector space under , and such that the weight of is always at least (this construction gets you all the way to the fake Singleton" bound). Here it is.

Let denote the -dimensional vector in which all entries are . Let . For and let . Under this addition" becomes a binary vector space. The weight of any element in is , so the -distance of any two elements in , defined as the weight of their -addition, is . This is a code of length , dimension , and minimum -distance , so the fake singleton bound is almost attained. In particular, we have asymptotic rate and relatively minimum -distance .

# Gilbert-Varshamov

## Background

The volume of a Hamming sphere of radius in is

The binary version of the Gilbert-Varshamov bound asserts that there is a code (possibly non-linear) of length and minimum distance which has at least elements [HP]. A well-known conjecture of Goppa asserts that the binary version of the Gilbert-Varshamov bound is asymptotically exact ([JV],[G]).

Conjecture 1 (Fake Goppa Conjecture)   There is no infinite family of codes , with addition not the usual coordinatewise addition and distance as above, for which the binary version of the -Gilbert-Varshamov bound is asymptotically exact.

Here is sometimes referred to as the entropy of the code.

The Gilbert-Varshamov bound implies that there is a code with relative minimum distance and rate . However our code above yields and .

Therefore we have constructed a linear binary code which beats the -Gilbert-Varshamov bound, thereby disproving the fake Goppa's conjecture.

Goppa's conjecture was investigated recently by Jiang and Vardy [JV], where they proved that there is a code (possibly non-linear) of sufficiently large" length and minimum distance which has at least elements, where is a positive constant which they do not determine. Our result is somewhat different because we show that if then the rate can be taken larger than that predicted either by analogs of the Gilbert-Varshamov bound or the Jiang-Vardy bound.

## Connection with hyperelliptic curves

A hyperelliptic curve over is a polynomial equation of the form where is a polynomial with coefficients in with distinct roots. We will consider the hyperelliptic curve defined by where is defined in section 3. We will let denote the cardinality of the set of all satisfying plus the number of points at infinity on . We define the content of to be .

Conjecture 2 (Small content conjecture for split hyperelliptics)

For an infinite number of primes for which , the content of is less than .

The goal of this section is to prove the following result which relates the theory of error-correcting codes to the theory of hyperelliptic curves over finite fields.

Theorem 4   If the small content conjecture is true then Goppa's conjecture is false.

Proof Recall Goppa's conjecture is that the binary Gilbert-Varshamov is best possible for any family of binary codes. The GV bound states that the rate is greater than or equal to , where . According to Goppa's conjecture if then the best possible is . This means that the minimum distance of our long quadratic residue code with rate satisfies . Recall that the weight of a codeword in this LQR code is the weight of the 4-tuple of polynomials . Let us assume is even for simplicity. Then by Lemma 8 and remark 1 we know that the . Now suppose is odd and . Then By lemma 8 and remark 1 we know that the . By the small content conjecture, . Therefore . This contradicts the estimate above.

Acknowledgements: I thank Prof. Amin Shokrollahi of the Ecole Polytechnique Federale de Lausanne in Switzerland for his helpful remarks.

## Bibliography

BM
L. Bazzi and S. Mitter, Some constructions of codes from group actions, preprint, 2001.

Gap
The GAP Group, GAP - Groups, Algorithms, and Programming, Version 4.4; 2002, (http://www.gap-system.org).

G
V. D. Goppa, Bounds for codes, Dokl. Acad. Nauk. 333(1993)423.

GU
GUAVA home http://www.gap-system.org/Packages/guava.html

HP
W. C. Huffman and V. Pless, Fundamentals of error-correcting codes, Cambridge Univ. Press, 2003.

HV
T. Helleseth and J. F. Voloch, Double circulant quadratic residue codes, IEEE Transactions in Information Theory, to appear, available on the webpage
http://www.ma.utexas.edu/users/voloch/preprint.html

JV
T. Jiang and A. Vardy, Asymptotic improvement of the Gilbert-Varshamov bound on the size of binary codes, IEEE Trans Info Theory, to appear.

M
Jason McGowan, Implementing generalized Reed-Solomon codes and a cyclic code decoder in GUAVA, USNA Honors Thesis 2004-2005.
http://www.usna.edu/Users/math/wdj/_files/documents/mcgowan/index.html

MS
F. MacWilliams and N. Sloane, The theory of error-correcting codes, North-Holland, 1977.

S
A. Shokrollahi, email communication, 2-2006.

vLM
J. van Lint, F. J. MacWilliams, Generalized quadratic residue codes, Proc. IEEE Trans Info Theory 24(1978)730-737.

W
H. N. Ward, Quadratic residue codes and symplectic groups, J. Algebra 29(1974)150-171.