The goal of the project is to examine a polynomial-time probabilistic algorithm to compute the minimum distance of an arbitrary, non-binary, linear error-correcting code. Only the binary case was done in A. Foster's honors project [F]. For the non-binary case a new function was needed in the GAP kernel. This function was added by Steve Linton of St. Andrews, Scotland, one of GAP's kernel maintainers, in August 2004. The programming shall be done in the GAP coding-theory package GUAVA. GAP is a computer algebra package whose open source kernel is written in the C programming language [GAP]. However, most packages (such as GUAVA) and the algorithms described here are written in GAP's own interpreted language. Jointly, with my advisor, I extended Foster's work (Algorithm 4) to the non-binary case but with a new feature making it faster (Algorithm 5). The paper ends with applications to the non-binary McEliece public key cryptosystem (see §4) and trace codes (see §3)
Acknowledgement: The GAP examples were performed on an AMD-64 linux PC with 2G of RAM. I am grateful to the Director of Research Professor Reza Malek-Madani for making this available.
Let
be a finite field with
elements.
We call a subset of
a (block) code.
A linear code is a subspace of the vector space
. The length of the code is the integer
.
A generator matrix of a linear code can be any matrix whose
rows form a basis for the subspace.
Let
be a linear code of length
over
with
generator matrix
, where
is a power of a prime
. In this case if
is a row vector in
, then
is a codeword in C. Moreover, all codewords in C
arise from some message vector
this way. This defines
the encoder from the message space
to the code C.
If
, then the code is called binary. We
assume that
has been given the standard basis
,
, ...,
.
The dimension of
is denoted
, so the number of
elements of
is equal to
. The quantity
is called
the rate of
the code and measures the amount of information that the code can
transmit.
Two codes are equivalent to each other if one can be obtained from the other by a combination of permutations of coordinate positions of the code and multiplication of the symbols appearing in a fixed position by a non-zero scalar. More precisely, two codes are equivalent if the generator matrix of one can be derived from the other using operations of the following types:
Please see [HILL] or MacWilliams and Sloane [MS] for a proof of this fact. A code with generator matrix in the form above is said to be in standard form.
Another important parameter associated to the code is the number of errors
which it can, in principle, correct. For this notion, let us introduce
the concept of distance between codewords.
For any two
,
let
denote the number of coordinates where these
two vectors differ:
| (1) |
If
denotes the smallest distance between distinct
codewords in a linear code
,then there exist
and
such that
.
If
is any matrix with coefficents in
,
then let
denote the minimum distance of the linear code spanned
by the rows of
.
| (2) |
Note, if
is equivalent to
, then
. In general,
this parameter is known to be very difficult to
efficiently determine [CHA]. In fact, computing it in general
is known to be NP-complete
[BMT], [VAR].
Let
be a linear code. We say
is
error-correcting
provided for any vector
in
there is exactly one codeword
for which
.
Proof:
Let
be a codeword. Let
denote a ball of radius
centered at
. Since
is
error correcting, any two such
balls must be disjoint. Therefore, the distance between their centers cannot
be less than or equal to
. This proves ``the only if'' part. For the
``if'' part of the proof, note that
implies that any two such
balls are disjoint. Suppose that
is not
error-correcting, in other
words, the ball of radius
centered at
contains distinct
codewords
and
. By the triangle inequality for the
Hamming metric,
contains
.
This contradicts the fact that they must be disjoint.
Notation: Let
denote a field extension of the finite field
.
First we must explain how each element of
may be represented as
an invertible matrix with entries in
.
Let
denote a generator of the cyclic group
. Let
denote the minimal polynomial of
(the lowest degree monic polynomial which has
as a root). Take the
matrix associated to
, denoted
, to be the companion matrix of
(namely the characteristic polynomial of
is
). If the degree of
is
, then
is an
matrix with coefficients in
(and the degree of
is
). If
denotes any other non-zero element, then we can write
, for some
(because
is a cyclic group). Take the matrix associated to
to be
. The matrix associated to
will be the
zero matrix.
We define the trace of
to be the trace of the
corresponding matrix. Denote this trace by
. Let
denote a linear code over
. The trace code of
is defined to be
The naive method for calculating the minimum distance is to examine
the weight of every possible linear combination of rows in
.
Input: A code
of length n containing
elements over
.
Output: The minimum distance of
.
In the above algorithm
is not required to be linear.
For linear codes the following algorithm is much faster.
By ``proper linear combination'', we mean to exclude the case where all coefficients are equal to 0.
Though this is still an exponential time algorithm problem, the
vectors being compared are of length
rather than length
.
Additionally, this algorithm works for all
Galois fields of order
. (We remark that step 2 in the above
algorithm was missing from the implementation of
MinimumDistance in the last release of GUAVA. This bug is fixed in version 2.0.)
The probabilistic algorithm used to reduce this to polynomial time was initially proposed by J. S. Leon [LEO] and later modified by F. Chabaud [CHA]. One version of the algorithm given in Chabaud's paper is as follows:
| (3) |
However, sometimes step (3) is impossible because Gaussian elimination does not
always produce a matrix in standard form. For example, take any code with
generator matrix in row reduced echelon form which is not of the form
.
Here is a faster version of Leon's algorithm.
Notation as in Algorithm 3.
Chabaud [CHA] and Barg [BARG] have discussed finding optimal values
for
,
, and
.
The previous two algorithms were discussed in A. Foster's thesis [F], although only implemented in GUAVA for binary codes. The next algorithm is original to this thesis and forms the basis of our application to non-binary McEliece public key cryptosystems. This application is discussed in section 4.
Examples of the above algorithm are given in a later section.
The next algorithm is used in this project as well as Lennon's project [L] for testing purposes.
Input: A code
of length n containing
elements over
, an
integer
, a word
Output: The list of all codewords within Hamming Distance
from
We remark that if
then the algorithm still works but may return
an empty list.
Input: A linear code
of length
over
where
is an
extension field of
Output: The Trace Code of
In the example below we compute the minimum distance of several randomly
chosen Trace codes where
is the field with
elements and
is the field with
elements.
gap> F:=GF(4);
GF(2^2)
gap> FF:=GF(16);
GF(2^4)
gap> Read("/home/irons/gapfiles/TraceCode.gap");
gap> C:=RandomLinearCode(15,4,F);
a [15,4,?] randomly generated code over GF(4)
gap> TC:=TraceCode(C,F);time;
a linear [15,4,1..8]6..11 user defined unrestricted code over GF(4)
126
gap> MinimumDistance(C);
7
gap> MinimumDistance(TC);
7
gap> C:=RandomLinearCode(15,4,FF);
a [15,4,?] randomly generated code over GF(16)
gap> TC:=TraceCode(C,F);time;
a linear [15,8,1..4]4..7 user defined unrestricted code over GF(4)
31061
gap> MinimumDistance(C);
10
gap> MinimumDistance(TC);
4
gap> C:=RandomLinearCode(15,4,FF);
a [15,4,?] randomly generated code over GF(16)
gap> TC:=TraceCode(C,F);time;
a linear [15,8,1..6]4..7 user defined unrestricted code over GF(4)
31542
gap> MinimumDistance(C);
10
gap> MinimumDistance(TC);
5
gap> C:=RandomLinearCode(15,4,FF);
a [15,4,?] randomly generated code over GF(16)
gap> TC:=TraceCode(C,F);time;
a linear [15,8,1..4]4..7 user defined unrestricted code over GF(4)
33973
gap> MinimumDistance(C);
10
gap> MinimumDistance(TC);
4
We follow section 1.2 of [CHA].
Here is a brief description of this cryptosystem. Alice is sending a message to Bob.
Eve is trying to read Alice's message. Given is a
non-binary
code with generator matrix
and also given is a decoding algorithm that
corrects up to
errors. Input parameters are an invertible
matrix
and an
permutation matrix
. The triplet
will be the secret key and the pair
will be the public
key. Everyone knows the public key, but Alice and Bob are the only two who know
the secret key. For Alice to transmit a
bit message
, she sends the cipher text
, where
is a random vector of weight at most
. To
decipher the cipher text Bob decodes
using the given algorithm.
To decrypt an encoded message, consider the matrix
The GAP code below illustrates how to use the MinimumDistanceRandom command described in Algorithm 5 to break the McEliece cryptosystem cipher.
gap> F:=GF(16); GF(2^4) gap> C := ReedSolomonCode( 15, 10 ); a cyclic [15,6,10]7..9 Reed-Solomon code over GF(16) gap> G:=GeneratorMat(C);; gap> g:=Random(SymmetricGroup(15));; gap> v:=ShallowCopy(NullWord( 15, F)); [ 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2) ] gap> v[1]:=One(F);; gap> v[2]:=One(F);; gap> v[3]:=One(F);; gap> v[4]:=One(F);; gap> v[5]:=One(F);; gap> e:=Permuted(v,g);; gap> c0:=Random(C);; gap> w:=VectorCodeword(c0+e);; gap> Gnew:=Concatenation(G,[w]);; gap> Cnew:=GeneratorMatCode(Gnew,F); time; a linear [15,7,1..9]6..8 code defined by generator matrix over GF(16) 49109 gap> MinimumDistanceRandom(Cnew,10,13);time; This is a probabilistic algorithm which may return the wrong answer. [ 5, [ 0 0 0 0 1 0 0 1 0 0 1 1 0 0 1 ] ] 158361 gap> e; [ 0*Z(2), 0*Z(2), 0*Z(2), 0*Z(2), Z(2)^0, 0*Z(2), 0*Z(2), Z(2)^0, 0*Z(2), 0*Z(2), Z(2)^0, Z(2)^0, 0*Z(2), 0*Z(2), Z(2)^0 ]Notice the error vector
BruteForceMindist:=function(C)
local n,c,w, mindist, minword, Cwords;
Cwords:=Elements(C);
n:=WordLength(C);
mindist:=n; #initialize
minword:=Cwords[1]; #initialize
for c in Cwords do
w:=WeightCodeword(c);
if w >0 and w < mindist then
mindist:=w;
minword:=c;
fi;
od;
return [mindist,minword];
end;
NearestNeighborDecodeword:=function(r,C,dist)
local Cwords,NearbyWords,c;
Cwords:=Elements(C);
NearbyWords:=[];
for c in Cwords do
if WeightCodeword(r-c) < dist then
NearbyWords:=Concatenation(NearbyWords,[c]);
fi;
od;
return NearbyWords;
end;
C:=NordstromRobinsonCode( );
a (16,256,6)4 Nordstrom-Robinson code over GF(2)
r:=Codeword("1100000000000000");
[ 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ]
r in C;
false
NearestNeighborDecodeword(r,C,3);
NearestNeighborDecodeword(r,C,4);
NearestNeighborDecodeword(r,C,5);
CodeLength:= function (C)
return WordLength(C);
end;
TraceCode:= function (C,F)
#C is defined over an extension of F
#F is a base field
local FF, TC, TrC, i, n, c;
n:= CodeLength(C);
FF:= LeftActingDomain(C);
TC:= List(C,c->Codeword(List([1..n],i->Trace(FF,F,c[i]))));
TrC:=ElementsCode(TC, F);
IsLinearCode(TrC);
return TrC;
end;
###################################################
## MinimumDistanceRandom( <C>, <num>, <s> )
##
## This is a simpler version than Leon's method, which does not put G in st form.
## (this works welland is in some cases faster than the st form one)
## Input: C is a linear code
## num is an integer >0 which represents the number of iterations
## s is an integer between 1 and n which represents the columns considered
## in the algorithm.
## Output: an integer >= min dist(C), and hopefully equal to it!
## a codework of that weight
##
## Algorithm: randomly permute the columns of the gen mat G of C
## by a permutation rho - call new mat Gp
## break Gp into (A,B), where A is kxs and B is kx(n-s)
## compute code C_A generated by rows of A
## find min weight codeword c_A of C_A and w_A=wt(c_A)
## using AClosestVectorCombinationsMatFFEVecFFECoords
## extend c_A to a corresponding codeword c_p in C_Gp
## return c=rho^(-1)(c_p) and wt=wt(c_p)=wt(c)
##
MinimumDistanceRandom:= function(C,num,s)
local A,HasZeroRow,majority,G0, Gp, Gpt, Gt, L, k, n, i, j, m, dimMat, J, d1,M,
arrayd1, Combo, rows, row, rowSum, G, F, zero, AClosestVec, B, ZZ, p, numrow0,
bigwtrow,bigwtvec, x, d, v, ds, vecs,rho,perms,g,newv,pos,rowcombos,v1,v2;
# returns the estimated distance, and corresponding vector of that weight
Print("\n This is a probabilistic algorithm which may return the wrong answer.\n");
G0 := GeneratorMat(C);
p:=5; #this seems to be an optimal value
# it's the max number of rows used in Z to find a small
# codewd in C_Z
G := List(G0,ShallowCopy);
F:=LeftActingDomain(C);
dimMat := DimensionsMat(G);
n:=dimMat[2];
k:=dimMat[1];
if n=k then
C!.lowerBoundMinimumDistance := 1;
C!.upperBoundMinimumDistance := 1;
return 1;
fi; # added 11-2004
if s > n-1 then
Print("Resetting s to ",n-2," ... \n");
s:=n-k;
fi;
arrayd1:=[n];
numrow0:=0; # initialize
perms:=[]; # initialize
for m in [1..num] do
##Permute the columns of C randomly
Gt := TransposedMat(G);
Gp := NullMat(n,k);
L := SymmetricGroup(n);
rho := Random(L);
L:=List([1..n],i->OnPoints(i,rho));
for i in [1..n] do
Gp[i] := Gt[L[i]];
od;
Gp := TransposedMat(Gp);
Gp := List(Gp,ShallowCopy);
##generate the matrix A from Gp=(A|B)
Gpt := TransposedMat(Gp);
A := NullMat(s,k);
for i in [1..s] do
A[i] := Gpt[i];
od;
A := TransposedMat(A);
##generate the matrix B from Gp=(A|B)
Gpt := TransposedMat(Gp);
B := NullMat(n-s,k);
for i in [s+1..n] do
B[i-s] := Gpt[i];
od;
B := TransposedMat(B);
zero := Zero(F)*A[1];
if (s<n-k and A=Zero(F)*A) then
Error("This method fails for these parameters. Try increasing s.\n");
fi;
if (s=n and A=Zero(F)*A) then
return 1;
fi;
## search for all rows of weight p
## J is the list of all triples representing the codeword in C which
## corresponds to AClosestVec[1].
J := []; #col number of codewords to compute the length of
for i in [1..p] do
AClosestVec:=AClosestVectorCombinationsMatFFEVecFFECoords(A, F, zero, i, 1);
### AClosestVec[1]=ZZ*AClosestVec[2]...
v1:=AClosestVec[2]*Gp;
v2:=Permuted(v1,(rho)^(-1));
Add(J,[WeightVecFFE(v2),Codeword(v2,n,F),AClosestVec[2]]);
od;
ds:=List(J,x->x[1]);
vecs:=List(J,x->x[2]);
rowcombos:=List(J,x->x[3]);
d:=Minimum(ds);
i:=Position(ds,d);
arrayd1[m]:=[d,vecs[i],rowcombos[i]];
perms[m]:=rho;
od; ## m
ds:=List(arrayd1,x->x[1]);
vecs:=List(arrayd1,x->x[2]);
rowcombos:=List(arrayd1,x->x[3]);
d:=MostCommonInList(ds);
pos:=Position(ds,d);
v:=vecs[pos];
L:=List([1..n],i->OnPoints(i,perms[pos]^(-1)));
newv:=Codeword(List(L,i->v[L[i]]));
return([d,newv]);
end;
This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.70)
Copyright © 1993, 1994, 1995, 1996,
Nikos Drakos,
Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999,
Ross Moore,
Mathematics Department, Macquarie University, Sydney.
The command line arguments were:
latex2html -t 'W. Irons Math Honors Thesis 2004-2005' -split 0 irons_mathhonors2004-2005.tex
The translation was initiated by David Joyner on 2005-04-22