INSTITUTE OF STATISTICS
BOX 5457
STATE COLLEGE STATION
RALEIGH. NORTH CAROLlN,.
t
•
UNIVERSITY OF NORTH CAROLINA
Depar~t of Statistics
Chapel Hill, N. C.
AN AFPLICATION OF CONVEX SETS TO THE CONSTRUCTION OF
ERROR CORRECTING CODES AND FAC!OORIAL DESIGNS
by
R. C. Burton
June 1964
This research was supported by the United States Air Force
Office of Scientific Research of the Office of Aerospace
Research under Grant No. AF-AFOSR-84-63. Reproduction in
whole or in part is permitted for any purpose of the
United States Government.
Institute of Statistics
Mimeo Series No. ,-B'r =3 ~ ~
...e
TABLE OF CONTENTS
CHAPrER
•
PAGE
ACKNOWLEDGMENTS
I
- - - iv
PREFACE
v
INTRODUCTION
1
1 •• The Coding Problem
1
2.
5
Group Codes - - - - -
3. Factorial Designs -
8
4. The Parity Check Matrix -
·e
II
10
5.
The Fisher-Hamming Codes and Finite
Projective Geometries
- - - - - - - - - - 12
6.
Error Detection and Correction, Probability
Considerations - - - - - -
21
7.
The Modular Representation
24
8.
Definitions of Optimality - -
- - - -
----29
CONVEX SETS AND ERROR CORRECTING CODES
1.
Terminology and Basic Theorems
2.
The Inverse of C for General
3.
35
- - - -
q -
The Solution to the Linear Programming
.Problem - - - - - - - - - - - - -
4. Lattice Points and the Geometry of T5.
The Extreme Points of T for n < qd/(q-l)
n
-
6.
The Generalized MacDOnald Codes
7.
Tn
for
38
44
- - 47
51
- 62
n > qd/(q-l) - - - - - - - - - - - - - - 65
ii
.e
35
Page
CHAPTER
•
III
OPrIMJM CODES AND BOUNDS FOR
Codes with
n
=
(qd+1)/(q-1)
2.
Codes with
n
=
(qd+m-1)/(q-1)
3.
Existence Proofs
4.
Optima1ity Proofs -
- - - - -
72
q1 d
q-
72
-
- ---
79
88
104
122
·e
.e
>
1.
BIBLIOGRAPHY
•
n
iii
ACKNOWLEDGMENTS
The author vdshes to express his gratitude to all those who contributed to this research.
I am deeply indebted to my advisor,
Professor R. C. Bose, who helped me when I needed help, and who
patiently let me sweat things out for myself when I could.
I would
like to thank Professor Roy Kuebler and the other members of my committee for their unselfish work.
Professor Dale Mesner has graciously
discussed parts of the research with me, and has contributed a number
of helpful suggestions.
I am indebted to a number of institutions.
First of all, to the
University of North Carolina and its Departments of Statistics and
Mathematics.
Then to the National Bureau of Standards·who got me
started on this problem while I was still an undergraduate.
wonderful place to ''lork.
It was a
Finally I am sincerely grateful to the
United States Government and its taxpayers for the support I received
as a student.
I would like to
thanl~
Mrs. Doris Gardner for her expert typing of
this research, and to thank Miss Martha Jordan and Mrs. Betty Donaghy
for help and encouragement.
Last, but above all, I would like to thank Janice Manwaring Burton.
She has been my help-meet, my friend, and the light of my life.
iv
v
PREFACE
•
1
This thesis is concerned primarily with the optimal coding of
information, that is with the construction of artificial languages
to be used by digital computers and communication eqUipment.
We have all had the experience of correcting typographical
errors and realize that whereas even a few errors may hopelessly
garble a very short message, the same proportion of errors in a long
message can almost always be corrected.
It was discovered by Shannon in 1948
[lJ 2 that it is possible
to do this using nothing more sophisticated than table look up, and
without depending on any peculiarly human understanding.
Since long
messages are required, a prohibitive amount of table look up may be
needed in order to correct the errors.
In order to cut down on the amount of table look up, a great
deal of work has been done since 1948 on artificial languages which
have a group structure.
It is known that group languages, or group
codes, eXist with efficiency approaching the Shannon limit.
Shannon limit is discussed in Chapter I.)
(The
However, except for trivial
cases, no constructive method for obtaining such codes is known.
l.
This research was supported by the United States Air Force through
the Air Force Office of Scientific Research of the Office of
Aerospace Research under Grant No. AF-AFOSR-84-63. Reproduction
in whole or in part is permitted for any purpose of the United
States Government.
2.
The numbers in square brackets refer to the bibliography listed
at the end.
vi
A large number of codes which are quite good and which can be
put to practical use have been found.
Work on group codes also
finds application in the construction of factorial designs for scientific experimentation.
Additional general discussion of the problem will be found in
the first part of Chapter I.
The remainder of this preface contains
a summary of the results in this thesis.
In Chapter I we formulate a number of related mathematical
problems which lie at the basis of the construction of error correcting codes and factorial designs.
lows.
Let EG{n,q) be the vector space of all n-tuples over the
Galois field GF{q).
If ~ belongs to EG{n,q) let w{~) be the number
of non-zero elements of the vector
weight
One of these can be stated as fol-
~,
this number is called the
of~.
If V is a k-dimensional subspace of EG{n,q), let w{V )
k
k
be defined to be the weight of the lightest non-zero vector in V •
k
If GF{q), n, and a positive integer d are given, we would like to
find a subspace V such that w{V ) ~ d, for the largest possible k.
k
k
In Section 1.5 (which is section 5 of Chapter I) the classical
Fisher-Hamming solution of the above problem in case d
and q
= 2,
is given.
= 3,
or d
=4
The finite projective geometries which are used
throughout the remainder of the thesis are introduced here -- indeed
they seem almost inevitable since the null vector plays a special
role and proportional vectors must be regarded as equivalent.
proved (Theorem 1.7) that the solution for d
= 4,
q
=2
It is
is unique,
and generalizations of this theorem play &n important role in Chapter
III.
vii
Section 1.6 contains a brief account of decoding methods and
probability considerations, and in Section 1.7 we describe the modular representation which is fundamental to the rest of the thesis.
Let G be a kxn matriX whose rows are a basis for a k-dimensional
sUbspace V of EG(N,q).
k
are called code words.
V is called a code, and the vectors in Vk
k
A column of G can be multiplied by an arbi-
trary non-zero element of GF(q) Without affecting either word lengths
or the probability of error, therefore the columns of G may be regarded
as coordinates of the points of the projective geometry PG(k-l,q).
Suppose the points of PG(k-l,q) are taken in some arbitrary, but
fixed, order and denoted
= (qk
s
P l , P2' ••• , P8
-l)/(q - 1).
Then except for permutation of columns, which affects neither word
lengths nor probabilities, G may be specified by a vector
m
= (ml ,
m2 , ••• , ms )'
where m. is the number of times that P. appears as a column of G.
J
J
The vector m is called the modular representation vector.
If a
v.
= cG where
av = acG has
Let v be a vector of V • Then v
k
c is a k-vector.
#0
the same weight as
is a member of GF(q) then
In this way we associate the non-zero multiples of v with the
hyperplane of PG(k-l,q) whose equation is
c . x
= O.
Suppose the hyperplanes of PG(k-l,q) are taken in some arbitrary,
but fixed, order and denoted
viii
s
•. ., F ,
s
= (qk
-l);(q - 1) •
Then the weights of the vectors in V may be specified by a vector
k
:! = (wl , w2 ' ••• , ws )'
where w. is the weight of the vectors which correspond to the i-th
1.
hyperplane.
It is shown that this formulation leads to the set of
equations
~
= :!'
where the (i,j)-th element of C is 1 if the i-th hyperplane does not
contain the j-th point, and otherwise is zero.
The above equations suggest a set of linear inequalities
~
Then any vector
~
= (d,
d, ... , d).
of non-negative integers satisfying the above
inequalities corresponds to a vector space Sk having the property
that all its non-null vectors are of weight
~
d.
Given GF(q), k, and d, we may seek to minimize
subject to the inequalities em > d, and the requirement that the m.
---
1.
be non-negative integers.
In Chapter II we seek to solve the integral linear programming
problem.
It is not possible to obtain practically useful results
by computation since the number of equations rapidly becomes too
large.
However the solution set of the linear inequalities, not
insisting on integral solutions, has a simple interpretation in
the real geometry of s dimensions which can be used to prove and
.e
ix
suggest theorems about PG(k-l,q).
Conversely the geometry of
PG{k-l,q) can be used to obtain information about the solution set
in an arbitrary number of dimensions.
For example, the general inverse of the matrix C can be easily
found using the theory of PG(k-l,q), and it can be shown that
m
= C-1d
is the unique solution of the linear programming problem
if ! is integral.
This solution, which amounts to choosing every
point of PG(k-l,q) to be a column of the generator matrix, is well
known and has been discovered in a variety of ways.
If ! = C-l~, then n = E~ m = (qd - l)/(q - 1) whether or not
i
the m are integral. If they are not, then it is shown that no noni
negative integral solution of the inequalities is possible if
n < qd/(q - 1).
may exist.
For n
= qd/(q
- 1) non-negative integral solutions
If they do, then the corresponding codes are obtained
by including as columns of the generator matrix the coordinates of
all points in PG(k-l, q) which do not lie on an arbitrary u-flat
(u
= 0, 1, 2, .•. , k-2).
These codes are a generalization of the codes found in the case
q
=2
by MacDonald [17J.
MacDonald found that if the columns of
the matrix C are taken in a special order, then an optimum code is
obtained by omitting the first 2u+l -1 columns.
These columns correx-
pond to the polnts on a particular u-flat in PG(k-l,q).
If non-negative integral solutions to the inequalities
do not exist on the (real) hyperplane whose equation is
n
.e
= E~
mi
= qd/(q
- 1),
~ ~ ~
x
then they may exist on the hyperplane
n
= ~~
mi
= (qd
+ l)/(q - 1).
If so, then the corresponding codes are obtained by including as
columns of the generator matrix the coordinates of all points in
PG(k-l,q) which do not lie on either of two arbitrary disjoint flats.
These codes are all new unless one of the flats is a single point,
in which case they were found by MacDonald by omitting the first
2
u+l
columns of the matrix C.
Continuing in this way we find that a series of optimum codes
may be obtained for n
integer.
= (qd
+ h - l)/(q - 1), where h is a non-negative
In terms of the geometry PG(k-l,q) these codes are obtained
by including as columns of the generator matrix the coordinates of
all points which do not lie on any of h disjoint flats.
For h > 2
these flats must be chosen in an optimum way, but a simple rule is
given for so choosing them.
This rule places an upper bound on h.
The advantage of the linear programming approach is that it
provides a more or less automatic way of obtaining new codes and
prOVing their optimality.
However, since we are interested in values
of the parameters which are too large for calculation, all results
practically have to be translated into theorems about PG(k-l,q).
Once this has been done, it is not only possible but extremely convenient for expository purposes to omit most of the details of the
linear programming procedure.
for h
=0
and 1, and indicate by examples how we first obtained the
codes for h > 2.
.e
In Chapter II we give all the details
In Chapter III we prove the theorems relating to
xi
the codes with h > 2 within the framework of PG(k-l,q).
It is important to establish simple necessary and sufficient
conditions for the eXistence of the disjoint flats.
This is done
by means of the following theorem which is of general interest for
finite projective geometry.
Theorem
If F is a set of points in PG(k-l,q) which has a non-empty
intersection with every v-flat, then the number of points in F is
greater than or equal to (qk-v -l)/(q - 1).
Equality holds if, and
only if, F is a (k - v - I)-flat.
Variations of this theorem are also used to establish a tightened
form of the Plotkin bound
of the codes.
[29], which establishes the optimality
This bound is considerably tighter than previously
published bounds for certain values of the parameters.
-e
CHAPl'ER I
INTRODUCTION
1.1 The Coding Problem
Suppose we have a device, called a channel, which transmits information.
It is a q-ary channel, which means that the information
is written with an alphabet of
0, 1, 2, ..• , q-l.
symbols, which we might denote by
q
A sequence of
or a code vector.
If
q=3
and
=
x
(2
n
symbols is called a code word
n=6, then
0
1
1
2
2)
A collection of code words is called
is an example of a code word.
a~
Definition 1.1.
If
and l
x
are code vectors, then
the number of coordinates in which
called the distance between
x
and l
l)
r)
differ.
is
is
l.
x
and
x =
(2
0
I
I
2
2)
r =
(2
1
1
I
1
2)
d(!,
l)
For example, if
and
then
d(!, r) = 2.
It is easy to show that
of a distance function, that is,
and only if ! =
r,
d(!,
r)
= d(r,
d(!,
satisfies the abstract properties
l)
~
!), and
0
and equality holds if
d(!,~)
S d(!,
r)
+ d(r, ~).
Suppose we have a code in which every two code words are a distance at least
d
apart.
Then if fewer than
d
transmission errors
2
are made, no code word can be transformed into any other.
Hence we
will always be able to detect the error.
One problem then is to maximize the minimum distance
any two code words, given the length
and the number
n
between
of the individual code words
N of code words which are required.
from the Varsharmov-Gilbert lower bound [26J
of
d
I
It is known
.tbat for large values
N there exist much better codes than we know how to construct,
except for very special values of
d.
Now suppose that instead of just detecting errors we want to
automatically correct them.
One way of doing this, which practi-
cally suggests itself, is contained in
Decision Rule.
x
If the vector
which minimizes
d(!,
l)'
l
the following:
is received, choose a code vector
Adopt some arbitrary rule for breaking
ties in case there is more than one vector
x
such that
d(~
l)
is
a minimum.
It is easy to see that if the number of coordinates changed in
transmission is less that
d/2, then this decision rule will always
result in choosing the correct code vector.
vectors are more than a distance
d
If some of the code
apart, then the decision rule
would in certain cases correct errors consisting of more than
d/2
changes.
It is known that the decision rule maximizes the conditional
probability of choosing the correct code vector
x
given the re-
~bers in square brackets refer to bibliography.
3
ceived vector I
if the following three conditions are met:
(a)
All the code vectors are equally likely to be sent.
(b)
The channel is
of the
symmetric in that regardless of which
q . symbols is sent, there is a fixed probability
P of its being correctly transmitted, and if a symbol is
changed, it is equally likely to become anyone of the
other (q-l) symbols, with probability Q where
and
(c)
Q< P
P + (q-l)Q = 1.
The channel is memoryless in the
sense that the proba-
bilities of different coordinates being in error are
independent.
When we have occasion to discuss probabilities we will assume,
unless explicitly stated otherwise, that the three conditions are
met, and that the above decision rule is to be used.
Suppose we have
a given channel.
there is an over-all probability
in a correctable form.
Let
x
P
of code vectors being received
In principle
P may be calculated as follows.
be any particular code vector.
there is a specified probability
will be received.
Then if ¥.
Pr(~!) that if
is any vector
!
Now for a certain set of vectors
decision rule will give us back
pC!)
Then for a particular code,
=
I
x.
Hence
Pr(~!)
Ie S(!)
is the probability of being able to correct!,
and
is sent,
S(!), our
I
4
p
!
=
N
where summation is over all
\'
L
N vectors
x
in the code.
A second problem then is to maximize the probability
P
of
being able to correct errors, given a channel, the word length
and the number
N
of words required.
n,
As with the maxi-min distance
problem previously described, asymptotic bounds prove the existence
of much better codes than have yet been constructed, except for
special values of the parameters.
d
and maximizing
P
These two problems of maximizing
are of course related, and we will discuss
some of the interrelations later.
With regard to code construction, let us consider the nature
of the asymptotic bounds.
power of
q, say
k
N = q .
For simplicity, assume that
Then
kin
N is some
is called the rate of trans-
mission and
c
=
I + P log q P + (q-I)Q log q Q
is called the channel capacity.
form:
c
o
C has the following functional
5
Theorem 1.1 (Shannon).
large
n
which
P > 1-£
R < C and
If
€
> 0, then for sufficiently
there exists a code having rate of transmission
~
R for
.
The main idea of Shannon I s proof is that if k
sufficiently large, and if
random is likely to have
and
n
are
kin < C, then a code which is chosen at
P > 1-e"
.
we mean that every coordinate of the
By choosing a code at random
k
q n-vectors is chosen by per-
forming an independent random experiment.
However, there is a difficulty in using random codes.
these codes have no structure, it is actually necessary to
received vector I
correct errors.
Since
c~mpare
a
with all the code vectors in order to detect or
In order to be efficient, the codes must be long, and
this comparison becomes a serious matter.
In the next section we will introduce the class of group codes
or linear codes, which have received a great deal of attention because they obey the same asymptotic bounds as the random codes but
require only a small fraction of the labor to correct errors.
The asymptotic bounds have been proved for much more general
channels than we are considering.
Shannon considered more general
channels in his original paper [28] and his results have been made
rigorous and extended by Feinstein [13], Khinchin [22], Wo1fowitz [32],
and others.
1. 2
Group Codes
If
q
is a prime number or the power of a prime, then there
exists a finite field with
q
elements, the so-called Galois field,
6
which we shall denote by GF(q).
In the case where
q
the elements of the field are just the residue classes
and denoting these
classes by
is a prime,
modul~
q,
0, 1, 2, ... , q-l, the field opera-
tions are equivalent to ordinary addition and mUltiplication followed
by reduction modulo
q.
For example, if
q=3 the addition and mUltiplication tables are
~fu~m:
In case
q
+
o
1
2
o
o
1"
2
1
1
2
2
2
o
o
1
2
o
0
o
o
o
1
0
1
2
1
2
0
2
1
is not prime, the theory is more complicated (cf. Chapter
6 of Peterson [26]).
But for the present we need only the fact that
the field axioms are satisfied.
From now on, unless otherwise stated, we will consider code
vectors as
n-vectors over
GF(q).
Since nearly all the algebraic
and vector operations in this thesis will be over some Galois field
GF(q), we will use the ordinary notation for addition and mUltiplication for operations in
GF(q), and specially indicate when operations
are over the real field.
Theorem 1.2.
Then if !
Let
w(!)
be the number of non-zero coordinates in
and X are code vectors,
x.
d(!, X) = w(!-X)'
This theorem is trivially true for vectors over any field, finite
orinftnite.
For example, let q=3, ! = (2 0 1 1 2 2),
and
7
l = (2 1 1 1 1 2).
w(~)
Then, ~ -
l = (0 2 0 0 1 0),
l) = 2 = d(~,X)'
w(~ -
has been called the weight of a vector ~.
Definition 1.2.
GF(q).
If
Let
EG(n, q)
be the vector space of
V is a subspace of rank
(n, k) linear code or an
k
of EG(n,
(n, k) group code.
the weight of the lightest non-null vector in
Example 1.1.
q
= 3,
n
= 6,
k
= 3.
w(V)
q»)
n-tuples over
V is called an
is defined to be
V.
The rows of
100 0 2 2
G
=
010 0 2 1
o0
are the basis of the code.
-e
1 2 1 2
c c )
1 2 3
The code words and
To each distinct 3-vector (c
corresponds a different code word (c
1
c
2
c )G.
3
the corresponding 3-vectors are given below.
.
e
(0 0 0)
o0
(1 0 0)
1 0 0 0 2 2
(2 0 0)
2 000 1 1
(0 1 0)
010 0 2 1
(0 2 0)
o
(1 1 0)
110 0 1 0
(2 2 0)
220 0 2 0
(1 2 0)
1 2 0 001
(2 1 0)
2 100 0 2
(0 0 1)
o0
121 2
(0 0 2)
o0
(1 0 1)
101201
(2 0 2)
2 0 210 2
(1 0 2)
1 0 2 1 1 0
(2 0 1)
2 0 1 2 2 0
(0 1 1)
o1
1 200
(0 2 2)
02210 0
(0 1 2)
o1
2 112
(0 2 1)
o
(1 1 1)
11122 2
(2 2 2)
2 2 2 1 1 1
(1 1 2)
1 1 2 101
(2 2 1)
2 2 120 2
(1 2 1)
121 2 1 0
(2 1 2)
2 1 2 1 2 0
(1 2 2)
1 2 212 2
(2 1 1)
2 112 1 1
0 0 0 0
200 1 2
2 1 2 1
2 1 2 2 1
8
We see by inspection that
We
w(V)
=3 .
should point out that if
q is not prime, there exist sub-
groups of vectors which are not subspaces.
However, as indicated
by Definition 1. 2, we will follow the usual practice of
con-
sidering only those group codes which are also linear codes, and we
will use the two terms interchangeably.
A kxn matrix G whose rows are a basis for an
called a generator matrix.
Since there are
qk
(n, k) code is
distinct k-vectors
each of which corresponds to a distinct code vector £ G, an
~,
(n, k) code has
qk
vectors.
It is easy to see that if V is a
linear code, then the minimum distance
in
V is
d between two code vectors
w(V).
It is a great simplification that with group codes, in order
to compute the minimum distance between code vectors, it suffices
to consider the vectors singly rather than in pairs.
We will see
later in this chapter that it is also possible to obtain an explicit
expression for
An
P.
excellent general reference on group codes is Peterson's
book [26].
1.3 Factorial Designs
Consider an experiment in which we have
ture, pressure, etc.
levels.
~
n
factors:
Each factor can be applied at
tempera-
q different
Then a single treatment can be represented as an n-vector
= (xl x2 .•. xn )
over
GF(q)
whose i-th coordinate gives the level
of application of the i-th factor.
9
The total number of such treatments is
n
q
necessary to divide them into blocks containing
k < n.
where
and it may be
k
q
This is done by choosing a subspace
treatments, .
V of rank
and letting the blocks be the cosets of this subspace.
shown by Fisher [15] that if
k,
It has been
w(V) = d, then the main effects of
individual factors, and interactions involving less than
d
factors,
are not confounded with the block effects.
In some cases it may be possible to run only a fraction of the
total
qn
treatments.
If this fraction consists of the
ments which belong to a subspace
V such that
qk
treat-
w(V) = d, then it is
known {4] that main effects and interactic:ms of order less than
d/2
are not confounded with each other.
Hence, as in coding theory, an important problem is, given
and k, to find a subspace
has the property that
V of EG(n, q) which is of rank
w(V) = d
for the largest possible
n
k and
d.
We should point out that while coding theory and factorial design theory share a common mathematical problem, the ranges of parameters which are of practical interest are different.
theory n = 20
factors is a large number whereas in coding theory
there is interest in values of
q=2
In design
n
as large as 300.
is the most practical value and
q=3
In both cases
is next.
In order to avoid saying everything twice, we will speak in the
language of coding theory when deriving results, but we will comment
on possible applications in both disciplines.
A detailed account of
e
w
the connection between codes and factorial designs is given in Bose (4].
1.4
The Parity Check Matrix
In section 1.2 we introduced the concept of an (n, k) linear code.
If
x
and
are two n-vectors, then we say that !
~
orthogonal if their dot product
r
x·
= O.
and rare
We note that it is possi-
ble for a vector to be orthogonal to itself; over GF(3) for example,
(121) . (121) = 1·1 + 2·2 + 1·1 = O.
If V is a linear code, then the set
V*
of vectors which are
orthogonal to every vector in V is also a linear code.
v*
is also a subspace of EG(n, q).
The (n, k) code
codes.
V and the
It is of rank
(n, r) code
If G is a generator matrix for
matrix for
to which !
V*, then
G*G' = O.
In fact,
belongs if, and only if,
That is,
r=n-k.
are called ~
V*
V and G*
is a generator
V is the set of vectors
G*x = O.
This is so because
a vector is orthogonal to an entire subspace if it is orthogonal to
any basis of that subspace.
V, and
G*
is called a parity check matrix for
G is called a parity check matrix for
Example 1.2.
G =
V*.
As in Example 1.1, let
1
0
0
0
2
0
1
0
0
2 1
0
0
1
2
1
2
then
G*
=
2
0
0
1
1
0
0
1
1
2
0
1
0
1
2
1
0
0
1
is a parity check matrix for the code generated by G.
since
G* has
3
linearly independent rows where
as is easily verified,
G*G' = 0
over
GF(3) .
This is so
3 = n - k, and,
11
Note
that in the above example
G has the form
(1.4.1)
where
I
G is a kx(n-k) matrix.
l
Any generator matrix G can be put in form (1.4.1) by elementary
k
is the kxk identity matrix and
row operations and permutation of columns.
The row operations have
no effect on the code since they only amount to choosing a new basis.
Column permutations have no effect on word lengths and, since we are
assuming a memoryless channel, they do not change the probabilities
of error detection or correction.
Two matrices which can be trans-
formed into each other by elementary row operations and permutation
of columns are called combinatorially equivalent.
Let
G
= [I k,
G ]
l
be a generator matrix.
Then
(1. 4. 2)
is a corresponding parity check matrix, as can be seen from the fact
that both matrices have linearly independent rows,
G*G 1
r + k = n, and
=
The parity check matrix of Example 1.2 was chosen in this way.
in mind the dual codes generated by G and
G*
Keeping
respectively, we will
say that a generator matrix which is in either of the two forms (1.4.1)
or (1.4.2) has been standardized.
One of the fundamental theorems of code and factorial design construction is the following, first stated by Bose [3] with regard to
12
factorial designs.
Theorem 1.3.
Let
V be a linear code ,rith parity check matrix
Then for each code word of weight
w there is a linear dependence
relation between
w columns of
relation between
w columns, there is a code word of weight
G*.
Conversely, for each dependence
The proof follo'fs from the fact that
only if,
G*x = O.
columns of G*
G*.
x
belongs to
w(~) =
Hence the equation
are linearly dependent.
V
w.
if, and
w means that
w
The follO'fing corollary is
an immediate consequence.
Corollary.
A linear code with parity check matrix
weight (and hence minimum distance) at least
every subset of
d-l
columns of
G*
d
G* has minimum
if, and only if,
is linearly independent.
1.5 The Fisher-Hamming Codes and Finite Projective Geometries
As an application of the previous theorem, we will give the
codes first found by Fisher [15], [16] as factorial designs and
later rediscovered by Hamming [20] as codes.
Let
G*
be an
rxn
parity check matrix having as columns all
the non-null r-vectors over
GF(2).
r
Then n=2 _l.
For example, if
r=3, then
000
(1.5.1)
G*
=
l
1
III
o
1
100 1
1
0
1
0
1
101
is such a matrix.
Since
0
columns of G*
_e
and
1
are the only two scalars in
are dependent.
GF(2), no two
On the other hand, there are sets of
13
three dependent columns.
mum distance
Hence the code determined by G*
has mini-
d=3.
It is quite clear that it is not possible to choose more than
2r _l
r~vectors over
we define
GF(2) such that no two are dependent.
Hence if
nd(r, q) to be the maximum number of r-vectors over
which can be chosen such that every subset of
GF(q)
dr-vectors is inde-
pendent, then
n 2 (r, 2)
In case
q
= 2r
- 1
> 2, the situation is a little more difficult. For
example, the vectors
(2 0 1) and (1 0 2) are dependent over
GF(3) since the second is 2 times the first.
Of the (q-l) non-zero
multiples of a non-null vector, only one can be chosen.
done we have
dent.
If this is
(qr -l)/(q - 1) r-vectors, no two of which are depen-
Hence
(1. 5. 2)
For example, let
q = 3, r
= 3,
and
III 1 1 1 1 0 0 1 1 0 0
(1.5.3)
G*
=
112 1 200 1 1 2 0 1 0
'-
1 2 200 1 2 1 2 1 0 0 1
Here we have chosen as columns the (qr_l)/(q_l) non-null vectors
whose first non-zero coordinate is equal to
ways be chosen in this way.
_e
1.
The vectors may al-
Now suppose we have an
n
= nd_l(r,
q)
and no
rxn
parity
check matrix
d-l columns of G*
are dependent.
determines an (n, k) code having minimum distance
If any
c
columns of
matrix determines an
n'
=n
- c
d, where
Then
G*
k = n - r.
G* are deleted, then it is still true that no
d-l of the remaining columns are dependent.
where
G*, where
Hence the resulting
(n', k') code having minimum distance
and
k'
= n'
- r
=k
- c.
d,
For example, if the
first seven columns of matrix (1.5.3) are deleted, we have the parity
check matrix of Example 1.2, which determines the code of Example 1.1.
We want to show that if
n' > nd_l(r-l,q), then a code obtained
in the way described in the preceding paragraph is, in a certain
sense, optimum.
Let
there is a subspace
kd(n, q) denote the maximum k
V of rank
k
in
EG(n, q) for which
Such a code may be said to be of maximum size.
the codes described above, we see that if
kd(n', q) ~ n'-r.
b > O.
If possible, let
such that
= d.
By the existence of
nd_l(r, q)
kd(n, q)
w(V)
= n'
~
nt, then
- r + b
,mere
Then there exists a parity check matrix of order (r-b)xn'
such that no set of d-l columns is dependent, and therefore
nd_l(r-b,q) ~ n'.
n' > nd_l(r-l,q)
where
b > O.
Theorem 1.4.
But this contradicts the assumption that
for it is easily seen that
nd_l(r-l,q) ~ nd_l(r-b,q)
Hence we have the following theorem.
If
nd_l(r, q) ~ n > nd_l(r-l,q), then
The preceding theorem, which is of fundamental
kd(n, q)
= n-r.
importance,
does not seem to have been explicitly stated until reference [8J.
_e
15
For example, in the case of Hamming codes, we see that there exists
a maximum size code for every n.
It may be obtained by determining
n (r, q) = (qr_l)/(q_l) ~ n and
2
deleting the proper number of columns from the parity check matrix.
the smallest
That the
r
such that
n=6" code of Example 1.1 is of maximum size follows from
n (2, 3) = 4.
2
Without mentioning it, we have practically had to introduce
the fact that
n 2 (3,3) = 13 and
finite projective geometries.
vector space
To each subspace of rank
s
in the
EG(m, q) there corresponds a unique flat of dimension
s-l in the projective geometry
PG(m-l,q).
are said to intersect in a flat
F
Two flats
F
and F
2
l
provided that the subspaces
3
corresponding to F
and F
have the sUbspace corresponding to
l
2
F
as their intersection.
3
Every statement about the projective geometry PG(m-l, q) can
be translated into a statement about the vector space
Indeed, following Baer [2], we may regard
tive geometry of subspaces of EG(m, q).
EG(m, q).
PG(m-l,q) as the projecHowever, a problem stated
in terms of the projective geometry is often much more natural.
As the
simplest example of this, consider that the multiples
of a non-null vector in EG(r, q)
are a subspace of rank
hence correspond to a flat of dimension
PG(r-l, q).
1
and
0, or simply a point, in
Any non-null vector in the subspace uniquely determines
(i.e. is a basis for) the subspace.
Correspondingly, any non-nUll vec-
tor may be considered as t he homogeneous coordinates of the point in
the projective geometry, where if
_e
x = cX, c ~ 0, we say that !
and
16
r
are coordinates of the same point.
Thus the columns of the
parity check matrix of a Hamming code are just the points of
PG{r-l,q), and the statement that no two coltunns are dependent
amounts to saying that two different points are not the same point.
In order to keep our terminology clear, a subspace of rank
in
EG{m, q) corresponds to a
(s-l)-flat of
PG{m-l,q).
~
of dimension
Points are
s
s-1, or simply an
O-flats of
PG{m-l,q)
and
vectors are elements of EG{m, q).
Points and vectors do not corres-
pond, but to each point correspond
q-l
tors, each called the
there ?xe
qS
proportional non-null vec-
homogeneous coordinates of the point.
vectors in a subspace of rank
(qS_l)/{q_l) points on an {s-l)-flat.
Since
s, there are
A set of
s
points is said
to be dependent if they belong to some (s-2)-flat, this means that
the homogeneous coordinate vectors of the points belong to some subspace of rank
s-l, and hence are dependent.
I-flats and 2-flats respectively.
Lines and planes are
A hyperplane of
PG{m-l,q) is an
(m-2)-flat.
Now let us consider the problem of constructing codes having
minimum distance
d
= 4.
This means that in the parity check matrix
no three columns can be dependent.
multiples of each other.
Obviously two columns cannot be
Hence, vrlthout loss of generality, we can
consider the columns to be points of
PG{r-l,q), and our requirement
is that no three points be collinear.
Bose [3] proved the following theorem:
_e
17
Theorem 1.5.
Suppose
q.2
and F
is any hyperplane in
PG(r-l,2).
r l
2 - points of the geometry which do not lie on F
Then the
the property that no three are collinear.
r 1
more than
2 -
have
It is impossible to find
points having this property so
= 2r-l .
n (r, 2)
3
For example, we may take the points which do not lie on the
hyperplane
Xl
=0
where
Xl
r=4
If
is the first coordinate.
this gives the parity check matrix
11111 1 1 1
(1.5.4)
G*
o0
=
o
0 0 1 1 1 1
0 110 0 1 1
010 1 010 1
By using Theorem 1.4 we can obtain a binary distance
4
code
of maximum size for every n.
In succeeding chapters we will make a great deal of use of the
following theorem:
Theorem 1.6.
Let
rCA)
denote the rank of a subspace
A.
Then if
Sand Tare subspaces,
(a)
S~ T
(b)
reS) + reT)
and
reS)
= reS
=
reT)
imply S
=
T, and
AT) + reS + T)
A proof of this theorem is found on page 18 of [2].
we mean that
S
S
and T, and
together.
is a SUbspace of T,
as flats, and if for
flat.
_e
sc T
Sf} T is the intersection of
S + T denotes the subspace spanned by
Theorem 1.6
By
S and T
remains true if the subspaces are considered
rCA) we write
dim (A), the dimension of the
18
A simple consequence of Theorem 1.6 is that a hyperplane and
a line not contained in the hyperplane meet in a point.
case
Since in
there are only 3 points on a line, it follows that if
q=2
two points which are not on a hyperplane are on a line, then the
third point on the line must be on the hyperplane.
Hence the set
of points which do not lie on some hyperplane have the property
that no three are collinear.
Bose completed the proof of Theorem 1.5 with a combinatorial
S 2r-l .
n ( r-1,2 )
argument which shows that
3
The following theorem,
so far as I know, has not been published before.
Theorem 1.7.
If
S
2r-l points in
is a set of
that no three are collinear, and F
then F
PG(
r-l,)
2 such
is the set of points not in
S,
is a hyperplane.
In proving the theorem, let us adopt the following notation.
If
a, b, c,
coordinate vectors.
which we denote by
Two points
abo
line if, and only if,
Lemma.
of
If
In case
a+b
a
-0
b
a
and
on
_e
a a.
o J
are on the same
S, then the remaining points
(i = 1, 2,
is some point
is some point
c
a.
1
of F, can be numbered in such a way that
i
+ -1
a. = -1
b.
ao a.1
uniquely determine a line
q=2, a, b, and
The lemma is proved as follows.
line
b
be respective
= c.
is any point in
S, and the points
on the
~, ~,~
are points, let the vectors
b.
J
of F.
... ,
r-1
2
- 1).
By assumption, the third point
b.
1
of F, and
If
,the third point
b. = b., then
1
J
ao ' ai' and
19
a.
J
are all on the line
a b.
o
=
1.
tion that no three points of
S
a b., which contradicts the assumpJ
0
are collinear.
This proves the
lemma.
0.. o
Now we will show that the Lemma implies that if
are two points of F, and if
Note that
b. + b.
-1.
-J
= (a-0
a
0
is any point in
+ -1.
a.) + (a
-0
a. + a. ~ -0
a , since the
Moreover,
-J
~
+ a.)
-J
= -1.
a.
b.1.
and
b.
J
b.+b. ~ ~.
-J
S, then
-1.
+ a., since
-J
= O.
~ +!!u
three points are not collinear.
The result of the previous paragraph may be interpreted geometrically as follows:
If F
contains two points
b.
1.
and
b
j
of
any line, then it contains also the remaining point of the line.
is well known that this implies that
that F
Theorem 1.7
F
is
~~
(r-2)-flat or hyperplane.
Hence
is proved.
Theorem 1.7 shows that
can be added to
_e
is a flat, and from the fact
contains
points, we know that
S
F
It
are collinear.
S
r
n (r, 2) S 2 -l, for no additional point
3
and retain the property that no three points in
This is proved by the above lemma.
A set S of
20
points which has the property that no
point is added
there will be
d
d
are dependent, but if any
dependent points, is called
complete.
The problem of finding the maximum number of points such that
no three are collinear has never been solved for
q
> 2.
It has
received a great deal of attention from the Italian geometers as
well as from workers in coding and design theory.
probably known by the Italian school.
give an interesting bound on
Italian papers.
Theorem 1.7 is
Bose and Srivastava [10]
n (r-l,q)
and references to the
3
Carmichael [12] gives a
readable introduction to
finite projective geometries and the following formulae which we will
need later.
·e
The number of s-flats in a given t-flat (s ~ t)
(1.5.5)
is
(qt+l_l)(qt_ l ) . . . (qt+l-S _ 1)
(qS+l_l) (qS_l)
(q _ 1)
In a PG(r, q), the number of t-flats which contain a given
s-flat (s ~ t)
(1.5.6)
is
• (qr-t+l_ l )
( q t-s -1 )( q t-s-l -1 )
(q-l)
By specialiZing (1.5.6) we find that in a PG(k-l, q), the number of
(k-2)-flats which contain a given u-flat
(1.5.7)
( qk-u-l -1 )
(q-l)
(u ~ k - 2)
is
21
1.6 Error Detection and Correction, Probability Considerations
Let
where
V
=n
r
G*x
=0
G~
f Q
be an (n, k) code with an
- k.
Since a vector
x
rxn parity check matrix G*
belongs to
J
V only in case
we know that a transmission error has been made whenever
J
for a received vector
x.
This is a much better way of
detecting errors than to compare a received vector with all
q
k
code vectors.
If G*x
= ~J
then
s
is called the syndrome of
x.
All n-
vectors having the same syndrome are a coset of V, and there is
a 1-1 correspondence between the
vectors
s.
qr
cosets of
V and the-r-
(This follows from the fundamental homomorphism theorem,
cf. Jacobson [21].)
Now suppose we want to correct a received vector
Decision Rule, this
that
d(~J :~)
x
x.
By our
means we want to find a code vector
= w(~J ::!)
be written in the form
coset as
x.
is a minimum.
x - v
v
such
Now the vectors which can
are just the vectors in the same
Hence correction can be effected by SUbtracting from
the lightest vector in the same coset as
x.
We summarize with
a theorem.
Theorem 1.8.
From each coset let a vector of minimum weight be
chosen and called the coset leader.
----
Then when a vector
x
is
=eceived, it can be corrected by computing its syndrome and then
SUbtracting from
.e
x
the coset leader corresponding to the syndrome .
22
If a digital computer is used for correcting, then the syndrome
s
may be regarded as a number written to the base
leader obtained from location
s.
q, and the coset
This is especially useful if
q=2, since most computers will regard a binary vector in this way
without any conversion.
For special classes of group codes more
efficient correction methods exist than the general method of
Theorem 1. 8.
It is possible for a code vector to be altered by transmission
errors so that it is equal to another code vector.
The probability
of this is
(1. 6.1)
Pr(undetected error)
=
n
r:
W(i)
rP- i
Qi
i=l
where
W(i)
is the number of code words of weight
i, and
P
and
Q are defined in section 1.
There is a similar formula for the probability
vectors being received
in correctable form.
n
p = r:
(1.6.2)
P
of code
It is
L(i) pn - i Qi
i=O
where
L(i)
is the number of coset
weight of a coset is defined as
leaders of weight
the weight of
i.
The
the coset leader.
These formulae are given by Peterson [26] for the binary case, and
the proofs are practically unchanged for the channels we are considering.
.e
We might mention that Peterson's
Q is our
P.
23
To 'find the distribution of word weights for a series of
codes is often a very difficult matter.
Needless to say, so is
determination of the weights of the coset leaders.
For example, Peterson [26], pp. 67-69, starts with Theorem 1.3
and develops a generating function for the word weight distribution
of the unreduced Hamming codes.
n
= nd_l(r,
By unreduced we mean that
q), i.e., no columns have been deleted from the parity
check matrix.
The following generalization of Theorem 1.3 provides a possible starting point for determining the weight distribution of the
coset leaders.
Theorem 1.9.
If there is an r-vector which requires
m columns
of the parity check matrix to express it as a linear combination,
then the coset for which this r-vector is the syndrome has weight
m.
The proof follows immediately from the fundamental homomorphism
theorem and the discussion at the beginning of this section.
In the special case of the distance-3 Hamming codes, it is
much easier to get the coset weights than it is the word weights.
First note that since the minimum word weight is
~ ~
I
of weight
require that
x -
1
cannot be in the same coset.
I
be in the code, and
all vectors of weight
1
3, two vectors
1 ~ w(! -
For this would
I) <
2.
Hence
are coset leaders.
Now suppose that all columns deleted from the unreduced parity
check matrix belong to some hyperplane
.e
F.
This is possible since
24
in constructing the distance-3 Hamming codes we do not delete more
columns than belong to a single hyperplane.
If this' is done then each of the deleted columns can be expressed as a linear combination of two of the remaining columns,
that is, every point in F
For let
a
the line
if F
is on a line with two points not in F.
be a point in F
ab
and
b
be a point not in F.
contains a third point
c
Then
which cannot be in F, for
contains two points of a line, it contains the whole line.
Since every r-vector is a scalar multiple of the coordinates
of some point in
PG(r-l, q), it follows from the above argument
that every r-vector can be expressed as a linear combination of two
of the remaining columns of
the parity check matrix.
Theorem 1.9, all the remaining cosets are of weight
A code which for some
some of weight
m has all vectors of weight
From (1.6.2) and the fact that
it is easy to see that quasi-perfect codes maximize
and
k.
2.
m or less,
m+l, and none of greater weight, as coset leaders
is called quasi-perfect.
n
Hence, by
P >Q
P for a given
We will call codes which do this optimum-F.
The work of this section is due primarily to Slepian [30], who
made the fundamental discovery of coset decoding.
1.7 The Modular Representation
Let
x
=e
G be a generator matrix of an
G be a code vector.
of G, it is obvious that
x
is of weight
Then if
w(~)
= n-m.
V, and let
is orthogonal to
m columns
In other words, a code vector
w if, mId only if, the corresponding k-vector
is orthogonal to n-w columns of G.
.e
c
(n, k) code
~
25
Suppose
g.
is the
-J
V of replacing g.
-J
by
j-th column of G.
ago
where
-J
a
IVhat is the effect on
is a non-zero scalar?
Clearly
this has no effect on the word weights, for a vector is orthgonal to
g.
-J
if and only if it is orthogonal to
ag..
-J
If we can show that the
alteration has no effect on the distribution of coset weights, then
by formula (1.6.2) it has no effect on the probability
Let
G
a
P.
denote the altered generator matrix, and let
G* be
a parity check matrix corresponding to the original generator matrix
G.
Then if G:
-1
is obtained by multiplying the j-th column of G*
G*G' = O. Hence, keeping in mind that
a a
ranks of matrices are unchanged by elementary column operations, it
by
a
, it is obvious that
follows that
G*
a
is a parity check matrix corresponding to
Now if an r-vector
s
G.
a
can be expressed as a linear combination
of m columns of G*, it can be expressed as a linear combination
of the same columns of G* and conversely.
a
the weight of the coset corresponding to s
Hence, by Theorem 1.9,
is unchanged.
There-
fore, replacing a column of G by some multiple of that column has no
effect on the probability
Theorem 1.10.
and let
Let
z (z
= (qk -l)/(q-l»
(n, k) code
Then if m = (~, m , ... , m )
2
z
element m.
is equal to ,the number of columns of G which are
coordinates of the point
vector of
V.
ai'
~
V,
be the points of
FG(k-l, q).
~
.e
We have proved the following theorem.
G be the generator matrix of an
~, 80 , ... , a
2
P.
is a vector whose i-th
is called the modular representation
The word lengths of
V and
the probability Pare
26
uniquely determined by
Example 1.3
~.
The code of Example 1.1 has generator matrix
=
G
1
0
0
0
2
0
1
0
0
2 1
0
0
1
2 1
If we name the points of
2
2
FG(2, 3) by letting
be the point whose
a.
J
coordinate vector is the j-th column of the matrix
M
=
1
0
1
1
0
1
1
0
0
1
1
1
1
0
1
1
2
0
0
0
1
1
1
1
2
2
0
0
0
0
1
1
2 1
2 1
2
a
-e
then!!
=(
l
a
1, 1
2
a
0
3
a
2 1
4 a 5 a6 a 7 as a 9 alOalla12a13
0
2
0
0
0
0
is the modular representation vector.
0
1
1
0
The vector
)
~}
by definition}
is over the real field.
The properties of a modular representation vector depend very
decidedly on the particular ordering of the points of
FG(k-l} q).
That is} if the order is changed} the same vector may represent a
code with a completely different set of word weights and probability
P.
Let F
be the hyperplane in
FG(k-l} q) to which a point
longs only in case
c
.e
12.
=
0
p be-
27
where E is a coordinate vector of p.
then
c
and bc
If
b
define the same hyperplane.
is a non-zero scalar,
To each hyperplane
there correspond (q-l) code vectors of the form bx
= bcG. These
code vectors are of weight
contains (n-w)
w if, and only if,
F
of the points whose coordinates are columns of G, for this is the
same as saying that
G.
bc
Hence we have .the
Theorem 1.11.
is orthogonal to
(n-w) of the columns of
following theorem:
Let G be a generator matrix of an
Then to each hyperplane F
of
en, k) code
FG{k-l, q) there correspond (q-l)
code words which are scalar multiples of each other.
words are of weight
w,
where·
V.
These code
w is the number of points, not be-
longing to F, whose coordinates are columns of G.
Corollary.
Let
S be a set of n points in
PG(k-l, q), and let
d
be the largest integer such that no hyperplane contains more than
n-d of the points of S.
points of S are
resulting
Then if coordinates corresponding to the
taken as columns of a generator matrix G, the
en, k) code has minimum distance
d.
The above· theorem may be expressed in algebraic form as follows.
Let the points and hyperplanes of PG(k-l, q)
way.
Let
be ordered in some
C be the matrix whose (i, j)-th element is
0
if the
i-th hyperplane contains the j-th point, otherwise let the element
be
1.
(1.7.1)
where
.e
Then we have the equations
em = w
m is the modular representation vector of an
(n, k) code,
28
and
w is the vector whose i·th element is the weight of the code
words corresponding to the i-th hyperplane.
These equatLns are over
the real field.
If the conditions of the corollary are met, then the corollary
assures us that
em>
(1.7.2)
where
d
is the vector whose every element is equal to
Conversely, suppose that
points and hyperplanes of
that
d
d
k
is given so that, upon ordering the
PG(k-l, q),
C is given.
Suppose also
Then we might try to find a vector E! of non..
is given.
negative integers which satisfies (1.7.2).
By Theorem 1.10, such a
vector is the modular representation vector of an
minimum distance
d.
d, "There
Among the vectors
(n, k) code having
n is the sum of the elements of m.
m such that the above conditions are satis-
fied, we might seek a vector which minimizes
~
m.~
= n.
Then we have
a standard linear programming problem except for the requirement that
the elements
of
~
be non-negative integers instead of merely
non-negative real numbers.
This formulation is the starting point of
the investigations of the next chapter.
When, as above, we are given
d
and k
n, we are in effect minimizing the redundancy
and we seek to minimize
r
= n-k
A code which accomplishes this minimization will be
minimum redundancy.
said to be of
In the next secticn we will discuss the rela-
tions between the various definitions of "optimality" .
.e
of the code.
29
The modular representation was introduced independently by
Slepian [30], &1d Burton and Connor [11], the latter for the case
of factorial designs.
Using the modular representation, MacDonald
[24] found a series of maxi-min distance codes which will be deBose and Kuebler [7] used the method to completely
scr1bed later.
solve
k
=4
the coding problem for the cases
k
=3
and
k
= 4.
The
case is already very difficult.
All the above references are restricted to
EG(k, q)
is essentially equivalent to
is not needed.
q
=2
PG(k-l, q)
so that
and Theorem 1.10
It is possible' that the details of the general
q
The approach is in-
situation have not heretofore been published.
dicated in an abstract by Bose and Burton [5].
1.8
Definitions of Optimality
Most of our discussion of linear codes for a given value of
has been in terms of three parameters
n, k, and
d.
q
By considering
two of these parameters as fixed and optimizing the third, we have
three definitions of optimality.
Fixed Parameters
Optimization
Type of Code
n, k
maximize d
maxi-min distance
n,
d
maximize k
maximum size
k, d
minimize n
minimum redundancy
These definitions of optimality are closely related, but they
are not equivalent.
For example, let us graph the function
nition, is the
k
kd(n, q) Which, by defi-
of the maximum size code for given n and
d.
From
formula (1.5.2) and theorems 1.4 and 1.5 we can construct the graph
on the following page for the case
q
=2
and
d
=3
and
4.
30
';-.)
c
V1
.L_L_L__
~ 7'
.....
'.
",
'i.:
II
a
"
.
,
0Il
...r:
II
w
c
31
We note that
of
d
kd(n, q) is not a strictly decreasing function
for all values of
n.
For example,
n = 8, k = 4, and
so that although a code with parameters
d = 3
is of maximum size, it is not of maxi-min distance.
Similarly, since
kd(n, q) is not a strictly increasing func-
tion of n, we have, for example,
k
3
(7, 2) = 4 = k (8, 2)
3
so that although a code with parameters
n = 8, k = 4,
and
d = 3
is of maximum size, it is not of minimum redundancy.
Let
given
·e
~(n,
q)
n, k, and
be the
q.
d
for a maxi-min distance code with
It is easy to see that
a strictly increasing function of n.
quickly be constructed with
k
= 2.)
~(n, q)
is not
(A counter-example can
Hence a maxi-min distance
code is not necessarily of minimum redundancy.
~(n,
Moreover,
k.
For example, if
that
d (7, 2)
3
= 4.
q)
is not a strictly decreasing function of
n = 7, k = 3,
and
q = 2,
it is well known
One can quickly convince himself that
so that a code with n = 7, k = 2, q = 2, and
d2 (7,2)=4,
d = 4, is of maxi-
min distance, but not of maximum zize.
Now let
with given
Nd(k, q)
d, k, and
increasing function of
for given
d, q,
and
be the smallest possible
q.
k.
He will show that
Let
k, i.e.
n
for a code
N (k, q)
d
is a strictly
V be a minimum redundancy code
n
= Nd(k,
q).
Let
V.
J.
be the sub-
32
space of
V consisting of all vectors in
is zero.
We may choose
d.
so that the i-th coordinate is not
V, and then V.J.
always zero in
weight
i
V whose i-th coordinate
is an
(n, k-l) code with minimum
Omitting the 1-th coordinate of every vector in
we obtain an
Since
(n-l, k-l) code with minimum weight
Nd(k, q)
d.
J.
Hence
is a strictly increasing function of
minimum redundancy code must be of maximum size.
V.
k, a
For if we assume
the existence of a minimum redundancy code with parameters
n = Nd(k, q), k, and
d,
which is not of maximum size, then we
will have
It is easily seen that
function of
d, q, and
d.
Let
k, i.e.
Nd(k, q)
is a strictly increasing
V be a minimum redundancy code for g1ven
n
= Nd(k,
q).
Then choosing
i-th coordinate is not always zero in
i
so that the
V, we omit the i-th coordinate.
This gives us a (n-l, k) code with weight at least
d-l.
Hence
Hence a minimum redundancy code is of maxi-min distance.
FinallJr, we will show by example that a code can be both of maximin distance and
maximum size without being of minimum redundancy.
Consider the binary code
V consisting of the four vectors
33
so that
n
= 13,
k
(0
0
0
0
0
0
0
0
0
0
0
0
0)
(1
1
1
1
1
1
1
1
0
0
0
0
0)
(0
0
0
0
1
1
1
1
1
1
1
1
0)
(1
1
1
1
0
0
0
0
1
1
1
1
0)
= 2,
and
d
= 8.
Since the last coordinate of
the vectors is always zero, it can be omitted without affecting
or
d.
Hence
V is not of minimum redundancy.
k
However, it is
easily seen that, codes with parameters
n
= 13,
k
= 2,
d
=9
or
n = 13,
do not exist.
Hence
k =
3,
d =
8
V is of maxi-min distance and maximum size.
We may summarize these results with a theorem.
Theorem 1.12
If a code is of minimum redundancy, then it is of maxi-
mum Size, and of maxi-min distance.
The other possible implications
between these three properties are not universally valid.
It seems
li~ely
that the other possible implications are valid
for most values of the parameters.
cates that
kd(n, q)
For example, Theorem 1.4
indi-
is strictly increasing for most values of
and this is reflected in
the
n
above graph.
Definitions of optimality in terms of the probability of error
detection or correction seem less closely related to the three simpler definitions given above.
.e
Various counter-examples have been
34
constructed.
For example, a code may be quasi-perfect and there-
fore maximize the probability
ceived vector for a given
tance for that
nand
P
of being able to correct a re-
nand k, and not be of maxi-min dis-
k.
Conversely, a maxi-min distance or
minimum redundancy code may not maximize
P.
A whole series of
examples of this sort may be obtained by comparing the quasi-perfect
codes described in Problem 5.3
Hamming d
=4
of Peterson [26J with the reduced
codes.
On the positive side we can offer the example of several
classes of codes which have been constructed with one of the simple
definitions in mind and turned out to be optimum in the probability
sense.
The Hamming codes with
codes and the Bose-Chaudhuri
.e
d
d
=3
=5
and the unreduced
d
=4
codes are examples of this •
CHAPl'ER II
CONVEX SETS AND ERROR CORRECTING CODES
2.1 Terminology and Basic Theorems
Let
field.
R denote
m
The segment between two m-tuples
which z
and l
x
is the set to
belongs only in case
(2.1.1)
where
the vector space of m-tuples over the real
z
s
=
sx
+
(l-s)l
is a real number.
The vectors which we will consider in
this section will be real vectors in
A subset of R
m
R
m
is convex if whenever it contains two m-tuples,
it contains also the segment between them.
the intersection
It is easy to see that
of a finite number of convex sets is convex.
The set of all m-tuples which satisfy a set of linear inequalities is convex.
If all the inequalities are of the form
such a set has been called a convex polyhedral set or
< or >
CPS.
For
example, if m = (qk_l)/(q_l), the set of m-tuples which satisfy
the 2m linear inequalities
ct >
(2.1.2)
where
C and
d
this CPS by T •
.e
d
t
>
0
were defined in Section 1.7, is a CPS.
We denote
36
Since an equation can always be replaced by hro inequalities I
one
and the other 5, the m-tuples which satisfy a set of
~
>
linear equations, or a set of inequalities of the form < or
and a set of equations, constitute a convex polyhedral set.
example, the m-tuples
!
For
which satisfy the inequalities (2.1.2)
and also the equation
(2.1.3)
are a CPS which we shall call
integers, the m-tuple
in Section 1.7 that
of an
·e
t
t
Tn'
the elements
t.
are
~
is called a lattice point, and we saw
is then the modular representation vector
(n, k) code with
minimum distance
> d.
of this chapter is to determine whether a set
The major problem
Tn
contains a
lattice point.
An m-tuple
x
S provided that
is an
x
~xtre~~
is in
S, but
of any segment contained in
where
l
and
z
s=l, so that !
theorem in the
Theorem 2.1.
are in
is either
S.
S
~
is not an interior point
~
= sl
+ (l-s)!,
and 0 < s < 1, tbeneither
X
or
z.
s=O
or
The following is a standard
theory of convex sets.
If the m-tuple
x
point of the set if, and only if,
dependent rows
point. or extreme vector of a set
That is, if
set defined by the inequalities
.e
If all
a.
-1.
of A.
belongs to the convex polyhedral
~ ~
E'
a.·x
-1. -
then
= bi
x
for
is an extreme
m linearly in-
37
We should point out that the matrix A is meant to include
all the inequalities defining the CPS, in (2.1.2) for example,
A
Let
I!I
=
CJ .
denote the usual Euclidean length of the vector !'
that is,
(2.1. 4)
The Euclidean distance between
If
S
x
and 1. is I! - 1.1 .
is a subset of R; and if there exists a real number
m
B
< B for every x in S, then S is said to be
such that I!I
bounded.
Let
K
= { -Xl'
the set to which
(2.1. 5)
x
is called the
x ' ... , x.} be a finite set of m-tuples.
-2
x
~
= ~
-J
belongs if and only if
s. x.
where each
~-~
convex span, or convex hull of K.
quire that the coefficients
s.
~
~
If we do not re-
be non-negative, but retain the
requirement that their sum be equal to
called the affine
Then
1, the resulting set is
of K.
Convex spans are the generalization of the segment between
two m-tuples
~
and !2' and may also be called convex polytopes.
Affine spans are the generalization of the line through !l and
.e
38
The following theorem, due to Minkowski and Farkas, is of
fundamental importance.
Theorem 2.2.
A bounded convex polyhedral set is a convex polytope,
being the convex span of its extreme points.
The theory of convex sets and linear programming is fairly
extensive and there are many books on the subject.
The most compre-
hensive of these is edited by Kuhn and Tucker [23].
We shall use during the subsequent development those theorems
of this theory which are necessary for our purpose.
2.2
The Inverse of
C for General
q.
C.
Let us recall the definition of the matrix
hyperplanes of
-e
order.
ro(k-l, q)
The points and
are taken in an arbitrary, but fixed,
Then C is the matrix whose (i, j)-th element is
a
if the
i-th hyperplane contains the j -th point, and otherwise the element
is
1.
Let
to
lIs
C
be the matrix obtained from
and all the
l's
to
a's.
C
C by changing all the
a's
is what is usually meant by
the incidence matrix of points on hyperplanes.
We may call
C the
modular representation matrix.
Let ~
Let
C.
c.
-~
= (qk_l)/(q_l),
then both
denote the i-th row of
If A is a matrix,
At
.e
c.
-~
C
are
C.
hxh matrices.
denote the i-th row of
denotes its transpose.
formulae are needed to invert
over the real field•
C and
C and
The following
All dot products are to be
tal~en
39
If
-~
=j
i
. -Jc.
c.
(2.2.1)
i
I
i
=j
k 2
l(q - -l)/(q-l)
if
i
I
~
we reason that the unit elements of C.
j
-~
C.
which match
The intersection of two hyperplanes is a (k-3)-flat, which
Now let
points.
.:i be the h-tuple
c .. c. = (j -J
-~
-e
j
correspond to points which are on both hyper-
-J
has (qk-2_ 1 )/(q_l)
and since
j
points on a hyperplane, which is a (k-2)-flat.
unit elements
planes.
if
the formula follows from the fact that there are
(qk-l_l)/(q_l)
For
r (qk-l_ l )/ (q-l)
I
=
j·c.
- -J
all of whose elements are
c,).c,
= -j·c.
-~
-J
-J
-
= (qk-l_l)/(q_l)
C •• C.
-1. -J
=
- _c.;,_c '
j
...
we find from (2.2.1) that
{O
qk-2
if
i = j ,
if
i
I
j
Similarly,
c.·C.
-~
-J
=
c .• (j -
-1.
-
c.)
-J
=
c.·.i -
-1. -
C.,·C.
-1-J
,
and since
c .• j
-~
-
=
(j _
c.)'j _
-~
-
k
q -1
q-l
q
_
k-l -1 _
q-l -
we see from (2.2.2) that
-- k-l
\q
(2.2.3)
.e
c.' c.
-~
-J
=
I
I...-
qk-l
- qk-2
if
i
=j
if
i
I
j
k-l
q
1.
Then
40
Let
1; and let
be the
J
I
hxh matrix every element of which is equal to
be the hxh identity matrix.
Then, taking all matrix
multiplications over the real field, we obtain from (2.2.1) the
•
formula
(2.2.4)
=
From (2.2.2) we obtain
(2.2.5)
and from (2.2.3) we obtain
(2.2.6)
Noting that
-e
=
CC '
qk-l -qk-2
( qk-l - qk-2) J
k-2
q
I
+
= qk-2( q-l ) ,
•
) by ( q-l )
we multiply (
2.2.5
and subtract it from (2.2.6) to obtain
) CC'
CC' - (
q-l
__
qk-I
I
from which it follows that
C- 1 =
fc' -
(
) -C'] / qk-l
q-l
,
where the inverse is to be taken over the real field.
We may also find the inverse of
z
C by
multiplying (2.2.5) by
and sUbtracting it from ( 2.2. 4) to obtain
= ( qk-2 -1 )/( qk-l -qk-2)
It follows that
(2.2.8)
--1
C
=
q-l
k-l
q
.e
-1
q
k-2 -1
qk-2( q-l )
41
The inverse of
C may be used to obtain necessary and suffi-
cient conditions for the existence of a code with specified weights.
From equation
(1.7.1) we obtain
..
and the conditions are that the elements of
integers so that
~
(2.2.9) in more detail, we get from (2.2.7)
w. - (q-l)
J.
where
mj
be non-negative
can be a modular representation vector.
Expressing equations
(2.2.10)
~
I:
w.)/(qk-l)
~E(j) J
is the number of times a coordinate vector of the j-th
point is to be taken as a column of the generator matrix,
-e
is
the weight of the (q-l) words corresponding to the i-th hyperplane,
~O(j)
means that the i-th hyperplane does not contain the j-th
point, and ie;& (j)
j-th point•
.e
w.J.
means that the i-th hyperplane does contain the
42
Example 2.1-
Suppose
and
q=3
planes (in this case lines) of
k=3.
Let the points and hyperbe ordered as below:
FG(k-l,q)
POINTS
-_._.-
-
e
0
0
1
0
1
0
1
0
0
1
0
1
1
1
1
0
1
1
1
2
0
1
0
2
1
2. 1
1
2
2
1
1
2
0
1
2
1
1
0
1
1
0
0
1
0
1
1
1
1
1
0
1
2
0
0
1
0
1
1
1
2
1
0
1
2
0
0
1
0
2
2
2
1
1
0
1
1
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
=
1
1
1
1
1
1
1
.e
1
1
1
1
The matrix given above is the incidence matrix
(2.2.7)
1
1
1
1
1
1
1
1
1
1
1
1
1
C
1
1
c.
C = J - C, and by
LINES
0
0
1
1
1
0
1
1
1
1
1
0
1
0
1
0
0
1
1
2 0
2
2 1
1
1
1
0
0
1
1
1
0
2 1
2
2
0
-2 1
1
1 -2
1
1
1
1
-2 -2
1
1
1 -2
1
1
1
1 -2 1 -2
1 1 -2 1
1 -2 -2
1
1
1 -2
1
1
1
1
1 -2
1 -2 1
1 1 -2
1
1
1
1
1
1
1
1
1
1 -2
1
1
1
1
1
1 -2
1
1
1 -2 -2
1
1
1 -2
1
1 -2 -2
1
1
1 -2
1 -2 -2
1
1
1
1 -2 -2
1
1
1 -2 -2
1
-2
C- 1 = 1/9
1
1 -2 -2
1
1 -2
1
1 -2 1 -2 -2 1
1 1 -2 1 -2 -2
1
1
1
1 -2
1
1
1
1
1 -2
-2
1
1
1
1
1
-
1 -2 -2 1
1 -2 1 1 1
1 1 -2 1 1
1 1 1 -2 1
e
1 -2
1
1 -2
1
1
1 -2
1
1
1
see
that~
with the points of
=
~_l(C'-(q-l)C']'
q
1 -2 -2
From the generator matrix G
Now let us refer to Example 1.1.
we
1 -2
2
PG(2, 3) ordered as above, the
modular representation vector of the code is
m =
(0
1
2 1
0
0
1
1
0
0
0
0) •
0
Looking at the code vectors, we see that the ( q-l)=2 code
vectors which correspond to the same hyperplane are arranged in
the same row of the table of code words.
vectors (1
0
0
0
2
2)
and
(2
0
For example, the code
0
0
1
1)
correspond
to the hyperplane which can be denoted by the £-vector (1
or by the £-vector (2
.e
0
0).
0
These code vectors have weight
0),
3
44
and therefore, since
(1
0
0)
is the third hyperplane in the
w =3.
above ordering,
3
We can easily verify that, subject to the above ordering of
PG(2, 3), the weight vector is
the hyperplanes of
w
= (4 3 3 4 6 3 3 4 5 6 5 5 3) .
It is also easily verified that
C~
=
and m
!J
=
-1
C
w.
2.3 The Solution to the Linear Programming Problem
The solution to the linear programming problem described in
-e
Section 1.7 follows quickly from (2.2.9) and (2.2.10).
Subject
to the inequalities (2.1.2) we want to find a vector!
= (tl, ••• ,th )
~ti is a minimum.
such that the sum
We will show that
(2.3.1)
is the unique solution to this problem.
From equation (2.2.10) we have
~.
J
where every
d.
~
= d,
=
d
(.E
C(j)
and where
E(j)
i
- (q-l)
.E
E(j)
d. )/(qk-l),
~
indicates summation over
(qk-l_l)/(q_l) summands corresponding to hyperplanes containing
the j-th point, and
h- ( qk-l -1 )/( q-l )
hyperplanes.
.e
=
Hence
O(j)
qk-l
indicates summation over
summands corresponding to the remaining
Since the matrix
l'
C is nonsingular,
is the unique solu-
tion to the equations
=
Ct
Hence if
t
satisfying
>
c .. t
-J.
Since every column of
lies on
PG(k-l), q)
-e
d
c.
of
-1
C
C.
t
elements equal to
elements equal to
0 ( a point
(qk-l_l)/(q_l) hyperplanes), summing
C we obtain
~
k-l .
.:l'
"1 .£i
= q
~ -1c.t = ( ~c.)
"1
~-1
where
qk-l
contains
(qk-l_l)/(q_l)
and the other
up the rows of
t
= qk-l
and
~ t
~
i
is any vector.
Now from
and since each
However, if
!
(2.3.3) and (2.3.5) we have
d
i
=d
it follows that
is any other vector in
q
.e
T (i.e., any other vector
(2.1.2», it follows that
for at least one row vector
in
•
is any other vector in
(2.3.4)
1
d
l-k
ri c .• t
-1 -
T, then from (2.3.4) and
> q l-k
ri d.
1
=
46
•
Hence
,
1-
is the unique solution to the linear programming problem,
iii
as we asserted.
•
l'
The vector
can be a modular representation vector if and
only if its elements are non-negative integers.
see that this will be the case if and only if
qk-l
integral multiple of
If
d
From (2.3.2) we
d
is a positive
is not such a multiple, then
the problem still remains of finding the lattice vector in
T
~ti.
which minimizes
One way of doing this is to employ a general method for
finding integral solutions to linear programs due to Gomory [18],
Suppose
[19].
4
is not a lattice point.
Then Gomory tells how
to construct a linear inequality which is not satisfied by
which is satisfied by every lattice point in T.
is added to the defining inequalities for
vector
11
is found.
repeated and so on.
If i i
1
but
This inequality
T and a new optimum
is not integral the process is
It has been shown that we converge to an op-
timum lattice vector of T.
For details see McCluskey [25].
Unfortunately, the new inequalities rapidly destroy the highlypatterned character of our matrices which makes it possible to get
general results, and we are restricted to comparatively small
values of
h
k
for which ''1e can do numerical computations involving
= (qk_l)/(q_l)
variables.
Even the capabilities of a digital
computer are quickly exceeded.
In the next section we will present another method of finding
integral solutions which will be used throughout the rest of the
chapter.
2.4
Lattice Points and the Geometry of T
Recall that ~
t has its
In order for
k-l
q
ble
1\
h
to be a lattice vector,
.
,~n which case,
n
d must be at least
= h = (qk -l)/(q-l)
is the smallest possi-
n, as we showed in the last section.
For smaller values of
d,
1
is not a lattice vector, but
we can hope to find lattice vectors in
than
d/ qk-l .
coordinates all equal to
h.
T for values of
Such a vector will have at least
h-n
n
less
of its elements
equal to zero, more if some of the non-zero elements are greater
than one.
For example, the modular representation vector of
Example 2.1 of Section 2.2 must have at least 13-6=7
equal to zero since
h
= 13
and n
= 6.
elements
8 ele-
In fact, it has
ments equal to zero since one of its non-zero elements is equal to
tv/o.
Suppose
t
is a lattice vector in T, and that the sum of
its elements (which must be a positive integer) is equal to
that
t
belongs to the set
show that for
n' < n,
T
n
defined in Section 2.1.
n, so
If we can
Tn' does not contain a lattice vector, then
we have solved our problem and
t
is the modular representation
vector of a minimum redundancy code.
We will sometimes be able to
do this by using the following theorem.
Theorem 2.3.
In order for
Tn
to contain a lattice vector, it is
necessary that some extreme vector of T
n
its elements equal to zero.
.e
have at least
h-n
of
It follows that some extreme vector
48
> 1.
has an element
First we note that
then
I!I
< n since all the elements of t
Hence, by Theorem 2.2,
points.
Tn
t
That is, if
t
is bounded, for if
Tn
is in
T ,
n
are non-negative.
is the convex span of its extreme
is any point in
T
n
and
t , t , ••• , t
-l -2
-m
are the extreme points of Tn' then
=
t
~ s.t.
1.
Since all the
each
t.
each
J.-J.
s.
J.
equal to zero unless some
h-n
1 .
t
t.
-J.
cannot have
h-n
of its elements
has at least
h-n
elements equal to
As was pointed out earlier, a lattice vector must have
zero.
-e
=
are non-negative, and all the elements of
are non-negative,
-J.
s.J. > 0
of its elements equal to zero.
It turns out that whenever we find an extreme vector of
with
h-n
T
n
coordinates equal to zero, we will be able to find a
lattice vector in
Tn.
Often, but not always, the extreme vector
is a lattice vector.
We would like to present a figure which has been helpful to
us) and will serve as an introduction to the next few sections.
Suppose
q=2
FG(k-l, q)
and
k=2.
Then if the points and hyperplanes of
are taken in the order of the columns of the matrix
Ct > d
the inequalities
t
+
t
2 +
t
l
t
tl +
t
take the form
2
3
> d
3
> d
> d
We will try to sketch the appearance of T in a real 3dimensional space having coordinate axis
t , t ,
2
l
is contained in the non-negative orthant since each
and t .
3
t
l
~
T
O.
viewer is to imagine himself sitting inside of T and looking
toward the origin
., e
e.
The
I")
50
t- -'
d
-,-
1'0-
~
\),
"
""j.,
'(:
I
,
../\
-
t
e
'-
-..
I
d 'I
itt
,
l!':'
i ~
..-,
J:..'
,
-..
....
I
i:~.+t3~d
d'
....
E:,'!>,\
;d
....
;f1t".*J :: J
/
....
I
, I
./
Figure 2.4.1
.e
!:
. . ,-<
,l~
A'-'?
. . .<
,,"" "d
""
"
"l
'-
-,
:::1.
;fJ.-
51
In order to draw a figure we have had to choose a very trivial
example, and it is surprising how many features in Figure 2.4.1
have close analogues in the general case.
For example, we have
already found ~ in the general case to be the point of intersection of the
h
real hyperplanes
c.·t
-1. -
= d.
In the next section we will find the edges analogous to
E , and E ,
2
3
where by edges we mean the following.
convex polyhedral set to which the m-tuple !
Ax
in
> b.
Then if E
is a subset of
S
a.·x
-1. -
= b1.'
then E is called an edge of
, where
(m-l) linearly independent
a.
is the i-th row of A,
-1.
S.
E , E2, E , enables
l
3
A knowledge of the edges analogous to
us to find all the extreme points of
However, for larger values of
belongs only in case
consisting of all m-tuples
S which satisfy a particular set of
equations of the form
Let
E ,
l
S be the
Tn
if
n
is not too large.
n, the situation is more complicated.
This is so because in addition to the edges
labeled, there are, t'or'larger'values of
not have analogues in the k=2 case.
Ei,j which we have
k, other edges which do
The situation is indicated in
Figure 2.4.2.
2.5
The Extreme Points of Tn
Let
C
m
for
n ~ qd/(q-l)
be the hxh matrix which is obtained from
placing the m-th row of
C by the h-tuple
1
of all
C by rel's.
The
matrix C is non-singular, for as we showed in obtaining formula
m
(2.3.5), the sum of all the rows of
C is
52
\
E
,E.
Figure 2.4.2
.e
E
53
I:
h
1
and since the rows of
1
tation of
=
C.
-1
q
k-1 .
:l
C are linearly independent) this represen-
is unique.
Hence
i
cannot be represented as a
linear combination of h-1 rows of C.
m
Let d
be the vector obtained from ~
m-th element of ~ by the integer
n.
Let
by replacing the
t
m
be the unique
solution of the equations
By Theorem 2.1) if
treme point of
t
m
belongs to
T ) then it is an exn
Tn .
t~
We want to derive a formula for the elements
·e
making use of the inverse of
J
Let
f
Cm
=w
placing the m-th element of
m
=
m
-
by
C in terms
•
be the vector which is obtained from
f
t
C) and) in particular) of formula
(2.2.10») which is an expression for the inverse of
of the equations
of
d
d
by re-
by
qk-1n - (·h-1 ) d
We assert that
(2.5. 4)
where
t
m
is the vector defined in (2.5.2).
singular) the equations
ct
m
=f
Since
C is non-
have a unique solution which cer-
54
tainly satisfies all the equations (2.5.2) except possibly the
m
m
equation .J.. t = n .
Now summing all the equations Ct = f
we obtain
h
Zl
and since
f
m
m
~i·~
= qk-ln
= qk-l.~._tm =
-h
Zi f i
)
=(
h-l d
+ fm
- ( h-l ) d, this gives us
.J.. • _t m = n
as desired.
~m from
We can therefore calculate
(2.5.4) instead of from
(2.5.2), and by substituting ~m for ~ and f for w in
(2.2.10) we obtain the following results.
If meo( j), then
t.m
where
/ qk-l ,
Z' d. + f - (
q-l »
l:
dol
O(j) ~
m
E(j)'"
(
=
J
Z' indicates that
Z'
d
m
d. +
O(j)
~
so substituting the value of
t~
= «h-l)d
is omitted from the summation.
Z
d.
~
E(j)
f
m
from
= (h-l)d
(2.5.3) we have
+ qk-ln _ (h-l)d _ q
J
,
Z d.)/(qk-l).
E(j) 1
Reaclling that
d.
Z
1
E(j)
=
(qk-l_l)d/(q_l)
we have
t~
J
.e
= (n
-
-S- d)
q-l
+
d
qk-2( q-l )
if mO(j).
Now
55
Now if
t
m
m€E(j), then
= (E
O(j)
j
di - (q-l)
= «h-l)d
E' dJ.. - (q-l) fm)/qk-l
E(j)
)
E' d. - (q-l) f )/qk-l
E(j) J.
m
- q
Noting that
q
E' d.
E(j) J.
=q
dJ.. - qd
I:
E(j)
=
q(q:~~-l)d
- qd
= (h-l)d
- qd)
we have
t~
J
= ql-k
= qd
(qd _ (q-l) f )
m
- (q-l)n
•
Putting this in a form which is more like (2.5.5) we have
q
t.m
= (q-l)(---l
d - n)
J
q-
(2.5.6)
m
If!
J
if
m€E(j) •
T) then it obviously belongs to Tn and)
belongs to
as we pointed out earlier) is an extreme point of T .
n
m
t
does not belong to T if
Eh tm
1 i
and if
for all
.e
= 6n)
n
h
_
-
n
<
(qk_l)d
qk-l( q-l )
=
n
By (2.3.6)
)
then it is easily seen that
choices of
m.
Hence we will assume that
n >~ .
56
By construction of
c.' t
-J. -
m
=d
q
m
we conclude that
c·t
-m -
> d.
/\ m
n > n,! belongs to
for
i 1= m, and since
if
k-l
>
n
Hence
Ct
q
m
k-l/\
n
= hd
> d and it follows that
T if, and only if,
Let us define
(2.5.7)
a
= t~J
if
~O(j)
(2.5.8)
b
= t~J
if
meE(j) ,
Since
,
n >'Z} it is easy to show by (2.5.5) and (2.5.6) that
and
b < d/cf- l
(2.5.10)
It is also clear that
b > 0
n < qd/(q-l).
as long as
We have
proved the following:
The vector !m, which is defined by formula (2.5.2
Theorem 2.4.
and whose coordinates are given explicitly in (2.5.5) and (2.5.6),
is an extreme point of Tn
"There
trt
point
t
=
m
~q~
-l)d
q - (q-l)
= 't' .
if and only if
If n = ri', then
'ri- ~
n ~ qd/ (q-l),
T contains only the single
n
We might remark that if we regard n
as a continuous para-
meter, then we have found parametric equations for the edges
where
E
m
equations
.e
E,
is the set of points of T which satisfy all of the
Ct
=d
except possibly the m-th one .
m
57
In concluding this section we want to show that if
~< n ~ qd/(q-l), then the vectors
t
m
are the only extreme vectors of T.
Let
n
whose m-th row is
(2.5.11)
B
are all distinct} and
B be the hxh matrix
m
t •
From
(2.5.7) and (2.5.8) we see that
= aC
+ be
= (a-b)C
+ bJ •
Consider the equations
where
y
o
is defined so that the equations are consistent.
the sum of the elements in each row and column of
means that
=
ni· x
B is
Since
n, this
h
Z 1 Yi ' &'1d hence
-e
MUltiplying the first equation of
(2.5.12) by b ani sub-
tracting the product from the other equations we obtain
=
. B...bJ ~
j
[Y J
0
y-by • j
-
Let us divide the last
nonzero by
h
0-
of these equations by (a-b), Which is
(2.5.9) and (2.5.10).
Then we have, according to
(2.5.11),
-!a-b
.e
(B-bJ)_x
= Cx =
1
-a-b (v-by
j) •
.I(. 0-
58
Hence, by (2.2.10), we have
1
xj - a-b
..2:...- (
qk-1
I:
O(j)
(y . -by ) - ( q-1)
~
I:
E(j)
0
(y . - by »
~
0
which, by the argument preceding (2.3.2), can be written
1
=
x.
J
qk-1
( a-b )
( ,I:
Yl.' -
(q-l)
O(j)
I:
E(j)
y.~ -by)
.
0
Now from the fact that
a-b
= (n-qd/(q-l»
+ dq2-k/(q_1) + (q-l)(n-qd/(q-l»
k-2( q-1 ) - qk-ld + d
qk-2 ( q-l )
=
~ ( n-d ) q
so that
(2.5.16)
a-b
we have
k 1
(q-l~
q(q - (q-1)n-(q -l)d)
x. =
J
and using formula (2.5.13) for
(
)
2.5.17
=
x.
J
Yo
{~Yi
O(j)
- (q-l)
I:
E(j)
y.-by)
1.
0
gives
«n-d) I: y. - d I: y.
k 1(q-l)
k
n(q - (q-l)n-(q -l)d)
O(j) 1.
E(j) ~
).
Formula (2.5.17) gives the inverse of BJ which can evidently
be written
(2.5.18)
-1
B
_-
(q-1) k
{n-d
( ) C' - dC'
- ) •
k 1
n(q - (q-l)n-(q -l)d)
59
In the next section we will make use of the inverse of B in
finding representations of the lattice points of T
in terms of
n
m
-t
the extreme points
the vectors
dent.
t
m
)
but for the present we need only note that
are not only distinct, but also linearly indepen-
We are now ready to prove the main theorem of this section.
n ~ qd/ (q-l), then Tn
m
extreme vectors t •
Theorem 2.5.
of the
h
If
1i' <
is the convex span
m
Let us denote the convex span of the vectors
~
by S.
From the definition of convexity, it is easy to show that if a
convex set contains a set of vectors, then it contains their convex span.
Hence
Since the
Tn
h
contains
S.
linearly independent vectors
t
m
(h-l)-dimensional affine plane F whose equation is
is well known that F
Since Tn
belong to the
1.!
is the affine span of the vectors
is contained in F, then; if. ,...
t
€
n, it
m
t •
=
Tn
where
s. > 0, so that T is conJ n
This, together with Tn ~ S, establishes the theorem.
We will show that, in fact, all the
tained in S.
If t€T, then
-
n
(2.5.20)
=
for every row ~i
where
c ij
=I
of
C.
Esc .• t
m m-~ -
Recall that ~i
if ieO(j) and c ij
=0
m
>
-
d,
= (c il,
if iEE(j).
c i2 ' •.• , c ih )
Similarly,
60
! m = (m
tl,
m) where
tm
, ... , t h
2
if meE(j). That is, _tm = ac
-m
tm
=a
j
if mEO(j)
+ be
and
t~
J
=b
Hence by (2 2.3) and
-m
(2.2.2)
and
c .• t m = qk-2( q-l ) a + qk-2_
0
=d
-1. -
!
if i
m•
Substituting these values in (2.5.20), and recalling from (2.5.19)
that
~~ sm = 1, we obtain
c .• t
-
-~
= ~ s d
m~i
m
+ s.(qk-ln _ q(qk-l_l)d/(q_l»
~
+ s.(qk~ln _ (qk_l)d/(q_l»
= d
1.
But the above inequality implies that
that
(qk-ln - (qk_l)d/(q_l»
> O.
>d •
-
s. > 0, since
1. -
n >~ implies
This completes the proof of the
theorem.
Example 2.2.
Let
q
= 3,
k
= 3,
and let the rows and columns of
be ordered as in Example 2.1 of Section 2.2.
n
Let
= 6 and
= 53/6, where we consider n as a continuous parameter in
order to illustrate the most general situation.
= 5/6
a
=
b
= 2(3.6/2 - 53/6) = 2/6 •
(53/6) - 3· 6/2) + 6/6
and
Hence
!l
.e
d
= ~(2 2 5 2 5 5 5 5 5 2 5 5 5) •
Then
C
61
It is easy to verify that
1
and
2
1
5 2
5 5
5 5
2 5
5 2
B=; 5
.
e
2
5
5 5
5 5
5 5
2 5
5 2
2 5
. t
.J.._
=
1
c .·t
-J -
= 6 =
5 2 5 5
2 5 2 5
2 2 5 2
5 2 2 5
5 5 2 2
5 5 5 2
2 5 5 5
5 2 5 5
5 5 2 5
5 5 5 2
5 5 5 5
5 5 5 5
2 5 5 5
53/6 =
n
for
d
j
fl.
5 5 5 2 5 5 5
5 5 5 5 2 5 5
5 5 5 5 5
2 5 5 5 5
5 2 5 5 5
2 5 2 5 5
2 2 5 2 5
5 2 2 5 2
5 5 2 2 5
5 5 5 2 2
2 5 5 5 2
5 2 5 5 5
5 5 2 5 5
2
5
5
2
5
5
5
5
5
5
5
5
2 5
5 2
2 5
2
= b1 ( 5C
+
2C).
2
5 2
From (2.5.18) we have
-
e
-36 17 17 17
-36-36 17 17
17-36-36 17
-36 17-36-36
17-36 17-36
17 17-36 17
-1 4 1 17 17 17-36
B -53
17 17 17 17
17 17 17 17
-36 17 17 17
17-36 17 17
17 17-36 17
17 17 17-36
-36 17 17 17 17 17
17-36 17 17 17 17
17 17-36 17 17 17
17 17 17-36 17 17
-3617 17 17-36 17
-36-36 17 17 17-36
17-36-36 17 17 17
-36 17-36-36 17 17
17-36 17-36-36 17
17 17-36 17-36-36
17 17 17-36 17-36
17 17 17 17-36 17
17 17 17 17 17-36
-36 17
17-36
17 17
17 17
17 17
17 17
-36 17
17-36
17 17
17 17
-36 17
-36-36
17-36
-36
17
-36
17
17
17
17
17
-36
17
17
17
-36
-5; i(17Ct -36C t ),
62
and it can be verified that
2.6
BB
-1
= 1.
The Generalized MacDonald Codes
For
= qd/(q-l),
n
we have from (2.5.5) and (2.5.6),
t.m = dq2-1'../( q-l )
(2.6.1)
if m1Z0(j)
= a,
J
j
and
m
t. = 0
(2.6.2)
J
If
d
=b
)
= qk-2( q-l,
if
;
then
t
lU
.
m€E(j)
is a lattice point and is the
modular representation vector of an
It is easy to
(n; k) code.
shoYl that such a code is of minimum redundancy. For given k
qd
q-l
q
--=q-l.
n<-=
qk-2 ( q-l )
=
q
and
k-l
then by (2.5.5) and (2.5.6), all the coordinates of the vectors
t
m
are non-zero.
P oints on
that
T
unless
n
n <
Hence, by Theorem 2.3, there are no lattice
-
and this contradicts the assumption
'
et- l
Now whether or not
that if
n > h
d
n < qd/(q-l), then
)
= qk-2( q-l,
n > h.
the above argument shows
This clearly leads to a con-
tradiction as long as
k
d < (q-1) h/ q = (q
- 1)/ q
Hence we have the following theorem.
Theorem 2.6
n
0
= qd/(q-l),
If
d < (qk_l)/q, and if there exists a code with
then the code is of minimum redundancy.
We have seen that such a code exists if
d
= qk - 2 (q-l),
and
we can see from Theorem 2.3 that it does not exist if
d < qk-2(q_l), since in this case
a < 1
and
b
= O.
In case
n
= qd/(q-l)
b
= O.
~~-2(q_l) < d < (qk_l)/q; we have a > 1 and
and
In this case codes sometimes exist in accordance with the
following theorem, which includes the codes with
d
= qk-2(q_l)
as a special case.
Theorem 2.7.
If n
= qd/(q-l)
and
k-l
u
d = q
- q,
where u = 0, 1, 2, ••. , k-2,
then there exists a minimum redundancy code having parameters
n; k, and
Since
d.
qk-l_qu < (qk_l)/q, if such a code exists it is of
minimum redundancy by Theorem 2.6.
It remains to prove the exis-
tence.
In attempting to express lattice vectors in terms of the extreme vectors
t
be a u-flat of
of
m
we are led to the following solutions.
PG(k-l,q).
PG(k-l,q) containing F
(2.6.3)
Define an
(2.6.4)
h-tuple !
= (sl'
= [l/f
m
(1.5.7) the number of hyperplanes
is
= (~-u-l_l)/(q_l)
f
s
By
La
Let F
•
s2' •.• , sh) as follows:
if the m-th hyperplane contains F,
otherwise •
We will show that
(2.6.5)
is a lattice point.
t
=
By Theorem 2.5,
will prove Theorem 2.7.
.e
t
belongs to
T, so this
n
64
Since
= 0,
b
we have from (2.5.11)
(2.6.6)
where
t
t
= aE.,C
= aC
and
,
is defined in (2.6.5).
Let
C, and recall that the m-th element of
point of
B
b.
-J
be the j-th column of
b.
-J
is
if the j-th
0
PG(k-l,q) lies on the m-th hyperplane, and is
1
other-
wise.
From the definition of
t
(2.6.7)
Since F
j
E."
if the j-th point is on F.
= as·b. = 0
- -J
is a u-flat, there are (qU+l_l)/(q_l) elements of
which are equal to
t
O.
If the j-th point of
PG(k-l,q) is not on F, then
(2.6.8)
where
f
is defined in (2.6.3) and p
which contain F
is the number of hyperplanes
but do not contain the j-th point.
Now a hyperplane contains
F
and the j -th point if, and only
if, it contains the (u+l)-flat spanned by F
and the j-th point.
By (1.5.7), there are (qk-U-2_ 1 )/(q_l) such hyperplanes.
Subtract-
ing this number from the total number of hyperplanes which contain
F, we have
P
.e
( k-u-l) _ (k-U-2)
= ..9...-_--=!..
q
-1
=
q-l
q-l
k-u-2
q
65
Substituting this value for
of
a
and
t.
J
in (2.6.8) along with the values
f, we obtain
t\qk-l -qu)
. k 2
=
p
q
k-u-2
(q-l)
( qk-U-l -1 )
q -(q-l)
=
( qk-l -qu) q-u
....=--k-u-l
(q
-1 )
---.-..:~~
= 1.
Hence
if the j-th point is on F,
=
There are
if the j-th point is not on
(qu+l_l)/(q_l) elements of
!
equal to
F~
0, and
(~_qU+l)/(q_l) = n elements equal to 1.
Since we have exhibited a lattice vector in
of
n, k, and
2.7.
d
T
n
for the values
hypothesized, this completes the proof of Theorem
From formula (2.6.9) we also have the following theorem.
Theorem 2.8.
Let
G be a generator matrix whose columns are co-
ordinate vectors of all the points in
on some u-flat (u=0,1,2, ••• ,k-2).
minimum redundancy, with
PG(k-l,q) which do not lie
Then the resulting code is of
k-l u
d = q
-q
and n = qd/(q-l) =
(qk_qU+l)j (q-l).
These codes are a generalization of the codes found in the
q = 2
case by MacDonald [24].
MacDonald used a different method,
involving neither convex sets for finite
geometries~
which we will
indicate in the next chapter.
2.7
Tn
for
For
n
> qd/(q-l) it is very difficult to find all the extreme
points of Tn'
.e
n > qd(q-l)
As mentioned in Section 2.4, extreme points of dif-
66
ferent types exist for increasing values of
k.
However, it is
possible to find some of the extreme points, and these lead to
the codes described in the next chapter.
One way of finding additional extreme points is through the
method of linear programming.
The inequality
c.·t > d
-l. -
-
may be replaced by the equation and inequality
c .• t - x.l.
-l. -
=d
C we see that
h-tuple
x
t
belongs to
l. -
Doing this for all rows
is a dUIIlIDV variable.
where
x. > 0,
,
c.
-l.
of
T if and only if there exists an
such that the following equations and inequalities are
satisfied:
[C,
-II
t
=
x>O
d,
x
If the above equations are satisfied, then certain columns of
the
of
that
hx2h
t
matrix
and
t
x.
[C, -I]
are multiplied by non-zero elements
It is a fundamental theorem of linear programming
is an extreme point of
T if, and only if, these columns
are linearly independent.
For example,
l'
is an extreme point of
''le have ~ = 2, and the columns of
c£ = ~,
C are linearly independent.
It is possible to express the columns of
linear combinations of the columns of
done as follows:
T, for since
-I
and the vector
d
C.. This may be compactly
as
,./.\
Cl!, -C
where
C- 1
-1
] = [,2;, -I]
is given by (2.2.7).
Equations (2.7.3) are essentially an example of what is known
in linear programming as a simplex tableau, the matrix
[~
l
_C- ]
being the tableau matrix.
The simplex method of linear programming provides us a means
of going from one extreme point to the others which are adjacent
to it, where two extreme points !l
and !2
are said to be ad-
jacent provided that the segment between them
O<s<l,
is an edge.
The extreme points adj acent to
n
= qd/(q-l).
The edge between
t
t
and
are the points
t
m
is E.
m
t
m
By taking
the intersection of the edges with the real hyperplane .\l:!
'ri' < n
n.
with
=n
,
< qd/ (q-1), we can obtain the points t m for all values of
It is not difficult to continue this process, however certain
features of the problem should be noted.
First, the number of extreme points and edges is extremely
large, so that it is essential to recognize equivalence classes
of extreme points rather than dealing with them individually.
We
first considered doing this by using the idea of combinatorial1y
equivalent codes, due to Fontaine and Peterson [17].
However, since
it was necessary to repeatedly prove the equli.va1el1lce of different sets
of extreme points it seemed more convenient for this problem "(which
68
is different from Fontaine and Peterson's) to treat things from
the beginning on a coordinate free basis.
For this reason we have
not specified the order of the points and hyperplanes of PG(k-l,q)
except in examples.
A second feature of the problem is the highly patterned
character of the matrices which arise if some care is taken.
The
simplex method enables us to go from one tableau to another, which
usually corresponds to a new extreme point.
In some cases the
columns which correspond to the non-zero elements of the vector
(~,~)
may not be sufficient to span the column space of the matrix
of (2.7.2).
This is known as degeneracy and it is necessary to add
columns corresponding to zero elements in order to get a tableau.
IVhich columns are added will have a great effect on the symmetry
of the resulting matrices.
It may even be desirable, for reasons of symmetry, to add
additional (dependent) columns.
The situation is similar to that
which exists in the analysis of incomplete block designs.
The
individual treatment effects, for example, may not be estimable}
but for reasons of symmetry one prefers to work with these parameters and the resulting singular matrices.
The psuedo or condi-
tional inverse idea, introduced by Bose in the analysis of variance;
is also useful for the linear programming problem.
Our guiding principle for recognizing what columns will result
in symmetry has been to translate each stage of our simplex method
results into the language of finite projective geometry.
.e
If the
translation seems overly messy, we try something else.
Example 2.3.
planes of
Let
q
= 2 and k = 3. Let the points and hyper-
FG(k-1,q)
be ordered as the columns of the matrix
' 0 0 0 1 1 1 11
o 110 0 1 1
[ 1 0 101 0 1
Then corresponding to (2.7.2) we have the equations
t1 t2
1
1
1 1
(2.7.4)
1
e
1
t
t t
3 4 5
1
1
1
1
1 1
1 1
1 1 1 1
1
1
t6 t
7
1
1
xl x2 x x4 x x6 x
3
7
5
-1
)
1
-1
1
1
r
-1
1
=
-1
1
-1
-1
1
-1
d
d
d
d
d
d
d
Note that in the upper left-hand corner of C above we have the
submatrix
C2
"hich is just the
~
01 10 1
=
[
110
C matrix for
k
=2
are ordered as follows
011
101
o 1 101
1 0
1 1
.e
011
110
if the rows and columns
70
This matrix is displayed in (2.7.4) because we have chosen three
collinear points, and appropriate hyperplanes, to be first in our
ordering.
Now returning to the k = 3 case, consider the extreme vec4
t
with n = 2d. By (2.5.5) and (2.5.6),
tor
t
4
= (000 d/2 d/2 d/2 d/2) ,
and, substituting
t
4
into equations (2.7.4), we see that
x4=d
and all the other xi
are zero. The columns which correspond to
4
and x do not span the column space
the non-zero elements of t
of C, so we have degeneracy.
We must add two additional columns
in order to get a simplex tableau, and for reasons of symmetry we
will actually add the three columns corresponding to
~,
x6' and
We then have
d tlt2t3x1x2x3
t4t5t6t7x4X5x6~
011
011
011
11111
1
1 1
1
1
d/2 000
d/2 1
1
d/2
1
d/2
d
111
0
- 1 0
1 - 0
- 1
-
1/2 1/2
-1/2 1/2
1/2 -1/2
-1/2 -1/2
0
0
1
1
r:
1/2
-1/2
-1/2
1/2
0
d
d
= d
d
d
1
1 1 1
11
000
1 1
11
1 1
1
-'
For any k, the extended tableau matrix corresponding to any t
can be put in the form
.e
m
71
ad
-0'
-
I
d
J.'
ad
CYi['
-C -1_
k 1
,
a=
1
0'
0 _ ! C-.i
a k-l
I
-
-
Row vectors are indicated by primes.
It is probably clear from the previous discussion that, as
far as this problem is concerned, we regard convex sets and
linear programming primarily as a means of discovering theorems
in finite projective geometry.
Once the theorems are discovered
it is often easier to present them without explicitly giving the
linear programming approach.
Since computation is impossible,
the proofs almost have to be of an essentially finite geometrical
(or group theoretical) character.
For expository reasons the next
chapter will be much more like Chapter I than like the present
chapter.
There is a good deal of interest in the possible utility of
linear programming methods in combinatorial problems (cf. reference
[1]).
In this regard it might be mentioned that the author had
completed the work up to the beginning of this section, in coordinatized form and for
q=2, before MacDonald pUblished his codes.
Some of the features of the problem we have considered are common
in applications of linear programming to combinatorial problems,
for example, the highly patterned constraint matrix which rapidly
gets too large for computation as the parameters are increased, and
.e
the requirement for integral solutions •
CHAPrER III
OPI'IMJM CODES AND BCUNDS FOR
= (qd
3.1 Codes with n
n
> ...9-.
q-l
d
+ l)/(q-l)
. In following the program outlined in the last section of the
previous chapter, the first codes we encounter can be described
as follows (see Theorem 2.8).
Theorem 3.1.
Let G be a generator matrix whose columns are co-
ordinate vectors of all distinct points in
lie on either of two disjoint flats.
PG(k-l,q) which do not
Then the following statements
are true:
(a)
If u
and v
are the dimensions of the respective flats,
+ v < k - 2.
then u
= (qk_qU+l
(b)
n
_qV+l+l)/(q_l).
(c)
If W(j) is the number of words of weight
k-l)
W( q
= qk-u-v-2 -1
= qk-u-V-2
W(qk-l_qV)
= qk-U-V-2(qV+l_ l )
W(O)
.e
,
w(qk-l_qU)
W( qk-l -qU-qv)
=1
j ,
(qu+l_ l ) ,
,
= qk-u-V-2( qu+l - 1)( qv+l- 1)
,
,
(d)
and for all other values of j, W(j) = 0 •
k-l U v
d = q -q -q so that n = (qd + l)/(q-l) •
(e)
The code generated by G is of maxi-min distance •
73
A necessary and sufficient condition for the existence of a
u-flat and a v-flat which do not intersect is that
than the dimensionality of the space.
To prove
u+v be less
This proves (a).
(b) we simply subtract the number of points on the
u-flat and the v-flat from the total number of. points in PG(k-l,q).
To prove (c) we first note from Theorem 1.6 that an m-flat
is either contained in a hyperplane or intersects it in an (m-l)flat (where by a (-I)-flat is meant an empty set).
Let U and V
denote respectively the u-flat and the V-flat of the theorem.
A
hyperplane contains both U and V if and only if it contains
their span U+V, which is a (u+v+l)-flat.
there are
V.
( qk-u-v-2 -1 )/ ( q-l ) hyperplanes which contain both U and
It follows from Theorem 1.11 that there are
of weight
tain
Hence, by formula (1.5.7),
q
qk-l , for each hyperplane of PG ( k-l, q )
qk-l points, and if a hyperplane F
then all the points not contained in F
k-u-v-2 -1 words
fails to con-
contains both U and V,
have coordinate vectors which
are columns of G.
There are
(qk-V-l_l)/(q_l)
hyperp1anes wh'1Ch conta1n
·
V, an d
subtracting the ones enumerated above leaves
hyperplanes which contain V and intersect U in a (u-l)-flat.
The
number of points which are on U but not on such a hyperplane is
(qu+l_qU)/(q_l) = qU, and the weight of the (q-l) words corresponding
to the hyperplane is affected by the fact that coordinate vectors
of these points are not columns of G.
.e
Hence there are
u+l
qk-u-V-2( q
-l)
words of weight
k-l -qu •
q
Switching U and V in the above argument shows that there
are
qk-u-v-2( qv+l -1 ) words of weight
qk-l -qv •
The only remaining possibility is that a hyperplane contains
neither U nor V but intersects both, in a (u-l)-flat and a
(v-l)-flat respectively. To such a hyperplane correspond (q-l)
k-l u v
words of weight q -q -q • From the fact that there are a total
of
k
q -1 non-null words in the code, we find that there are
qk-u-v-2(qu+l_ l ) (qv+l_ l )
words remaining for the final weight
category.
In proving part (d) of the theorem, the value of
d follows
immediately from part (c) and hence it follows from part (a) that
n
= (qd
+ l)/(q-l).
To prove part (e), recall that the sum of the elements in
each column of C is
qk-l Hence no matter what columns are
chosen by the modular representation vector, the sum of the weights
of all the words in the code is
(3.1.1)
n ( q-l ) qk-l •
Since the smallest non-zero weight cannot be greater than the
average weight of the non-null words of the code, we have
This inequality is well known and can be proved in a variety of ways •
.e
75
Substituting the value of n
given in the theorem into
(3.1. 2) we have
( q-l ) qk-l
d<
k
(q -1)
=
The difference between this upper bound and the actual
k-l u v
d = q -q -q is
J
[qU - \k-l- (qU+l_ l )
q -1
qk-l -qu
= It 1
q -
e
Since the value of
+
k-l
[qv _ \(qV+l_1>]
q -1
qk-l -qv
k
q -1
+
=
k-l u v
2q -q-q
< 1 .
k
q -1
d must be an integer, this shows that
d is
maximum, and completes the proof of the theorem.
By simplifYing the logic used in the previous proof, we can show
that for the codes given in Theorem 2.8,
k-l)
W( q
= qk-u-l -1
W( qk-l-qu)
w(o) =
and W(i)
=0
=
,
qk-qk-u-l ,
1
for all other values of i.
then the generator matrix corresponding to
Similarly if
"t'
d
= qk-l
has as columns the
coordinates of all the points of FG(k-l,q), so that every non-null
76
word is of weight
k-l
q
As we mentioned in Section 1.6, a vector over GF(q)
regarded as a number written to the base
q
q.
may be
For example, if
= 2,
(1 0 1 1 0 1 1)
In Example 2.3 we ordered the points and hyperplanes of
PG(2,2)
in ascending order if they are regarded as binary numbers.
We
will do the same thing on the next page in giving the matrix C
for
PG(4,2), or what is the same thing except for the presence
of the null vector, for EG(5,2) •
.e
e
e
e
0
0
0
0
0
0
0
0
0
,
0
0
0
I
0
, ,, ,, , I
,• • , ,
0
0
0
0
0
0
0
0
0
0
0
0
0
0
I
0
0
0
0
0
0
0
0
,
0
0
I
0
I
0
, ,
0
0
I
I
0
•
0
0
P 0 I
0
0
,
,• I
0
,
I
0
I
,,
NT
, S
I
0
0
0
0
0
0
0
, , , ,
0
0
, •
0
,
0
0
0
I
0
I
0
0
I
0
,
I
I
I
0
0
,
I
,, ,
0
0
0
0
I
I
0
I
I
0
I
•
0
I
0
,
I
,,
0
,
,
,
,
I
,
0
0
f
:0
I
0
I
I
I
o 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31
0
1
2
3
4
5
6
7
8
·9
H
Y
P
E
R
P
L
A
N
E
S
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
1
1
1 1
1 1
1
1
1 1
1 1
111 1
1
1 1
1
1 1 1 1
1 1
1
1
1
1
1 1
1 1
1"1 1 1
1
11
1
1 1 11
1 1
1
1
1
1 11
1 1
1
1
1 1
1 1
1
1
1
1 1
1 1
111 1
1
1 1
1
1 1 1 1
1 1
1
1
1
1 1
1 1
1 1 1 1
1
1 1
1
1 1 11
1 1
1
1
1
1
1 1
1 1
1
1
1
1
1
1
1
1 1
1
1 1
1
1 1
1
1 1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1 ·1
1
1
1
1
1
1
1 1
1
1 1
1
1 1 .1
1
1
11
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
Figure 3.1.1
Matrix C for EG(5,2) or PG(4,2) Points and Hyperplanes in Binary Order
1
1
1
1
1
1
1
1
1
-.J
-.J
1
1
78
MacDonald [24] discovered that when the rows and columns of
C were in binary order, then by deleting the first m columns
u
u
where m = 2 _l or m = 2 (u = 1,2, ••• ,k-l), he got a code which
was optimum by (3.1.2). From our point of view, deleting the first
u
2 _l columns amounts to deleting a particular (u-l)-flat. Deleting
U
the 2 _th column amounts to deleting a disjoint v-flat with v = o.
When v > 0, MacDonald did not discover any codes of the type
described in Theorem 3.1.
This is because these codes cannot be
obtained by deleting columns in order.
Example 3.1.
C
For
k
= 5,
q
= 2, let the rows and columns of
be ordered as in Figure 3.1.1.
Let U be the 2-flat defined by
the equations
Xl
=0
,
x
=0
•
Then U consists of points
be the disjoint
2
1
through
7 of FG(4, 2).
I-flat defined by the equations
3 • 0 ,
x4 = 0 ,
x
x
Then V consists of points
5
=0
•
8, 16, and 24.
It is easily verified from Figure 3.1.1 that
= w(16) = 25- 2- 1 - 2_1 = 0 ,
W(qk-l_qU) = W(12) = l·(~-l) = 7 ,
v = W14)
(
W(qk-l -q)
= 1· (2
2 -1 ) = 3 ,
W(qk-l)
W( qk-l - qu - qv)
.e
( ) = 1· 7· 3 = 21,
= W10
Let V
79
as predicted
~yTbeorem
3.1. It is also easy to illustrate the
steps of the proof of the theorem, keeping in mind the definition
of
C.
If the first 10 columns are omitted from Figure 3.1.1, then
words
12 and 28
are of length
9, so that the resulting code is
not of maxi-min distance.
3.2 Codes with n = (qd+m-l)/(q-l)
When our previous result is stated in the form of Theorem 3.1,
it is easy to conjecture that perhaps more than two disjoint flats
can be omitted.
However, as pointed out in part (d) of the following
theorem, the resulting codes are not always optimum.
In the fol-
lowing discussion we will use the convention of calling the null
set a (-l)-flat.
Theorem 3.2
Let
G be a generator matrix whose columns are co-
ordinate vectors of all distinct points in
PG(k-l,q) which do not
U , U , ••• , U . Let T be the
2
l
m
U., and let S be anyone of the
lie on any of m disjoint flats
set consisting of the m flats
1.
m
2
subsets of T including the null set and T itself.
Then the
following statements are true:
k
u.+l
(a) n = (q -1- E~(q 1. -l»/(q-l),
where
(b)
u.
1.
is the dimension of U ••
1.
For any choice of S,
W(qk-l _ E
S
qUi)
= (q-I)N[S]
,
where S is the set of all U not in S, and N[S]
i
is the number of hyperplanes containing all the flats
80
S.
in S and none of the flats in
To calculate
N[S], see formula (3.2.6) below.
(c) Except for
W(O) = 1, W(j) = 0 for all values of
not enumerated in part (b).
j
d
~
qk-l_
E~
qUi.
Hence
If there exists a hyperplane
which does not contain any of the flats U., then
~
k 1
u.
d = q - - ~ q ~ and n = (qd+m-l)/(q-l).
(d) The codes are not of maxi-min distance for all
choices of the flats
U.•
~
To prove (a) we simply subtract the number of points on the
flats in T from the total number of points in
To prove (b), let
·e
N(S) be the number of hyperplanes in
PG(k-l,q) which contain all the flats in S.
(3.2.1)
where
N(S)
PG(k-l,q).
= (qk-S*-l_l)/(q_l)
By formula (1.5.7),
,
S* is the dimension of the flat spanned by the flats in S.
Let
S
be the complementary set of S in T, and let
be the number of hyperplanes in
N[S]
PG(k-l,q) which contain all the
flats in S but do not contain any flats in S.
Then if M(S)
is the number of hyperplanes which contain all the flats in S
and at least one of the flats in
N[S]
=
S,
N(S) - M(S) •
To calculate M(S), assume first that S can'ta ins two flats
U
l
_e
and U
2
0
Then it is easy to see that
81
N(SU U )
where
is the number of hYllerplanes which contain all the
l 2
flats in S, U ' and U2' N(SU )
l
l
fined.
To extend
Ur
and N(SU 2 )
being similarly de-
(3.2.3) to the general case of r
flats
U ' U2, ••• ,
l
in S, let
<3.2.4)
E
i<j
~
and so on,
Ep(P
= 1,
N(SU.U.)
~ J
E
N(SU.U .U )
i<j<k
~ J k
=
2, ••• , r)
being the sum of
corresponding to the distinct choices of p
( 3.2.5)
M(-S)
=
-
~
E2 + ~
(~) summands
flats out of r.
- E 4 + ••• +( -1 )r+l
Then
Er •
The proof of (3.2.5) exactly parallels the derivation of the probability of realizing at least one among
events (cf. Feller [26]).
r
From (3.2.2) and (3.2.5) we have
(3.2.6)
N(S)
N[S]
= N(S)
-
and the flats
Let
S'
U. , U. , ••• , U.
~l
~+ ••• + (_l)r
~2
~p
Ep
Er .
Ep
may
be the set consisting of S
from
S.
S in formula (3.2.1), we can calculate
We then calculate
.e
E 2
can be calculated from (3.2.1), and the quantities
be calculated as follows.
for
~ +
from (3.2.4).
Then substituting S'
N(SU. U. • • • U
~l ~2
4p
...
),
82
Now since
N[S]
is the number of hyperplanes which contain
all the flats in S but do not contain any of the other flats in
T, it follows from Theorem loll that there are (q-l)N[S] words
k-l
u
of weight q
E q i , where u 4 is the dimension of flat U.
S·
in
S.
~
This is so because each of the
N[S]
hyperplanes inter-
sects each of.the:":flats Ui in S in a (ui-l)-flat. The number
u.
q ~of points which are on U. but not on the hyperplane must be
~
subtracted from the weight of each of the (q-l) words corresponding
to the hyperplane.
Since the flats
u
subtrahend is the sum of the
Ui
are disjoint, the total
q i•
To prove (c), note that by Theorem 1.11, every non-null word
of the code corresponds to some hyperplane which determines its
weight.
Any given hyperplane will contain some subset S of the
flats in T, and fail to contain some subset
S
of these flats.
Hence all the possibilities have been considered.
If there exists
a hyperplane which contains none of the flats in T so that
S = T,
then by part (b) the weight of the corresponding words is
Ii qUi,
qk-l
for
n
which is obviously a minimum.
From this value
d and part (a) of the thoerem, we calculate
= (qd
+ m - l)/(q-l).
Hence if nand
q are given,
d de-
pends on the number m of flats which are omitted.
To prove (d) it is sufficient to exhibit a code, satisfying
the conditions of the theorem, which is not of maxi-min distance.
Suppose
.e
m = 3 and let Ul ' U2' U be three non-collinear points
3
83
If k ~ 3
of PG(k-l,2).
it is easy to verify from Figure 3.1.1
that there exists a hyperplane which does not contain any of these
points.
.
we~ght
The word corresponding to the hyperplane is therefore of
~
qk-l -.I.
However if we delete three collinear points, in
v[hieh case the code falls under Theorem 2. 8 , then
d = qk-l -2.
This completes the proof of Theorem 3.2.
EXample 3.2.
the
Let
k = 5 and referring again to Figure 3.1.1, let
m = 3 flats U
i
U
l
U
2
U
3
be defined as follows:
= 2-flat consisting of points 1 through 7·
= I-flat
consisting of points
= I-flat
consisting of points 15, 20, and 27.
8, 16, and 24.
Then we have
n
The
23
. 5-1)-(23-1)-(2 2 -1)- (2
2 -1» I( 2-1 ) = 18.
= «2
=8
subsets 8
described in the theorem may be numbered
as follows:
81 = Ul , U2, U
3
8 =U, U
2 3
2
8 = U, U
l
3
3
84 = U , U
l
2
5
= Ul
86
= U2
8
8 = U
7
3
88 = cp, the null set
Using formulae (3.2.1), (3.2.2), (3.2.6) and the accompanying
definitions, we calculate
N[Sl] = N(Sl)-M(Sl) = N(Sl)-M(~) = N(Sl)-N(Sl) =0, since St=4,
N[S2] = N(S2)-M(S2) = N(S2)-N(S2UO) = 1 - 0 = 1,
since S~=3,
(S2UO)* = 4,
N[S3] = N(S3)-N(S3U1) = 0 - 0 = 0 ,
N[S4] = N(S4)-N(S4U2) = 0 - 0 = 0 ,
N[S5] = N(S5)-(N(S5U1)+N(S5U2)-N(85U1U2»
= 3-(0+0-0) = 3 ,
N[S6] = N(S6)-(N(S6Uo)+~S6U2}wN(S6UOU2»
= 7-(0+1-0)
N[S7] = N(87)-(N(S7UO)+N(S7U1)-N(87UOU1»
= 7-(0+1-0) = 6 ,
=6
,
N[SS] = N(S8)-(N(Uo)+N{U1)+N(U2)-N(UOU1)-N(UOU2)-N(U1U2)+N(UoU1u2»
= 31 - (3+7+7-0-0-1+0) = 15 •
From the above calculations we can easily determine the distribution of word weights.
·e
We will also give the individual words
which correspond to the different sets 8i , and by reference to
Figure 3.1.1 it is easy to see just how the theorem works in this
example.
.e
Set
Word Weight
No. of Words
Individual Words
(Hyperplanes)
3
S2
12
1
S5
12
3
8,16,24
86
10
6
1,2,4-7
S7
10
6
9,10,21,22,28,31
8
8
8
15
all others
85
For the other sets, N[S.] = O.
.
The word weights are obtained
~
from part (b) of the theorem.
we have
N[S5]
Corresponding to
S5' for example,
words of weight
k-1 u1 u 2
4
1
1
q -q -q = 2 - 2 - 2 = 12.
We note that bound (3.1.2) tells us only that
d < 18. 16/31=9.3,
so that this bound does not establish the optimality of the code.
We
shall see later that the code is of maxi-min distance.
Now let us return to the general problem of determining when
the codes of Theorem 3.2 are optimum.
By part (c) of the theorem,
if there exists some hyperplane which does not contain any of the
U , then n = (qd + m-l)/(q-1). Let us assume for the
i
moment that such a hyperplane exists. (This assumption will be disflats
cussed in the next section.)
m is, the smaller
d
is.
Then for gi \e n n
and k, the larger
Hence, if a code is to be of maxi-min
distance, the number
k
q -1
-
q-1
-n
of points to be omitted must be omitted on the smallest possible
number
m of flats.
For example, by violating this condition, we
constructed a non-optimum code in the proof of part (d) of the theorem.
Now according to Theorem 1.12, a code which is not of maxi-min
distance is not of minimum redundancy.
However, it is not difficult
to construct codes of the type defined in Theorem 3.2 which are of
maxi-min distance, but which are not of minimum redundancy.
.e
The
86
code of Example 3.2 is such a code.
k
= 5,
and
d
= 8,
It has parameters
whereas a code with parameters
n
n
= 16,
= 18,
and
d
k
=8
can be constructed by omitting a hyperplane from
= 5, •
pG(4, 2). Such a code is proved of minimum redundancy in Theorem
2.8.
It is easy to see why the code of Example 3.2 is not of minimum redundancy.
If we define.
(3.2.7)
qk-l - d
=
f
,
then, un.der the assumption that some hyperplane does not contain any
of the flats
d
U.,
we have by Theorem 3.2 part (c),
~
E~
= qk-l
qUi
(3.2.8)
d
f)
k
n
=
f
and for a given
al
and
and
= (qd
m
+ m-l)/(q-l).
u.~
El q
d (which by
we must minimize
Hence
(3.2.7)
means for a given
m in order to minimize
n.
k
This moti-
vates the following definition.
Let
V
and generator matrix
G.
Definition
'.1.
be an
Suppose the columns of
vectors of all di stinct points in
of
m disjoint flats.
(n, k) code with minimum distance
PG(k-l, q)
Suppose further that
number of positive integers of the form
into which
m-code •
.e
f
= qk-l
- d
qS (s
can be partitioned.
d
G are coordinate
which do not lie on any
m
is the smallest
= 0,
Then
1, ••• , k-2)
V is an
87
Note that the definition of the m-codes does not depend on
the assumption that some hy.perp1ane does not contain any of the
disjoint flats.
The m-codes are a subset of the codes defined
in Theorem 3.2.
The
m-codes with
m =0
~
and
codes and the codes of Theorem 2.8
m =1
Theorem 3.1 are all m-codes if
respectively.
q
> 2. If q
are
The codes of
= 2, they are m-
codes unless the two omitted flats are of the same dimension.
This
is not permitted since
In general, no more than
q-1 flats of the same dimension can be
omitted.
If
e
values of
f
.e
q
= 3,
the following table shows the partitioning of various
f.
p.:;:rtition
f
partition
f
partition
1
=1
10
= 9+1
19
= 9+9+1
2
= 1+1
11
= 9+1+1
20
3
=3
12
= 9+3
21
= 9+9+1+1
= 9+9+3
4
= 3+1
13
= 9+3+1
22
= 9+9+3+1
5 = 3+1+1
14
= 9+3+1+1
23
= 9+9+3+1+1
6
= 3+3
15
= 9+3+3
24
= 9+9+3+3
7
= 3+3+1
16
= 9+3+3+1
25
= 9+9+3+3+1
8
= 3+3+1+1
17
= 9+3+3+1+1
26
9
=9
18
= 9+9
27
= 9+9+3+3+1+1
= 27
'88
Partitioning
expressing f
power of
f
into powers of
as a number to the base
q which divides
q is closely analogous to
q.
If
s*
is the highest
f, then
m S (q-l) (s* + 1)
as may be seen by eXamining the partition of 26.
Let us consider the procedure for actually constructing an
m-code. For a given q, k, and d, it is easy to partition
k-l
f =q
- d into.the proper summands. But we must then determine whether disjoint flats, with dimensions equal to the powers of
the summands, actually exist in
q
= 2,
k
= 5,
and d
f
=
= 6.
10
PG(k-l, q).
For example, suppose
Then
=
23 +
2~
but a 3-flat and a l-flat cannot be disjoint in
an m-code does not exist for these values of
PG(4, q).
q, k, and
Hence
d.
3.3 Existence Proofs
In order to establish the existence of the m-codes we need a
number of geometric theorems which are extensions of our Theorem
1.7.
These theorems are of independent interest.
Let
IF I
denote
the number of points in a set F.
Theorem 3.3.
If F
is a set of points in
PG(k-l,q)
which has a
non-empty intersection with every..line in PG(k-l,q), then
(qk-l_ l )
Equality holds if, and only if, F is a
(q-l)
hyperplane.
Je
89
We note first that if F
is a hyperplane it intersects every
= (q~-l_l)/(q_l).
line and F
Now let us suppose only that F
S be the complementary set of F
By formula
P.
(1.5.6)
there are
intersects every line.
and let
Let
P be any point of S.
(qk-l_l)/(q_l) distinct lines
2.
~
on
By assumption each
2i has a point Q in common with F. If
i
Q. = Q., then 2. = PQ = PQ. = 2., which contradicts the fact that
i
~
J
~
J
J
the lines 2i are all distinct. Hence the points Q are distinct
i
and I F I 2: (:-l_l)/(q_l) •
Now suppose that F
intersects every line and that
Then the points
the points of F.
Since
Q.
~
described above are all
P was an arbitrary point of S, we con-
clude that any line which contains a point of S contains but one
point of F.
In other words, a line which is on two points of F
is contained in F. This shows that F
the number of points it contains F
hyperplane.
Corollary.
is a flat space, and from
is clearly a (k-2)-flat or
This completes the proof of the theorem.
If
S is a set of
qk-l points in
PG(k-l,q) which
does not contain all the points of any line, then
complementary set of a hyperplane.
S is the
Any set which contains more
points contains at least one line.
This is just a restatement of the theorem in terms of S the
complementary set of F.
Ie
It clearly implies Theorem 1. 7 when
90
Theore~ 3~~
If F
is a set of points in
PG(k-l,q) which has a
non-empty intersection with every v-flat, then
Equality holds if, and only if,
We note that if F
F
I FI
~ (qk-V_l)/(q_l).
is a (k-v-l)-flat.
is a (k-v-l)-flat, it has a non-empty
intersection with every v-flat (by Theorem 1.6) and
(qk-v_ l )
F =
(q-l)
The remainder of the proof is by induction on v.
is obviously true if v
3.3.
= 0, and for v = 1 it reduces to Theorem
We assume that it is true for
Assume that F
and that
IF I
:s
IFI
v*
if v* < v-I.
has a non-empty intersection with every v-flat
(qk-v_l)/(q_l).
there exists a (v-I)-flat
since
The theorem
Then, by the inductive hypothesis,
L having no points in common with F
< (qk-V+l~l)/(q_l).
By formula
(1.5.6) there are
(qk-V_l)/(q_l) distinct v-flats on L which we denote by
m = (qk-V_l)/(q_l).
TI2, ••• , TI,
m
a point Pi in common with F.
By assumption each
TIl'
TI.
~
has
91
The points
PJ.'
since
and
P.
J.
II.J.
are all distinct, for if
IIj
P.
J.
= P.J
would have the {v-I)-flat
then
II.
J.
=
II.
J
L and the point
= P.,
not on L, in common. Hence F contains at least
J
(qk-V_l)/(q_l)
points, as asserted by the theorem.
Now suppose F
intersects every v-flat and that
IF 1=
is just the set of points
Then F
described above.
Since
Let
Q,P.
Q, be a point on
belongs to
J.
are all distinct.
lines
II.
J.
P.J.
L, and consider the lines
and Q,P.
does not, the lines
J
Hence if G is the set of all points on the
q{~-v_l)/{q_l)+l = (qk-V+l_l)/{q_l)
Q,Pi' then G contains
points.
First assume that
G is not a {k-v)-flat.
ductive hypothesis, there is a {V-I)-flat
in common with
Q, and L' (Q,
G.
v-flat
L'
since
Q,P
i
Q,
is on G)
To see this, note that if P.J.
Q,L', then the {v-I)-flat L'
the line
which has no points
But this implies that the v-flat determined by
is not on
in common with F.
L'
Then, by the in-
which is in G.
has no points
lies on the
has no point in common with
Since by assumption F
has a point
in common With every v-flat, this is a contradiction.
Hence
from Q,.
GI
G is a (k-v)-flat.
Let
G'
Let
Q,'
be the set of all points on the lines
is a {k-v)-flat just as
G is.
The (k-v)-flats
not identical, since if Q, belonged to
QIP.
J.
be a point on L distinct
G'
Q, 'P.•
J.
G and G'
Then
are
it would lie on a line
= Q,'Q and P.J. would be on L. Hence the intersection of G
99'
and G'
is a flat space of dimension not greater than k-v-l.
Since the
(qk-V_l)/(q_l)
Gt, we see that Gn G'
the points
Pi.
points
P.
J.
all belong both to G and
is a (k-v-l)-flat which consists of just
That is,
G
n G'
= F.
This completes the proof
of the theorem.
Corollary I.
PG(k-l,q)
If S is a set of
(qk_qk-v)/(q_l)
points in
which does not contain all the points of any v-flat, then
S is the complementary set of a (k-v-l)-flat.
Any set which con-
tains more points contains at least one v-flat.
This is just a restatement of the theorem in terms of S,
the complementary set of F.
It is in this form that we will most
often use the theorem.
Corollary"· II.
if
n
If F
is a set of points in :ro(k-l, q) such that
is any V-flat when the intersection
I~
then
IF
(qk-v+u_l)/(q_l).
is a
(k-v+u-l)-flat.
F nn contains au-flat,
Equality holds if, and only if,
To prove the corollary, let n be any V-flat and let
u-flat which is contained in F
corollary, such an
L exists.
F
L be a
n. By the conditions of the
Since every
(v-u)-flat in n has
at least a point in common with L, F has a non-empty intersection
with every (v-u)-flat in
Since
n.
every (v-u)-flat is contained in some v-flat,
a non-empty intersection with every (v-u)-flat.
follows from the theorem on replacing
F
has
Hence the corollary
v by v-u.
93
The above corollary suggest the following theorem.
Theorem 3.5.
If F
is a set of points in
if II is any v-flat then the intersection
least
(qU+l_l)/(q_l) points, then
Equality holds if, and only if,
IFI ~
PG(k-l,q) such that
F 11 II contains at
(qk-v+u_l)/(q_l).
F is a (k-v+u-l)-flat.
We note first that if F is a (k-v+u-l)-flat then
IF I
= (qk-v+u-1)/ (q-l),
and if II is any v-flat then F
at least a u-flat so that IF
n III ~ (qu+l_l)/(q_l).
The remainder of the proof is by induction on u.
the theorem reduces to Theorem
u*
if u*
~
n II contains
For u
=0
3.4. We assume that it is true for
u-l, and give a proof for u.
Let us assume that either
·e
I
(a)
IF
(b)
IFI
< (qk-v+u_l)/(q_l),
or
= (qk-v+u_l)/(q_l)
and F is not a (k-v+u-l)-flat.
Then, by the inductive hypothesis for
u*
= u-l, there exists
at least one (v-l)-flat L which has fewer than
(qU*+l_l)/(q_l)
By formula
on L.
= (qU_l)/(q_l)
(1.5.6)
points in common with F.
there are (qk-v_l)/(q_l) distinct v-flats
No two of these v-flats have any points in common other than
the points of L.
Since by the assumption of the theorem every v-
flat has at least (qu+l_l)/(q_l) points in common with F, we have
94
e
IF I>
=
k-v
9,
-1
q-l
qu+l - 1
q-l
k-v
q -1 qu +
q-l
[qk_V_1 -1J
-
q-l
u
q -1
q-l
=
u
q -1
q-l
k-v+u 1
q
q-l
Hence either of the assumptions (a) or (b) leads to a contradiction,
and the proof of the theorem is complete.
Corollary.
If S is a set of
which does not have more than
any
(qv+l_qU+l)/(q_l)
points in PG(k-l,q)
points in common with
V-flat (v > u) , then S is the complementary set of a (k-v+u-l)-
flat.
that
(qk_qk-V+U)/(q_l)
If
Is
n
I S I > (qk_qk-V+U)/ (q-l),
n I > (qV+l_qu+l)/(q_l) •
then there exists a v-fla.t such
This is just a restatement of the theorem in terms of S, the
complementary set of F.
Necessary and sufficient conditions for the existence of m-codes
follow from Corollary I of Theorem 3.4.
Theorem 3.6•. Suppose
f
= qk-l
- d ~ 0, and let
f
be partitioned in-
to a minimum number of terms of the form
Assume) without loss of generality, that
m-l.
Then the partition of
f
~.
~
> u.~+1 for i
-
= 1,2, •••
is unique, and a necessary and suffi-
cient condition that there exist an m-code for the given values of
q, k, and
d is that
if m ~ 0 define
~l
k-l > ~l + u2 •
= u = 0 .)
2
(If m = 1
define
u2
= 0;
95
To prove that the partition is unique, recall that any positive
integer is uniquely expressable as a q-ary number
as q
Hhere each a
l
s
s-l
+ as_lq
+ ... + alq + ao '
is a non-negative integer
~
q-l.
As was pointed out
in the previous section, in order that the partition of f contain
a minimum number of terms, it is necessary that the same exponent
not appear more than q-l times.
Hence the partition is easily seen
to be obtained from the q-ary expression of the number simply by
writing down qi a
times. The numbering of the disjoint flats is
i
clearly arbitrary, hence it is no loss of generality to assume that
they are numbered with flats of higher dimension first.
For m
=0
and m
= 1,
If m ~ 2, the
the theorem is trivial.
condition is necessary since otherwise a ul-flat and a u2 -flat
cannot be disjoint in PG(k-l, q).
To prove that the condition is sufficient, note that by Theorem
3.4, Corollary I, the eXistence of the m-code will be proved if we
can show that
q
k
u.+l
-1
q-l
for p
= 2,3,
p-l
~l
... , m.
(
q
1
-1)
q-l
>
q
k
q
k-Up
q-l
The left-hand side is the number of points
left in the geometry after the first p-l flats are deleted, and the
right-hand side is the number of points which ensure that a
remains.
The above inequality reduces to
u -flat
p
96
.E~-1
1 (q
u.+l
J.
_
1) < q
k-u
P - 1, f or
p
= 2,3, .. , m.
U
Since
l
q
is the largest power of
q
which is less than
f,
we have
U
q
l
+ 1
~+
> f + 1 = q u..t + q
1
q
~
qq
1,1
!l2
u +l
L.).+l
+ q 2
+ •••
m
+ 1
+ ••• + q
u +1
+ q m + q
,
+ ••• + q
u +1
m
and finally,
fl
q(q
1
+1
\
-1)
>
«('
qUf...1_
1
) \
\ u2+1
+ (q
u +1
-1) + ••• +(q m - 1) .
Hence we see from (3.3.1) that an m-code will exist provided that
<
..
k -1.1
q
2 _ 1
where the right-hand side is the smallest possible value of
for
p
= 2,3, ••• ,
k-~
q
- 1
m.
k-l > 1.1 + 1.1 ' which may
1
2
L.).+1
k-U2
qq
S q
Hence
Now the condition of the theorem is that
be written
k - 1.1
2
~
~
+ 1 + 1
so that
1.11+1
k-u~
q(q
- 1) < q
~ -1 , and the proof of the theorem is complete.
Consider the space
PG(k-l,q).
To each point
we may assign a non-negative integral weight
all the points of the space, the number
be the weight of
PG(k-l, q) •
P of the space
w(p).
If we sum over
W = Ew(P)
may be said to
97
Consider the set
point of S
S
of all points with non-zero weight.
If each
is counted with a mUltiplicity equal to its weight, then
the number of points in
1s 1 =
write this as
is the weight of the whole space.
S
w.
0 < v < k-l
Given a v-flat,
We may
j
the weight of the v-flat is the
sum of the weights of all the points in the v-flat.
set of points in the
v~flat,
number of points in
S () V, each point of
If
V is the
then the weight of the v-flat is the
S
nV
being counted with
multiplicity equal to its weight.
Theorem 3.7.
If the weight of
k
k-v
q -q
>
\01
PG(k-l, q) is
q-l
o < v < k-l
+ 1 ,
then there exists a v-flat of weight at least
( qv+l -1) / (q-l)
or,
S of points with non-zero weight has at
in other words, the set
least
(qV+l_l)/(q_l),
points in common with some v-flat counting
multiplicities.
(i)
v=k-l.
at least
The theorem is trivially true for the cases
When
v = 0,
\01 ~
Hence there is a point
1.
P
k
\01
> ~
q-l
+ 1
=
k
q -1
q-l
=
q
v+l - 1
q-l
and the whole (k-l)-space is the required flat •
and
with weight
1, and we may take P to be the required v-flat.
v=k-l,
..
v = 0
When
98
It follows that the theorem is true whenever
k~l}
then the only allowable value of
v=o
or
v
is
k=l
or
0, and if
2.
If
k=2, then
v=l=k-l.
(ii)
Case
k=3.
Here
SvS
0
2, and the only new case is v=l.
In this case the hypothesis of the theorem is that
2
W = q +l+h,
Through each point of
point of weight
h >0 .
where
PG(2,q) there pass
w contributes a weight
2
W ~ q +1, say
q+l lines.
Hence a
w to (q+l) lines, and:the
total weight of all the lines is
(q+l)W = (q+l)(q2+l +h )
The average weight of a
is
line~
2
(q+l)( q +l+h)
l+(q+l)h
= q +
2
q2+q+l
q +q+l
>
q.
Since weight is integral, there must exist at least one line of
~eight
at least
(iii)
q+l) which proves the theorem for
Case k=4.
the two new subcases
(a)
Let
VJ
v=l
k=4, v=2.
=
4
q - q
q-l
0 ~ v
Then
and
plane is
We shall consider separately
v=2.
By hypothesis
2
+ 1 + h
Through each point there pass
all the planes is
S 3.
k=3.
=
2
q (q+l)+l+h)
q2+q+l
planes.
h
>
O.
Hence the weight of
(q3+ q2+l +h )(q2 +q+l), and the average weight of a
99
> q2 +q
=
Hence there is at least one plane of weight at least
2
q +q+l, and
the theorem is established.
(b)
Case
k=4, v=l.
By hypothesis
W = q3+l+h,
h>O.
2
q +1.
We shall prove that there is at least one plane of weight
Then by the case
k=3, v=l, there is a line of weight at least
q+l
and the theorem will be established.
The average weight of all planes is
·e
(q3+1+h )(q2+ q+l )
q3+ q 2+ q+l
=q
2
+
q+l+h(q2+ q+l )
> q2
q3+q2+q+l
Hence there is a plane of weight at least
q2+1
and the proof is
complete.
(iv)
General case.
Having proved the theorem for
us assume it to be true for
k=t, and prove it for case
is, we assume that if the weight of
t
where
h >0
at least
VIe
and
0
~
v
q-l
~ t~l,
~
4, let
k=t+l.
That
PG(t-l, q) is
t-v
q -q
-
k
+l+h
,
then there exists a v-flat of weight
(qV+l_l)/(q_l).
have to show that if the weight of
t+l t+l-v
q
-q
+l+h
Wt =
q-l
PG( t, q)
is
100
where
h > 0 and 0 < v< t then there exists a v-flat of weight at
least
(qV+l_ l )/ (q-l).
- -
If v=t
the result follows by case (i).
If v
~
t-l, and if
we can show the existence of a (t-l)-flat with weight
t
t-v
q -q
+ 1 ;
q-l
then by the inductive hypothesis there will be a v-flat of weight at
least
(qV+l_ l )/ (q-l), and the theorem will be proved.
Now the number of (t-l)-flats on each point of PG(t, q) is
(qt_l)/(q_l).
Hence the average weight of all the (t-l)-flats is
(qt+l_ qt+l-V) (qt_ l )
(q-l)( qt+l_ 1 )
+
Clearly,
t
(l+h) (q -1)
('1t+l_ l )
h('1t_l)/('1t+l_ l ) ~ 0,
and
t
t-v
q -q
'1-1
if
( '1t+l -'1t+l-v+'1- 1)( '1t - 1) > ('1t -'1t-v)( '1t+l·- 1') ,
that is, if
('1-1 )( '1t -1 ) - (t+l
q -qt+l-V) > - ('1t -qt-v) ,
or
(q-l)('1t -1 ) - ( q-l) (qt -qt-v) > 0
.e
,
101
i.e· 1
But this is always the case since
v < t
and
q
~
2 •
Hence, since the average weight of a (k-l)-flat is greater than
(qt_qt-v)/(q_l), there exists a (t-l)-flat with weight at least
qt +qt-v
+ 1
q-l
and the inductive argument is complete.
This proves the theorem.
The idea of using integral weights to handle the problem of
multiple points was used by Bose and Kuebler [7].
Theorem 3.7 is closely related to Theorem 3.5, Corollary I.
have removed the assumption that the Points of
and have shown that
S
are all distinct
has the proper number of points in common with
S
some V-flat, counting mUltiplicities.
It is possible to prove a theorem
which is similarly related to Theorem 3.5, Corollary.
for the special case where
Theorem
3.8.
We
We now do this
v=k-2, which we will need later.
If the weight of PG(k-l, q) is
u+2
k
W>
q -q
-
q-l
o -< u
+ 1,
-< k-2
then there exists a hyperplane of weight at least
k-l
q
u+l
-q
q-l
or, in other words, the set
+ 1
S
of points with non-zero weight has at
least this many points in common with some hyperplane, counting multiplicities.
To prove the theorem let
102
k
w =
u+2
q
- q
Through each point of
hyperplanes.
+1
q-l
!
h,
h
>
O.
PG(k-l,q) there pass (qk-l_l)/(q_l)
Hence the total weight of all the hyperplanes is
k
Since there are
(q -l)/(q-l) hyperplanes, the
aVerage weight of a hyperplane is
w( qk-l_ l )
k
q -1
=
=
·e
=(
k
q -1
[q.< qU+l_ l ) +h~
qk-l_ l _
q-l
,
q-l
l
=
~k
qk-l_ l
k u+2
]
q -q
+l+h
[ q-l
qk-l -1
q-l
( qu+l -1)
q-l
j
-\-1+q-1 +)l
qk-l ,;.1
u+2
k
q -1
qk-l_ l
k 1
q-
(qk -l)-(q-l )
+h
qk_ l
k-l
q
-1
k
q -1
=
Hence, since the average weight of a hyperplane is greater than
(qk-l_qU+l)/(q_l), there exists a hyperplane with weight at least
k-1
q
u+l
-q
q-l
and the theorem is proved•
.e
+ 1
103
Theorem 3.90
If V is an (n,k) m-code, then
d
k-l
=q
..m ui
- l,;lq
and
n = (qd + m - l)/(q-l)
where U is the dimension of U , the i-th omitted flat.
i
i
To prove the theorem, note that the m-codes are a subset of
the codes defined ia Theorem 3.2.
By part (c) of Theorem 3.2, the
assertions of Theorem 3.9 are true if there exists some hyperplane
of PG(k-l,q) which does not contain any of the flats U •
1
To prove that such a hyperplane eXists, let us assign to each
hyperplane of PG(k-l,q) a weight which is equal to the number of
flats Ui which is contained in that hyperplane.
By formula (1. 5.7) ,
a flat of dimension,"" is contained in (qk-Io.l-l_l)/(q_l) hyperplanes.
The dimensions of the omitted flats lie in the range 0
for i
= 1,
ui ~ k-2
Since V is an m-code, no more than q-l flats
2, ••• , m.
of the same dimension can be omitted.
~
Hence the total weight of
all the hyperplanes is not greater than
k-2 (k-i-l
!:i=O
q
-1
)
= (l-k)
.' q
0:
k
- q
q-l
+ k-~k-2
1=0 q
q
-
(
-i
)
k-l,
as we see by summing the finite geometric series.
Since the total number of hyperplanes in PG(k-l,q) is
k
(q - l)/(q - 1), the average weight of a hyperplane is not greater
than
q
k
.e
_ (k - 1) (q - 1)
k
- 1
q - 1
- q
k
q
<1 •
104
Since the weight of the hyperplanes 1s integral, some hyperplane
must have weight zero.
This completes the proof of the theorem.
We might note that the above inequalities are very loose,
and it is actually necessary to omit qUite a large number of flats
in order that every hyperplane contain at least one of them.
Such
codes do not appear promising, even though they do have a d somek-l
u
what larger than q
- E q i • Hence, in the definition of the
m-codes, we have omitted a minimum number of flats.
We have also investigated omitting flats which are not disjoint.
When no two flats have more than a single point in common this
seems to work all right.
But when the intersections are larger,
the codes we have examined have been bad.
3.4 Optimality Proofs
Given a code with parameters nt, k t , and d t , we have shown
(formula (3.1.2) ) that
kt_l
dt < n t (q-l) q
k'
q
- 1
By rearranging this inequality, we get
kt
q
<
,
q dt
qd ' - (q - l)n t
prOVided the denominator is strictly positive.
There is a bound due to Plotkin
and the following lemma.
[27] which follows from (3.4.1)
105
I f there eXists an (n,k) code with minimum distance
Step-Down Lemma.
d, and if' c is a positive integer less than k, then there eXists
an (n - c, k - c) code with minimum distance d.
The Step-Down Lemma is easily seen to be a special case of' the
following
Jump-Down Lemma.
If' there exists an (n, k) code V with minimum
distance d, and if' the generator matriX G of' V contains some set
of' C columns all of' which depend on some set of' c columns, where
c < k and c
~
C, then there exists an (n - C, k - c) code with
minimum distance d.
The Jump-Down Lemma may be proved as f'ollows.
A given vector
v in V may have zero coordinates in all the positions which correspond to the c columns mentioned in the lemma.
of' all such vectors.
Let VI be the set
Then VI is a subspace of' V, and it is of'
rank at least k - c, since we have imposed c linear equations of'
the f'orm
~i •
v
=0
on the vectors of' V.
Let the c columns of' G be denoted
~1' ~2'
••• , ~ ,
and let
be anyone of' the C columns which depend on the c columns.
If' v
is any vector in V', we will show that its j-th coordinate is zero.
106
Since V' is a subset of V,
for some
k-vector~.
Hence the j-th coordinate of v-is equal to
and all the dot products ~ • ~i
= 1,
(i
2, ••• , c) are zero since
! belongs to V'.
If we omit all C coordinates from the vectors in VI, then d
is not decreased.
Hence we have a code with minimum weight at least
d, and of rank at least k -c.
If the weight is greater than d or
the rank greater than k - c, then clearly a code with smaller weight
or rank eXists.
To
Hence the lemma is established.
prove the Plotkin Bound, let B(n, d) be the maximum
possible
,
k
value of q for given n and d.
By the Step-Down Lemma, the existence
k
of a code with paramaters n, k, and d, where q
= B(n,
d), implies
the eXistence of a code with paramaters n - c, k - c, and d, if
c < k. Hence B( n - c, d ) ~ qk-c ,and
(3.4.2)
qk
= B(n,
d)
S qCB(n - c, d) •
Now let VI be a code with parameters n' = n - c, k', and d' = d,
k'
such that q = B(n - c, d). Then from (3.4.1) and (3.4.2) we have
k
q
= B(n,
c
d)
S
q qd
,
where c < k,
qd - (q -l)(n - c)
prOVided the denominator is positive, which means provided that
c > n - qd/(q -1).
Assuming that c is sufficiently large so that the denominator
is positive, we should use that value of c which minimizes the
right hand side.
Considering the ratio
107
qC
9;
d/,
~c+l ~d
qd - (q - l)(n - c)
qd - (q-l)(n - c -
=
qd - (q - 1) (n - c - 1)
q[qd - (q - l)(n - c)]
=.1
q
1)
+ _ _....
q_---=l~
,
q[qd - (q-l)(n-c)]
we see that it is greater than 1 provided that
qd - (q - l)(n - c) < 1,
and in this case we should increase the value of c to c + 1.
Otherwise we should not.
The inequality qd - (q - l)(n - c) < 1 can be written
c < n - (qd - l)/(q-l),
hence to optimize, we take for c the smallest
integer ~ n - (qd - l)/(q -1).
This gives us the Plotkin bound.
Just as the step-down process was used to prove the Plotkin
Bound, we can use the jump-down process to prove a stronger bound
provided we know that some C columns of the generator matrix of a
code all depend on c columns where C > c.
In this case the eXistence
of a code with paramaters n, k, and d, where qk = B(n, d) implies
the existence of a code with paramaters n - C, k - c, and d, if
c < k.
Hence B( n - C, d )
~
qk-c
and
This improves on (3.4.2) in case C > c.
Letting VI be a code With paramaters n l = n - C, k l , and d'
k'
where q = B(n - C, d) we have by (3.4.1)
c
k
, where c < k,
q qd
d)
<
q = B(n,
qd - (q - 1) (n - C)
= d,
108
•
provided the denominator is positive, which means provided
C
> n - qd/(q - 1).
Now considering C as a function of c, say C
= Cc'
we examine
the ratio
qC
9;
...._
d / ._q
c+1 ....9:d
qd - (q - l)(n - Cc)
=
_
qd - (q-1)(n - CC+1 )
qd - (q - 1)(~ - CC+1 )
q[ qd - (q - l)(n - Cc) ]
1
q - 1
=-+ -
q
q
We see that this ratio is greater than 1, and hence c should be
increased to c+1, if and only if
By theorem 3.7, if the columns of a generator matrix G are
considered as points of PG(k-1,q), then if n > (qk _ qk-V)/(q - 1)
G contains (qV+1 _ l)/(q - 1) points which are on the same V-flat
·e
counting multiplicities.
Hence all (qV+1 -l)/(q -1) columns depend
on at most v + 1 columns of G, and from (3.4.5) we get the inequality
(3 • 4 • 7)
qk <
_
d -=-_ _ _.....;qv+1 9:.a;;;;.__
_
qd - (q-1)[n - (qV+1 -l)/(q - 1)]
provided that n > (qk _ qk-V)/(q _ 1) and that the denominator is
positive.
As an illustration, let us consider the m-code of Example 3.1
which had parameters q
= 2,
n = 21, k
= 5,
and d = 10.
Letting
c = n - (qd - l)/(q - 1) = 2 in (3.4.3), we obtain the Plotkin bound
2
2
0
2 • 10
2 • 10 - (21-2)
.e
80
=--..;.,.;=
20 - 19
80.
109
We note that the inequality still holds if k is increased to 6,
hence the Plotkin bound does not prove that the code of Example 3.1
is of maximum size.
However, it will prove that the code of Example
3.1 is of maxi-min distance, that is, it will prove that a code with
parameters q
= 2,
n
= 21,
these parameters we get
2
k
k
c
= 5,
=n
and d
= 11
- (2d - 1)
0
<
-
2 • 2 • 11
2 • 11 - 21
cannot exist.
= 0,
From
and by (3.4.3)
= 22,
so that k can be no greater than 4.
Finally, we might try to prove that the code of Example 3.1
is of minimum redundancy by showing that a code with parameters
q
= 2,
we get
n = 20, k
c
=n
= 5,
and d
= 10
cannot exist.
- (2d - 1) = 1, and by (3.4.3)
k
2 <
2 • 2 • 10
2. 10 - (20 - 1)
From these parameters
=
40.
Since the inequality holds with k = 5, we cannot prove that the
code of Example 3.1 is of minimum redundancy by the Plotkin Bound.
Now let us use bound (3.4.7) and see what we can prove with
regard to the code of Example 3.1.
We see that n
hence bound (3.4.7) must hold with v
= 1.
="
21 >
25 _25- 1 ,
That is,
80
= 40.
2
2 • 10 - (21 - 3)
Hence we might be tempted to infer that k cannot be as large as 6,
=-
and the code must be of maximum size.
difficulty here.
However, there is a logical
Since the right-hand side of (3.4.7) is only
established in case n > (qk _ qk-v)/(q -1), and since 21 is not
110
greater than 2 6 - 2 6-1 , we cannot infer directly that the code is
of maximum size.
We may use
n
= 21,
k
= 5,
(3.4.7) to show that a code with parameters
and d
= 11
Finally, we will use
a code with parameters q
= 2,
1
Since 20 > 25 - 25 2k
<
= 2,
cannot eXist, just as we showed this
with the Plotkin Bound.
exist.
q
n
= 20,
k
= 5,
(3.4.7) to show that
and d
= 10
cannot
we must have
22 • 2 • 10
2 • 10 - (20 ~ 3)
=
80
< 32
3
Hence the code of Example 3.1 is shown to be of minimum redundancy,
and from Theorem 1.12 we can infer that it is also of maximum size.
The code of Example 3.2 has parameters n
= 18,
k
= 5,
and d
= 8.
The Plotkin bound does not establish any sort of optimality for
this code, but from bound
min distance.
(3.4.7) it is easily seen to be of maxi-
The code is not an m-code, and is not of minimum
redundancy, as was pointed out in section 3.2.
The folloWing theorem is proved essentially by using bound
Theorem
3.~O.
The m-codes with m ~ q are of minimum redundancy.
To prove the theorem note that (3.1.2) can be written
(3.4.8)
n >
k
(q -1) d
k-l
(q - 1 )
q
Since for an m-code we have
n
= (qd
+ m-
l)/(q - 1),
111
the difference between n and the bound is
k
qd + m - 1
(q - 1) d
q k-l( q - 1 )
q - 1
=
m - 1 + d/qk-l
q - 1
which is less than 1 if and only if
m - 1 + d/qk-l < q - 1.
If m = 0, then d
= qk-l
of minimum redundancy.
and the inequality holds, so the code is
k-l
m ui
k-l
If m > 0, then d = q
- Ll q < q
,
so (3.1.2) will prove minimum redundancy i f and only if m < q.
To extend the proof to the case m
=q
which essentially amounts to using bound
we will use an argument
(3.4.7). Assume without
loss of generality that the disjoint flats defining an m-code are
numbered in such a way that
u > u > •.. > u
,
l - 2 - m
where u
is the dimension of the i-th omitted flat.
i
shown in the proof of Theorem 3.6,
Then, as was
Since the theorem has been proved if m < q, we may assume that
m ~ q,
hence
From this inequality, and from Theorem 3.2 part (a), we get
u i +1
k
k
q
1
~m (q
1)
_
1
>
q
-1
n-l= - q-l
- L.. 1
q-l
q-l
=
.e
k
q
- q
q-l
u l +2
112
In other words,
n - 1
>
q
k
- q
k-v
where v
q-l
=k
- (u l +
2).
Hence, if there exists an n-l, k, d, code (or in other words
if the original m-code is not of minimum redundancy), by Theorem
3.7 there exists an n-l-C, k-c, d, code where
c
By
=v
+ 1
=k
- (U
+ 1) and C
l
v+l 1
= qq-l-
=
c
-1
q-l
(3.4.8) this implies that
n - 1 - C
SUbstituting C
d
>
(l-c
n - 1
= (qC
=
d ( qk+l - q C+l)
-l)/(q _ 1) and d
> -q - 1
q-l
=
-1)
qk-l-C( q-l )
c
+
-qd
= qk-l
c+l
d q
k
q
q-l
Since n
= (qd
(q-l)
U
qd + qC+l-k~ q i - 1
+ m - l)/(q -
n _ 1
=
U
E~ q i
q - 1
1), we have
qd + (m - q + 1) - 1
q - 1
and the above inequality becomes
Hence an m-code is proved of minimum redundancy if
.e
q
we have
113
where u
i
is the dimension of the i-th omitted flat and u
1
~
u
2
~
••• ~ u •
m
This completes the proof of Theorem 3.10.
Actually bound (3.4.9) (or equivalently bound (3.4.7) ) is
somewhat stronger than Theorem 3.10 provided that q > 2 so that
u
we may have q
2
u
=q
1•
However, none of the bounds or theorems
proved so far really demonstrate the power of the Jump-Down Lemma.
Tb utilize this Lemma more fully we need the following theorem,
which is an easy extension of Theorem 3.7.
Theorem 3.11.
If the weight of PG(k-1,q) is
qk - qk-v
q-l
(qk -1) Q +
W >
q-l
+ 1
where Q and v are non-negative integers and v
~
k - 1, then there
exists a v-flat of weight at least (Q + l)(qV+l - l)/(q-l).
In
o·cher words, the set S of points with non-zero weight has at least
(Q + l)(qv+l -l)/(q-l) points in common with some v-flat counting
multiplicities.
We first prove the case where v = k - 2.
Through each point
of PG(k-1,q) there pass (qk-l -l)/(q - 1) flats of dimension (k-2).
Hence the total weight of all the (k - 2)-:flats is W(qk-1 - 1)/( q-l).
Since there are (qk - l)/(q _ 1)
(q - 2)-flats, the average weight
of a (k-2)-flat is
W
(qk-1 _ 1)
.a...:k--~
q
where e > O.
.e
- 1
r
.
=
l
(
q
k
- 1) Q +
q - 1
Hence the average weight is
q
k
2 ) k-l 1
- q
+l+e l qkq - 1
) q-l
114
· k-l
q
- 1
k
q - 1
= (Q
k-l
+ 1) .q
- 1
q - 1
> (Q +
1) q
k-l
- 1
- 1,
q - 1
and since the weight is integral, some (k-2)-flat must have weight
at least (Q + l){qk-l - l)/(q - 1).
Now let us regard such a (k - 2)-flat as a PG(k-2,q) having
weight
WI
> (Q +
q
k-l
-
k-l
-
1
1
- 1
q - 1
Q
q _ 1
>q
·e
k-l
1) q
k-2
k-l
k-2
q
- q
+ q
- 1
+ ..:o...-q-_-i!'l-q - 1
k-l
k-2
Q+q
q - 1
provided that k
v
=k
~
3.
-q
q - 1
+ 1
Clearly the argument given for the case
- 2 will prove the theorem for the case v
=k
- 3, and so
on by induction for any v leBs than k - 2.
The case v = k - 1 is trivial since the v - flat is then all
of PG(k-l,q) and
k
W >
q
k
- lQ +
q-l
q
- q
q-l
+ 1
-
-
k
(Q + 1) q - 1
q-l
This completes the proof of the theorem.
The following example shows how this theorem may be used •
.e
115
Example 3.3
u2
= 2,
u
3
Consider the m-code with q
= 1,
= 2, m = 4, k = 8, ul = 4,
= 0,
u4
k-l
d
=q
n
= (qd
ui
- I: q
= 105
+ m - l)/(q - 1)
= 213.
We will show that this code is of minimum redundancy.
Suppose that it is not, then there exists a (212, 8) code
with d
= 105.
Since
212
= n > 28 _ 2 8-2 ,
by Theorem 3.7 the columns of the generator matrix of this code
include all 7 points on a 2-flat.
Hence we may take C = 7, c
=3
in the Jump-Down Lemma, and the existence of a (205, 5) code with
d
= 105 is implied.
We cannot prove that the (205, 5) code does not exist by bound
(3.4.8), which means that the original m-code cannot be proved of
minimum redundancy by (3.4.7) or (3.4.9).
However, if the (205, 5)
code exists, then since
205
= (25
- 1)6 + 19
and
we conclude from Theorem 3.11 that 7 • 3
tor matrix must lie on some I-flat.
code with d
19 > 25 _ 25 - 1 ,
= 21
columns of the genera-
Hence the existence of a (184, 3)
= 105 is implied,
We cannot prove that the (184, 3) code does not exist by bound
(3.4.8), but since
184
= (23 - 1) • 26
+ 2,
some point must be repeated 27 times as a column of the generator
.e
116
matrix.
plied.
Hence the existence of a (157, 2) code with d ::: 105 is imThis code may be proved impossible by bound (3.4.8) and
therefore the original m - code is of minimum redundancy.
Instead of using (3.4.8) we may note that since
157 ::: (22 - 1) • 52 + 1,
some point must be repeated 53 times.
(104, 1) code with
d::: 105
Hence the existence of a
is implied, and this is clearly impossible.
This example can be worked in a slightly different way which
corresponds exactly to the proof of the next theorem.
If the m-
code is not of minimum redundancy, there exists a code with
paremeters
n - 1 = 212, k = 8, d = 105.
k
k 2
Since n - 1 > 2 - 2 - , we have c ::: 3, C1
l
and there eXists a code with parameters
n1
=n
- 1 - Cl ::: 205,
Since
205
=
k1 ::: k - c l
(25 - 1) 6
some point must be repeated 7 times.
C2
= 5,
d1 ::: d
= 105.
+ 19
Hence we may take c
2
::: 1,
= 7, and there eXists a code with parameters
n2 ::: n1 - C2 = 198,
Continuing in this way we get
198 ::: (2
n
3
4
k2 ::: k~ - c2 ::: 4,
- 1) 13 + 3,
c
3
= n2 - C3 ::: 184, k3 ::: 3,
184 ::: (23 - 1) 26 + 2,
n4 ::: n - C4 ::: 157,
3
.e
= 7,
r ;,
c4
k4 ::: 2,
..
d2
= d = 105.
::: 1,
C ::: 14,
3
d::: 105.
= 1,
C4 ::: 27,
d::: 105 •
117
157
= (22
- 1) 52 + 1,
c
5
= 1,
C
5
= 53,
n = n4 - C = 104, k4 = 1, d = 105.
5
5
Since this is clearly impossible, the m-code must be of minimum
redundancy.
Theorem 3.12.
The m-codes are of minimum redundancy.
To prove the theorem, assume without loss of generality that
the disjoint flats defining an m-code with parameters n, k, and d,
are numbered in such a way that
u 1 ~ u2 ~
00 0
~ ~,
where u
is the dimension of the i-th omitted flat.
i
cal convenience we will denote u by u.
1
u
i
Note that the sum ~ q may be written
Eu +1 N u+l-i
i=l i q
,
(3.4.10)
where
For typographi-
0
Ni ~ q-l, since no more than q-l flats of a given dimension can be omitted.
~
If the m-code is not of minimum redundancy, then there exists
a code with parameters n-l, k, and d.
By Theorem 3.2 part (a)
and (3.4.10)
n - 1
=
qk _ 1 _ Eu +1 N (qu+2-i _ 1)
1
i
- 1
q - 1
=
k
q
- q
u+2
q - 1
.-
+
( qu+2 - 1 ) - E1u+l Ni(qu+2-i -l)-(q-l)
q - 1
U+1 N
Now s i nce Ni _< q- 1, we have qu+l-a _> 1 + Ea+1
i qu+l-i
118
for
0
~
number.
a
~
u, as may be seen by regarding the sum as a q-ary
Hence
qu+2-a > q + I:au+1
+1 Ni qu+2-i
and
qU+2-a _ 1 > q - 1 + I:au+1
+l Ni (qU+2-i _ 1).
= 0 we have
In particular, taking a
n - 1
>
q
k
u+2
- q
q - 1
,
so by Theorem 3.7, the existence of a code with parameters
where
c
c
1
=k
- (u + 1)
and
C
1
- 1
= ....-----:-1
q - 1 '
q
is implied.
Hence we have
n1
=
qk _ qk-u-1 _ (q _ 1) _ I: u +1 N (qU+2-i _ 1)
1
i
,
q - 1
k1
=u
+ 1,
d1
= d.
In order to jump down to still another code, we diVide and
see that
k
q
1
- 1 =
k-u-1
q
- N
1 -
q - 1
By (3. 4.11) with a
hence
.e
= 1,
the last term on the right is less than 1,
k
1
n1 > qq _ - 11
119
(k-u-1
_ N1 _ 1) + R
q
° < R < (q 1 _ l)/(q k
where
I), it follows that some point of
PG (
~ - )
I, q appears at least qk-u-1 - N1 times as a column of
the generator matrix of the (n , ~1) code. This implies the existence
i
of a code with parameters
where
c 2 = 1 and C2 = qk-u-l - N1 •
Hence we have
n
2
= q-l~
k-u
- q
(
)
(
) _ 't"'u+1 N (qu+2-i _ 1)
+ HI q - 1 - q - 1
L>l
i=-_
q - 1
k
•
We may now divide n2 by (q 2 - l)/{q - 1) and determine from
the quotient Q and remainder R that some point of PG(k - I, q)
2
appears at least
Q + W times as a column of the generator matrix
°
of the (n2 , k2 ) code, where R' =
if R = 0, and otherwise R'
This implies the existence of a code With parameters
= 1.
where
c
3
=1
and
C
3
=
Q + R' •
We assert that by repeating the above process we obtain for
2
.e
~
P
<;: u
+ 1,
120
n = (q _ 1) -1 { qk _ qk-u+p -2 + Ei- 1 N (qP-i _ 1)
p
i
- (q - 1) - E~+l N (qu+2-i - 1)
i
k
p
=u
- p + 2,
= d.
d
P
Since these formulae are true for p
mation E~
J'
=2
=1
(and for p
if the sum-
is regarded as 0) it is only necessary to assume that
they are true for p - 1 and show that they are then true for p.
To do this, we divide and see that
qU- p +3
_ 1
q - 1
=
k-u+p-3
q
p-l N
- E1
p-l-i
i q
q -
u-p+3
q
By (3.4.11) with a
=p
-
1
- 1, the last term is less than 1, hence
some point of PG(k 1 - 1, q) appears at least
p-
C
p
= qk-u+p-3
p-l N
- E
1
i
p-l-i
times
q
as a column of the generator matrix of the (n
p-
taking c
p
n
P
l' k 1) code.
= 1 we have
= n p-l
- C
P
(p-l-i
= ( q - 1 ) -1 { k
q - qk-u+p-3 + Ep-2
1 Ni q
_ (q _ 1) _ EU +1 N
1
- 1
)
(qu+2-i _ '1)
i
k-u+p -3 - Ep-l N qp-l-i)( q-l )
- (q
1
i
.e
Hence
p-
]
121
= (q
_ 1)-1
i qk _ qk-u+P-2
+ Ei- l N (qP-i - 1)
i
_ (q _ 1) - E~+l N (qU+2-1 - l)} ,
i
kp
=
kp- 1 - cp
=u
- p + 2, and dp
= d.
Hence (3.4.12) is established.
It is easily seen from (3.4.12) that
n
u+
1 = qk-1 _ EU +1 N qu+1-i _ 1,
1
and
i
du+1
= d.
By (3.4.10) and Theorem 3.9, this gives
nU +1
=d
- l.
Since this is clearly impossible, the assumption that the m-code
is not of minimum redundancy leads to a contradiction, and the proof
of Theorem 3.12 is complete.
122
BIBLIOGRAPHY
[lJ American Mathematical Society (1960).
Tenth Symposium
in Applied Mathematics - Combinatorial Analysis, Providence.
[2J
Baer, R. (1952).
Linear
Al~bra
and Projective Geometry,
Academic Press Inc., New York.
(3]
Bose, R. C. (194-7).
"Mathematical theory of the symme-
trical factorial designs," Sankhya, vol. 8, 107-166.
[4J
Bose, R. C. (1961).
"On some connections between the de-
sign of experiments and information theory,"
Bull. of the
Internat. Stat. Inst., vol. 38, 257-271.
[5]
Bose, R. C. and R. C. Burton (1957). "On a problem in Abelian
groups and the construction of fractionally replicated designs" (Abstract), Ann. Math. Statist., vol. 28, 533.
[6]
Bose, R. C. and K. A. Bush (1952).
"Orthogonal Arrays of
Strength Two and Three," Ann. Math. Statist., vol. 23, 508-524.
•
[7J
Bose, R. C. and R. R. Kuebler, Jr•. (1960).
"A geometry of
binary sequences associated with group alphabets in information
theory," Ann. Math. Statist., vol. 31, 113-13.9.
[8J
Bose, R. C. and D. K. Ray-Chaudhuri (1960).
error correcting binary group COdes,"
"On a class of
Inf. and Control,
vol. 3, 68-79.
[9J
Bose, R. C. and D. K. Ray-Chaudhuri (1960).
"Further Results
on Error Correcting Binary Group Codes," Inf. and Control,
vol. 3, 279-290.
123
[10] Bose} R. C. and J. N. Srivastava (1963).
On a bound
useful in the theory of factorial designs and error
correcting codes, University of North Carolina, Institute
of Statistics Mimeo Series No. 359.
[11] Burton, R. C. and W. S. Connor (1957).
liOn the identity
n
relationship for fractional factorial designs of the 2
series,"
Ann. Math. Statist., vol. 28} 762-767.
[12] Carmichael, R. D. (1937).
Introduction to the Theory of
Groups of Finite Order, Ginn and Co.} Boston.
[13] Feinstein, A. (1958). Foundations of Information Theory,
McGraw-Hill Book Company, Inc., New York.
[14] Feller, W. (1957). An Introduction to Probability Theory
and Its Applications, John Wiley and Sons, Inc., New York.
[15] Fisher, R. A. (1943).
"The theory of confounding in fac-
torial experiments in relation to the theory of groups. II
~als
of Eugenics, vol xi, 341-353.
[16] Fisher, R. A. (1945).
"A system of confounding for factors
with more than two alternations, giving completely orthogonal cubes and higher powers."
Annals of Eugenics, vol.
xii, 283-290.
[17] Fontaine, A. B. and W. W. Peterson (1959).
"Group Code
Equivalence and Optimum Codes," IRE Trans.} IT-5, Special
Supplement, 60-70.
124
[18]
Gomo:ry, R. E. (1958).
"Outline of an algorithm for integer
solutions to linear programs," Office &f-O:OOnanoe-"'Rea&&rch
•
Symposium on Combinatorial Problems, New York.
[19]
Gomory, R. E. (1960).
"Solving linear programming problems
in integers," Amer. Math. Soc. Symposia in Applied Mathematics, vol. 10, 211-216.
[20]
Hamming, R. W. (1950).
"Error detecting and error
correcting codes," Bell System Tech. J., vol. 29, 147-160.
[21]
Jacobson, N. (1951).
Lectures in Abstract Algebra, D. Van
Nostrand Company, Inc., Princeton.
[22]
Khinchin, A. I. (1957).
Mathematical Foundations of Infor-
mation Theory, Dover Publications Inc., New York.
[23]
KUhn,
H. W. and A. W. Tucker, editors (1956).
Linear Inequali-
ties and Related Systems," Ann. Math. Studies, no. 38, Princeton.
[24]
MacDonald, J. E. (1960).
"Design Method for Maximum Minimum-
Distance Error-Correcting Codes," IBM J. Research Develop.,
vol. 4, 43-57.
[25]
McCluskey, E. J., Jr. (1959).
"Error-Correcting Codes - A
Linear Programming Approach," Bell System Tech. J., vol. 38.
1485-1512.
[26]
Peterson, W. W. (1961).
Error-Correcting Codes, The M.I.T.
Press.
[27]
Plotkin, M. (1960).
"Binary Codes with Specified Minimum
Distance," IRE Trans., IT-6 z 445-450.
125
[28]
Shannon, C. E. (1948).
"A mathematical theory of communi-
cation," Bell System Technical Journal, vol. 35, 203-234•
•
[29]
1
Shohat, J. A. and Tamarkin, J. D. (1943).
The Problem of
Moments, American Mathematical Society.
(30)
Slepian, D. (1956).
"A class of binary signaling alphabets,"
Bell System Tech. J., vol. 35, 203-234.
[31]
Wolfe, Phillip (1962).
Recent Developments in Nonlinear
Programming, Rand Report R-40l-PR, Santa Monica.
[32]
Wolfowitz, J. (1961).
Coding Theorems of Information Theory,
Springer-Verlag, Berlin.
[33]
Zelen, Marvin (1951).
Bounds on a Distribution Function
which are Functions of Moments, M.S. Thesis, Department of
Statistics, University of North Carolina, Chapel Hill.
© Copyright 2026 Paperzz