Vijayaditya, N.; (1972)Combinatorial information retrieval schemes."

~
*
This research was supported in part by the u.s. Air Force Office of Scientific Research under Contract No. AFOSR-68-l4l5.
1
Ph.D. dissertation written under the direction of Professor R. C. Bose.
Reproduction in whole or in part is permitted
for any purpose of the
United States Government
COMBINATORIAL INFORMATION RETRIEVAL SCHEMES*,l
N. Vijayaditya
Department of Statistics
University of North Carolina at Chapel Hill
Institute of Statistics Mimeo Series No. 833
July, 1972
r
N. VIJAYADITYA. Combinatorial Information Retrieval Schemes
(Under the direction of R. C. BOSE.)
The development of an information filing scheme deals not only
with the storage of data but also with the retrieval.
The efficiency
of a filing scheme is measured not only in terms of the ease with which
it is possible to retrieve information pertinent to a given task but
also in terms of the retrieval time.
In this research, combinatorial
methods are applied to obtain filing schemes that are efficient in
terms of retrieval time.
In the beginning, a brief review of the
relevant basic combinatorial techniques along with applications to the
filing schemes is given.
Then a few interesting properties, dealing
with the intersections of quadrics and flat spaces are obtained.
these results are applied to obtain filing schemes.
Later,
In the later parts
of the research, a mathematical (linear) model (representation) for
multiple-valued attributes is developed.
The design of this (repre-
sentation model) depends on matrices, over a finite field, with a
certain property.
A method, applying the theory of spreads, for
some of these matrices, is given.
Finally, using this representation,
a filing scheme for multiple-valued attributes is obtained •
•
~DI·
r
TABLE OF CONTENTS
CHAPTER
PAGE
ACKNOWLEDGEMENTS
I
SUMMARY
v
INTRODUCTION
1
1.1
1.2
1.3
1.4
1.5
1.6
II
.e
III
IV
•
•
iii
Terminology
Formulation of the problem
Finite geometries
Balanced filing schemes
Combinatorial configurations
Second order combinatorial configurations
2
3
7
11
15
17
MISCELLANEOUS FILING SCHEMES
22
2.1
2.2
2.3
2.4
2.5
2.6
22
23
23
26
32
36
Orthogonal arrays and Partially balanced arrays
Composition
Configurations of order two with k=l
Configurations of order two with k~l
Filing schemes based on finite geometries
Multiple level attributes
QUADRICS AND SOME RELATED FILING SCHEMES
40
3.1
3.2
3.3
3.4
3.5
48
56
Quadrics
(N-l)-flat spaces and quadrics
(N-2)-flat spaces and quadrics
Retrieval schemes based on quadrics
Multiple level attributes
A GENERALIZED FILING SCHEME FOR MULTIPLE LEVEL ATTRIBUTES
4.1 Linear representation of attributes
4.2 Generalized multiple valued filing scheme
4.3 Correspondence between Galois fields, vector spaces
and projective spaces .
4.4 Spreads
4.5 A method for obtaining (Nxm) matrices
4.6 General methods
BIBLIOGRAPHY
40
79
92
98
98
107
115
121
125
135
140
ACKNOWLEDGEMENTS
It is a pleasure and privilage to acknowledge my sincerest gratitude to Professor R. C. Bose, not only for proposing the problems in
the dissertation and making numerous suggestions and comments during
its progress, but also for his excellent guidance and encouragement.
It has been a pleasure to work under him.
I wish to express my special thanks to Professor T. A. Dowling
for his encouragement and inspiration during my graduate program here.
I also express my thanks to him for reading the manuscript very carefully and suggesting many valuable improvements.
To Professors R. R. Kuebler, Jr., D. G. Kelly, K.J.C. Smith and
G. A. Mag6, the other members of my doctoral committee I express my
thanks, for their review of my work and suggestions for improving it.
I would also like to thank Professor I. M. Chakravarti for agreeing
to be a member of my committee.
I wish to express my thanks to Professor N. L. Johnson, Chairman
and the Department of Statistics for providing the financial assistantship.
To Professor Deal, Chairman, Department of Mathematics, Colorado
State University, Fort Collins, I express my thanks for provinding the
research facilities during my stay there.
To the members of the faculty and staff of the Department of
Statistics who have contributed to my graduate training I wish to
express my sincere gratitude.
lowe a deep sense of gratitude to my parents and brothers for
•
their active cooperation and enthusiam.
To my grandfather, Dr. N.S.R.
Sastry, I express my thanks for encouraging me into the field of
Statistics.
For her excellent typing of the manuscript I extend my thanks to
Mrs. Susan Drum.
•
•
iv
SUMMARY
The development of high-speed computers has produced an "information revolution" in its sources and as well as its types.
Computers
have provided means to comprehend with large volumes of data.
But the
question naturally arises as to how one may best file information in
computer storage for further use.
The efficiency of such a filing
scheme can be evaluated in terms of the time it requires to satisfy a
query.
In this research we apply combinatorial methods to obtain
efficient filing schemes.
Until recently, the best known type of filing scheme has been the
inverted filing scheme.
It is characterized by a one-one correspondence
between the buckets and the levels of the attributes.
allow efficient retrieval of queries of size one.
These schemes
But for higher order
queries, it involves numerous cross comparisons.
We introduce the necessary terminology, along with the mathematical
formulation of the problem, which was motivated by Ray-Chaudhuri [3Il t ,
in Chapter I.
is given.
Then a brief review of relevant combinatorial techniques
Finally, a brief review of the basic filing schemes with
examples is given.
Chapter II introduces the filing schemes that are
derived from orthogonal arrays and partially balanced arrays.
Also,
some improved versions of existing filing schemes are given.
Chapter III deals mainly with quadrics and the filing schemes that
are derived from them.
A few basic results about the intersections of
quadrics with (N-2)-flats are obtained.
These results are later applied
t The number in square brackets refer to references listed in the
bibliography at the end.
to obtain filing schemes for multiple-valued attributes.
Finally, in Chapter IV, we develop a linear representation for
multiple-valued attributes.
This representation enables us to develop
filing schemes for multiple-valued attributes.
depend on matrices with property
These filing schemes
R (r ,r , ... ,rt).
t l 2
give a construction procedure for optimal matrices.
of
t,
it becomes a difficult problem.
vi
For
t=2,
we
For higher values
~
CHAPTER I
INTRODUCTION
"It is unworthy of excellent men to lose hours like slaves in the
labor of calculations which could safely be relegated to anyone else if
machines were used."
Leibnitz 1671
With the advent and advancement of computer technology there has
been a proliferation in information collection - its sources as well as
its types.
With the growing complexity and abundance of information we
are faced not just with the problem of storing and retrieving a few
words and facts but rather of storing and retrieving billions of words
and facts.
In an attempt to retrieve information from such a file we
could go through these billions of facts - be they five or fifty
billions - serially.
But a simple computation will show that if a
person spends his entire life from the day he turned twenty until the
day he died, he still could not read one billion words.
In light of
this problem, the individual does not know where to begin to look.
job, then, is to structure the files to assist him.
Our
The question
naturally arises as to how one may best file information for further use.
The success of any filing scheme can be measured in terms of the ease
with which it is possible to retrieve information pertinent to a given
task.
With the present high speed computer technology the memory
(storage) can be advantageously utilized to store well-organized files
which can be retrieved rapidly.
This fact provides a basis for the
formulation of problems of designing filing schemes for efficient jnformation retrieval.
The actual construction of such files poses a
variety of questions some of which can be tackled by combinatorial
mathematics.
This is the line of approach which we will follow in
this research.
1.1 Terminology
For the purpose of this dissertation, a file is a collection of
A record, in turn, is comprised of two parts.
related records.
first part consists of an identifier
respect to a record.
The
number, which is unique with
This identification is known as the
p~may.y
key.
Sometimes the records are also assigned a second identifier known as the
secondayoy key.
This is not unique.
number of related data fields.
The second part consists of a
These data fields correspond to the
values of a number of information variables, also known as attributes.
We will assume that each attribute can take only one of finitely many
different values.
Hence the data field of a record consists of a
particular combination of levels of attributes associated with the record.
field.
There may be more than one record associated with the same data
Once the records are obtained, they can be stored in some
permanent location (memory).
an accession number.
The identifier of this location is called
One aspect of the basic problem of file organiza-
tion is the definition of the correspondence between accession numbers
and primary keys.
This process is called key transformation.
This has
been discussed by a number of researchers in computer systems. For
additional details and bibliography, the reader is referred to
Buchholz [16].
We assume that the accession numbers have already been
assigned by some procedure.
From now on we deal with accession numbers
2
alone, as they are smaller in size and easier to handle than the
records.
An
address is usually a number compounded of two or more coordi-
nates that physically select the location.
As an example, on a disk
file, various digit groups of an address might specify position of a
track, track, disk side, and module (group).
is called a bucket.
The buckets form a partition of the addresses
assigned for storage.
at the addresses.
A collection of addresses
The accession number of the records are stored
We assume that in a bucket, exactly one accession
number is stored at one address.
Finally, a query is a request to retrieve records containing
certain information.
A query is represented by a vector.
The number
of components of the vector indicate the size or order of the query.
1.2 Formulation of the problem
In this section, a mathematical model for filing schemes will be
formulated.
The approach used is similar to that of Ray-Chaudhuri [31]
for the case in which retrieval pertains to only one level of
i
attri-
butes.
Let
A..
denote the j-th level of the i-th attribute, where
1)
j = 0,1,2, •.. ,n.-l;
1
i = 1,2, •••
A. = {A .. :
1
,i.
Also let
(1.2.1)
j = 0,1,2, ••• ,n.-l}
1)
1
and
j
That is,
Ai
= 0,1, .. .,n.-l; i
1
= 1,2, ...
,iL
is the set of all levels of the i-th attribute and
(1.2.2)
Q
the collection of all levels of all attributes.
A file
f
is denoted by a triplet,
3
(A,
Q, f)
where
A is the
is
collection of all records; Q
a mapping from
A to
the set of all attribute levels; and
such that for any record
Q
I
E
A,
f(I) = (A., , A , , ... , A )
1j -l.
2J 2
1.J 1
where
o
-<.J k <
- nk- 1
for all
(1.2.3)
f(I)
That is to say
k.
I-th record.
attribute levels possessed by the
f
indicates the
Since each record
contains exactly one level of each attribute we have
If(I)
n
The storage rule
where
A
A.I
1.
S
1,
=
for all
of the file
i
(1.2.4)
F is a triplet
is the collection of all records;
M
(A, M, a)
the set of all possible
integers corresponding to the set of possible addresses;
mapping from
A to subsets of
M.
The subset
a(I)
is a
a
contains the
We
addresses, where the accession number of the I-th record is stored.
implicitly assume that a single address is sufficient for storing any
accession number.
The number of addresses in
a(I)
indicates the
number of repetitions of the accession number of the I-th record in the
storage.
Define
I
s=
la(I)
I/IAI
(1.2.5)
lEA
S
is called the redundancy of the file.
R
The retrieval rule
Q.,
a class of subsets of
for storage;
Ia(l)
and
r
n rCA)
I
1
(li, M, r)
is a triplet
M
if
where
li
represents
represents the set of addresses available
is a mapping from
=
rt depends on the storage rule.
f (I)
::J
A,
li
to subsets of
where
A
E li.
M such that
(1.2.6)
In other words, only one of the addresses. where the accession number
of l-th record is stored, is related to the retrieval of the query
li
is called the set of queries.
The filing system is said to be of
4
A.
order
t
holds •
•
query
if for each
A belonging to
the relation
~
The time required to retrieve the records pertaining to a
A is denoted by
The average retrieval time is
T(A).
(1. 2.7)
T depends on the retrieval rule.
A filing scheme is completely specified by indicating
R.
The scheme will be called optimal if
~
and
T
F,
are minimum.
it is not always possible to obtain optimum filing schemes as
are somewhat inversely proportional to each other.
tackle the problem will be by
other.
arbitrar~fixing
Sand
and
~
T
T
So the best way to
one and minimizing the
In this research, though we do not specify the values of
T we aimed at minimizing
But
for reasonable values of
~
and
~.
As an example let us consider a simple filing scheme, known as the
inverted filing system.
This scheme is characterized by a one-one cor-
respondence between buckets and levels of the attributes Following the
same notation as above, Corresponding to each
i
= 1,2, ... ,i)
assigned.
Let
M.. (a set of addresses in the memory) is
a bucket
1.J
be the identification nwnber of
W..
1.J
taken as the first address in
sequence or we
c~n
i-I
I
k=O
ij
~ + j,
I
is stored in the bucket
Air
Since a record contains
attribute - it will be stored in
W
ij
where
•
level
M...
1.J
can be
W..
1.J
if all the addresses are in a
M
define the identifier
W••
1.J
A record
= O,l, •.• ,ni-l;
Aij (j
i
5
M..
1.J
of
nO =
as
o.
(1.2.8)
if the record contains the
levels - one level of each
buckets.
That is
0(1) n M.. '"
1J
'"
if
(1.2.9)
f(I)::> A..
1J
and
10(1)1
=I
(1.2.10)
The retrieval rule for queries of size one is very simple.
example, suppose we want to retrieve all records containing
As an
A •
22
first, the identification number of the bucket corresponding to
Then
A
22
is
calculated from (1.2.8),
W22
By matching
bucket
M
with the identification numbers of the buckets, the
M22
is determined.
22
records containing the level
gives all the accession numbers of
A •
22
So, this retrieval rule involves a
simple arithmatic calculation and a simple matching.
The dominant
factor, in terms of time is the matching of identification numbers.
When the queries
retrieval time,
A..
are equally likely, we can show that the average
1J
T,
using serial comparison is
T
=
I
(b+1)Tb
(1.2.11)
2
where
b I n . and Tb represents the time required for each comi=l 1
parison. If we use binary search technique (for matching) instead of
serial comparison, then
(1. 2.12)
where
[u]+
is the smallest integer greater than
u.
Though the filing scheme appears simple for queries of size one, in
terms of retrieval procedure, it becomes complicated when higher order
queries are involved.
two levels
All
and
For example the retrieval of records containing
A22
involves three steps - i) identify the buckem
6
MIl
and
M22
ii) extract all records contained in the buckets and
iii) find the records common to both
comparisons.
MIl
and
M
22
by making cross
So the dominant factor in retrieval time will be the time
required to make cross comparisons.
the buckets.
This will depend on the sizes of
This fact represents the most striking disadvantage of
the inverted filing system for retrieving records pertinent to two-fold
queries.
For higher order queries, this problem becomes progressively
more and more serious.
The inverted filing system has been generalized to accommodate
higher order queries.
This scheme is also characterized by a one-one
correspondence between queries and buckets.
redundancy.
But it has enormously large
Further the retrieval time depends on the number of buckets.
These considerations suggest the need to look for new filing schemes.
In the remaining sections we shall consider the problem of construction
of filing schemes by using various methods of combinatorial mathematics.
1.3
Finite Geometries
Abraham, Ghosh, and Ray-Chaudhuri [1]
and Ghosh and Abraham [23]
have applied combinatorial methods to the construction of some efficient
filing schemes.
To obtain these results, properties of finite geometries
are extensively used.
chapters.
We also use these properties in the following
So, we shall briefly summarize the properties of the two
types of finite geometries:
PG(N,q)
q
the finite projective geometry, denoted by
and the finite Euclidean geometry, denoted by
EG(N,q)
where
is a prime power.
Projective geometry:
In a finite projective geometry
a Galois field
GF(q),
where
q
PG(N,q)
of dimension
N based on
is a prime power, the points can be
7
x = (xO,x '··· ,~)
l
(N+l)-tuples
taken as
elements of
GF(q).
The
(N+l)-tuple
regarded as the same point as
The
GF(q).
PG(N,q).
(N+l)-tuple
0 x = (ox ' oX '
O
l
are
,~
... ,
O~)
0
e
is
of
is not regarded as a point of
P is uniquely determined by a non-null vector x
So a point
uniquely determines a vector
arbitrary non-zero multiple
PG(N,q)
xO,x l '· ••
for any non-zero element
(0,0, ••• ,0)
and conversely a point P
space in
x
where
0
of
GF(q).
An
~
up to an
m dimensional flat
is defined by the set of points which satisfy
(N-m)
independent linear homogeneous equations
°
+ a2N~ = °
+ alN~ =
alOx O + aU xl +
a
x
20 O
+ a 21x l +
(1.3.1)
+ ~-mN~
=
°
or
(1.3.2)
where
T
x
A
(N-m)xN
is a
matrix with the elements
~.
is the transpose of
satisfies
N
from
GF(q)
and
Thus the points which satisfy one linear
homogeneous equation define an
PG(N,q)
a ..
1.J
(N-l)-flat in
PG(N,q).
A point in
independent linear homogeneous equations.
Hence
a point is called a a-flat, a line a I-flat, a plane a 2-flat and so on.
Let
~(N,m,q)
denote the number of m-flats in
(q
HN ,m,q)
(q
The function
~
N+l
N
m+l
m
-1) (q -1)
-1) (q -1)
then
( qN-m+l -1 )
(1.3.3)
(q-l)
satisfies the following property
8
PG(N,q),
~(N,m,q)
~(N,N-m-l,q)
=
~(N,-l,q)=
1
(1.3.4)
(by convention).
If
T
x
(N-m)~(N+l) (N+l)xl
A
o
(N-m)xl
represents an m-flat then all points whose corresponding row vectors
A constitute an
lie in the vector space generated by the rows of
(N-m-l)-flat which is called the dual of the m-flat.
Using the property
of duality it can be shown that the number of m-flats containing
(N~m~t)
(passing through) a given t-flat
(N-m-l)-flats contained in a given
Let
~l' ~z'
... ,
~
be
is equal to the number of
(N-t-l)-flat, i.e.
k (N+l)-tuples.
dependent if there exists elements
c ' c z,
l
$(N-t-l,N-m-l,q~
They are said to be
... ,
c
k
in
GF(q),
not
all simultaneously zero, such that
(1. 3.5)
A set of points of the pro-
Otherwise they are said to be independent.
jective geometry are said to be dependent or independent according as
the corresponding row vectors are dependent or independent.
Let
.!o'
~l'
... , ;
the equations in (1.3.1).
be a set of
Then any solution of (1.3.1) can be expressed
as a linear combination of the vectors
point of
\
L
m
independent solutions of
(m+l)
~, ~l'
••• , ; .
Hence any
an m-dimensional flat, can be expressed as a linear combi-
nation of the points
PO' PI' •.. , Pm
corresponding to
~,
Xl' .•• , ; -
Finally by suitable linear transformation the row vectors of any
independent points in
PG(N,q)
(N+l)
can be taken as
(0, ... , 0, 1, 0, ... ,0)
'"
.-.---/
.....
-J
N-i
i
9
i
=
O,l, •.. ,N.
Let
L
and
m
respectively, in
k-flat
Lk ,
L
n
be any two flats of dimension
PG(N,q).
Lk =
Lk
n
The set of points common to both form a
-l~k~ min(m,n).
where
m and
That is
Lm n Ln .
Lm
is called the interseotion of
Ln .
and
h-flat with lowest dimensions, containing both
unique and is called the join of
max(m,n)
~
Lm and
h
~
Ln .
Let
be the
Lh
Lm and
Ln .
is
Lh
Then
N
and
m+n=k+h.
Euclidean geometry:
A point in an N-dimensional finite Euclidean geometry
based on the Galois field GF(q)
where
N-tuple
(0,0, ... ,0)
The m-flats
geneous equations.
x.
1.
is defined to be an ordered N-tuple
is an element of
is also a point of
(0 ~ m ~ N-l)
EG(N,q)
of
GF(q)
for all
i.
The
EG(N,q).
EG(N,q)
are defined by non-homo-
The set of points which satisfy
(N-m)
independent
and consistent linear equations form an m-flat
a lO + allx l +
a ZO + aZlx l +
(1.3.6)
=0
or
(1.3.7)
10
where
A is a
(N-m) x (N+l)
matrix with elements from
geometry
x
o=0
PG(N,q)
q •
EG(N,q)
can be obtained from the
by deleting the so-called
EG(N,q)
Hence the number of
is equal to the number of m-flats in
less the number of m-flats contained in the
~(N,m,q)
-
~(N-l,
m,q)
=
N-m
q
project~
(N-I)-flat at infinity
and all points and flats contained in it.
m-flats in
The
m
number of points in a m-flat is
The Euclidean geometry
GF(q).
(N-I)-flat
PG(N,q)
X
o=
i.e.,
0,
~(N-l,m-l,q).
(1.3.8)
The various m-flats can be partitioned into parallel bundles by allowing
the associated vectors
~
In this way, there are
N-m
q
~(N-I, ~l,
in
EG(N,q)
q)
(in 1.3.7)
to assume all possible values.
m-flats in each such parallel bundle and
distinct parallel bundles in all.
Finally each point
lies in exactly one of the m-flats belonging to any
parallel bundle.
1.4 Balanced filing schemes
Abraham, Ghosh and Ray-Chaudhuri [1] were the first to use finite
geometries to construct combinatorial filing schemes.
developed for binary-valued attributes.
Their method was
It consisted of forming groups
of recrods in such a manner that the group containing records pertaining
to a given query could be determined algebraically, thus expediting the
search.
Ray-Chaudhuri [31] discussed some further combinatorial pro-
perties of file organization schemes for binary-valued attributes.
Ghosh and Abraham [23] developed the theory for file organization
schemes for multiple valued attributes, where the attributes have an
equal number of possible values.
However these schemes were limited to
queries involving two values from two different attributes.
11
As an example we consider a filing scheme obtained by Abraham,
AI' A , ••• , Ai
2
Ghosh, and Ray-Chaudhuri [1].
with two levels.
Consider a
PG(N,q)
i
be
i
such that
=
attributes each
~(N,O,q).
Then
the points of the geometry are identified with the attributes, the
correspondence being one-one.
We assume that the retrieval pertains to
only one level of each attribute.
the lines of the geometry.
The buckets are identified with
So the number of buckets is
b = HN,l,q)
= q+l.
~(l,O,q)
and the size of the bucket is
Since any two points
will lie on exactly one line of the geometry, any pair of attributes
will belong to exactly one bucket.
pairs.
In any bucket there are
(i]
So each pair corresponds to exactly
Theorem (1.4.1).
queries.
Given that the retrieval pertains to exactly one
level of each attribute, there exists a filing scheme for queries of
size two with
=
i
~(N,O,q)
= ni = 2)
Example (1.4.1).
two levels.
attributes each with two levels,
and with
b
= ~(N,l,q)
Suppose there are
=
i
Consider the projective geometry
this geometry are triplets.
o = 0;
blocks.
7
attributes each with
PG(2,2).
The points of
For convenience, we shall write them as
xl
= 0;
x +x = 0;
l 2
and
X
x
2
= 0;
xO+x
= 0;
l
x +x
O 2
= 0;
x +x +x = O.
O l 2
The attributes, and, together with the corresponding points of
are
100
AS
= 101,
A6
(i.e.
= 110,
12
A
7
= III
.
PG(2,2)
41'
The buckets are constructed by storing in them the accession numbers
of records which have the following pairs of attributes.
Bucket No.
Attribute pairs
Identification
1
AlA Z; Al A3 ; AZA3
100
Z
Al A4 ; AlAS; A4AS
010
3
AA ; AA ; AA
Z 4
Z 6
4 6
001
4
AA ; AA ; AA
l 6
l 7
6 7
110
S
AZA ; AZA ; A A
S
7
S 7
101
6
AA ; AA ; AA
3 4
4 7
3 7
Oll
7
AA ; AA ; AA
3 S
3 6
S 6
111
Each bucket has an identification number, which is taken to be the
triplet of the coefficient of the equation of the line corresponding to
--
the bucket.
Within each bucket the accession numbers of records are
subdivided into subgroups called subbuckets corresponding to each pair
of attributes.
The identification number of each subbucket is obtained
by concatenation of the binary representation of the two attributes
corresponding to it.
Bucket Id.
100
Thus the subbuckets are represented as follows:
Subbucket Id.
001 010
001 011
010 011
010
001 100
001 101
100 101
.
so on.
13
Accession numbers
In storing the accession numbers of the records in the subbuckets of any
bucket, it is likely that the same accession number is entitled to be
stored in more than one subbucket, but this is avoided by storing it in
the first subbucket it is entitled to.
A.
~
indicates the absence of
A.)
then the accession number of this
~
001 010
record is &ored in the subbucket
010 all
100.
under the bucket
and
all 100
of
all.
001 all
and not in
or
This accession number, however, is
001 100 of the bucket
stored again in the subbucket
001
Suppose a record is represented
010;
010 100
of
Thus it is obvious that in this scheme any
accession number is stored more than once, and this is the price that
has to be paid for fast retrieval.
Suppose the following query was posed, "all records with
A
4
are to be retrieved".
A
2
The attributes
and
A
and
2
Then the search procedure is as follows:
A
4
010
are converted to points
and
100
respectively and the equation of the line containing these two points
in a
PG(2,2)
is determined by substituting the coordinates of these
two points in the general equation
and solving for
a
a =a
and
a
l
and
= 0,
in
GF(2).
On substitution, we have
hence the related line is
x
2
= a
and the iden-
tification number of the corresponding bucket is
001.
bucket, the subbucket with identification number
010 100
Inside this
is searched
and this happens to be the first subbucket number in the bucket
001.
Hence all the accession numbers in this subbucket are the required
accession numbers.
If the query was to find all the accession numbers
of the records which have
A
2
and
14
A ,
6
then the search procedure would
lead to the second subbucket, namely
010 110
in
001.
In that case,
all the accession numbers in this subbucket have to be retrieved and the
accession numbers in the previous subbucket have to be searched to find
A A •
2 6
This can be made an easy task by grouping before hand the
accession numbers in the subbucket
the accession numbers of
010 100
into two groups, namely,
A A A , and the remaining and then using the
2 4 6
chaining technique between subbucket
accession numbers under
AAA
2 4 6
010 110
and the group of
in the subbucket
010 100.
The retrieval time of the records pertaining to a query for the
above filing scheme can be split up into the following components:
1.
T
l
= time
needed to solve the algebraic equation to determine
the bucket,
-e
2.
T = time needed for matching the bucket identification number,
2
3.
T = time needed for matching the subbucket identification number,
3
4.
T = time needed for chaining the subbuckets, i f needed.
4
Hence the total retrieval time for any query
= Tl +
T(A)
is
A
T + T + T •
2
4
3
1.5 Combinatorial Configurations
A combinatorial configuration
(Q,k,6,b)
consists of a master set
Q (the set of all attribute levels), a class of subsets
of queries) and blocks (buckets)
subsets of
Q)
for every
A
contained in
If
IAI
~
t
(the set
(which are certain
such that
for all
ii)
B , B , ••• , B
2
b
l
6
for all
€
6,
h,
there exists an
h
such that
A is
B •
h
A in
6,
then the configuration is said to be of
15
order
t,
and is known as an
construction of
(Q,k,t,b)
(Q,k,t,b)
configuration.
configurations with minimum
difficult problem in combinatorial mathematics.
and
The actual
b
is a very
For the case of
= 2,
t
such configurations are essentially equi-
valent to certain group divisible
designs used in statistics.
(GD)
In most situations, optimal schemes are largely unknown and perhaps can
be found through systematic trial and error.
A oombinatorial filing soheme may be based on a combinatorial configuration.
A in
For each
~,
is not contained in
A.
which contains
of
B , B , •.• , B
2
b
l
The blocks
Q such that
yeA)
define
Bhl ,
Let
for
Sh
YeA) = h.
=h
are arranged in serial order.
A is contained in
if
hI < h.
Hence
B
h
B
h
is the first block
denote the collection of all subsets
To each combination of
there correspond sufficiently large disjoint subsets
but
A and
h
A
let
~,A of M.
The
accession number of the I-th record is stored in an element of
if and only if the largest set which
the subset
A
in
Sh;
i. e.,
f(I)
has in common with
if
f(I) n B
h
=
A.
Let
The sets
subsets
~
may be called the buckets of the filing scheme while the
M
may be called the subbuckets.
-n,A
The retrieval procedure for any query simply involves the determination of the appropriate bucket by identifying the first block which
contains the subset specified in the query.
Then, all subbuckets cor-
responding to subsets which contain the query set are located and the
accession numbers obtained.
Thus, the retrieval function may be
16
e"
formally written as
where
A
E~,
and
y(A)
= h.
From the preceeding remarks, one can see
that once a combinatorial configuration has been constructed, a reasonable filing scheme may be readily based on it.
In particular, Bose,
Abraham and Ghosh [7] have used a procedure similar to this.
Finally
the concepts of combinatorial configuration and combinatorial filing
schemes that are described here are equivalent to the ones considered
by Ray-Chaudhuri [31] for the situation in which only one level of any
attribute was of interest with respect to retrieval.
1.6 Second order combinatorial configurations
.-
The problem of constructing second order combinatorial configurations is essentially the same as that of constructing certain incomplete
block designs used in statistical research.
Of special interest are
balanced incomplete block designs and group divisible designs.
The
combinatorial properties of these designs have been studied by a number
of researchers.
For further reference we refer to Bose [3], Bose [4],
Bose, Shrikhande and Bhattacharya [12], Rao [28], and Sprott [33].
A balanced incomplete block (BIB) design with parameters (v,b,r,k,A),
is an arrangement of
v objects into
b
subsets, called blocks, such
that
i)
ii)
each block contains exactly
each object occurs in
r
k
objects,
distinct blocks,
and iii) each pair of objects occurs together in
A distinct blocks.
Assuming that the retrieval pertains to only one level of an
attribute the problem of construction of a combinatorial configuration
17
(~,k,2,b)
~
where
= {Al ,A , .•. ,Al }
2
is exactly the same as the
construction of a BIB design with parameters
given value of
of blocks.
optimal.
k
the BIB design, if it exists, has the minimum number
Hence the corresponding combinatorial configuration is
and
A.
Some existing designs are, for
b
r
k
7
7
3
3
13
13
4
4
13
26
6
3
15
35
7
3
16
20
5
4
19
57
9
3
25
50
8
4
25
100
12
3
27
117
13
3
28
63
9
40
130
13
4
4
66
143
13
6
91
195
15
7
113
226
16
8
145
232
16
10
145
290
18
9
v
The above list is by no means complete.
of GD designs
A= 1 .
We shall delay the description
until the next chapter.
Example (1.6.1).
Suppose there are seven attributes each with two
We shall assume that the retrieval pertains to only one level
of each attribute.
butes.
For any
There may not exist BIB designs for all possible choices of
v, b, r, k
levels.
(l,b,r,k,A=l).
Let
AI' A2 , A3 , A4 , A5 , A6
and
A7 be the attri-
The buckets are identified with the blocks of a BIB design with
parameters (7,7,3,3,1).
So the buckets are
18
_
B
=
{Al'AZ,A 4 },
B ..,
Z
B
4
=
{A4' AS' A7 },
B
5
B7
=
{A ,A ,A } .
7 l 3
1
=
,A3 ,AS}'
B3
=
{A ,A ,A },
3 4 6
{A ,A ,A },
S 6 l
B6
=
{A 6 ,A 7 ,AZ }'
{A
Z
The subbuckets are formed by considering all subsets in a bucket.
That
is,
M:
l
{AI};
{A };
Z
{A };
4
{AlAZ};
M :
Z
{A };
3
{A };
6
{A };
7
{AS};
{A A } ;
Z 3
{AZA };
S
{A A }; {A A };
3 6
3 4
{A A }; {A A };
4 S
4 7
{AlAS}; {A l A6 }; {ASA6 };
{A A }; {A A }; {A A };
6 7
Z 7
Z 6
{A A }; {AlA]}; {A A };
3 7
l 3
{A A };
4 6
{A A };
S 7
M:
3
M:
4
M:
S
M:
6
.-
M:
7
That is, a subset
only i f
Let
where
Ai
{A A A },
l S 6
{A A A },
Z 6 7
{A A3A]} •
l
AZ in BZ is not considered as a subbucket in MZ'
The subset
M
i
{A A };
{A A };
{A A A },
Z 4
l Z 4
l 4
{A A };
{A ZA3AS}'
3 S
{A A A },
3 4 6
{A A A },
4 S 7
I
y (A)
A of
Bi
for
is taken as a subbucket in the corresponding
= i.
be any record such that
indicates the presence of the i-th attribute and Ai
the absence of it.
Then
f(l) n B
1
=
{AI}
f(l) n BZ'= {A3AS},
f(l) n B = {A 3 }
3
f (I) n B4 = {AS}
f(l) n B
5
= {AlAS}'
f (I) n B6
=
y(A ) = 1
l
y(A A ) = Z;
3 S
y(A ) = z
3
y(A )
Z
S
y(AlAS)
{ l}}
19
= 5;
indicates
and
Hence the I-th record accession number will be stored in the subbucket
{AI}
of the 1st bucket. in the subbucket
in the subbucket
{AlAS}
of the seventh bucket.
{A A }
3 S
of the second bucket.
of the fifth bucket and in the subbucket
{A A }
l 3
That is a record is stored in the subbucket re-
presenting the maximal subset common to the record and the bucket.
So
when retrieving a query we may have to search in more than one subbucket.
This can be taken care by chaining the appropriate subbuckets.
example in the first bucket. the subbucket
{A A }. {A A }
l 2
l 4
will be chained to
{A A A }.
l 2 4
and
To retrieve the query
that contains the set
A .
3
{A }.
3
first we have to determine the bucket
Since
y(A )
3
ation is the second bucket.
{A A A }
2 3 S
{AI}
For
=
2. the bucket under consider-
The subbuckets
{A }. {A A }. {A A }
2 3
3
3 S
and
contain all the accession numbers that satisfy the query.
Similarly we can retrieve all records that satisfy queries of size one
and two.
Another type of combinatorial configuration is obtained by RayChaudhuri [31] using covers in finite projective spaces.
An m-flat
if
\ ~ \
Lm - Lt-l
Lm
in
where
is defined to be a
PG(N.q)
N
~
m
(b.t.m)
~
is said to cover a
t-l.
A class of m-flats
cover if every
contained in at least one of the m-flats
(b.t.m)
cover is called a minimum
minimum number
b
O
= bO{N.t.m.q)
(t-l)-flat
n
h
(b.t.m)
t
Lt-l
(n .n •.•.• n )
l 2
b
(t-l)-flat in
PG{N.q)
belonging to the class.
cover if it contains the
of m-flats required to cover every
{t-l)-flat.
20
is
A
An important result in this connection is obtained by Ray-Chaudhuri
[31].
Theorem (1.6.1).
There exists a
(Q,k,t,b) combinatorial config-
uration for the case
t
= (qN+l_l)/(q_l),
b = bO(N,t,m,q),
and
q
is a prime power.
k
where
= (qm+l_l)/(q_l),
N
~
m
~
t-l,
The retrieval pertains to only one level of
each attribute.
Some additional methods for forming covers are obtained by Koch [25].
21
CHAPTER II
MISCELLANEOUS FILING SCHEMES
In this chapter we obtain miscellaneous combinatorial configurations.
Some of these configurations are improvements over the existing ones.
We give a simple construction procedure for these designs, beginning with
a description of the existing ones.
2.1
Orthogonal arrays and Partially balanced arrays
The problem of constructing combinatorial configurations with
k = i
is equivalent to the problem of forming an array of ordered i-tuples (in
which each coordinate corresponds to a unique attribute) in such a way
that every possible ordered combination of
least once.
with
For the case in which
n(n =n =••• =ni=n)
l 2
t
coordinates occurs at
n consists of i
attributes, each
levels, and in which alIt-tuples occur exactly
once, such a construction is called an orthogonal array of strength
constraints
i,
and index unity.
It is represented by
t,
(b,i,n,t).
Such
arrays have been studied by Bose and Bush [9], Bush [17] and many others.
For large values of
i,
it becomes very difficult to construct orthogonal
arrays with index unity.
But for our purpose we do not need every t-tuple be covered exactly
once but at least once.
This leads to the concept of partially balanced
array, as defined by Chakravarti [18], [19].
Definition {2.1.1}
blocks,
i
A partially balanaed array of strength
attributes with
n
t
levels each, is equivalent to a
matrix in which among the rows of each
t
in
b
(bxi)
column submatrix, every
e·
(u ,u , ••• ,u )
l 2
t
possible permutation of the values in the vector
A(u ,u , •.• ,u )-times, where the value of
t
l 2
exactly
not depend upon which
t
occurs
A(u ,u , .•• ,u )
l 2
t
does
columns are chosen.
For our purpose we need those partially balanced arrays for which a
majority of
A(u ,u , .•• ,u )
l 2
t
are unity.
As with orthogonal arrays, the
problem of constructing partially balanced arrays for large
A'S
t
with
near unity is another very difficult problem.
2.2 Composition
One way of tackling the problem for higher values of
the method, composition as suggested by Koch [25].
t
is to use
This procedure
involves two steps - i) construct efficient partially balanced arrays
for small values of
values of
t.
t
and then
ii) expand these arrays to obtain higher
The arrays so obtained for large
t
That is we may have more blocks than are necessary.
may not be efficient.
Koch [25] applied
this procedure to obtain configurations of order two, three, and four
when the attributes had
q
(a prime power) levels.
k =t
2.3 Configurations of order two with
Let us assume that there are
i.e.
n
= 2.
t
attributes each with two levels,
A combinatorial configuration of order two, with
can be represented by a
(bxt)
k
matrix in which among the rows of each
two column submatrix, each of the four possible ordered 2-tuples
(01), (10), (11)
= t,
occurs at least once.
(00),
In this section, a method for
constructing such matrices will be studied.
Consider the following
(4 x3)
matrix
111
100
(2.3.1)
010
001
23
This represents an efficient configuration of order two for three attributes.
Further
A(OO)
=
=
A(Ol)
A(lO)
=
A(ll)
= 1.
~
Similarly for four
attributes an efficient configuration of order two is given by
1111
1000
(2.3.2)
0100
0010
0001
b = 5;
It has
A(ll) = A(lO)
=
A(Ol)
=
1
and
A(OO)
=
2.
above design is a partially balanced array of strength two.
In fact the
The matrices
(2.3.1) and (2.3.2) have a systematic structure - namely, the first row
consists of ones and every column has exactly
ones.
Generalizing
this idea, we have
Theorem (2.3.1).
Given
l
attributes each with two levels, there
exists a configuration of order two with
where
a
b
blocks of
size
i.
Further
is the least integer such that
(2.3.3)
Proof.
Let
Let Q
l
t b
(axl)
matrix such that
the first element in each column is " one",
i)
and
G be a
it)
every column has exactly
iti)
all columns are distinct.
and
~2
be any two columns of
is the number of rows and
24
[~] - 1
2
" ones",
G,
denotes the integral part of
b
2.
1
e
C
11
C21
.£1
-,
1
C
12
C22
=
.£2
C=
ax2
1
1
°11
C21
C
12
C
22
-I
(2.3.~
C
C
_ a-1, 1 a-1,~
where
C
ij
once in
C.
= 0 or 1 for all i
and
j.
Clearly (11) occurs at least
do not coincide as
The ones in .£1
have exactly the same number of "ones" and are distinct.
(10) will each occur at least once in
C.
Hence
G is a partially balanced array of strength two.
Examp1e (2. 3. l) .
If
The
a - 2.
:s;
a = 6
(6 x 9)
then
Let
.l
=
(~=~J =
9.
10
~
matrix is,
111 111 111
111 100 000
100 011 100
010 010 011
001 001 010
000 100 101
Example {2.3.2}.
Let
.l
=
12.
7 -1 J = 15 > 12.
Then ( 3-1
So the design is
25
9.
Hence (01) and
Finally (00) occurs at least
once as
2[~] - 2
Q1 and C2
111
111
111
111
111
110
000
000
100
001
111
000
010
001
000
111
001
000
100
100
000
100
010
010
000
010
001
001
The columns are in lexicographic order.
In particular, the following table indicates the relative sizes of
and
b
l
I n th e
i
b
i
b
i
b
3
4
12
7
54
9
4
5
18
8
81
10
6
6
27
8
108
10
9
6
36
9
162
11
{} -- 3u
. 1 case, wh en
spec~a
we have
~
a
= 2u +
2,
for
Where as for the same case the composition procedure gives
b
=
3u + 1.
2.4 Configurations of order two, with
k; i
Construction of configurations of order two with
k
~
i,
is equi-
valent to the construction of certain group divisible designs.
Definition (2.4.1).
v = in
A group divisible design is an arrangement of
objects, belonging to
i
groups of
n
objects each, into
blocks such that
i)
ii)
iii)
each block contains
each object occurs in
k
objects,
r
distinct blocks,
each pair of objects, belonging to the same group, occurs
26
b
together in
and
iv)
Al
blocks,
each pair of objects, belonging to different groups occurs
together in
So, a GD design with
A
2
blocks.
Al = 0, A2 = 1
represents a combinatorial con-
figuration of order two appropriate to the multi-level attribute case
with
l
attributes, each with
n
levels.
These configurations are
optimal in the sense that each pair of levels of different attributes is
covered exactly once.
Hence for a given value of
But GD designs may not always exist.
k,
b
is minimum.
One simple method of construction
of these designs is given by Bose, Shrikhande and Bhattacharya [12].
Theorem (2.4.1).
BIB design with parameters
v*, b*, r*, k*,
a,
all the blocks containing
v
= v*
l
= r*,
a from a
By omitting a particular treatment
- 1;
n
= k*
=1
A*
together with
we get a GD design with parameters
= b*
b
and
- r*,
- 1,
Al
r
= r*
- 1,
A2
= 1.
= 0,
k
= k~
In particular if we start with a BIB design belonging to the series OSl,
with parameters
v
= q2,
= q 2+q,
b
r
= q+1,
k
= q,
A = 1
(q
is a
prime power) we get a series of GD designs with
v
= b
= q 2-1,
As an example, take
v
= b = 8,
r = k = q,
q. 3.
r
l
= q+l,
n
= q-l
Then
= k = 3,
= 4,
l
20
31
4
1
30
41
1
40
10
20
27
0
n
= 2,
Al
= 0,
A2
=1
.
11
2
0
3
0
2
1
3
0
4
0
1
4
0
11
1
11
2
2
3
1
3
4
1
0
1
1
In the rest of this section we develop a method for constructing
k # f.
configurations of order two, when
Lemma (2.4.1).
Given an integer
m,
there exists an
(N x m)
zero-one matrix such that in any two columns taken together the pairs
(10), (01)
occur at least once.
N ::;
Further
minimum
ex
Proof.
Choose any
{ ex:
[
m of the possible
[~l ] ,
ml.
[[iJ ]
"ones".
Since every
column has exactly the same number of "ones" and are all distinct, the
pairs (10), (01) will occur at least once in any two columns taken
together.
Example (2.4.1)
1
1
0
0
2
0
2
1
30
4
1
11
30
4
2
1
3
0
40
11
2
1
31
11
2
0
4
0
1
3
1
4
3
4
2
0
0
1
1
0
1
28
_-
This represents a configuration of four treatments each with two levels
in blocks of size three.
Lemma (2.4.2).
Given 4m
treatments each with two levels there
exists a configuration of order two with
b
blocks of size
3m,
where
(2.4.1)
Proof.
Define
W.1
That is
so on.
[(i-1)m+l, (i-l)m+2, ••• , [m]
i = 1,2,3,4.
WI is an m-tuple consisting of the first m treatments
j
Denote by W the j-th level of all the m treatments in
i
and
w.t.
Then the first eight blocks can be taken as
wO
1
WO
1
WI
1
WI
2
WI
1
WI
1
wO
1
wO
2
WO
2
WI
2
wO
3
wO
3
WI
2
wO
2
WI
3
WI
3
wO
3
WI
4
WI
4
wO
4
WI
3
wO
4
wO
4
WI
4
These eight blocks do not cover the pairs (01), (10) corresponding to the
treatments within a
cover these pairs
Wi
(i
(within a
= 1,2,3,4).
Wi)'
We need at most
a
blocks
where
(from lemma 2.4.1)
29
to
Hence
where
[~]+ (the least integer greater than or equal to ~)
number of extra blocks needed as the block size is
is the
3m.
Example {2.4.2}
l = 12,
k
= 3,
so
Then
0'.
1
= 9 and m = 3
b
~
+ 3 + 1 = 12
8
7
8
9
0 20 3 0
40 50 60
10 20 30
41 51 6 1
10 11 12
1 1 1
11 2 1 31
7
8
0
10 11 12
1 1 1
41 51 6 1
7
8
0
10 11 12
0 0 0
11 2 1 31
41 51 61
11 2 1 3 1
4 0 50 60
10 11 12
0 0 0
1
3
0 20 0
7
8
1
10 11 12
0 0 0
4
6
0 50 0
7
1
10 11 12
1 1 1
0
0
1
1
0
0
8
1
1
9
9
9
9
0
\
7
0
0
81 9 1
8
9
11 20 30
4 1 50 60
1
40 51 61
10 11 12
1 0 0
0 21 31
7
0
10 11 12
0 1 0
4 50 6
1
0
7
1
10 11 12
0 0 1
1
0
2
1
Theorem {2.4.3}.
3
0
Given
0
0
8
8
1
0
4m
9
9
1
0
e~
0
attributes each with two levels there
exists a combinatorial configuration of order two with
size
3m.
If the configuration for
otherwise
30
V
= 2m,
k = 3m
2
b
blocks of
exists then
where
is the number of blocks required for
b*
configuration of order two
and
a
v = 2m
and
is an integer such that
m.
Proof.
v = 2m,
Suppose that the configuration
k = 3m
2
and
b*
blocks exists.
C of order two with
Then
m is clearly even.
We will extend this configuration
C to a new configuration
follows.
Replace every attribute
C by a pair of attributes of
i.e.
i.
if
J
denotes the j-th level of the i-th attribute in
(i , 2m + i ).
j
j
it is replaced by the pair
These 2-tuples can be covered by the following three blocks.
Hence
b
~
b* + 3.
But from lemma (2.4.2), we have
where
~
m.
Hence the result.
31
as
C*;
C then
The blocks so obtained will
cover all 2-tuples except
--
C*
Example (2.4.3).
v = 8,
k = 6 .
From example (2.4.1), we have the configuration of order two with
parameters
v = 4, k = 3, b = 8. So using this configuration we have
10 50
2 6
0 0
3 7
0 0
10 50
2
4
11 51
3
2
3
1
6
1
1
8
1
8
0
70
4 8
0 0
6
3
2
1 0 50
3
2
3
0
1
4
11 51
6
1
7
0
2
0
6
0
11 51
1
0
6
1
1
1
1
7
1
4 8
0 0
0
1
\
4 8
0 0
1
7
1
4 8
1 1
0
3 51
0
6
11 21
4 50
0
6
2
4 6
1 0
7 8
0 0
1
0
2
1
3
1
b
~
1
0
7
1
8
--
1
Hence
11.
2.5 Filing schemes based on finite geometries
Consider a projective geometry
points in
(N+1)
PG(N,2)
PG(N,2)
of dimension
are either dependent or independent.
such that it contains all the
(N+1)
points.
independent, then we shall show that there exists an
PG(N,2)
such that its complement contains these
N.
Any
(N+1)
If a set of
points is dependent then there exists at least one
PG(N,2)
(N-1)-f1at in
If they are
(N-l)-f1at
(N+l)
in
independent
points.
Let
PG(N,2).
e
PO' PI' P2'
Let
P.
-1
... ,
P
be a set of
N
denote the row vector of
32
(N+1)
Pi'
independent points in
i = O,l, ... ,N.
We
can find a non-singular linear homogeneous transformation such that the
row vector of
P.
1.
is
~ =
(O~O~
••• ~O,l,O, ••• ,O)
i
Consider the
i =
1,2, ••• ,N.
(2.5.1)
N-i
(N-l)-flat,
X
None of the points
o+
Xl
+ •.. + XN
(2.5.2)
= 0 •
satisfy the equation (2.5.2).
P.
1.
to the complement with respect to
PG(N,2)
+ XN
of the
Hence they belong
(N-l)-flat
= O.
So,
Theorem (2.5.1t.
contained in a
Any
(N-l)-flat
(N+l)
points in
PG(N,2)
are either
or contained in the complement of an (N-l)-
flat.
Theorem (2.5.2).
Given that retrieval pertains to only one level
of any attribute, there exists a filing scheme with
which is oriented toward
Proof.
geometry
Identify the
PG(N,2).
(N+l)-tuple queries~
N l
2 + _l
b
= 2 x ~(N,
= 2N+2_2.
The size of a bucket is either
scheme covers all
It has
PG(N,2).
33
N
2 +2_2
buckets.
(N-l)-flats
and
The number of buckets is
N-l, 2),
N
2 _l
(N+l)-tuples.
attributes,
attributes with the points of the
The buckets are identified with
their complements with respect to
N l
(2 + _l)
or
Clearly, this filing
A bucket is subdivided into subbuckets by forming all possible
(N+l)-tuples.
Some of these subbuckets may be duplicated.
Hence in any
N
bucket there are at most
~+l)
subbuckets.
A subbucket occurs exactly
once in the filing scheme if the corresponding
least
~
(N+l)-tuple
has at
N independent points.
Example (2.5.1).
Consider
PG(2,2). Let
Al
001 ,
A
4
100 ,
A
2
010 ,
AS
101 ,
A = 110 ,
6
A = 011 ,
3
A = 111 •
7
The buckets and their corresponding attributes are
Bucket No.
Identification
Attributes
Subbuckets
AAA
l 2 3
AAA
4 S 6
AAA
4 S 7
AAA
4 6 7
AAA
S 6 7
Bl
1000
A A ,A
t 2 3
B2
1001
A ,A ,A ,A
4 S 6 7
B3
0100
B4
0101
A A ,A
f 4 S
A ,A ,A ,A
2 3 6 7
AAA
l 4 S
AAA
2 3 6
AAA
2 3 7
AAA
2 6 7
AAA
3 6 7
B
S
B
6
0010
A ,A ,A
2 4 6
Al'A 3 ,AS ,A
AAA
2 4 6
AAA
l 3 S
AAA
l 3 7
AAA
l S 7
AAA
3 S 7
0011
34
7
e
Bucket No.
Identification
Attributes
A ,A ,A
1 6 7
A ,A ,A ,A
Subbuckets
B7
1100
B8
1101
B9
1010
BIO
1011
Bll
0110
BIZ
0111
A ,A ,A ,A
l Z S 6
AIAZAS
AAA
l Z 6
AAA
I S 6
AAA
Z S 6
B13
1110
AAA
B14
1111
A ,A ,A
3 S 6
A ,A ,A ,A
l Z 4 7
Z
4
3
AI A A
6 7
AZA A
3 4
AZA A
3 S
AZA A
4 S
AAA
3 4 S
AAA
Z S 7
AAA
I 3 4
AAA
I 3 6
AAA
l 4 6
AAA
3 4 6
AAA
S
A ,A ,A
Z S 7
AI' A3 ,A 4 , A6
A ,A ,A
3
4
7
3 4 7
3 S 6
AAA
l Z 4
AAA
I Z 7
AAA
I 4 7
AZA A
4 7
Notice that none of the subbuckets are duplicated.
This is because any
two points are independent and we are considering only three points.
Suppose we are interested in retrieving records with attributes {A A A }.
I Z 4
The corresponding points of the geometry are
AZ = 010,
A
4
= 100.
The equation of the complement of any 1-flat is
as we are dealing with
PG(N,Z).
Hence the equation of any bucket is
of the form
3S
So the bucket containing
(001), (010), (100) is obtained by solving the
following equations
aD = a 3
a
a
Hence
a
3
= 1.
l
= a3
2
a
3
So the bucket is
1111.
Similarly we can find the
bucket corresponding to any query of size three.
2.6 Multiple level attributes
The filing schemes described in the previous sections deal mainly
with the case
n
l
= n 2 = ... = nl .
In this section we shall consider
the unequal case.
Consider a partition of the points in
say
EG(N,q)
into flat spaces,
Then
for all
and the join of
~l'
~2'
... ,
~l
i
and
is the whole space.
are not necessarily of the same dimension.
~l
point of
EG(N,q)
j
Also
Notice that every
belongs to exactly one flat space
These flat spaces are identified with the attributes.
~.
1
(i = 1, 2, ... , 1) .
So the number of
levels of any attribute will be of the form
n
If
h
in
EG(N,q) •
1
= h
2
i
l
= 4,
n
2
h.1
then
=
o~
for all
h. < N,
~l' ~2'
1
... ,
~l
i.
form a parallel bundle
This case is studied by Ghosh and Abraham [23].
Example (2.6.1).
n
= q
= n 3 = 2.
As an illustration, consider the case
1
=
3,
The filing scheme for these attributes can be
36
constructed using an
EG(3,2).
The points of the geometry are triplets
and for simplicity we shall write their coordinates without commas
000,
001,
010,
011
100,
101,
110,
111.
The three flats
i.e.
are
and
IT
1:
xl = 0
000, 001, 010, on
IT
2:
xl=l, x 2=0:
100, 101
IT
3:
x =1,x =1:
2
l
no, IlL
These flat spaces are constructed as follows:
first a parallel bundle
of order one is formed
then a parallel bundle
of order one in
i.e.
{x =l}
l
{xl=O}; {xl=l};
is found
i.e.
{x =1; x =0}; {x =1, x =1}.
2
1
1
2
The correspondence between points and attribute levels is as follows:
so
so
All = 000,
A = 010,
13
A = 001,
12
A = 011.
14
A = 100,
2l
A = 101.
22
A = 110,
3l
A = IlL
32
The buckets are constructed by identifying them with planes, not containing
lTl or
~2
Bucket No.
or
~3.
So
Identification
1
0010
2
0011
3
1010
4
1011
5
ono
6
0111
7
1110
8
1111
Equation
Attribute levels
x =0
AllA13A2lA3l
3
x =1
A12A14A22A32
3
xl+x3=0
AllA13A22A32
x +x =1
A12A14A21A3l
l 3
x +x =0
AllA14A2lA32
2 3
x +x =1
A12A13A22A31
2 3
x l+x2+x 3=0 AnA14A22A3l
x +x +x =1 A12A13A2lA32
l 2 3
37
The identification number attached to each bucket is the 1- tuple of
the coefficients of the equation
A.
J.
E
GF(2»
AD + A x
1 1
+ AX +
2 2
~x3 =
of the plane corresponding to the bucket.
0
(where
Within each
bucket, the accession numbers of the records are divided into subsets
called subbuckets, corresponding to each relevant triplet of values.
The
subbuckets may be identified by concatenating the codes of the triplet
of values they represent.
As an example,
Bucket Identification
Subbucket Identification
0010
Accession nos.
000 100 110
010 100 110
0011
001 100 110
011 100 110
000 101 III
1010
010 101 111
1011
001 100 110
011 100 110
0110
000 100 III
011 100 111
0111
001 101 110
010 101 110
1110
000 101 110
011 101 110
001 100 111
1111
010 100 111
The buckets corresponding to
x
2
=
0; x
2
= 1;
x +x
1 2
= 0;
x +x
1 2
=1
are deleted from the filing scheme, as they do not contain any relevant
triplets.
The remaining eight flats intersect all the three flats,
and
38
Suppose the query request is to retrieve all records pertaining to
All' AZI
and
A3l ·
All' AZI
Then
the points of the geometry.
and
A3l
These points are
are first converted into
(000), (100), (110).
The
plane passing through these three points is determined by solving the
equation
in
GF(Z).
On substituting these points in the equation,we have
~d
=>
Hence the equation is
values is
0010.
x
3
=
O.
=0
SO the bucket containing these three
The accession numbers corresponding to the query are
obtained from the subbucket
000 100 110
The retrieval time for a query
where
AZ
T
I
=
T
Z
of the bucket
0010.
A is the sum of three components:
the time needed to solve the algebraic equation,
the time required for matching identification,
numbers of the buckets
T
3
=
the time required for identifying the subbuckets.
39
CHAPTER III
QUADRICS AND SOME RELATED FILING SCHEMES
In this chapter we shall describe a method for obtaining filing
schemes based on the properties of a homogeneous, non-degenerate quadric
in a finite projective geometry.
Before describing these procedures we
shall describe some important properties of quadrics.
3.1
Quadrics
The following remarks are based on the work of Primrose [Z7], Ray-
Chaudhuri [30] and Bose [6].
Our introductory comments are those of
Dowling [Zl].
Let
B be an
L
denote a projective geometry of order
(N+l) x (N+l)
matrix with elements from
N over
GF(q)
GF(q).
Let
and consider
the equation
x B x
where
i
x = (xO' ..• '~)
= O,l,Z, ..• ,N
PG(N,q)
and
T
=0
(3.1.1)
is a row vector with elements
x
T
is the transpose of
x.
x.
1
E
GF(q),
The set of points in
whose coordinate vectors satisfy (3.1.1) is said to constitute
a quadPiain
PG(N,q).
Without loss of generality we can take
B to
be an upper triangular matrix, i.e.
B
bOO
b
0
b
Ol
0
ll
0
0
0
b
bON
OZ
bIZ
b
Z2
brn
b
ZN
0
b~
(3.1.2)
~
for if
B is an arbitrary
GF(q),
then clearly the matrix
b
(N+1) x (N+1)
D defined by
0 ,
=
ij
matrix with elements from
> j,
i
b U = dU '
b
= d
ij
+ d
ij
j2
i < j,
is upper triangular, and
x Dx
N
T
N
r r
r
i=O j=O
=
O~i~j~
= x B x
As both the equations
x Bx
T
d
ij xix j
b ij xix j
.
T
= 0
and
x Dx
T
= 0
represent the same
quadric, we take the equation
(i.e. (3.1.1»
to represent the quadric.
Now let
QN:
be a quadric in
x Bx
PG(N,q),
vector
x*
P
GF(q).
B*
D be a non-singular
and the quadric
x
into the point
p*
with row
QN into the quadric
~*
B* x*
T
= 0
,
is the triangular matrix obtained from
manner described above.
(N+1) x (N+1)
The non-singular linear transformation
with row vector
QN*.•
where
=0
and let
matrix with elements from
carries the point
T
C = DBD
T
in the
This transformation is incidence preserving,
41
i.e.
p*
E
QN* i f and only i f
are called equivaZent.
If
t
exists a non-singular matrix
corresponding
The quadrics
to be non-degenepate; otherwise
Thus if the rank of
t
D such that the last
N + 1 - t.
quadrics is defined to be
QN*
is the largest integer for which there
are null, then the pank of
B*
and
Q
N
and all equivalent
t = 0,
If
columns of the
then
Q
is said
N
Q
is degenepate.
N
Q is
N
N+ 1 - t
appropriate linear transformation
where
-1
D
x*T
x
T
t
~
1,
then by an
we can find an
equivalent quadric
o.
Q*N·.
Notice that in equation (3.1.3), the variables
are missing.
Define the
I*
N-t
~~
N-t+l' ~~
N-t+2'···' ~~
N
(N-t)-flat
~-t+l = ~-t+2 = ... = ~ = 0 .
Then since the rank of
Q~-t:
(3.1. 3)
Q*
N
I
O~i~j~N-t
is clearly non-degenerate in
is
N+l-t,
the quadric
b~. x~ xj = 0, ~_t+l=O, •.. ,~
1J
(3.1.4)
o
(3.1.5)
1
I N*- t .
We can write
QN-t
* = Q*N n LN-t·
\' *
If
\'
LN-t
is the inverse image of
*
\'L N-t
under the transformation
then the quadric
is non-degenerate in
I N- t •
The equivalent quadrics
in
PG(N,q).
If
Q
and
N
p* is a point of the
o
42
Q~
are called cones of opdep t
(t-l) -flat
4i'
*
It-I:
and
p*
1
(p* p*)
°
some
is a point of
Q~.
lies in
1
P~
x~ = 0, x! = 0, .", ~-t =
*
It-I'
E:
P!
vertex of the cone
Q~-t'
The
p*
E:
Q~,
p*
Lt - l
(P~
E:
I*
t- l
(t-l)-flat
under the transformation
\*
Lt-l
is the vertex and
QN-t
on the line
P!)
for
is called the
and the non-degenerate quadric
Q~,
called the base (of course the base is not unique).
inverse image of
p*
then clearly any point
Conversely, if
E:
°
If
x*T
is the base of the cone
is
Q~-t
Lt - l
is the
= D-1 xT
then
QN'
Unless
otherwise stated, we shall employ the term cone to refer to a cone of
order one.
In this case, the vertex consists of a single point and
the base in a non-degenerate quadric in
If
Q
N
is even then
is a non-degenerate quadric in
But if
N
An
A hyperbolic quadric in
flat spaces of dimension
Let uS denote by
Q in
N
$(N,O)
k - 1
N
= 2k
but not
k - 1
but none of
PG(2k+l, q)
contains
k but none of any higher dimension.
$(N,O)
PG(N,q).
where
elliptic quadric in
contains flat spaces of dimension
any higher dimension.
,
PG(N,q),
= 2K+l (i.e. odd), the non-
degenerate quadrics are of two types.
quadric
dimensions.
Q contains flat spaces of dimension
N
of any higher dimension.
PG(2k+l, q)
(N-l)
the number of points in a non-degenerate
Primrose [27] showed that
fO(k,O)
if
N
= fl(k,O)
if
N
f (k,0)
2
if
N
43
= 2k,
2k+l
lie
QN is elliptic,
= 2k+l & QN is hyperbolic,
where
(q2k_ l ) /(q-l) ,
(3.1.6)
= (qk+l+ 1 ) (qk_ 1 ) / (q-l) ,
(j.1.7)
(qk+l_ l ) (qk+l)/(q_l) .
(3.1. 8)
Ray-Chaudhuri [30] generalized this result by finding the number,
W(N,r),
PG(N,q).
of
r-flats contained in a non-degenerate quadric
Q of
N
He showed that
f (k, r)
O
if
N = 2k,
W(N,r)= fl(k,r)
if
N = 2k+l lie Q elliptic,
N
f (k,r)
2
if
N = 2k+l lie Q hyperbolic,
N
where
r-r
r
fO(k,r) =
(q2k-2m_ l )/(qr-m+l_ l )
if
r ~ k-l
if
r >
m=O
o
k-l
(3.1.9)
fl(k,r) =
o
if
r >
k-l
(3.1.10)
...
o
if
r>k.
(3.1.11)
44
PG(N,q),
QN is a cone in
If
where
N is even, then
to be an elliptic or a hyperbolic cone according
is elliptic or hyperbolic.
In general i f
called a cone.
N is odd, then
If
r
N = 2k,
the base
k-flats •.
~(N-l,
If
QN-l
QN will simply be
QN-l
an elliptic cone contains
hyperbolic cone contains
k-flats also.
QN-l
of a cone
QN'
(r+l)-flats but none of any higher dimension.
contains
Thus when
as its base
is the dimension of the highest-
dimensional-flat space contained in the base
then
QN is said
0)
of a cone
1
N = 2k+l,
a cone contains
is the number of points contained in
QN'
+
If
(k-l)-flats and a
then the number of points in
Q is
N
q ~(N-l, 0).
(3.1.12)
By a suitable choice of the system of reference, the equation of
"-
any non-degenerate quadric
relatively simple form.
PG(N,q),
where
N
=
2k,
Q in
N
If
PG(N,q)
can be expressed in a
Q is a non-degenerate quadric in
N
then the equation of
Q can be expressed
N
in the canonical form
(3.1.13)
The equation of a non-degenerate hyperbolic quadric
where
N
= 2k+l,
Q in
N
PG(N,q),
can be written as
(3.1.14)
If
q = p
n
where
p
a prime number not equal to two, and
n
integer, then the equation of a non-degenerate elliptic quadric
in
PG(N,q), (N=2k+l), can be written as
45
is an
Q
N
where
"-S"
is a non-square element of
equation of the elliptic quadric
GF(q).
If
p = 2,
the
QN' (N=2k+1), can be written as
2
2
x Ox 1 + x 2x 3 + ••• + x2k-2x2k-1 + A(x 2k + x 2k+ 1 )
(3.1.16)
+ x2kx2k+1 = 0,
where
n
GF(2 ).
is irreducible over
Earlier we defined the rank of a quadric
is odd (i.e.
p
~
2)
then the rank of
of the symmetric matrix
Q in
N
PG(N,q).
If
q
Q is the same as the rank
N
and the equation of
may be
taken as
T
T
.!.(B+B ).!.
= O.
However, this is not generally true for fields of characteristic two
(2) .
T
B+ B
if
QN
If
is
is a non-degenerate quadric in
T
B+ B
(1. e.
N+ 1
N is odd, but
T
B+ B
T
B+ B
T
non-degenerate i f the rank of
The matrix
T
B+ B
N and
I f both
is non-singular.
B+ B
is odd,
q
N and
is
N when
q
are even,
defines a polarity with respect to the non-
coordinate vectors
Rn
T
)fT1 =
is self conjugate.
the only self conjugate points are those of
respect to
and
PI
with
O.
The relation of conjugacy is obviously symmetrical.
PG(N,q)
Po
f 1 are said to be conjugate if and only if
and
Ro(B + B
P
QN is
N.
Q
according to which the points
N
A point
N is even.
QN is non-degenerate
degenerate quadric
then every point of
p f 2)
is non-singular as in the case
is singular of rank
Conversely, i f at least one of
if
n
PG(N,2 ), then the rank of
with row vector
Q
if
N
46
P
If
If
q
q
is even,
is odd then
QN.
is said to be a regular point with
e~
otherwise
is said to be an irregular point.
P
non-singular when at least one of
regualr in these cases.
is singular of rank
vector
C
If both
N
Nand
q
B
+ BT
is
is odd, every point is
q
are even, then
and there exists a unique point
C
B
T
+ B
with row
such that
+
f(B
The irregular point
quadric
Nand
Since
QN'
C
T
B ) = O.
is called the nucleus of polarity of the
All other points of
The polar space
T(P)
all points conjugate to
PG(N,q)
of a point
P
are regular.
is defined to be the set of
P, that is
(3.1.17)
where
P
is the row vector corresponding to the point
regular point, then (3.1.17) is the equation of an
P.
If
(N-l)-flat.
P
is a
If
P
is the nucleus of polarity, however, then every point is conjugate to
and hence
T(P)
is the entire space.
is symmetrical, for any two points
only if
PI
n
PG(2k, 2)
•
E
T(P )'
O
we have
passes through the nucleus of polarity
C
then the row vector
~l
A#
PO' PI
a
and
Po
E
T(P )
l
if and
It follows that the polar space of every point
point distinct from
where
Since the relation of conjugacy
~
and
for
PI
C.
If
Po
is a
is any other point on the line (paC)
PI
has the form
is the row vector of
~l (B + BT)~T
PO'
Then
(f + A~) (B + BT)~T
= A ~(B
47
+
T
T
B )~
P
So it follows that
o
i f and only i f
So
If, however, at least one of
Nand
q
is odd, then every point
is regular and hence there is a one-one correspondence between points
and their polar spaces.
The polar space
T(I )
t
points conjugate to every point of
independent points of
It'
\
L..t'
of at-flat,
It.
T(I )
t
then
If
PO,Pl, .•. ,P t
is the
P.
is the row vector of
--J.
section of the
Ir
Lr
(N-l) - fla ts
P ..
T (P .),
T(L).
t
A t-flat
PO,PI,PZ, ••• ,P t
I
t
\
L
t
are pairwise conjugate.
e-
is clearly the inter-
c T(\)
Lr
is contained in
(t + 1)
(N-t-l)-flat
O,l,Z, ... ,t.
i
1.
are any two flat spaces, then
c
T(I )
t
1.
are
i = O,I,Z, ... ,t,
Q,
where
is the set of all
If
and
if and only if
Q if and only if the points
N
Thus if
LtcQN' then
LtcT(LT ).
3.2 (N-l)-flat spaces and quadrics
Theorem (3.2.1)! Let QN be a non-degenerate quadric in
PG(N,q)
QN-I
and let
= QN
n T(P )
O
Po
be a point on
is a cone in
QN.
T(P O)
t This is a well known result.
D,)wling [21].
48
Then the quadric
with vertex
PO·
The proof given here, is that of
Proof. Let P 2 ,P 3 , .•• ,P N be
(N-2)-flat
Po.
LN- 2 ,
Further let
LN- 2
where
PI ~ T(P O)·
T(P )
O
independent points of an
and
1.
Pi
does not contain
Since
out loss of generality, to be unit points.
point
\LN-2
are independent, and we can take them, with-
P, (i=0,1,2, ••. ,N)
points
c
(N-l)
So the row vector of the
is
(0,0, ..• ,0,1,0, ... ,0),
'----,,-----'
L....- . ,_ _
i = 0,1,2, ... ,N.
(3.2.1)
N-i
i
We can then write
(3.2.2)
(3.2.3)
.-
But, i f
x B x
T
is the equation of the quadric
T(P ):
O
Ro(B
+
T
T
B)~
=
then
o.
(3.2.4)
Comparing (3.2.4) and (3.2.3) we have
b
= b
'0002
= b
03 =
and
Hence, we have
(3.2.5)
where
b.
,X.X,
1.J 1. J
It follows from (3.2.3) and (3.2.5) that
49
•
(3.2.6)
Now since
Q
N
is non-degenerate it is clear from equation (3.2.5)
that the quadric
x
is non-degenerate in
LN- 2 .
n
an integer.
to each
P
E
Let
QN'
T(P), such that
Let
C
Q
2k
Let
of polarity.
C
Since
is a cone in
be a non-degenerate quadric in
be the nucleus of polarity.
without loss of generality
n
PG(2k,2 ),
Then corresponding
T(P).
Q
N
are all distinct.
is non-degenerate and
Q
2k
P,
Further the polar spaces
be the row vector corresponding to
Q
2k
T(P )'
O
(2k-I)-flat, the polar space of
is a point of
corresponding to the points of
Proof.
(3.2.7)
PO'
there exists a
C
=
T(P ) n Q
N
O
Hence
Clearly the vertex of the cone is
Lemma (3 . 2• 2) •
I
q -
n
2 ,
C, the nucleus
we can take
as
(3.2.8)
i.e.
where
o
o
B
=
I 0 0 0
0 000
000 I 0
o0 0 0 0
000
000
000
000
(2k+l)x(2k+l)
(3.2.9)
oI 0
000
o0 I
00000
o0 0 0 0
o0 0 0 0
50
e"
e
Hence,
C
= (0,0,0, ••• ,0,0,1).
T(P)
polar space
P
P
be any point in
QN·
Then its
is
T
R.(B + BT)x
where
Let
= 0,
is the row vector of
P.
(3. Z.lO)
Since
C is the nucleus of
polarity, we have
(3. Z.11)
Hence,
C
Let
PI
and
P
2
E
T (P) •
be any two distinct points in
QN.
Then their polar
spaces are
.e
If
T(P )
l
=
R.l = (B+BT)..!T = 0,
(3.2.12)
R.2 = (B + BT)..!T = O.
(3.2.13)
T(P )
2
then the solution spaces of the equations (3.2.12)
and (3.2.13) are equivalent.
where
A is a scalar belonging to
represent the same point, i.e.
as
Hence
PI
and
P
z
GF(Zn).
PI = PZ.
That is
R.
l
and
R.
2
But this is a contradiction
are two distinct points.
Hence
T(P )
l
~
T(P ).
Z
So
the result.
Lemma (3.2.3).
If
L2k- l
is any
(Zk-l)-flat in
PG(Zk,Zn)
such
that it is not the polar space of any point in the geometry, then the
nucleus of polarity does not belong to
51
I zk- l •
Proof.
Without loss of generality we can take
2
xOx l + x 2x 3 + ••. + x2k-2x2k-l + x 2k
Then
C
is
(0,0,0,0, .•. ,0,1).
Q2k
as
e
0.
=
Now it is easy to notice that a (2k-l) -
flat will not contain the nucleus of polarity,
if and only i f the
C
linear equation representing the (2k-l)-flat does not involve the variable
x 2k ·
Further, any equation with
+ BT )~T
K(B
for some point
P
in
=
n
PG(2k,2 ).
cannot be written as
x 2k '
0,
Hence the result.
So we can conclude that there exists a one-one correspondence
between points of
polarity.
Q
2k
and (2k-l)-flats containing the nucleus of
Further the rest of the (2k-l)-flats, not containing the
nucleus of polarity cannot be expressed as polar spaces of points in
Theorem (3.2.4) t •
is any (2k-l)-flat in
If
C,
not containing the nucleus of polarity
then
n
PG(2k,2 )
Q
- \
n Q
2k-l - L2k-l
2k
is non-degenerate.
Proof.
As before, let us take
Q
as
N
2
xOxl+x2x3+ ••. +x2k_2x2k_l+x2k = 0,
and so the row vector of the nucleus of polarity
By lemma (3.2.3) we have the equation of
L2k-l
C is (0,0,0, •.. ,0.1).
as
(3.2.14)
where
L(xO'x l ,··· ,x 2k - l )
the quadric
Q2k-l
is a linear form in
So
can be written as
tDowling [22] has shown that, in
erate if and only if C ~ L - l .
Zk
52
PG(2k,2), QZk-l
is non-degen-
e
GF(Zn),
Since our base field is
Z
{L(xO,xl,···,xZk_l)} •
in the expansion of
G of
QZk-l
G
Zk
where
form
where
x
(Zk
Zk)
x
matrix
goo
1
0
0
0
0
0
gIl
0
0
0
0
=
(3.Z.l5)
Zk
gu
0
0
0
0
gZk-Z , 2k-Z 1
0
0
0
0
0
is either
1
or
0
B is the matrix of
PG(Zk,q=Zn).
Define
Let
QZk'
Q
Zk
occurs in the linear
It is easy to notice that
Hence
QZk-l
is non-degenerate in
be a non-degenerate quadric in
QZk-l = QZk n LZk-l'
Then there are
gZk-l, Zk-l
according as
L(xO,xl, ••• ,xZk_l) or not.
order one;
So the
will be
Theorem (3.2.5)t.
flat.
we will not have any cross products
where
LZk-l
is a
Nl , (Zk-l)-flats such that
QZk-l
is a cone of
NZ' (Zk-l)-flats such
that
QZk-l
is a non-degenerate
N , (Zk-l)-flats such that
3
degenerate elliptic quadric; where
4yperbolic quadric and
N
l
N
Z
t This result when
QZk-l
(qZk_l)/(q_l),
= (qZk + qk)/Z,
n = 1, is obtained by Dowling [ZZ].
53
(Zk-l)-
is a non-
and
N = (q
2k
k
- q )/2
3
Proof.
pairs
To determine
(P, L2k-1)
where
hyperplane containing
N
P
P.
(t = 1,2,3)
t
is a point of
we count the number of
Q
2k
and
L 2k - 1
Since, each point is contained in
is a
(q
2k
-1) ...
(q-1)
hyperplanes and there are (q2k_ 1)/(q_1) points in Q2k' the
2k
(q q-1
-1)2. Count1ng
.
. .1n anot h er way,
.
.
numb er 0 f pa1rs
1S
t h ese pa1rs
2k-1
2k-1
-1) + N (q
-1 + k-1) + N (q
-1 _ qk-1)
1 q - 1
2 q - 1
q
3 q - 1
N (9
2k
(3.2.16)
Le.
But
= q2k+1_ 1
(3.2.17)
q - 1
From theorem (3.2.4)
(3.2.18)
So
(3.2.19)
From (3.2.17) and (3.2.16) we have
(3.2.20)
Hence
and
and
N
3
are integers as
54
q
=
n
2 •
e"
Theorem (3.2.6).
PG(N,q),
if
Po
N
4 QN
~
2,
Let
Q be a non-degenerate quadric in
N
where at least one of
the quadric
QN-l
= T(P O)
Nand
q
is odd.
Then
n Q
N is non-degenerate in
T(P O)·
This result is obtained by Dowling [21].
We will not give the
proof of this theorem.
Theorem (3.2.7).
quadric in
intersect
Let
PG(2k+l, q).
Q2k+l
Q2k+l
be a non-degenerate hyperbolic
k
Then there are (q 2~1-q)
2k-flat spaces that
(q
in a non-degenerate quadric and
k+l
k
-l)(q +l)/(q-l)
2k-flat spaces that intersect in hyperbolic cones of order one.
Proof.
.-
From theorem (3.2.1) and theorem (3.2.6), as there is one-
one correspondence between (2k-l)-flat spaces and polar spaces with
Q2k+l'
respect to
Q2k+l
we h ave
( q 2k+l -q k) 2k- fl at spaces i ntersecting
in a non-degenerate quadric and
(q
~l
spaces that intersect in a cone of order one.
k
-l)(q +l)/(q-l) 2k-flat
These cones can be
elliptic cones or hyperbolic cones.
Suppose one of these cones is an elliptic cone.
by
Q~k'
Then the base of
Q~k'
(2k-1)-flat space and is elliptic.
spaces but no higher flat spaces.
(k-l)-flat spaces but no higher.
is a hyperbolic cone.
say
Q~k-l'
Hence
That is,
Let us denote it
is non-degenerate in a
Q~k-l
Q2k+1
contains (k-2)-flat
will contain only
But this is a contradiction as Q2k+1
Hence all cones are hyperbolic cones.
Theorem (3.2.8). Let Q2k+l be an elliptic quadric in
PG(2k+l, q).
intersect
Then there are
Q2k+1
(q2k+l + qk)
2k-flat spaces that
in a non-degenerate quadric and
(q
k
+l)(q -l)/(q-l)
~l
2k-flat spaces that intersect in an elliptic cone of order one.
55
Proof.
The first part of the theorem follows from theorem (3.2.6)
and theorem (3.2.1).
of order one.
That is, there are
k+l
k
-l)(q -l)/(q-l)
Suppose one of these cones is hyperbolic.
of this cone will contain
But
Q2k+l
hence it will not contain k-flat spaces.
cones
Then the base
(k-l)-flat spaces but no higher.
will contain k-flat spaces.
Q2k+l
(q
Hence
is elliptic and
This is a contradiction.
Hence all cones are elliptic cones.
3.3 (N-2)-flat spaces and guadrics
Lemma (3.3.1)t.
order
r
The number of points in a degenerate quadric of
in PG(n,q), is
(3.3.1)
Proof.
(r-l)-flat
Since
Lr - l ,
Q is a cone of order
N
called the vertex of
r,
QN'
Q are those of the lines joining, points of
N
QN-r;
where
QN-r
such that the points of
\
L.r-l
is a non-degenerate quadric in
obtained by intersecting
skew to
there exists a unique
Q with any
N
(N-r)-flat,
to the points of
(N-r)-dimensions
LN- r ,
\
L.r-l·
Hence
r
=
.9.-.::.!.
q -1 + (q-l)
r
=
.9.-.::.!.
q -1 + q r
x
r
(9 -1)
x (q -1) x
IQN-r I
t This is a well known result.
56
IQN-r I + IQN-r I
.
Reference Bose [6].
which is
e'
Theorem (3.3.2). Let
where
(N-r)
= Lr
n Q
QN-l
= QN
n LN- l ·
1.
if
Lr
z.
if
\
~r-l
o~der
r
in
be the nucleus of polarity of
N the vertex.
QN-l
C
Let
is even.
Lr - l
be a cone of
Let
LN- l
Q and
N
be a hyperplane and define
Lr - l ~ LN- l
Then if
is a cone of order
r-l;
is a cone of order
r+l;
LN- l ,
QN-l
C
but
\
~N-l
~ \
~r ~ ~N-l'
\
then either
3.
QN-l
is an hyperbolic cone of order r,
4.
QN-l
is an elliptic cone of order
or
The number
Nt'
is of type
t
t = 1,Z,3,4
r.
of hyperplanes
for which
LN- l
QN-l
is
N
l
N
Z
N
3
= qN-r+l(qr_l)/(q_l)
=
(qN-r_l)/(q_l)
(qN-r+q (N-r)/Z)/2'
and
(3.3.2)
Proof t.•
\
skew to
~N-r
in
\
~N-r'
Lr-l ~ LN- l ,
If
\
~r-l"
and
QN-l
joining points of
QN-l
to
The quadric
QN-r
\
~r-l
and let
\
~r-l
\~N-l
QN-r
contains
= QN
n LN- r
(N-r)-flat
is non-degenerate
clearly consists of all points on the lines
to points of
is a cone of order
Suppose now
then
c
r-l
\
•
~N-l
Lr-z
with vertex
Let
Q
= QN n LN-r·
\
N-r
\N
~ -r
n LN-l.
Hence,
Lr-Z'
be fixed
(N-r)-flat skew
There is one-one correspondence
t This proof is given by Dowling [Zl].
57
= Lr-l
\
L.N-I
between hyperplanes
of
\L.N-r'
such that
\
L.r-l
containing
and
(N-r-l)-flats
if and only if
LN-r-l '" LN-I
\
L.N-r-l'
\
\
\
L.N-r-I=LN-InL.N-r·
Since Q -r is non-degenerate we can apply Theorem (3.2.5) to obtain the
N
numbers N ,N , ••• ,N
2
3
of (N-r-l)-flats
4
\
in
LN-r-l
\
LN-r
for which
QN-r-l--Q N-r n\LN-r-l is a cone of order one, a non-degenerate hyperbolic
quadric, and a non-degenerate elliptic quadric, respectively.
But if .
LN-I"'LN-r-l' and QN-r-l is a cone of order s in LN- r , then QN-l is a one
of order r+s in
\
Lr
and
\
Ls '
IN-I.
where
The vertex
L.\ s
\
Lr+s-l
is the join of
The proof is
is the vertex of
completed by noting that if
C is the nucleus of polarity of
C
and hence
E
\
LN-r-l
i f and only i f
Theorem (3.3.3).
Let
Q be a cone of order
N
where at least one of
N-r
and
of
QN.
Then if
Let
\
LN-I
\
L.r-l
J
i
1. QN-l
otherwise
2. QN-I
Proof.
LN-r
If
skew to
in
LN- r ;
of
QN-r
of order
and
be any
is odd.
r
in
\
Lr-l
Let
(N-I)-flat and define
PG(N,q)
be the vertex
QN-I = QN n LN- l ·
\
L.N-l '
is a cone of order
r-l;
is a cone of order
r
Lr - l
Lr - l ·
QN-l
~
then
LN- l
The quadric
or
LN- l
r+l.
contains an
QN-r = QN n LN- r
Lr -2 = LN-l n Lr-l·
with vertex
=
is non-degenerate
~(N,N-I,q)
Hence
QN-I
Lr - 2 •
(N-l)-flats not containing
The number of
(N-r)-flat,
consists of all points on the lines joining
to the points of
(r-l)
q
QN-r-l'
- ¢(N-r,N-r-l,q)
58
Lr-l
is
points
is a cone
q - 1
N-rtl
q
-1
q - 1
(qN+l
qN-r+l)/(q_l).
qN+l_ l
e
Suppose now
to
\
c
\
L.N-l"
Let
QN-r
= QN
n LN-r.
L. r-l
and let
pondence between hyperplanes
flats
\
L.N-r-l
LN-r-l
=
of
\
L.N-r'
\
LN-l n L.N-r·
\
L.N-r
\L.N-l
(3.3.3)
be a fixed
Then there is one-one corres-
containing
\
L.r-l
\
~ \
L.N-r-l
L.N-l
such that
Since
(N-r)-flat skew
and
(N-r-l)-
if and only if
is non-degenerate we can apply
theorem (3.2.1) and theorem (3.2.6) to show that
is either a cone of order one or non-degenerate, as there is one-one
correspondence between
QN-r.
IQN-r l
Hence we have
cone of order one and
which
Q
N-r-l
But if
LN-r
then
IQN-r l
\
L.N-l
QN-l
~(N-r, N-r-l, q) - IQN-r l
\
L..N-r-l
and
QN-r-l
is a cone of order
(N-l)-flats that intersect
QN-r-l
is a
(N-r-l)-flats for
!QN-r l
is a cone of order
r+s
QN
in
LN-l.
in
Therefore, we have
in a cone of order
(N-l)-flats that intersect
s
(r+l); and
QN in a cone of
r.
Corollary 1.
In theorem (3.3.3) if
(q
prime power then there are
sect
(N-r-l)-flats for which
is a non-degenerate cone.
~(N-r,N-r-l,q) -
order
(N-r-l)-flats and polar spaces with respect to
QN in a cone of order
an elliptic cone of order
hyperbolic cone of order
2s
-1)/(q-l)
N-r = 2s
r.
59
q
is an odd
(N-l)-flat spaces that inter-
(r+l) ; (q2s_ qs)/2
r; and
and
(q2s+q s)/2
(N-l)-flat spaces in
(N-l)-flats in an
Proof.
intersect
From theorem (3.3.3), the number of
Q in a cone of order
N
(rtl)
(N-l)-flats that
IQN-r l .
is
Since
N-r
= 2s,
~
we have
(q
So the number of
2s
-l)/(q-l).
(N-l)-flats that intersect
Q in a cone of order
N
r
is
= ~(N-r,N-r-l,q) 9 2s+ l _ l
q -
q
But out of the
q
2s
quadric and the remaining
QN-r
1
q - 1
2s
cone of order
rand
qS(qS_l)/2
a elliptic cone of order
Corollary 2.
s
Q in a hyperbolic
N
polar spaces that intersect
r,
then there are
N-r
(q
= 2s+l and QN is an
s+l
s
+1) (q -1) /(q-l)
Q in a elliptic cone of order
N
(N-l)-flat spaces that intersect in a cone of order
Proof.
From theorem (3.2.8), as
Q
N-r
Q
in
N
r.
flat spaces that intersect
2s+l+ q s
s
Hence there are q (q +1)/2
that intersect
In theorem (3.3.3), if
elliptic cone of order
s
polar spaces will intersect
• a non- d egenerate e 11··
. t .
ln
lptlC qua d rlC
Q
N
s
QN-r' q (q :t"1)/2
in a non-degenerate hyperbolic
qS(qS_l)/2
polar spaces with respect to
q
_ 9 2s _ l
polar spaces with respect to
polar spaces will intersect
QN-r
IQN-r I
n '\
N-r
LN-r-l = QN-r-l
t Dowling [21].
60
= 2s+l,
(N-l)-
(r+l);
r.
and
e"
is either an elliptic cone of order one or a non-degenerate quadric.
Now the result follows from theorem (3.3.3).
Corollary 3.
In theorem (3.3.3), if
hyperbolic cone of order
r,
then there are
(N-l)-flat spaces that intersect
(r+l)
and
order
r.
(qZs+l_qs)
Proof.
N-r
= Zs+l
and
Q is an
N
(qs+l_l) (qs+l)/(q_l)
Q in an Hyperbolic cone of order
N
(N-l)-flat spaces that intersect in a cone of
From theorem (3.Z.7), as
Q
N-r
N-r
= Zs+l,
= QN-r-l
n \
LN-r-l
is either an hyperbolic cone of order one or a non-degenerate quadric.
Now the result follows from theorem (3.3.3).
Using these corollaries we obtain the following lemmas.
These are
nothing but summaries of the preceeding results.
Lemma (3.3.4).
q
Let
any prime power.
flat space.
QZk+l
Define
QZk
be a cone of order one in
= QZk+l
n LZk'
where
LZk
PG(Zk+l,q),
is a Zk-
Then we can partition the Zk-flat spaces into four groups,
according to the order and nature of the quadric
Order of
QZk'
number of Zk-flat spaces in the
Q
2k
corresponding group
Zk+l
Non-degenerate
q
Hyperbolic cone of order one
(q2k+qk)/2
Elliptic cone of order one
(q2k_ qk)/2
cone of order two
(qZk_l)/(q_l).
Proof.
If
q
is an odd prime power, then the result follows from
theorem (3.3.3) and corollary 1.
61
If
q
is an even prime power, then
the result follows from theorem (3.3.Z).
Lemma (3.3.5).
PG(Zk,q).
Let
Q
Zk
be a hyperbolic cone of order one in
Then we can partition the
(Zk-l)-flat spaces into three
groups according to the nature and order of the quadric
QZk n LZk-l.
order of
QZk-l =
Further
number of (Zk-l)-flats in the
Q
Zk
corresponding group
Zk
non-degenerate
q
cone of order one
q
hyperbolic cone of order two
(qk_ l ) (qk-l+l)/(q_l)
Zk-l
-q
k-l
The result follows from theorem (3.3.3) and corollary 3.
Lemma (3.3.6).
PG(Zk,q).
Let
Q
Zk
be an elliptic cone of order one in
Then we can partition the (Zk-l)-flat spaces into three
groups according to the nature and order of
where
L Zk l
order of
is a (Zk-l)-flat space.
Q
Zk
QZk-l = QZk n LZk-l'
Further
number of (Zk-l)-flat spaces in the
corresponding group
Zk
non-degenerate
q
cone of order one
q
elliptic cone of order two
(qk+l)(qk-l_l)/(q_l).
Zk-l
+ q k-l
The result follows from theorem (3.3.3) and corollary Z.
Theorem (3.3.7).
Let
LN- m be any
order at most
m,
Let
Q be a non-degenerate quadric in
N
(N-m)-flat.
Then
\
QN-m -- QN n LN-m
PG(N,q).
is a cone of
where in particular a non-degenerate quadric is
considered as a cone of order zero.
6Z
Proof.
For
(N-l)-flat spaces, the result follows from theorem
(3.2.1), theorem (3.2.4) and theorem (3.2.6).
true up to
(N-m+l)-flat spaces. Let
Suppose the result is
L - be any (N-m)-flat space.
Nm
There exists at least one (N-m+l)-flat space containing
LN-m+l be one such (N-m+l)-flat space.
by induction QN-rn+l- is
a cone
LN-m •
Let
Define, QN-m+l=QNnLN-m+l.
of order at most (m-l).
That is
Then
QN-m+l
is a cone of order at most (m-i) in LN-m+l and LN-m is a (N-m+l-l)-flat
space in
LN-rn+f.
we have that
Hence froin theorem (3.3.2) or from theorem (3.3.3)
Q
N-m
Lemma (3.3.8).
is a cone-of order at most
Let
I
O~i~j~N
--
be a non-degenerate quadric in
q
is odd.
If
of
QN+1.
b
x x
ij i j
=
Let
0
PG(N,q), where at least one of
A is a non-null element of
is non-degenerate in
Proof t .
m.
Nand
GF(q), the quadric
PG(N+l,q).
BN be the matrix of
QN and
BN+ l
be the matrix
Then
o
where
0
is a row vector of
Assume first that
q
(N+l)
is odd.
the matrix
t Dowling [21].
63
zeros.
Then
QN+l
is non-degenerate if
TO
o
is non-singular.
Since
is non-singular and
Q
N
Next suppose that
non-singular.
But
T
B+ + B +
N 1
N l
is
q
=
q
N+l.
2
is non-degenerate and
o.
:f
2>'
Hence
is even.
n
2>.
implies
Since
N+l
q
is odd,
B + B;
N
is non-singular.
Then
N
=
0,
2>.
is odd and
B + B;
N
so that the rank of
is even, however, the rank of
is one more than the rank of
is
Hence
QN+l
QN+l
has rank
N+2
and therefore is non-degenerate.
Theorem (3.3.9).
Let
Po
and
i)
if
PI
Let
Q
N
be a non-degenerate quadric in
be any two points on
QN'
PG(N,q).
Define
Then
PI
LN- 2
ii)
if
4 T(P 0)
QN-2
is non-degenerate in
T (P0)
QN-2
is a cone of order
;
PI
E
two in
Proof.
T(P )
l
LN- 2 •
If at least one of
are well defined.
are well defined as
PO' PI
E
~ T(P O)
Also,
PI
€
T(P 1)
and
Po
E
T(P )'
O
Suppose
q
is odd then
Also if both are even then
PI
Part i) :
Nand
QN'
=> P
64
0 ~ T (P 1) •
T(P )
O
T(P )
O
and
and
T(P )
l
Now choose
P 2 ,P , .•. ,P N as
3
the
(N-l)
Without loss of generality we can take the row vectors of
P. = (0,0, ..• ,0,1,0, ••. ,0)
i
...
-:L
p.
1
So,
or
Since,
b·
i.e.
--
OO
= b
T(P O):
Similarly,
T(P )
l
= b
02
03
- ... -
°
Xl =
b
ON =
°
=
0,
is
Since
we have
=
° °
T(P 1):
X
SO,
LN-2
= T(P l ) n T(P O):
QN:
X
Xl
o=°
X
Therefore
Bx
or
(xO,x I '··· ,~)
T
=
°
° b OI ° °
° ° °b b°
° ° 22 2l
°
°b
°° ° °
b
65
as
0,1,2, •. .,N.
N-i
i
LN- 2 •
independent points of
X
o
Xl
2N
x
NN
~
2
=
°
Le.
Hence
That is
QN-Z
Part ii) : Suppose
= QN
n'LN-Z
P
E
T(P 0)
1
is non-degenerate in
Then
Po
E
T(P )
Also,
P
E
T(P 1)
So,
PO' PI
Let
Pz be a point in
and
P3
1
'N-Z.
L
1
E
and
Po
E
T(P O)
as
PO'P I
IN- z .
be a point in
T(P 0)
LN-Z'
T(P 1) - LN- Z ·
Without loss of generality we can take the row vector of
=
QN .
E
(0,0,0, ... ,0,1,0, ... ,0),
p.
1
as
i = O,l,Z, ... ,N.
'-----..-r--'
e-
N-i
i
Hence
can be written as
bOO = b Ol = b OZ = b 04 = boS = ••• = bON = 0 •
Hence
T (P 0) :
x
3
= 0 •
Similarly we can show that
that is
= b
Hence the quadric
Q
N
can be written as
66
1N
= 0 •
+ x 3 (b 03 x O + b 33 x 3 + b 34 x 4 +...+
+ Q(x4,x5' ••• '~)
=
b3N~)
0
Hence
can be written as
That is
(N-l)
QN-2
is a cone of order two.
For
variables and by theorem (3.3.8)
QN-2
QN-2
equation has only
can be at most a cone
of order two.
Theorem (3.3.10).
where at least one of
and
Let
Nand
PI
be any point not on
1.
if
PI
LN-2
2.
if
Proof.
Q be a non-degenerate quadric in
N
q
is odd.
QN·
Let
Po
Define
LN-2
=
QN-2
4 T(P O)
then
LN-2
n QN
T(P O)
then
LN- 2
n Q
=
be any point on
T(P O) n T(P l ).
and
Then
'
PI
E
Since at least one of
T(P )
l
are unique.
Part 1.
Also,
Case a):
Q
N
is non-degenerate in
N = QN-2
N and
q
is a cone of order one.
is odd, there is a one-
one correspondence between polar spaces and (N-l)-flat spaces.
T(P )
O
PG(N,q),
q
n
,2,
n
integer.
67
Hence
Then
P1 ~ T(P 1)
Let
(N-1)
p*
1
be a point in
points in
T(P 1) - LN-Z'
We can choose
P Z,p 3 "",P N
Po,Pt,PZ""'P N form a set of
such that
LN- Z
independent points.
PI ~ QN'
as
With out loss of generality, we can choose the
vectors corresponding to these points as unit vectors, i.e.
(0,0, ... ,0,1,0, ••• ,0),
!:t
= (0,1,0, •.. ,0).
T(P)
0:
i.e.
i = O,Z,3, ... ,N
T
P (B+BT)x
=-0
-
=
°
ZbOOxO + b 01 x 1 + •.• + bONXN =
PO'P Z,P 3 "",P N
°
T(P O)
E
°
T(P 0):
xl =
T(P)
1:
T
P (B+BT)x
-1
-
e-
Also,
T(P 1 ) n T(P O):
QN:
X
B x
= 0, Xl = 0,
Xo
T
=0
as
PO,P!
E
LN-Z'
°
=
x 1 (b ul x O + b n x 1 + ... + blN~) + Q(xZ,x3'''''~) = 0.
Le.
LN- Z'
It is easy to see that in
Case b)
Then
SO, let
q
PI
E
= Zn,
T(P 1 )
is non-degenerate.'
n integer.
PI ~ QN'
but
P 'P 1 'P Z""'P N be
O
P Z,P "",P
belong to
N
3
Q(xZ,x3""'~)
(N+1)
LN- Z'
independent points such that
Without loss of generality, we can
take
P. = (0, ••. ,0,1,0, .•. ,0),
-1.
Hence
Xl
and
-1
=
°
T
P (B+BT)x
=
-
68
°
i = O,l,Z, •.. ,N.
or
x
°
b ll #
But
PI ~ QN'
as
x B x
°
°
°
=
T
=
Le.
Therefore
Hence the result.
Part Z:
.-
PI
E
Also,
Po
q
Case a):
T(P O) => Po
E
T(P O)
# Zn.
as
n
P
z
be a point in
independent points in
Po
E
QN·
an integer
PI ~ T(P 1)
Let
T(P l ).
E
PI ~ QN'
as
Let
T(P 1) - LN-Z·
PO,P 3 ,P 4"" 'PN be
(N-l)
Without loss of generality we can take
LN-Z·
P. = (0,0, ... ,0,1,0, .•• ,0),
-J.
i
=
O,l,Z, ... ,N.
Then
But
bOO = b Ol = b 03 = b 04 =••• = bON =
That is
T(P O):
X
Similarly
T(P 1):
xl =
Therefore
LN-Z:
xl
QN:
z=
°.
°.
= 0,
xBxT=O
69
X
° as
z=0
PO,Pl,P3,P4,···,PN
E
T(P O)·
or
That is
But
Q(x3,x4' •.. '~)
(N-3)-flat space.
is non-degenerate in
Case b)
q
=
Zn,
So,
PI
E
T(P1) •
Also
PI
E
QN
That is
b
Also,
Po
E
T(P )
O
Let
P
z
E
T(P )
O
and
P
3
E
T(P )
l
p.
xl
= 0,
X
z=0
PO.
integer
n
(Hypothesis)
&
Po
E
e·
QN
=>
bOO
= O.
- LN- 2
- LN- Z·
LN- Z
P4,P S,···,PN in
~
= 0,
:; 0
ll
are independent.
o
Therefore
is a cone of order one with vertex
Choose points
X
such that
PO,Pl,PZ,P3,P4, .•. ,PN
Without loss of generality we can take
=
(0,0, ••• ,0,1,0, ••. ,0),
i = O,l,Z, ... ,N.
Then
But
= b OZ = b04 = b OS
So,
b Ol
So
T(P ):
O
x
Similarly
T(P ):
l
X
3
=0
z=
0
or
70
- .•• - bON
=
0, but
b
03
:; O.
That is
+ x3(b03~+b33x3+···+b3N~)
+ Q(x4,x5' ... '~) = 0
Hence
xZ
=
It is easy to see that
is non-degenerate in
O
= 0, xl = 0,
X
z = 0,
O
=
x
= O}
{x
x
3
= O} flat.
Hence, by lemma (3.3.8),
"-
is non-degenerate in
{x
0,
X
z = 0,
3
flat as
N is odd.
Therefore
is a cone of order one in
Theorem
PG(N,q),
on
QN.
(3.3.11}t.
where
q
Define
1)
vertex
if
at
PI
Po
E
Let
= T(P O)
and
3
=
OJ.
Q be a non-degenerate quadric in
N
is odd, and let
LN-Z
flat containing
{xZ = 0, x
n T(Pl )
LN- Z.
Po
and
and let
PI
be two points not
LN-l
be the (N-l)-
Then
LN- l , LN-Z n QN is a cone of order one in
the point
Pz where the line
tDowling [Zl]
71
(POP l )
meets
QN;
LN- z with
PI ~ LN- l ,
LN- 2
Theorem (3.3.12)t.
Let
2)
if
PG(2k+l, q),
not on
where
Q2k+l.
1)
if
q
Q2k+l
PI ~ T(P O)'
if
Po
and
L2k- l ~ T(P O) n T(P l ).
vertex at the point where the line
2)
be a non-degenerate quadric in
T(P O)' L2k-l n Q2k+l
E
then
LN- 2 •
is non-degenerate in
is even, and let
Define
PI
n QN
PI
Then
is a cone in
(POP l )
meets
L2k - l n Q2k+l
be two points
L2k-l
with
Q2k+l'
is non-degenerate in
L2k-l·
Theorem (3.3.13).
n
PG(2k,2 ),
Define
Q2k·
1)
n
Let
an integer.
L2k-2
Let
= T(P O)
if
PI
E
T(P O)'
if
PI
1 T(P O)'
Q
2k
be a non-degenerate quadric in
Po
and
n T(P l )·
L2k-2 n Q2k
PI
be any two points not on
Then
is a cone of order one in
L2k- 2 ;
2)
Proof.
T(P )
O
and
L2k-2 n Q2k
and
p*
1
L2k-2·
From lemma (3.2.2) and from lemma (3.2.3) we have that,
T(P )
l
can be expressed as the polar spaces of some
particular points on the quadric
P~
is non-degenerate in
on
Q2k
Q2k·
That is there exists points
such that
and
Now, the result follows from theorem (3.3.9).
t Dowling [21]
72
Lemma (3.3.14).
PG(N,q).
in
Let
PG(N,q).
I N- z n
QN
Po
Let
and
Define
Q be a non-degenerate quadric in
N
PI
be any two points (not on the same line)
= r(p 1 )
\
LN-Z
then
n r(p )
a .
I N- z.
is non-degenerate in
Proof. The result follows from theorems (3.3.9), (3.3.10),
(3.3.11), (3.3.lZ) and (3.3.13).
Theorem (3.3.15).
quadric in
Define
Let
PG(Zk+l, q).
QZk-l
= I Zk - l
QZk+l
be a non-degenerate hyperbolic
LZk - l
Let
n QZk+l"
be any
(Zk-l)-flat space.
Then we can partition the
(Zk-l)-flat
spaces into four groups according to the nature and order of
QZk-l'
Further
order of
number of (Zk-l)-flats in the
corresponding group.
QZk-l
hyperbolic
qZk(qk+l_l)(qk+l)/Z (q-l)
elliptic
qZk(qk+l_l)(qk_l)/Z (q+l)
cone of order one
qk-l(qZk_ l ) (qk+l_l)/(q_l)
hyperbolic cone of order two (qk+l_l)(qk-l+l)(qZk_l)/(q_l)Z(q+l).
Proof.
For any (Zk-l)-flat space,
QZk-l
= QZk+l
n
LZk-l
= QZk+l
n
(L Zk
=
(QZk+l n
I zk )
LZk-l'
LZk-l)' where
n
n
LZk-l .
Since any (Zk-l)-flat intersects the quadric
section
Q2k-l'
LZk ~ LZk-l
QZk+l
in a unique
it does not make any difference what,
considered, as long as
LZk
contains
73
LZk - l .
Define
LZk' Zk-flat is
LZk .
for any Zk-flat space,
From theorem (3.Z.7) it follows that we
can partition the Zk-flat spaces into two groups - C :
l
the class of
C :
Z
the class of
Zk-flat spaces for which
QZk
is non-degenerate;
Zk-flat spaces for which
QZk
is a 'hyperbolic cone of order one.
Icll
=
(Zk+l
q
-q k)
ICzl
=
(q
Also
and
Case 1):
Let
Let
LZk- l
L~k
k+l
k
-l)(q +l)/(q-l).
be any Zk-flat in
be any (Zk-l)-flat in
C •
l
L~k.
Define
Define
eis non-degenerate in
and
is a (Zk-1)-f1at space in
Then we can apply theorem (3.Z.5) for the case
Thus, the
(q
Zk+1
-1)/(q-1) (Zk-l)-f1at spaces in
,0
LZk
q
is even.
can be parti-
tioned into three groups according to the nature and order of
That is
order of
respect to
When
q
°
number of (Zk-1)-flats in the
corresponding group
QZk-l with
L~k
Zk
cone of order one
(q
-1) / (q-1)
hyperbolic
(qZk+qk) /Z
elliptic
(qZk_qk)/Z .
is odd, we can apply theorem (3.Z.l) and theorem (3.Z.6).
Then, we obtain that there are
(qZk_1)/(q_l) (Zk-l)-flat spaces in
74
e
I~k
that intersect
q
(Zk-l)-flat spaces in a non-degenerate quadric.
Zk
these
q
Zk
0
in a cone of order one and the remaining
QZk
qk(qk+l)/Z
in an elliptic quadric and
hyperbolic quadric t .
Case Z):
Let
I Zk- l
(Zk-l)-flat spaces intersect
(Zk-l)-flat spaces in a
This completes the proof for case 1.
L~k be any Zk-flat in CZ• Define,
= QZk+l
Qik
Let
k k
q (q -l)/Z
(Zk-l)-flat spaces
Further, out of
I~k
n
•
be any (Zk-l)-flat space in
Iik.
Then define
II\'
= QZk
QZk-l
Q;k
is a
~perbolic
n lZk-l·
cone of order one in
I
the
(Zk-l)-flat spacesof
(qZk+l_l)/(q_l)
•
and
LZk-l
is a
Hence by theorem (3.3.5), we can partition
(Zk-l)-flat space in
zk
I;k
Iik
into three distinct
1
QZk-l;
groups according to the nature and order of
i.e.
order of
number of (Zk-l)-flats in the
corresponding group
hyperbolic
q
cone of order one
(q
hyperbolic cone of order two
k
k-l
(q -l)(q
+l)/(q-l).
Zk
2k-l
-q
k-l
)
Now, we shall count the number of (Zk-l)-flat spaces for which
\'
QZk-l -- QZk n lZk-l
is a cone of order one.
in each Zk-flat there are
QZk-l
is a cone
(Zk-l)-flats.
0
(q
Zk
f or d er one.
-1) / (q-l)
Also, i n
Hence there are
t Dowling [Zl]
75
Firstly, in class
C ,
l
(2k-l)-flat spaces for which
CZ'
there are
(q 2k-l_qk-l)
2
(9 -1)
(q -1)
(2k-l)-flats for which
Q2k-l
(2k-l)-flat occurs in exactly
+
is a cone of order one.
(q+l)
2k-flat spaces.
number of (2k-l)-flat spaces for which
Q2k-l
But each
Hence the
is a cone of order one
is
( 9k+l_ l )( qk+l)x( q 2k-l_ qk-l)} /(q+l)
(q - 1)
=
q
k-l
(q
2k
-l)(q
k+l
-l)/(q-l).
Similarly we can obtain the remaining values.
flat spaces, for which
Q2k-l
is
The number of (2k-l)-
hyperbolic, is
1
2(q-l)
The number of (2k-l)-flat spaces, for which
Q2k-l
is elliptic, is
=
Finally, the number of (2k-l)-flats, for which
cone of order two, is
76
Q2k-l
is hyperbolic
e
Theorem (3.3.16). Let QZk+1 be an lliptic quadric in
PG(Zk+l, q)
and
LZk - 1 be any (Zk-l)-flat space.
Define
QZk-1
Then we can partition the (Zk-l)-flat spaces into
QZk+1 n LZk-I"
four groups according to the nature and order of
order of
QZk-l.
Further
number of (Zk-l)-flat spaces in
the corresponding group.
QZk-1
yperbolic
qZk(qk+I+ I ) (qk+l)/Z(q+l)
lliptic
qZk(qk+I+ I ) (qk_l)/Z(q_l)
qk-l(qk+l+ l ) (qZk_I)/(q_l)
cone of order one
lliptic cone of order two (q
k-l
-l)(q
k+l
Zk
Z
+l)(q -l)/(q-l) (q+l).
Proof. The proof of this result is exactly the same as that of
theorem (3.3.15).
Theorem {3.3. 17). Let QZk be a non-degenerate quadric in
PG(Zk, q).
LZk - Z n QZk'
Let
LZk-Z
be any (Zk-Z)-flat space.
Define
QZk-Z
=
Then the (Zk-Z)-flat spaces can be partitioned into four
distinct groups according to the nature and order of the quadric
QZk-Z'
Further
order and nature of QZk-Z
number of (Zk-Z)-flat spaces in
the corresponding group
non-degenerate
qZk(qZk_I)/(qZ_I)
hyperbolic cone of order
one
q
elliptic cone of 0rder one
q
cone of order two
(q
Proof.
k-l
(q
k-1
Zk
Zk
k-l
-l)(q +1)/Z(q-1)
(q
k-1
-l)(q
-l)(q
Zk'
-l)/Z(q-l)
Zk-Z -l)/(q Z-I) (q-l)
The proof of this result is exactly the same as that of
theorem (3.3.15).
77
QN be a non-degenerate quadric in
Let
be any
(N-m)-flat space.
be the number of
Define
(N-m) -flats in
cone of order zero.
LN- m n
QN-m =
PG(Ntq)
Let
PG(Ntq) .
Let
QN'
that intersect
LN- m
n (N tN-mt r tq)
QN in a
In particular t by a cone of order zero we mean a
non-degenerate quadric t and by a cone of order
intersection is empty.
That is
n (N ,N t r, q)
r
~
"-1"
we mean that the
Also
-1.
{o1
if
if
r = 1
r:f
°
=0
if
r
m.
From theorem (3.3.7) we have
n(N,N-mtrtq)
-1 ::;; r ::;; m.
Hence
Let
L~-m+l be any (N-m+l)-flat space in
Q~-m+l = QN
any
or
in
>
L~-m+l'
n
Let
or
\0
LN-m+l
r-2+i t
r+l.
Let
Q~-m+l be a cone of order r.
a.(N-m t r)
1.
such that
Define
Then for
L~-m+l' QN-m be either a cone of order
(N-m)-flat space in
r
PG(N,q).
denote the number of
Q
- QO
n \
N-m - N-m+l
LN-m
r-l
(N-m)-flats
is a cone of order
= lt2t3).
(i
Theorem (3.3. 1.!).
I
n(N,N-mtr,q) =
i=l
Proof.
result for
Assume the result up to
m.
Let
m-l.
-1::;; r ::;; (m-l).
QN-m+l
°
If
be a cone of order
r = -1,
78
then
[q~l ]
q -1
Now we shall prove the
L~-m+l be any (N-mtl)-flat and
Suppose
where
n(N t N-m+l,r-2+i t q) x a 4_ i (N-m.r-2+i)x
QN-m
r
°
QN-m+l
=
in
\0
LN-m+l t
is clearly empty.
e
From the assumption, there are exactly
spaces that intersect
N
and theorem (3.3.3)
Q
N-m
in a cone of order
"'N-m+l
\'
L.N-m'
I~-m+l'
in
e_
r-l
or
r
or
1.
r-2+i, (i = 1,2,3).
Therefore the total
QN in a cone of order
n(N,N-m+l,r-l,q)
x
a (N-m,r-l)
3
+ n(N,N-m+l,r,q)
x
a (N-m,r)
2
+ n(N,N-m+l,r+l,q)
If a
By theorem (2.3.2)
a.(N-m,r) (N-m)-flat spaces that intersect
number of (N-m)-flat spaces that intersect
T
r.
is either a cone of order
I~m+l there are
In
Q-
in a cone of order
(N-m) -flat
Consider any
r+l.
Q
n(N,N-m+l,r,q) (N-m)-flat
r,
a (N-m,r)
l
x
(N-m+l)-flat intersects Q in a cone of o,rder greater than r+l or
N
less than
r-l, then the (N-m)-flats, in these
not intersect
QN in a cone of order
counted
(qm_l)/(q_l)
PG(N,q)
that intersect
times.
r.
In
(N-m+l)-flat spaces, do
Teach (N-m)-flat is
Hence the number of (N-m)-flats in
QN in a cone of order
n(N,N-m,r,q) = T x
r
is
.9.=!....
m
q -1
The values of
a.(N-m,r), i = 1,2,3
theorem (3.3.3).
1.
The result for
are given by theorem (3.3.2) and
m = 1,2
is obtained in sections (3.2)
and (3.3).
3.4.
Retrieval schemes based on quadrics
Let
AI' A2 , A , •.• , A
l
3
Denote the buckets by
bucket
be
l
attributes each with two levels.
B , B , •.. , B •
2
b
I
B..
1.
79
Let
k.
1.
be the size of i-th
Theorem (3.4.1).
Given that the retrieval pertains to only one
level of each attribute, there exists a filing scheme based on a nonn
degenerate quadric in
PG(2s, q=2 )
queries.
~
(q-l)
o
It involves
which is oriented toward 2s-fold
= (q 2s -l)!(q-l)
attributes and
b = (q2s+l_ l )+
buckets.
Proof.
Consider a non-degenerate quadric
The quadric has exactly
(q
2s
-l)!(q-l)
the quadric with the attributes.
of the quadric with
points.
(2s-l)-flats
Q
- Q
n L2s-l
\
2s-l - 2s
n
PG(2s,q=2 ).
Identify the points of
Further correspond the intersections
to the buckets.
to each (2s-l)-flat we have a bucket, that is
Since
in
b = (q
Hence corresponding
2s+l
-l)/(q-l).
is not necessarily of the same order and
nature for each (2s-l)-flat, the block sizes are unequal.
From theorem
(3.2.5) we have the block sizes.
Consider any 2s points in
L2s - l '
Q2s.
containing the 2s points.
Then there exists a (2s-l)-flat,
If all the 2s points are independent
then there exists exactly one (2s-l)-flat; otherwise there exists more
than one (2s-l)-flats.
of size
Hence the filing scheme
satisfies all queries
2s.
Now we shall describe a storing rule and retrieval rule for the
above filing scheme.
Storing rule:
where
\
L2s-l
Each bucket is identified with
is (2s-l)-flat space.
Hence there
is a one-one correspondence between (2s-l)-flats and buckets.
So. the
coefficients of the equation of a (2s-l)-flat are taken as the identification number of the corresponding bucket.
Suppose a bucket contains
k
attributes
Al'A2' .••• ~.
A record
will be stored in the bucket if it has 2s attributes in common with
Al.A2 ••..• ~
(k 2 2s).
A further refinement in the storage can be
80
made by subdividing the buckets into subbuckets such that corresponding
to each 2s-fold query associated with a bucket there corresponds a
subbucket.
Accession numbers can be assigned to the subbuckets such
that there is no duplication within a bucket.
In this case more than
one subbucket will have to be searched to retrieve records pertaining
to a particular query.
Retrieval rule:
Retrieval of records pertaining to a given 2s-
fold query is performed in two stages.
First the attributes are
identified with the corresponding points of geometry.
flat containing these points is obtained.
Than a (2s-l)-
Once the (2s-l)-flat is
determined, the bucket can be located by matching the identification
numbers.
Finally, the subbucket is determined and all the accession
numbers in the subbucket give the required records.
0_
Some additional
subbuckets may have to be searched if a record is stored exactly once
in a bucket.
Example (3.4.1).
Let
s
=
2
and
q = 2.
Then
4
I = 2 -1 = IS.
Let the attributes correspond to the points of the quadric
in
That is
PG(4,2).
Al = 10000,
A
2
01000,
A = 00100,
3
A = 10010,
6
A = 11011,
12
01100,
A = 00111,
12
A
7
A = 01010,
8
11001,
A
4
00010,
A
9
AS
10100,
A = 11001,
lO
A
13
01111,
A = 10111,
14
A = 11110 •
IS
Hence the blocks and the attributes associated with them are,
81
(3.4.1)
No.
Identification
Linear equation
Attributes
B1
BZ
10000
X
0
Az,A3,A4,A7,AS,A1Z,A13
01000
A1,A3,A4,AS,A6,A1Z,A14
B
3
B
4
B
S
B
6
B
7
B
S
Bg
B
10
B11
00100
xl = 0
0
Xz
A1,AZ,A4,A6,AS,Ag,A11
00010
x
0
A1,Az,A3,AS,A7,Ag,A10
BIZ
B
13
B
14
B
1S
B
16
B
17
B
1S
BIg
B
20
B21
11010
B
22
B
23
B
24
B
2S
B
Z6
B
27
B
2S
B
29
B
30
B
31
10011
o
10100
01010
11000
10010
01100
00110
11100
10110
01110
11110
00001
10001
01001
00101
00011
10101
OnOl
OJ011
lll11
U)Ol
00111
11101
11011
10111
01111
=
3
x +x
O 2
x +x
1 3
x +x
O 1
x +x
O 3
x 1+x
2
x2+x
3
xO+x1+x2
=
0
A2,A4,AS,AS,A10,A14,A1S
=
0
A1,A3,AS,AS,A11,A13,A1S
0
A3,A4,Ag,A10,A11,A12,A1S
0
A2,A3,A6,A7,A10,A13,A1S
0
A1,A4,A6,A7,A10,A13,A1S
0
A1,A2,Ag,A12,A13,A14,A1S
=
0
A4,AS,A7,Ag,A11,A13,A14
x +x +x
O 1 3
x +x +x
O 2 3
x +x +x
1 2 3
X +x +x +x
o 1 2 3
x
4
x +x
O 4
x +x
1 4
x2+x
4
x +x
3 4
x +x +x
O 2 4
x +x +x
O 3 4
x +x2+x
1
4
x +x +x
1 3 4
x +x +x +x
O 1 2 4
x +x +x
O 1 4
x 2+x +x 4
3
x +x +x +x
O 1 Z 4
x O+x 1+x +x 4
3
x +x +x +x
O 2 3 4
Xl+x2+x 3+x4
=
0
A3,A6,AS,Ag,A10,A13,A14
=
0
A2,AS,A6,A10,A11,A12,A13
=
0
A1,A7,AS,A10,A11,A1Z,A14
=
0
0
AS,A6,A7,AS,Ag,A12,A1S
A1,Az,A3,A4,AS,A6,A7,AS,A1S
=
0
A2,A3,A4,A7,AS,Ag,A10,A11,A14
=
0
A1,A3,A4,AS,A6,Ag,A10,A11,A13
=
0
A2,A3,A4,A7,AS,Ag,A10,A11,A14
=
0
A1,A2,A3,AS,A7,A11,A12,A13,A14
=
0
A2,A4,AS,AS,Ag,A11,A1Z,A13,A1S
=
0
A2,A3,A6,A7,Ag,A10,A12,A13,A1S
0
A1,A4,A6,A7,Ag,A11,A12,A14,A1S
0
A1,A3,AS,AS,Ag,A10,A12,A14,A1S
0
AS,A6,A7,AS,A10,A11,A13,A14,A1S
=
0
A3,A4,A13,A14,A1S
0
A1,A2,A10,A11,A1S
0
A4,AS,A7,A10,A12
=
0
=
0
=
0
A3,A6,AS,A11,A12
A ,A ,A ,Ag ,A
14
3 S 6
A1,A7,AS,Ag,A13
=
There are 31 blocks - 15 buckets are of size 7, 10 of size 9 and 6
of size 15.
The identification numbers of the buckets are the coefficS2
e
e
.
ients of the general equation
where
c.
E
~
GF(Z).
The subbuckets in a bucket are formed by con-
sidering all possible 4-p1ets in that bucket.
The identification of a
subbucket is formed by concatenation of binary representation of the
4 attributes corresponding to it.
No.
Bucket 1d.
Subbucket
11001
A3A4A13A14
For example,
Accession number of records
A3A4A13A15
A3A4A14A15
A3A13A14A15
....
........ .... ....................... ...............................
A4A13A14A15
00111
A1AZA10All
A1AZAlOA15
A1AZA11A15
A1AlOAllA15
AZA10A11A15
..................................................................
And so on.
point
A record is stored in a bucket if it contains at least one
of the bucket.
f (I) =
(where
A.~
For example a record with
(A1 ,A2,A3,A4,A5,A6 ,A7 ,A8 ,Ag ,A10 ,A U
,A12 ,A13 , A14 ,A15 )
indicates the presence of i-th attribute and
the absence of i-th attribute) is stored in the bucket
since
IB Z6
n f(l)
I
~ 4.
83
indicates
This record is also stored in the buckets
10010(B
a),
and
10100(B )'
S
10001(B
17
),
00001(B
16
),
Further, if the record has more than
~
four points in common with a bucket, then it will be stored in more
than one subbucket.
This duplication, in a bucket, is avoided by
ordering the subbuckets.
that comes across.
The record is stored in the first subbucket
But a four tuple may occur in more than one bucket.
That is the same subbucket will occur in more than one bucket. This is
avoided by using the chaining procedure between subbuckets.
For example,
the buckets
A ,A ,A ,
2 3 4
A
7
and
a.
A
B
16
and
H , have five points in common i.e.
17
Hence the subbuckets
AAA A
2 3 4 7
AAAA
2 3 4 a
AAAA
2 3 7 a
4It .
A2A4A7Aa
AAA A
3 4 7 a
occur in both these buckets.
So instead of storing the accession
numbers of the records which have these attributes, the addresses of
these subbuckets in
B16
are stored in
B •
17
This avoides a lot of
repetitions of the accession numbers.
Given any query of order four the attributes of this query are
converted into the points of the geometry.
Then a 3-flat containing
these points is determined.
The rest of procedure consists of identifying
the buckets and subbuckets.
As an example, suppose the query is
{A A A A }'
l 3 4 S
The corresponding points, from (3.4.1), are
=
A1
=
10000
A
4
A3
=
OO~OO
AS = 10100
a4
00010
These points are not independent.
containing these four points.
Hence there is more than one 3-flat
Suppose the equation of the 3-flat is
(3.4.2)
Then,
o
Hence,
The 3-flats containing the points are
x
1
= 0,'
x
=
o·
4'
Without loss of generality we can consider the 3-flat,
Hence the bucket containing these points is
bucket is obtained by matching
numbers.
01001
x +x
l 4
i.e.
= o.
B •
IS
The sub-
Al A A A with subbucket identification
3 4 S
As the records are stored in the first subbucket that comes
across, we have to search all the preceeding subbuckets in the bucket
to obtain all the records satisfying the query.
{A A A A }
1 3 4 S
subbucket is the first subbucket.
But in
B ,
18
Hence we do not have to
search any other subbuckets to obtain all the accession numbers
satisfying the query.
Theorem (3.4.2).
Given that the retrieval pertains to only one
level of each attribute, there exists a filing scheme based on a
hyperbolic quadric in
fold queries.
PG(2s+l, q) which is oriented toward (2s+l)2s+l l
-1 + qS attributes, b = (q2s+2_ l )_·.
It involves l = q
q -
2s+l
s
2s
buckets such that (q
-q) buckets have size (q -1)/(q-1)
(qs+1_1) (qs+1)
2s 1
and the remaining
buckets have size (9 -1) + qS.
(q-1)
(q - 1)
q -
85
Proof. The proof of this result is exactly the same as that of
the previous result.
Example (3.4.2).
Q3:
=1
Let
s
b
= IS.
and
q
= Z.
Consider the quadric,
xOx l + x Zx 3 = O.
I = g,
Then,
Al = 1000,
1010,
A
7
A
Z
0100,
1001,
AS = 0001,
A
0010,
0110,
Ag
3
0101,
1111.
The buckets and the corresponding attributes are:
Bucket no.
Identification
B
l
1000
B
Z
B
3
B
0100
Attributes
AZ,A3,A6,A7,AS
Al'A 3 ,A4 ,AS ,A S
Subbuckets
all triplets
all triplets
0010
Al ~ AZ ' AS ' Aj , AS
"
4
B
S
0001
Al,AZ,A3,A4,A6
A ,A ,A ,A ,Ag
Z 4 7 S
"
"
B
6
B
1001
AZ·~3,AS,A6~Ag
"
0110
Al,AS,A6,AS,Ag
"
0101
Al'A 3 ,A4 ,A ,Ag
7
A4,AS,A6,A7,Ag
'A ,A ,Ag
3 8
Al,AZ,Ag
A ,A ,A
4 6 S
A ,A ,A
3 S 7
A ,A ,A
Z 4 S
A ,A ,A
l 6 7
"
7
B
S
Bg
BlO
Bll
BIZ
B
13
B
14
B
lS
1010
1111
1100
0011
1110
1101
1011
0111
"
A A Ag
3 S
AlAZAg
AAA
4 6 8
AAA
3 S 7
AAA
Z 4 S
AAA
l 6 7
Each of the first nine buckets have each ten subbuckets.
96 subbuckets.
But t h ere are on 1y
( 93~U -- 84
triplets.
So there are
The storing and
retrieval rules are exactly the same as in example (3.4.Z).
86
Theorem (3.4.3).
Given that the retrieval pertains to only one
level of each attribute, there exists a filing scheme based on a nondegenerate elliptic quadric in
fold queries.
It involves
the remaining have size «q
q = 2
=
q - 1
(q2s+l + qS)
buckets such that
When
.e.
PG(2s+l, q),
ls+l_l
s
and
2s
- q
oriented toward (2s+l)q2s+2_ l
attributes, b = (q _ 1 )
buckets have size
-l)/(q-l»
(q
2s
-l)/(q-l)
and
s
- q •
s = 1, there are five buckets which have size one.
These buckets are redundant, when only the queries of size three are
considered.
So the number of buckets will be less than the value
specified by the theorem.
and
s.
Example (3.4.3).
--
Let
This will not arise for higher values of
Q3:
X
It has
2
o + xOx 1
.e. = 5
q = 2,
s = 1.
2
+ Xl + x2 x = O.
3
and
b SIS.
Al = 0001,
A = 0111,
3
A = 0010,
2
A = lOll.
4
AS = 1111 •
The buckets and their corresponding attributes are:
Bucket no.
Identification
B
l
B
2
B
3
B
4
B
S
B
6
B
7
B
8
B
9
B
IO
1000
Attributes
AAA
l 2 3
AAA
l 2 4
AAA
1 2 S
AAA
3 4 S
AAA
1 4 S
AAA
2 4 S
AAA
1 3 S
AAA
2 3 S
Al A A
3 4
A2A A
3 4
0100
1100
0011
1010
1001
0110
0101
1110
1101
.............................................
87
q
Identification
Bucket no.
Attributes
0010
0001
0011
0111
1111
In fact we can delete all those buckets whose size is less than query
size.
The above filing scheme is optimum.
In the sense, each triplet
occurs exactly once.
Theorem (3.4.4).
Given that the retrieval pertains to only one
level of each attribute, there exists a filing scheme based on nondegenerate hyperbolic quadric in
2s-fold queries.
b
=
It involves
which is oriented toward
{(qs+l_l)/(q_l)} + qS
(q2s+2_ l )(q2s+l_ l )/(q2_ l )(q_l)
(qs+l)/2(q_l)
PG(2s+l, q)
buckets have size
attributes, and
buckets such that
q2s(qs+l_l) x
{(q2s-l_ l )/(q_l)} _ qS-l; q2s(qs+l_l)
s
2s-l
s-l
s-l 2s ) s+l )
-1) / (q-l) }-q
; q
(q -l(q. -1 T
(q -1) /2 (q+l) have size {(q
2s-l
...
.
(q-l) have size (q
-1) / (q-l) and finally the remaining buckets have
2s-l
s+l
s
size (q
+ q
- q -l)/(q-l).
Proof. The attributes are identified with the points of a nondegenerate hyperbolic quadric in
PG(2s+l, q).
The intersections of
the quadric with (2s-l)-flat spaces are considered as buckets.
So, the
number of buckets is equal to the number of (2s-l)-flat spaces in
PG(2s+l, q).
(3.3.15).
The actual sizes of these buckets are given by theorem
The storage and retrieval rules can be taken as before.
Example (3.4.4).
Let
s
=1
and
q
= 2.
Then
i
= 9,
b ~ 35.
(as in example 3.4.2) •
We can neglect 11 buckets, as their sizes are less than or equal to one.
88
e-
Hence there will be only 24 buckets.
e
The buckets and their corresponding attributes are:
Bucket no.
B
1
B
2
B
3
B
4
B
S
B
6
B
7
B
S
B
9
B
10
B
11
B12
.
These satisfy queries of size two.
B
13
B
14
B
1S
B
16
B
17
B
1S
B
19
B20
e
B
21
B
22
B
23
B
24
Identification
Subbuckets
Attributes
1000
0100
A~Aa
1000
0110
A ,A
6 a
1000
0101
A~A7
1000
0111
0100
1010
0100
1001
0100
1011
A ,A
6 7
A ,A a
4
A'jAS
A ,A
4 S
0010
0001
A
0010
1001
0010
0101
0010
1101
0001
1010
0001
0110
0001
1110
1100
1010
1100
1001
0110
0101
1010
1001
1000
0010
1000
0001
0100
0010
0100
0001
0110
1001
1010
0101
AiAS
A ,A4
1
A ,A 7
S
A ,A4
2
A ,A6
1
A ,A6
4
A ,A 9
S
A ,Ag
3
A ,A9
1
A ,A9
2
A ,A ,A
2 7 S
A ,A ,A6
2 3
A ,A ,A S
1 S
A ,A ,A
1 .3 4
A ,A 6 ,A
9
S
A ,A 7 ,A9
4
t A2
A2A ; A Afi A A
7 2
7 S
A A ;A A A A
2 3 2
3 6
AlAS; AlAS; AsAa
e
A1A3;A1A4;A3A4
ASA6;ASA9;A6A9
A4A9;A4A7;A7A9
The total number of subbuckets is 36 and the total number of queries is
36.
Hence corresponding to each query there is exactly one subbucket.
The storage scheme can be further simplified by forming super-buckets.
For example, we can group all the buckets with "1000" as the first four
symbols of its identification in one super-bucket.
taken as its identification.
So, we have
S9
Also "1000" can be
Super-bucket
1000
bucket
0100
0110
0101
0111
0010
0001
0100
1010
1001
1011
0010
0001
0010
0001
1001
0101
1101
Subbucket rd.
Record
Accession nos.
AA
3 8
AA
6 8
AA
3 7
A6A
7
AA ; AA ; AA
2 7
2 8
7 8
AA ; AA ; AA
2 6
2 3
3 6
AA
4 8
AA
3 S
AA
4 S
AlAS; A1A8 ; ASA8
AA ; AA ; AA
1 3
1 4
3 4
AA
1 2
AA
2 S
AA
1 4
AA
S 7
............................................................
0001
1010
0110
1110
1100
1010
1001
1010
0110
1001
A Ag
2
0101
A Ag ; A A ; A Ag
4
4 7
7
0101
A Ag
1
ASA6 ; ASAg ; A6Ag
1001
...........................................................
gO
e
Theorem (3.4.5).
Given that the retrieval pertains to only one
level of each attribute, there exists a filing scheme based on a nondegenerate elliptic quadric in
Zs-fold queries.
Proof.
(3.4.4).
PG(Zs+l, q) which is oriented toward
qZs+l_l
s
l =
- q
attributes,
q - 1
It involves
The proof of this result is similar to that of theorem
siz~are
The bucket
Theorem (3.4.6).
given by theorem (3.3.16).
Given that the retrieval pertains to only one
level of each attribute, there exists a filing scheme based on a nondegenerate quadric in
PG(Zs, q)
queries. It involves
b = (qZs+l_l)(qZS_l)
o
-{. =
which is oriented toward (Zs-l)-fold
(q Zs -l)/(q-l)
attributes, and
buckets.
(qZ -1) (q-l)
The bucket sizes are obtained from theorem (3.3.17).
Theorem (3.4.7).
..
Given that the retrieval pertains to only one
level of each attribute, there exists a filing scheme based on a nondegenerate quadric in
PG(Zs, q)
queries.
It involves
l = qZS
attributes and
buckets.
Of these
buckets,
(qZS_l)/(q_l)
(qZs+qs)/Z
b
have size
which is oriented toward Zs-fold
(qZs-l_qs-l);
and
b = (qZs+l_l)/(q_l)
have size
(q2s_ q s)/2
qZS_l;
have size
(qZs-l+q,s-l) .
Proof.
Let
\
1..2s
Consider a non-degenerate quadric
denote the whole space,
PG(2s, q).
Q
2s
in
PG(2s, q).
Define,
RZs = 1..2s
\
- Q
. 2s •
R
2s
denotes the set of points not on the quadric
are identified with the points on
91
R2s .
R
Zs
Q2s.
has exactly
The
q
attributes
Zs
points.
The buckets are identified with the intersections of
(Zs-l)-flat spaces.
RZS n LZs - 1
Zs-l
= LZS - 1 - (QZs
R
Zs-l
is known.
QZs n L,Zs-l
\
n LZ s - 1 )
are given by theorem (3.Z.5).
Hence
Hence the result.
Example (3.4.5).
l = 2
Then
with
Define
R
The properties of
RZs
s = Z and
4
q
2
16.
The points and the corresponding attributes are
Al = 11000,
A = 10101,
6
All = 10110,
A
Z
A
7
01101,
A = 00011,
1Z
A = 11100,
3
A = 00001,
8
A = 00101,
13
A = 11001,
4
A
9
AS = 01001,
A = 01110,
10
11010,
00110,
A
14
10011,
A = 01011,
15
A = lUll •
16
There are 31 blocks - 15 have size 8, 10 have size 6 and the remaining
6 blocks have size 10.
The same design can be obtained from
It has exactly 30 blocks each of size 8.
EG(4,2).
The design obtained will be
more useful, if some 4-tuples are retrieved more frequently than others,
than the design obtained from
EG(4,2).
3.5 Multiple level attributes
Consider a non-degenerate quadric
n +
q l
L~-Z.
be
(q+l)
Q in
N
PG(N,q).
Let
(N-l)-flat spaces containing the same (N-Z)-flat space,
Define,
92
i = 1,2, ... ,q+l
= QN
n
I
(3.5.1)
o
(3.5.2)
N-2
also,
(3.5.3)
and
i
l,2, ••. ,q+l
i
RN-l
(3.5.4)
or,
i
(3.5.5)
RN-l
or
i
(QN
RoN-I
n
TI.)
-
(QN
n
I
1
O
(3.5.6)
)
N-2
From the results of section (3.3) and section (3.4) the cardinalities
of
i
RN-l
(i
= 1,2, ••• , q+1)
are known.
Let
(3.5.7)
The
n.'s
are not necessarily equal.
1
Let
\
L. t -
1
be any (t-l)-flat space in
PG(N,q).
Define
(3.5.8)
Consider any
t
points
PO,Pl, ••• ,P t - l
from
points are contained in the same (N-l)-flat,
~
TI.
1
such that no two
(i
There exists at least one (t-l)-flat containing these
\
L.t-l
be one such (t-l)-flat space.
contains these
t
points, where
93
= l,2, .•• ,(q+l».
t
Then there exists a
points.
Rt- 1
Let
which
Hence, given any
t
points, there exists a
That is, there exists a filing scheme for
levels
n ,n , ••. ,n +
1 2
q 1
R _
t 1
(q+1)
containing them.
attributes with
such that it satisfies queries of size
The buckets of this filing scheme are the intersections of
~
t.
with
(t-1)-f1at spaces.
Example (3.5.l).
Let
q
= 2,
N
3,
&
t
= 2.
It has exactly 9 points;
1000
0100
0010
0001
0110
0101
1010
1001
1111 •
Let
and
o+
7T 1:
X
7T 2:
x2
7T 3:
X
xl = 0
+ x3
o+
= 0
xl + x 2 + x 3 = 0 •
Further
7T
1 n Q3:
0010, 0001, 1111
7T
2 n Q3:
1000, 0100, 1111
7T
3 n Q3:
1010, 1001, 0110, 0101, 1111 •
So
1
and
R2 :
0010, 0001
~:
1000, 0100
R~:
1010, 1001, 0110, 0101
So the levels of the attributes can be taken as
2,
All = 0010;
A
12
0001
2,
A = 1000;
21
A
22
= 0100
4,
A = 1010;
31
A
32
= 1001; A33
95
0110; A
34
0101.
The buckets are formed by considering lines that intersect in at least
There are ten such lines.
two
For storage convenience the
buckets are grouped into super-buckets.
Super-buckets
(id)
1000
buckets
(id)
subbuckets
0001
A22 A
33
A A
12 22
A
12
A
34
0101
A A
22 34
A A
0110
A
0001
A
ll 34
12
A
--
accession nos.
A A
n 22
A A
ll 33
0010
0100
i.e.
A
0010
A
A
33
A
n 2l
n A 3l
A
2l 3l
A
12 2l
A12 A32
A A
2l 32
0100
0010
0001
1010
A
1001
A
n A32
1001
A
A
0101
A
1010
A22 A3l
A A
2l
A
12 3l
22 32
A
2l 34
0110
33
There are exactly five super-buckets and at most four buckets in any
super-bucket.
This arrangement greatly reduces the retrieval time (in
terms of matching).
A record is stored in a bucket exactly once.
between subbuckets.
This requires chaining
The record is stored in the first subbucket that
comes across in a bucket.
95
Suppose we are interested interested in retrieving all records
containing the levels
A
ZI
and
PG(3,Z)
these levels in
A •
3Z
are
The corresponding points of
A
ZI
=
1000,
A
3Z
=
1001.
The line
satisfying these points is
cOX o + clx l + czxz + c 3x 3
=0
d Ox O + dlx l + dZx Z + c x 3
3
=0
or
Co
c
3
0
dO
= d3 = 0
So the equations of the line are
xl
= 0,
xZ
= O.
The super-bucket identification number is
is
0100; and that of the bucket
0010.
The retrieval time for any query
where
T
A,
involves three components,
is the time required to solve the linear equations;
l
T
Z
is
the time required to locate the super-bucket, bucket and the subbucket
by matching the identification numbers; and
is the time required
T
the computer system.
and
T
T
Z
depends on the distribution of records.
T
l
and
T
3
Let
T
l
for any query.
3
T
3
3
to retrieve the actual records.
l
and
T
are dependent, mainly, on
be the maximum possible values of
Suppose the records are uniformly distributed.
Then the average number
of comparisons required is
(lxlxl + lxlxZ +... + 4 xZx l)/ZO
So the average retrieval time
96
= 4.8.
e-
where
is the machine time required for each single comparison.
E
When we use
simple inverted filing scheme, we need on the average
10.5 comparisons.
Example (3.5.2).
N = 4, q = 2,
It has exactly 15 points.
The three 3-f1at spaces can be taken as
TI :
2
0
l = 3
x
0;
TIl:
x
R1 :
3
2
R :
3
3
R :
3
01000; 01100; 01010; 01111
0
=
1
=
TI 3:
x +x = 0 •
O 1
Further,
10000; 10100; 10010; 10111
11001; 11101; 11011; 11110
So, we have
.
e
n
n
n
4; All = 01000
A = 01100; A = 01010; A = 01111
12
13
14
2
4·, A = 10000
21
A = 10100; A = 10010; A = 10111
22
23
24
3
4; A
31
1
=
11001
A = 11101; A
32
33
11011; A = 11110
34
The buckets can be identified with I-flat spaces, which intersect at
least two
The storage and retrieval rule can be as that of the
previous example (3.5.1).
97
CHAPTER IV
A GENERALIZED FILING SCHEME FOR
MULTIPLE LEVEL ATTRIBUTES
In this chapter we obtain filing schemes for multiple valued
attributes, using linear representation of the attributes.
4.1
Linear representation of attributes
Let
A ,A ,A , ... ,A
i
l Z 3
levels respectively.
i
be
attributes with
n ,n ,n , ••• ,n
l 2 3
i
We shall assume, throughout this chapter, that
r.
n
where
q
r.
1
by
1
= q
i
,
1,2, ••.
i
is a prime power and
r.
r.-tuples over
A.,
we can represent the
1
iii
A..
1J
= (V. l' V. 2 ' ••• , V.
J
J
Jr i
),
V~
E
Jm
levels
n.
1
GF(q)
V~ is the q - ary
is the j-th level of i-th attribute and
representation of the number
unambiguous and unique.
we have
Since
GF(q); i.e.
1
where
some positive integer.
1
for the i-th attribute
,i,
r
1
=2
j.
The above representation is
As an example, when
=
(000), A
2l
A
24
=
(100), A
25
=
q
= 2,
i
= 2,
n
l
=
4,
Further
and
A
12
=
(10),
(001), A
22
=
(010), A
(01),
A
20
J
(11) ,
A
13
Z3
=
(011),
(101), A = (110), A = (111).
26
27
From now on, in this chapter we use, mainly the q-ary representation of
the levels of the attributes.
Let
98
e-
,e.
e
N=
r.
L
1.
i=l
and
p = max(r.
1.
+ ••• + r
+ r.
1.
1
i
2
)
t
where the maximum i.s taken over all possible t-tup1es. ,
There are exactly
(.~ )
(i 1 ,·· .,i t )·
such t-tup1es.
Let
h
h
h
U
h
21
h
12
h
22
1m
2m
H
=
N x m
-
(4.1.1)
~1
e
be a
m.
(Nxm)
matrix with elements of GF(q).
j-1
r. =
L r k where r O = O.
J"
k=O
Let
h
~2
Also, the rank of
h r. +1,1
h r. +1,2
h
h r. +2,1
h
h
J.
r. +2,2
J.
r.
J
r. +2,m
(4.1.2)
m
for j
That is,
r. +l,m
J"
H
j
x
H is
J.
J.
J•
Nm
H ,H , ••. ,H,e.
1 2
= 1,2, ••• ,1.
are submatriees of
99
H.
So we can write,
e
HI
· .....
HZ
·.....
H =
(4.1. 3)
· .....
HI
Definition (4.1.1).
i f for any
Rt(rl,rZ,···,rl)
of
The matrix
t
H is said to have the property
of the submatrices
H. ,H. , ••• ,H.
1.
1
1.
Z
l.t
H the matrix
H.
1.
1
·.....
G
H.
1.
I
j=l
[ t
r.
1. j
]
·.....Z
x m
(4.1. 4)
·.....
Hi
t
t
I
has rank
r.
1. •
j=l
If
H has the property
R (1)<l-tV
the notation
P
property.
Z
for
=•.. = rl = 1,
r
t
= r
t
If
l
J
then
Rt(rl,···,rl )
Rt(rl,···,rl )
~
p.
We shall use
when
then the property
The matrices with property
m
P
Rt(lxl)
t
is called the
were first considered by
Bose (1947) in connection with the problem of confounding symmetrical
factorial designs.
correcting codes.
[Z6].)
Later these matrices were used in constructing error
(Bose and Ray-Chaudhuri [10] & [11]
and Peterson
In later parts of this chapter, a method for obtaining matrices
100
e
-
R (r ,r , .•• ,r )
f
t l 2
with property
The i-th attribute
Ai'
will be described.
1,2, ••• ,f)
(i =
will be represented by a
set of linear forms
T
Hi ~
(r.xm) (mxl)
iT
L
=
(4.1. 5)
(r. xl)
1
1
or in full
h
x +
h
r. +1,1 xl + r. +1,2 2
·..
+ h
h
X +
r i ·+2,1 Xl + r.1 +2,2 2
·..
+ h
Xl + h r + ,2 X2 +
ri+lo,l
i lo
·..
+ h
l'
l'
h
0
h
-e
where
H.
is the i-th submatrix of
1
ri·+l,m
i
x = L
m
1
r io +2,m
i
X = L
m
2
X
r i + l o,m m
(4.1. 6)
'i
= Lr
i
H in (4.1.3) with property
To any query,
A.
1
Q
where
g
:::;
t,
A.
1
i
l
V.
Jl
1
Ai
2
g
i
g
V.
J g
i
2
V.
-J2
(4.1. 7)
we shall associate the set of equations,
i
i
L W= VW
-j
W =
W
That is
101
1,2, ••• ,go
i
h
r
i
+h
~+1.1 xl
r
w
i
.+1.2
x
+
2
+h
...
r
w
i
.+l.m
x
m
=V
w
•
i
hr
i
.+2.1 xl +h r
w
,
1
.+2.2 x 2 + •.. +h r
w
w
jw 1
+2 m m =V j
X
iw·
•
w
w.2
(4.1. 8)
w
for
= 1.2 •...• g.
or
i
H 1 xT
i
H
2
X
i
1
~1
=
i
T
V1
~2
(4.1.9)
i
= Vg
e-
-=-jg
Let
... , ..
G
=
=
(4.1.10)
=V
(4.1.11)
V
i
H g
So. we can rewrite (4.1.9) as
G x
where
G
is
(f
T
matrix and
V is
j=l
(f
j=l
r
i
)x 1 column.
j
g
rank G
= L
rio
j=l
J
102
=r
(say)
(4.1.12)
as
one
H has property
Rt(rl, ••. ,rt)
submatrix in
(rxr)
and
G with rank
columns in this submatrix.
g
$
r.
x ' a
a
There exists at least
Let
In the equation
values of all the variables
t.
(4.1.8) we can take the
+a l ,a 2 , ••• ,a r
solve the resulting equations for
x
a.].
be the
equal to zero and
Let
i=1,2, ••• ,r.
= u'
a~
].
We then get a solution
of (4.1.8), in which all the coordinates other
than~-th,a2-th,•••
,ar-th
are equal to zero.
So be the set of m-vectors over GF(q)
Let
.-
than
of the coordinates are non-null.
p
for which not more
The number of vectors in
(4.1.13)
We have shown that there exists at least one solution for (4.1.8) in
So'
as
r
$
p.
By using the "Echelon method - canonical solution" we
can find an unique solution in
take
u
SO.
So, without loss of generality, we
to be the unique solution for (4.1.8).
the linear forms
i
L ,
For any m-vector
~,
given by (4.1.6) may be said to attain the value
(4.1.14)
at
u.
Further,
w=1,2, ••. ,g.
So, we have,
Theorem (4.1.1).
To any query,
103
rA. 1
Ai
1.
Q
l~1
i
11
2
e
A::1
V
jwJ
2
~2
we can make correspond an unique m-vector
u
of
8 , namely the
0
canonical solution of the equations associated to the query,
i
the linear forms
L
i
i
i
1 , L 2 , .•. , L g
1
~,
attain the values
1
The number of queries of size less than or equal to
r
t
L
k=l
q
+r
i1
In general, the same vector
1. k
(4.1.15)
any vector of
Let
8
k~tup1es.
may correspond to more than one
u
query, though it is possible that there are vectors in
responding to any query.
be the subset of
80
8 ,
0
there corresponds at least one query.
r2
is
+ ... +r.
i2
denotes summation over all possible
where
t,
not cor-
such that to
Let
b
=
181·
Example (4.1.1)
v
=
r.
So,
1.
H
10x4
=
=
2; t
=
2
2; l
= 5;
for all
0001
0010
0100
1000
0110
1011
0101
1010
0111
1110
1_
n
i
=
2, i
= 1,2,3,4,5.
i.
(4.1.16)
J
104
e-
0100
0001
0101
1010
(10 x4)
3
1000
0010
The
H
matrix
have the property
HS =
0110
1011
0111
1110
H has the property
P4 - i.e.
=
R2 (Sx2),
any four rows of
but it does not
H are not independent.
The linear forms and their corresponding attributes are,
linear form
attribute set
1
L
1
1
L
2
2
L
1
2
L2
3
L
1
3
L2
L4
x
x
x
3
2
xl
x2 + x
3
xl + x 3 + x 4
x
1
L4
2
S
L
2 + x4
Xl + x 3
x2 + x
+ x4
Xl + x 2 + x 3
1
S
L
2
1).
4
Consider the query of size two,
lOS
3
(4.1.17)
The associated equations are,
x
2
+ x
4
= V21
4
4
xl + x 3 = V22
x
2
+ x
3
+ x
5
(4.1.18)
= V
31
4
5
xl + x 2 + x 3 = V32
Since
H has rank
4,
we have a unique solution
445
xl = V21 + V22 + V31
x
x
x
But
4
V = (01)
-2
2
4
5
4
5
= V + V
22
32
4
5
= V4
V
+4
V +
21
22
32
V~ = (10), so the actual solution for (4.1.17) is
and
u
ii)
(4.1.19)
= V21 + V
31
3
=
(0,1,1,1).
Consider the query
Then the corresponding equations are
3
x 2 + x 3 = V 41
3
xl + x 3 + x 4 = V42
So
3
x1+x3+x4=V42
x2
+
x3
3
= V 41
106
Put
Then
the solution vector is
= ( V.342 ' V341 ' 0, 0)
u
or
u = (1,1,0,0)
3
~
as
=
(1,1).
The value of the linear forms for the solution vector (1,1,0,0) in
(4.1.17) is
(0,0,1,1,1,1,1,1,1,0).
L (u)
4.2 Generalized multiple valued filing scheme
.-
Let
u
S.
be any solution vector in
To each
u
there corres-
pond an N-vector
where
b
i
is a
(l x r.)
J.
(i = 1,2, ••. ,i)
vector,
over
GF(q)
such
that
1,2, ... ,i.
i
B(~)
is called the block corresponding to
query there exists a vector
~,
u.
Corresponding to each
and hence there exists a block,
We shall show that the correspondence between the block
vector
in
u
is unique.
Let
u
1
and
u
2
B(u)
i.e.
Hence
107
and the
be any two distinct m-vectors
S such that
i
i
L. (u ) = L.(u ) ,
J - 1
J - Z
B(u).
for all
i
and
j
for all
Since the rank of
i
l
... ,
l
L1 ' L 2 '
k
where
L
ji
= m,
we can choose
and
j.
m rows of
(4.2.1)
H such that
Without loss of generality, let
they are independent.
i
m,
H is
i
i
L l,
j1
be a set of
... ,
i
k
L1 '
... ,
i
k
Jk
L.
m independent rows in
H.
Then
i=l
il
L 1 (~1-~2) = 0
i
L
2
1 (u -u )
= 0
l 2
(4.2.2)
Since (4.2.2) is a set of
in
m independent homogeneous linear equations
m variables, we must have
=0
or
ul
= ~2
This is a contradiction, since
~l
and
u
2
are two distinct vectors.
Hence, the correspondence between m-vectors of
one-one.
and the blocks is
We can now obtain a correspondence between queries and blocks.
If, to the query
say that
e
B(u)
to any query
Q there corresponds an m-vector
is the bucket corresponding to
Q of size less than or equal to
u
of
e,
Q.
Hence corresponding
t
there exists a block,
but to each block there corresponds, in general, many queries.
108
then we
Theorem (4.2.1).
... ,
I f the block
corresponds to
the query
A.1.
Q
Ai
1
i
i
V.
V.
-J
l
-Jl
Ai
2
i
2
V.
-J
2
)
g
(4.2.3)
g
g
then
i
b
V.
-J
i
b
i
1
2
1
1
i
=
2
V.
-J2
(4.2.4)
.............
i
i
g = V. g
b
J g
Proof.
From theorem (4.1.1) there exists a unique solution
an m-vector, corresponding to
Q.
u
Also,
(4.2.5)
Since,
k = 1,2, ... ,g •
we have
k
=
In the example (4.1.1) when the query is
where
109
3
v
-4
=
(1,1)
1,2, ... , g •
then
B(u) = (0,0,1,1,1,1,1,1,1,0).
Let
M be the set of addresses reserved for storing the accession
numbers of records.
M can be partitioned into
ponding to the block
M are called buckets.
M(~).
of the bucket
bucket
where
The vector
u
subsets, corres-
M(u).
These subsets of
will be called the identification
Consider the block
B(u)
corresponding to the
M(u).
b
i
is a ri-vector.
i-th attribute.
l
2 -1
to the
M(~)
being denoted by
B(!!)
b
M(u)
In fact
are called subbuckets.
A*
i
represents some level of the
is subdivided into
nonempty subsets of
to the subset
b
1
.e.-
2
The subbucket of
...,
subsets, corresponding
{l ,l , •.• ,l}.
1 2
l
{b ,b , ••. ,l}
of
l
2 _1
These subsets of
M(u)
which corresponds
may be denoted by
be the attribute vector of
M(u,A*).
I -th record .
=A
B(u) n f(I)
Let
(4.2.5)
where
A
That is,
b
i
€
A,
=
{li:
i
V~ }
-b = J
i
.
if and only if the level of i-th attribute for I-th
record is the same as the value attained by the linear forms
Our storage rule can now be stated as follows.
number of the I-th record is stored in the bucket
is non-empty.
If
M(u) n f(I)
= A*
at
u.
The accession
M(u)
if
M(u) n f(I)
then the accession number of the
I-th record will be stored in the subbucket
M(~,A*)
of
M(~).
this rule a record will be stored exactly once in a bucket.
110
i
L
By
e"
We have noted earlier that the redundancy,
on the distribution of the attribute records.
value of
~
~
in general depends
We shall calculate the
under the hypothesis of uniform distribution, and corres-
ponding to the storage rule described in the previous paragraph.
There are
N
q
possible distinct N-vectors over
possible attribute vector.
GF(q).
Each is a
Suppose the number of records having any
given attribute vector is
0,
and is independent of the vector.
oqN .
the total number of records is
f (I)
If
Then
is the attribute vector
of the I-th record, then the accession number of this record does not
appear in the bucket
B(E.)
-
i f and only i f
B(u) n f(l)
is empty.
being given there are
(q
.
M(u)
rl
r
-l)(q Z-1)
f(l)
ways of choosing
r
x ••• X(q
i -1)
= a
¥1 •
such that
ox
i
IT
Hence there are
r.
(q J_ l )
j=l
records whose accession numbers do not occur in the bucket
M(u).
The
number of records whose accession numbers occur in the bucket is,
therefore
i
IT
r.
(q J-l) .
j=l
Since there are
b
=
lsi
buckets the total number of the accession
numbers stored in the storage is
b x 0 x [q
N
i
-
IT
j=l
Hence,
III
r
(q j-l)]
N
b
x 6 x [q
-
f.
rj
r-r
(q -1)]
j=l
r,; =
6 x
f.
IT
= b[l -
j=l
N
q
1
(1 - r.)
(4.2.6)
q J
as
f.
N=
L
j=l
rj •
Example (4.2.1.). Continuing the example (4.1.1) consider the
block,
B(~)
(0,3,3,3,2).
= (0,0,1,1,1,1,1,1,1,0).
Also,
B(~)
corresponding to the block
(1,1,0,0).
We can rewrite as,
= (A10,A23,A33,A43,AS2).
B(u)
B(~)
So the bucket
has the identification number u =
The subbuckets of the bucket are
A10;A23;A33;A43;AS2
A10A23;A10A33;A10A43;A10AS2
A23A33;A23A43;A23AS2
A33A43;A33AS2
A
A
43 S2
;
A10A23A33;A10A23A43;A10A23AS2
A10A33A43;A10A33AS2;A10A43AS2
A23A33A43;A23A33AS2;A23A43AS2
A33A43AS2;
A10A23A33A43;A10A23A33AS2;A10A33A43AS2
A10A23A43AS2;A23A33A43AS2
Let the attribute vector corresponding to the I-th vector be
112
=
M(u)
So,
Hence, the accession number of the I-th record is stored in the subbucket
Let
A.1.
Q(A) =
1
Ai
A.
1.
g
i
g
V.
"J g
2
i
i
2
l
V. V
"Jl -j2
,
be any query of size less than or equal to
g
::;;
t.
Let
solution of (4.1.8) corresponding to the query
B<,~)
M(~)
1
I
2
= (.£. ,.£. , ... ,~)
B(u)
u
Q(A),
be the corresponding block.
corresponding to
to the query.
t
be the unique
and let
Then the bucket
is defined to be the bucket corresponding
From theorem (4.2.1)
i
b W
W
1,2, ... ,g .
=
If the I-th record satisfies the query, then the il-th, i -th, ••. ,ig-th
2
components of the attribute vector
f(I)
of
1
are
... ,
That is,
B(~) n f (1) :::J Q(A) .
Hence, there exists a subset
A*
in
(4.2.7)
M(~)
that contains the accession
number of I-th record where
B(~)
Conversely let
n f(l)
A*
= A*
•
be any subset such that
A*:::J Q(A),
the query.
It is clear from our storage rule that all accession numbers of records
that are stored in the subbucket
M(~,A*)
also satisfy the query
Q(A).
Hence to find all accession numbers of records satisfying the query we
have to search all the subbuckets
113
M(~,A*),
where
A*:::J Q(A).
That is
M(Q)
=
u M(~,A*)
(4.2.8)
where the union is taken over all subsets
set
Q(A).
There are exactly
21 - g
A*
subsets
that contain the query
that contain
A*
So, the retrieval rule may be stated as follows:
query relating to the attributes of the subset
the bucket
B(~)
corresponding to
all records satisfying
Q.
Q are found in
Q(A)
Q.
of
Q be any
A.
Determine
Then the accession numbers of
M(Q)
given by (4.2.8).
verse1y any record whose accession number is in
query
let
Q(A).
M(Q)
Con-
satisfies the
Once the accession numbers have been found we can retrieve
the complete records.
Example (4.2.2).
Continuing the example (4.1.1)
Consider the query
e'
From (4.1.17)
we have
u
=
(1,1,0,0)
So,
B(~)
= (0,0,1,1,1,1,1,1,1,0)
B(~)
=
or
So there are
25-2
8
(0,3,3,3,2).
subsets containing
(A
10
,A
23
)
in
eight subbuckets are located by searching within the bucket
B(u).
These
M(u).
The time required for retrieving the records satisfying a given
query will be made up of a number of components:
114
4It
T :
l
time required for coding physical attributes into linear
forms and the values of the attributes into vectors over
GF(q).
T :
Z
time required for reducing the query equations into Echelon
form and determining the Canonical solution
which gives
~,
the bucket identification number.
time needed for identifying the bucket
T :
4
time required for computing the labels of the subbuckets
entering in
M(Q).
T :
S
time required for identifying the subbuckets.
T :
6
time required for actual retrieval of the records.
Thus the total time required for a query
.4It
M(~).
T :
3
The components
T ,T Z,T
4
l
computer system used.
and
T
6
By using
Q(A)
is
will depend on the parameters of the
an ordering among bucket labels and
among subbuckets within a bucket, the components
T
3
and
T
S
can be
reduced.
The filing scheme for the case
Bose [7].
n =n =••• =nl
l 2
was obtained by
In his method the levels are not represented by vectors.
We shall now describe a method of obtaining
elements from
GF(q),
which have rank
(Nxm)
m and property
matrices with
Rt(rl,r2' •••
'~
Before describing the method we shall study briefly some properties of
spreads.
4.3 Correspondence between Galois fields, vector spaces and projective
spaces
Let
integers
4It
exactly
q
Let
be a prime power.
m, N, 8(>1).
q*1
elements.
(N+l) = e(m+l)
Consider the Galois field
Let
a
lIS
for some positive
GF(qN+l).
be a primitive root of
It has
GF(qN+l).
Then
a.q
N+l - 1
can b e expresse d'1n th e f orm
GF(q N+l)
All the elements of
(4.3.1)
= 1.
1
2
N
bO+b l a. +b 2 a. + ••• +bNa.
where
b.
1
E
GF(q)
for all
i.
for all
i
Similarly any element of
GF(qm+l)
can be written as
where
c.
1
E
GF(q)
11
n
and
=
is a primitive element of
GF(qm+l).
Since the representation of
elements of a Galois field with respect to a primitive element is
unique, the correspondence
(4.3.2)
between the elements of
one-one.
GF(qN+l)
and
(N+l) - vectors over
GF(q)
is
Similarly we can obtain one-one correspondence between the
elements of
GF(qm+l)
and
(m+l)-vectors over
GF(q):
(4.3.3)
The Galois field
GF(qm+l),
as
GF(qN+l)
can be viewed as an extension of
(N+l) = 6(m+l).
So the element of
GF(qN+l)
can be
uniquely expressed as
1
2
6-1
aO+alo +a2 o + ••• +a 6_ l o
belong to
GF(qm+l)
(4.3.4)
and
Now the correspondence,
116
is a primitive
(4.3.5)
between the elements of
one-one.
over
GF(qN+l)
By (4.3.3) each
GF(q).
a
and a-vectors over
GF(qm+l)
can be expressed as an (m+l)-vector
i
So we can correspond a-vectors of (4.3.5) and
vectors of (4.3.2).
is
(N+l)-
That is
(4.3.6)
where
a.
-J.
is the (m+l)-vector representation of
aJ.'
over
GF(q).
Conversely an (N+l)-vector can be made to correspond to a a-vector
as follows;
(4.3.7)
where
-e
(bO,b l ,··· ,b m)
~l
(bmtl,bm+2,···,b2mtl)
That is, there is a one-one correspondence between the a-vectors over
GF(qm+l)
Let
Let
Then
and the N-vectors over
VN+l be the (N+l)-dimensional vector space over GF(q).
VN+ l = (bO,bl, •.. ,b N)
V+
N l
be any non-zero (N+l)-vector in
represents a point
(c bO'c bl, .•. ,c bN)
where
represent the same point
between the points of
VN+l •
GF(q).
c
P
in
PG(N,q).
VN+l •
Also the (N+l)-vectors
is any non-zero element in
GF(q),
P.
Hence there is a one-one correspondence
PG(N,q)
and the one dimensional subspaces of
A basis of the one dimensional subspace is taken as a representa-
tive of the point
P
as a subspace is completely specified by a basis.
117
The null vector
(0,0, ••• ,0)
may be regarded as corresponding to the
(-I)-flat (the empty space) of
subspace of every k-flat of
PG(N,q),
PG(N,q),
k
where the empty space is a
= -l,O,l, .•• ,N.
one-one correspondence between k-flats of
spaces of
PG(N,q)
~ =
Va
space
and
(k+l)-dimensional subspaces of
(sl,s2, ••• ,sa)
over
(k+l)-
There is a one-one correspondence between k-flat
(N+l)- dimensional space over
Let
and
VN+ 1 .
dimensional subspaces of
Lemma (4.3.1).
PG(N,q)
There is a
GF(qm+l).
GF(q),
where
0
$
k
$
VN+l'
the
N.
be any a-vector in a-dimensional vector
Let
c
€
GF(q
m+l
)}.
(4.3.8)
Then,
Lemma (4.3.2)t.
ponding to
q
m+l
The (N+l)-vectors of
a-vectors of
dimensional subspace of
GF(qm+l) •
over
GF(q)
Let
VN+l •
GF(q)
corres-
The null element of the subspace
uO,ul, ••• ,um+l
Since each
over
are the elements of an (m+l)-
corresponds to the null element of
Proof.
VN+l
VN+l •
be any
(m+2)
non-zero elements of
can be uniquely expressed as an (m+l)-vector
there exist
such that
(4.3.9)
Hence
(4.3.10)
t
Heft [24].
118
~
ul~,u2~, ••• ,um+l~
This implies that the (N+l)-vectors corresponding to
are dependent.
of
VN+ l .)
(The zero vector of
Hence there are at most
Va
corresponds to the zero vector
(m+l)
independent (N+l)-vectors
Wa(~).
corresponding to
Consider the set of (N+l)-vectors corresponding to the a-vectors
~,
n ~, ... ,n m~ of
GF(qmtl).
Wa(~)'
n is a primitive element of
where
If these (N+l)-vectors are dependent, then there exist
in
GF(q),
not all zero, such that
(4.3.11)
n
This implies that
satisfies a polynomial of degree less than
(m+l).
But this is impossible.
Hence the
vectors of
of
q
mtl
Wa(~)'
(N+l)-vectors of
corresponding to
q
m+l
all belong to the same (mtl)-dimensional subspace
VN+l"
Lemma (4.3.3)t.
ponding to all the a-vectors of
k-dimensional subspace of
Va
Va
over
VN+ l ,
a-vectors of
Va
are
independent (N+l)-vectors of
1 ~ k ~ a.
GF(qmtl)
GF(q)
corres-
belonging to the
Also if
~l'~2' ••• '~
are
k
independent
then the (N+l)-vectors corresponding to
primitive element of
Proof.
over
are the elements of a k(m+l)-dimensional
subspace of
k(mtl)
VN+ l
The (N+l)-vectors of
V
N+l'
where
n
'.
is a
GF(qm+l).
Suppose the (N+l)-vectors of
n ~l' n2~l, ••. ,n m~ are dependent.
119
corresponding to
~l'
Then, without loss of generality,
suppose the (N+l)-vector corresponding to
tHeft [24]
VN+ l
e
n
~
is dependent on the
{k(m+l)-l}
remaining
and
(N+l)-vectors corresponding to
Hence there exist
j -# h.
a..
in
1J
GF(q),
i
ns.,
-:J
i-#e
not all zero, such
that
m
k
L
o
L
1
i-#e
j-#h
e
11 ~ =
i
(4.3.12)
a .. n s.
1J
-J
or,
(n
e
m
(L
-
i=O
But
are independent.
~1'~2'··· ,~
i
a .. n )s.
1J
-J
(4.3.13)
So we have
(4.3.14)
That is,
n
satisfies a non-zero polynomial of degree less than
which is impossible.
to
i
n s., i = 0,1,2, ••• ,m;
W
k
s
of
(N+l)-vectors corresponding
are independent.
be a k-dimensional subspace in
be
~1'~2'··· ,~
vector
k(rn+l)
j = 1,2, ••• ,k
-J
Let
Let
Hence the
W
k
over
independent 6-vectors in
k
(m+l),
Then any 6-
can be expressed as
k
s =
are elements of
where
u i ' elements
(4.3.15)
LUisi'
i=l
aiO,ai1, ••• ,aim
in
There exist , for any
GF(q)
such that
m
u.1 =
n
\L a iJ" n j
j=O
is a primitive element of
k
s
= L
l' - 1
, 2
, •••k
, •
GF(qrn+1).
(4.3.16)
So, from (4.3.15), we have
m
L
(4.3.17)
i=l j=O
That is, every (N+1)-vector in
VN+1
is dependent on the (N+l)-vectors in
120
corresponding to a 6-vector in
corresponding to
i
n -J
s.,
W
k
= 0,1,2, .•• ,m;
i
= 1,2, ..• ,k.
j
W··
k' J...e.
.
8-vec t ors J..n
0f
equa1 to t h e numb er
The number of such (N+l)-vectors is
the number of (N+l)-vectors in a
qk(m+l).
But this ~~s
k(m+l)-dimensional subspace.
Hence
the result.
4.4 Spreads.
spread
A
S
m
such that every point in
of
PG(N,q)
of m-flat spaces in
PG(N,q)
is a set of m-flat spaces
is contained in exactly one element
S •
m
Spreads have been studied by various authors - Bose and Barlotti [8],
Bruck [13], Bruck and Bose [14]
Segr~
and [15], Rao [29], Dembowski [20], and
[32].
One of the fundamental results for the existence of spreads is
Theorem (4.4.1). A necessary and sufficient condition for the
existence of a spread of m-flat spaces in
divides
m+l
Suppose there exists a spread
-l)/(q-l)
condition for
divides
is that
(m+l)
(N+ 1) •
Proof.
(q
PG(N,q)
divides
(q
~l
(qm+l_l)/(q_l)
-l)/(q-l).
to divide
S
of m-flat spaces.
m
Then
But the necessary and sufficient
(q
N+l
-l)/(q-l)
is that
(m+l)
(N+l).
We shall prove the sufficient condition by constructing a spread
S.
m
Let
(N+l)
=
S(m+l),
Consider any point
P
dimensional subspace
GF(qm+l).
Let
s
in
where
N, m, S(>l)
PG(S-l, q
m+l
).
Then
are positive integers.
P
corresponds to a 1-
WI' of a S-dimensional vector space
be a basis of
c
E
WI'
GF(q
121
m+l
Then
)}.
Vs
over
By lemma (4.3.2), the (N+l)-vectors over
(m+l)-dimensional subspace of
a
GF(q)
corresponding to WI form
But, by lennna (4.3.1), there
VN+l •
iaaone-one correspondence between m-flat spaces of
PG(N,q).
PG(e_l,qm+l)
Let
But
P
and the
VN+l . So P is represented by an m-flat
(m+l)-dimensional subspaces of
space in
PG(N,q)
e
is an arbitrary point.
Hence every point of
is represented by an m-flat space of
PG(N,q).
~l and ~2 be any two points in PG(e_l,qm+l).
c
E
GF(qm+l)}
C E
GF(q m+l)}
Then let
and
) = {c
We (
~2
~2:
be the corresponding I-dimensional subspaces in
Let
Ve'
We(~)
be the corresponding (m+l)-dimensional subspace of
VN+l •
II
Let
and
m
1m2
be the m-flat
~l
(
) We (~l)
) 8
~2
(
) We(~2)
) 8
1
PG(N,q).
(
)
Si
(i
(i
=
=
1,2)
1,2) in
That is,
e
I; ,
and
\'2
Suppose
L
m
) 1m2
2
have a common point, say
x.
That is
Then the I-dimensional vector space corresponding to
belongs to both
(m+l)-dimensional subspaces
8
1
x, say
and
8
D(x),
of
2
V+ •
N l
That is,
i = 1,2.
V
But there is one-one correspondence between the points of
GF(q m+l)
an d t h e po i nts
corresponding to the
q
0f
VN+l
over
(N+l)-vectors of
122
GF(q).
D(x)
80 the
e
q
over
e-vectors
can be written as either
.
or
(i
are non-zero elements of
and
where
= 1,2, •.. ,q-1)
are non-zero elements of
and
GF(q).
Then
or
So,
~1
~2
and
represent the same point in
m+1
PG(6-1, q
).
the m-flats corresponding to these points are equal,
Therefore the m-flat spaces of
PG(6-1, qm+1)
are disjoint.
PG(N,q)
=
1 ... t 2 •
l
m
m
representing the points of
The number of points in
q6(m+1)_1
lt
Hence
PG(6-1, qm+1)
is
(qN+I_ I )
(qm+I_ 1 ) ,
qm+1 - 1
and the number of points in an m-f1at space is
(q
(qN+I-1)/(q-1)
So all the
is,
S
m
m+1
-1)/(q-1).
points of
PG(N,q)
are accounted for.
is the set of m-f1at spaces corresponding to the points of
PG(6-1, qm+l).
Example (4.4.1).
Then
6
Let
q ... 2, N
= 2.
The elements of
GF(22)
are
2
0, x, x , 1
where
123
...
3
and
m
= 1-
That
x
The points of
2
+ x + 1 =
PG(1,22)
e
are
P2:
s = (O,x),
-1
2
s = (l,x ),
-2
P :
3
s = (l,x).
-3
PI:
o.
2
P4:
s = (1, 1),
-4
P5:
s = (x,O)
-5
,
01
Ox
2
xx
1x
2
x 1
Ix
xx
2 2
x x
11
P4
2
x 0
10
xO
P
5
W(~l) :
00
Ox
W(!.2) :
00
xl
W(~3):
00
xx
W(-4) :
00
W(~5) :
00
2
PI
2
P2
(
P3
-+
~
The 2-dimensiona1 subspaces co rresponding to
81 :
0000
0011
0001
0010
8 :
2
0000
1001
1110
0111
8 :
3
0000
1011
1101
0110
8 :
4
0000
1010
1111
0101
8 :
5
0000
1100
0100
1000
Finally, the I-flat spaces in
PG(3,2)
8 ,
1
are:
e
are
points
I-flat spaces
A spread
W(~)
L1 :
0011
0001
0010
L2 :
1001
1110
0111
L :
3
1011
1101
0110
L :
4
1010
1111
0101
L5 :
1100
0100
1000
of I-flat spaces in
124
PG(3,2)
is
e
~
e
o = 0,
(i)
X
(ii)
x +x
I 2
(iii)
x +x
(iv)
x +x
(v)
x
= 0;
Xl
3
= 0;
0,
O 3 =
XI+X2+x3
= 0;
0,
O 2 =
x +x
= 0,
= o·
2
= 0,
x +x +x
O 2
1
x3
3
= 0;
4.5 A method for obtaining (Nxm)
matrices.
Let
H = «h
where
h
ij
E
GF(q),
Rt (r1 ,r 2 ,···,rl ).
--
»,
i = 1,2, ... ,N;
be an (Nxm)
Let
rows in common) of
ij
j
1,2, ... ,m,
matrix with rank
H ,H , ... ,Hl
1 2
m and property
be disjoint submatrices (that is no
H such that
(4.5.1)
H =
(r.xm)
1.
matrix,
i= 1,2, ... ,l.
Then
l
l
N
r.
i=l
(4.5.2)
1.
Also
i
is any row in
H,
1,2, ... ,l,
(4.5.3)
then the m-vectors
C
do not occur in the matrix
H.
E
GF(q),
That is, the row vectors of
125
Hare
distinct points of
PG(m-l, q)
PG(m-l),q).
So, for any
r.
corresponding to the
i,
row vectors of
1
So they generate a
(ri-l)-flat space in
row vectors of
correspond to a basis of a
PG(m-1,q).
Let
H.
1
Ir
1
-
l'
I r - l'
z
responding to the submatrices
be any
t
.•• ,
Ir
r
t-
1
points of
i
H.,
are independent.
1
PG(m-1,q).
That is, the
i
1
I
flat spaces cor-
I r -1,2 r
Let
il
-1, •••
i
Z
L
r
t
Then their join is a
{(
J-=l
flat space in
r
(r.-1)-flat space in
be the
Hl,HZ,···,H1 ·
of these flat spaces.
the
,2 r
-l
it
)-l}i j
PG(m-l,q), since the rank of the submatrix
G
=
(4.5.4)
et
is
(I
j=l
Ir
it is sufficient to obtain disjoint flat spaces
of
PG(m-1,q)
such that the join of any
... , L\'
r
Theorem (4.5.1).
with property
RZ(lxn)
t
I
- l' r - l' ... , L\ r - 1
t
2
of these flat spaces,
t
-1
is a
it
{( 2
r
j=l
There exists an
if
1
n
divides
(qn_ 1 ) •
126
i
)-1}-f1at space.
j
(Nxm)
m,
matrix
where
Hover
N = lxn,
I
GF(q)
= (qm_l)t
Proof.
integer.
Suppose
n
divides
m.
o
~ =
Then
(q m-l)/(qn -1)
is an
Let
PG(m-1,q).
Then by theorem (4.3.1),
Without loss of generality, we can take for any
i,
h(i-l)n+l,l' h(i-l)n+l,2'
... ,
h(i-1)n+l,m
h(i-l)n+2,1' h(i-l)n+2,2'
... ,
h
(i-l)n+2,m
(4.5.6)
h
h.~n, 2'
as a set of
-e
matrix
rank
n.
H.
n
in,m
independent points in the flat space
formed by taking the m-vectors of (4.5.6)
~
(nxm)
Further, for any two
matrices
H.
~
TI
i
•
Then the
as row vectors,has
and
the matrix
G=
has rank
2n,
since the join of
Hence the matrix
H
=
..tn x m
has property
R (..tx n).
2
127
and
TI
j
(nxm)
is a (2n-1)-flat space.
Definition (4.5.1). An
R (r ,r , ... ,r )
t l 2
i
an
(ri+lxm)
(Nxm)
matrix
H of rank
m with property
is said to be complete if it is not possible to adjoin
submatrix
Hi + l
=
H*
i+l
( L r )x m
i=l i
Lemma (4.5.2).
to
H such that
H
.......
, r i +l
~
1
Hi + l
The
(lnxm)
matrix
H obtained in theorem (4.5.1)
is complete.
Proof. The i
to
H ,H , ... ,H
l 2
i
in
PG(m-l,q)
~1'~2""'~i
(n-l)-flat spaces,
of
H form a spread of
PG(m-l,q).
is covered by one of (n-l)-flat spaces,
That is any m-vector not in
corresponding
Hence any m-vector
~1'~2"
e-
.. '~i'
H will be dependent on the rows of
Hence the result.
Example (4.5.1).
q = 2, n = 2, m = 4.
So
o =
-l..
4
2 2-1 = 5 .
2 -1
From example (4.4.1), the spread
2
x +x
O 2
{0011, 0001, 0010}
= 0, x 3 = 0
{1100, 0100, 1000}
= 0, x +x
l 3
=0
x +x = 0, x +x +x
l 2 3
O 3
x +x
l 2
of I-flat spaces in
=0
X = 0, Xl
o
x
Sl
{lOlOt 0101, llll}
= o {lOll, 1101, OlIO}
= 0, x O+x 2+x 3 = o {1001, 1110, Olll}
128
PG(3,2)
is
Hence
the
matrix
H with property
•
is
0001
0010
0100 1
1000
1010
0101
1011
1101
1110
0111
=
H
R (5 x 2)
2
10x4
Notice that it has rank
4.
may not be independent.
For example,
4 rows of
But if we take any
H
then they
0001
0010
0100
0111
.e
are dependent.
Also any three rows may not be independent. For example,
0001
0100
0101.
H is a matrix with property
Suppose
m.
That is there are
si (rixm)
R2(Blr1,s2r2, ••. ,skrk)
submatrices
(i
=
1,2, •.• ,k).
and rank
Also
and
k
N=
L
s.r.
1. 1.
i=l
Theorem (4.5.3).
R2(slr1,s2r2, .•• ,skrk)
i)
ii)
r
i
There exists an
(Nxm)
if
divides
m,
divides
r _ ,
i 1
i
=
where
129
2,3, .• ~,k,
matrix
H with property
and
r
i-I
L s. 9 r.j-1
j=l J
~
(4.5.7)
-1
fI
and
m
9 -1
=
sk
r
k
q -1
Proof.
Let
k-1
- L
1
Sr - 1 =
l
j=l
s
j
~1
{L
r.
J_ l
r
k
q -1
9
r 1 - l'
(4.5.8)
~2
L
r 1- 1'· •• ,
~hl
rI- I} be a spread of
m
r1
hI = (q -l)/(q -1). The
(r -1)-f1at spaces of PG(m-1,q), where
1
divides m.
spread Slr -1 exists as
L
H ,H , ••• ,H
as
l 2
s1
1
2
sl
I r - 1,I r - 1,···,I r - 1•
1
1
l
Now take
1
matrices of rank
the
Delete these
r1
corresponding to
(r -1)-f1at spaces from
1
1
S
r -
Then each of the (r -1)1
1.
1
flat spaces in the deleted set has a spread of
divides
r .
1
Let
S2
r -1
(r -1)-f1at spaces as
2
denote the collection of
r
2
(r -l)-f1at spaces
2
2
spreads corresponding to the (r -1)-f1at spaces,
l
s~
s~
I r1- 1 , I rl - l , .•• ,
Without loss of generality, we can take
where
(4.5.9)
Now take
Hs +1' H
.•. , H +
sl+ 2 '
sl s2
1
corresponding to the first
s2
as the
(r xm)
2
matrices of rank
(r -1)-f1at spaces in
2
the rest of the submatrices of rank
130
r ,r , ••• ,r k ,
3 4
2
s r -1·
2
r
2
Similarly,
can be obtained.
e-
Clearly the matrix
Skrk)
H so obtained has the property
R2(slrl.s2r2' ••• '
as we are dealing with submatrices corresponding to disjoint flat
spaces.
The matrix so obtained is complete as each point of the geometry
PG(m,q)
is covered exactly once.
Example (4.5.2).
So
l = 9
q
= 2,
and
N =
m
= 4,
rl
= 2,
r2
= 1,
sl
= 3,
12.
2 4_1
5
hI =--=
22_ 1
So,
0
::;
sl
::; 5
and
--
We shall construct a matrix
H with property
o = 0,
x = 0,
2
X
Sl
1
x +x
O 2
=
0,
xl
=0
x
=0
3
x +x
l 3
=0
x +x
O 3 = 0,
x +x +x
l 2 3
=0
x l +x2 ·= 0,
x +x +x
O 2 3
=0
So
0001
0010
0101
1010
131
R (3¥2, 6xl).
2
s2
= 6.
H
4
=
[1011] ,
H
5
=
[1101] ,
H
6
=
[0110] .
H
[1001] ,
H
B
[1110]
H
[0111]
7
g
so, the desired matrix
H
l2x4
H with rank 4 is
0001
0010
0100
1000
0101
1010
1011
1101
0110
1001
1110
0111
=
(4.5.10)
We can also obtain matrices
property
R2(lxn).
H when
n
does not divide
m,
with
But the matrices so obtained may not be complete.
Example (4.5.3).
q
Consider the geometry
= 2, t = 2, n = 2 and m = 5.
PG(4,2).
It has 31 points.
exist a spread of lines (I-flat spaces) in the geometry.
the following I-flat spaces and their points
132
There does not
But consider
e
:
01001
10010
11011
:
10001
11010
01011
:
3
L :
4
L :
5
L :
6
L :
7
11001
01010
10011
00001
01101
01100
00010
10101
10111
00011
11101
11110
10000
00100
10100
La:
01000
00110
01110
L :
9
11000
00111
11111
L
L
~
1
2
L
(4.5.11)
These nine lines are disjoint and cover 27 points of the geometry.
The
remaining four points are
(4.5.12)
Any line passing through anyone of these four points intersects at
--
least one of the above (4.5.11) nine lines.
Now consider the matrix
H
obtained from the above nine 1-f1ats t
H
The matrix
H
-0100110010
10001
11010
11001
01010
00001
01101
00010
10101
00011
11101
10000
00100
01000
00110
11000
00111
(4.5.13)
clearly has property
R (9 x 2).
2
But it is not complete as
we can adjoin anyone of four points (4.5.12) to have property
133
R (9 x 2,1).
2
This result can be extended for higher values of
does not divide
m.
when
n
We shall illustrate this method by an example.
Example (4.5.4).
Adjoin
m,
m
= 7, n = 2,
q
= 2,
t
= 2.
"01" to all the odd rows of the matrix in (4.5.13) and
"10" to all the even rows.
For example,
0101001
1010010
0110001
1011010
and so on.
property
Let
G
1
R (9 x 2).
2
be the resulting (la x 7) matrix.
be the resulting matrices.
have property
G
1
has
We can also obtain (la x 7) matrices from the matrix
(4.5.13) by adjoining (10,11); (11,01) and (00,00).
G
4
Clearly
Let
Each of these matrices
G , G
2
3
and
G , G and G
2
4
3
R (9 x2).
2
Consider the first two rows of
G
1
and the first two rows of
G ,
2
i.e.
0101001
1001001
1010010
1110010
These four points are independent.
That is the I-flat passing through
the points 0101001 and 1010010 does not intersect the I-flat passing
through 1001001 and 1110010.
pairs are independent.
Similarly we can show that the remaining
Hence the matrix,
-G
1
-
(4.5.14)
134
obtained by adjoining
G1 .G Z.G 3
and
G4 •
has property
RZ(36 x Z).
Consider the lines
{1000000. 0000101, 1000101}
{0100000, 0001111, 0101111} ,
{1100000, 0010110, 1110110}.
These lines do not intersect anyone of the lines corresponding to the
matrix in (4.5.14), as the points
00101, 01111, and 10110
lie on the I-flat spaces, given in (4.5.11).
do not
Let
-1000000
0000101
0100000
0001111
1100000
0010110
(4.5.15)
Then the matrix
(4.5.16)
H
39x7
has property
R (39 x Z).
Z
But it is not complete. as the corresponding
lines do not form a spread of I-flat spaces in
adjoin more than one point to
H in (4.5.16).
4.6 General methods
Consider the matrix.
135
PG(6.Z).
Also we cannot
000001
000010
000100
001000
010000
100000
010101
101010
100111
111001
011011
110110
H
(lZx6)
It has property
(Nxm)
(4.6.1)
R (6 x Z) and is complete.
3
R (r )r , ••. ,r ), t
t l 2
i
matrices with property
complete, is a very difficult problem.
for obtaining
(Nxm)
But the construction of
~
3) which are
We shall give a simple procedure
matrices with the desired property.
But these
matrices may not be complete.
Let
An
(Nxm)
no
t
matrix with rank
m,
is said to have the property
P
if
t
rows of the matrix are dependent.
Lell111a (4.6. l) .
Proof.
If
H
is an
(Nxm)
H is a matrix with property
matrix with property
P
tr
•
We can express
P
tr
H as
HI
H
=
·.....
HZ
· .....
·.....
Hi
136
(4.6.2)
l
where
Hi
(r.xm)
is a
suhmatrix
J.
L
and
r i = N.
i=l
Consider any
t
Let
submatrices - say
(4.6.3)
G
l
Then the number of rows in
L
G is equal to
j=l
l
L
j=l
:;;
r.
J..
r i .•
But
J
.
t x r
J
Hence
l
rank
L
G =
j=l
rio
J
result is not necessarily true.
Example (4.6.1).
Consider the (lOx4) matrix in the example (4.5.1).
R (5 x2)
2
It has property
Now we shall
but any four rows are not independent.
describ~,
matrices with elements from
Pt.
in brief, a method for obtaining
(Nxm)
GF(q), which are of rank m and have property
These matrices were first obtained by Bose [5] in connection with
the problem of confounding symmetrical factorial designs.
Let
t
<
e
N :;; q -1,
the extension field
where
GF(q8)
t, Nand
of
GF(q).
correspondence between the elements of
vector space
e,
V
8
are integers.
We can obtain a one-one
GF(q8)
and 8-vectors of the
as in the preceeding section (4.3),
137
Consider
where
0
e
is a primitive element of
GF(q e).
the distinct non-zero elements of
o
i
=
oi-l
GF(q ).
In particular we can choose
'
Let
H
O
=
--I
O
2
2
1
2
0
2
03
1
3
0
2
ot
1
ot
2
oN
2
3
0
0
N N
ot
N
0
1
0
The determinant of any submatrix of
(4.6.4)
H ' formed by il-th, i -th, .•• ,i -th
O
t
2
rows is
IT
- 0
and is therefore non-zero.
x
Since
-+
q
x ,
i
u, v = 1,2, .•. ,t; u '" v
)
v
Hence
no
t
rows of
q
0 , 02q
1
1 '
of the resulting matrix
HI
no linear combination of
t
will vanish.
we have a
has property
... ,
osq
1
x
e(t - [1.])
q
where
s
=
[1.] t
will still be independent over
rows of
HI'
Pt' If thereare any columns in
HI
t
rows
GF(q);
i.e.
GF(q)
as row vectors of
matrix with elements from
GF(q),
which
that are 4ependent on
H with property
indicates the largest integer not exceeding
138
O the
with coefficients from
HI
if
H
then any
q
others we can delete them and still obtain a matrix
t[x]
q
n = n,
and
i f we delete from
If we now regard the elements of
N
O are dependent.
e
GF(q ),
is an automorphism of
n is an element of GF(q), it follows that
columns headed by
H
x.
.
Pt.
Let the rank of the matrix
H be
Theorem (4.6.2).
~
matrix
,.
P ,
t
H
where
of rank
m
~
8(t
If
m,
t < N
m.
8
q -1,
with elements from
[.!.]) •
q
.
139
Hence
then we can find an
GF(q),
(Nxm)
and having property
BIBLIOGRAPHY
[1]
Abraham, C.T., Ghosh, S.P, and Ray-Chaudhuri, D.K. "File
organization schemes based on finite geometries", IBM
Research Report RC-1459, Yorktown Heights, New York; IBM
Watson Research Center, August 1965.
[2]
Barlotti, A. "Some topics in finite geometrical structures,"
Institute of Statistics Mimeo Series No. 439, University of
North Carolina, Chapel Hill, August 1965.
[3]
Bose, R.C. "On the construction of balanced incomplete block
designs," Annals of Eugenics, Vol. 9 (1939) pp. 353-399.
[4]
Bose, R.C. "Some new series of balanced incomplete block
designs," Bulletin Calcutta Mathematical Society, Vol. 34
(1942) pp. 17-31.
[5]
Bose, R.C. "Mathematical theory of the symmetrical factorial
designs," Sankhya, Vol. 8(1947) pp. 107-166.
[6]
Bose, R.C. Unpublished notes on Combinatorial Mathematics,
University of North Carolina, Chapel Hill.
[ 7]
Bose, R.C, Abraham, C.T., and Ghosh, S.P. "File organization of
records for multiple-valued attributes for multi-attribute
queries," Proceedings of the Symposium on Combinatorial
Mathematics, Chapel Hill, North Carolina: University of
North Carolina Press, 1967.
[ 8]
Bose, R.C. and Barlotti, A. "Linear representation of a class of
Projective planes in four dimensional Projective space,"
Annali Di Mathematica Pura Ed Applicata, Vol.88(197l) pp.9-32.
[ 9]
Bose, R.C. and Bush, K.A. "Orthogonal arrays of strength two and
three," The Annals of Mathematical Statistics, Vol. 23, No.4
(December 1952), pp. 508-524.
[10]
Bose, R.C. and Ray-Chaudhuri, D.K. "On a class of binary errorcorrecting group codes," Information and Control, Vol. 3
(1960) pp. 68-79.
[11]
Bose, R.C. and Ray-Chaudhuri, D.K. "Further results on errorcorrecting group codes," Information and Control, Vol. 3 (1960)
pp. 279-298.
..
[12]
Bose, R.C., Shrikhande, S.S. and Bhattacharya, K.N. "On the
construction of group divisible incomplete block designs,"
The Annals of Mathematical Statistics, Vol. 24, No.2 (June
1953) pp. 167-195.
[13]
Bruck, R.H. "Construction problem of finite projective planes,"
Combinatorial Mathematics and its Applications, University
of North Carolina Press (1969), pp. 426-514.
[14]
Bruck, R.H. and Bose, R.C. "The construction of Translation
Planes from Projective Planes," Journal of Algebra, Vol. 1
(1964) pp. 85-102.
[15]
Bruck, R.H. and Bose, R.C. "Linear representations of projective
planes in projective spaces," Journal of Algebra, Vol. 4
(1966) pp. 117-172.
[16]
Buchholz, W. "File organization and addressing," IBM Systems
Journal, Vol. 2(1963), pp. 86-111.
[17]
Bush, K.A. "Orthogonal arrays of index unity," The Annals of
Mathematical Statistics, Vol. 23, No. 4(1952), pp. 426-434.
[18]
Chakravarti, I.M. "Fractional replication in asymmetrical
factorial designs and partially balanced arrays," Sankhya,
Vol. 17(1956), pp. 143-164.
[19]
Chakravarti, I.M. "On some methods of construction of partially
balanced arrays," The Annals of Mathematical Statistics, Vol.
32, No. 4(1961), pp. 1181-1185 •
[20]
Dembowski, P. Finite geometries, Springer-Verlag, Berlin,
Heidelberg, 1968.
[21]
Dowling, T.A. "Two-and three-valued codes for the Gaussian
channel," Institute of Statistics Mimeo Series No. 499,
University of North Carolina, Chapel Hill, (1966).
[22]
Dowling, T.A. "A class of Tri-weight cyclic codes," Institute
of Statistics Mimeo Series No. 600.3, University of North
Carolina, Chapel Hill, (1969).
[23]
Ghosh, S.P. and Abraham, C.T. "Application of finite geometry
in file organization for records with multiple valued attributes," IBM Research Report RC-156l, Yorktown Heights, New
York, IBM Watson Research Center, March 1966.
[24]
Heft, S.M. "Spreads in projective geometry and associated designs," Doctoral dissertation, University of North Carolina,
1971.
[25]
Koch, G.G. "The design of combinatorial information retrieval
systems for files with multiple-valued attributes," University of North Carolina Mimeo Series No. 552, Chapel Hill,
1967.
..
141
[26]
Peterson, W.W. Error Correcting Codes, MIT Press, Cambridge,
Mass. (1961).
[27]
Primrose, E.J.F. "Quadrics in finite geometries," Proceedings
Cambridge Philosophical Society, Vol.47(195l), pp. 299-304.
[28]
Rao, C.R. "A study of BIB designs with replications 11 to 15,"
Sankhya, Series A, Vol. 24(1962), pp. 203-207.
[29]
Rao, C.R. "Cyclical generation of linear subspaces in finite
geometries," Combinatorial Mathematics and its applications,
University of North Carolina Press (1969), pp. 515-535.
[30]
Ray-Chaudhuri, D.K. "Some results on quadrics in finite projective geometry based on Galois fields," Canadian Journal of
Mathematics, Vol. 14(1962), pp. 129-138.
[31]
Ray-Chaudhuri, D.K. "Combinatorial information retrieval systems
for files," IBM Research Report RC-1554, Yorktown Heights,
New York, IBM Watson Research Center, February 1966.
[32]
Segre, B. "Teoria di Galois Fibrazioni proiettive e Geometrie
Non Desarguesiane," Annali Di Mathematica Pura Ed Applicata,
Vol. 64(1964), pp. 1-76.
[33]
Sprott, D.A. "A study of BIB designs with replications 16 to 20,"
Sankhya, Series A, Vol. 24, (1962), pp. 203-207.
•
142