•
*Part of this work of this author was supported by NSF grants GU-2059
and GU-19568.
lOn Leave from Punjab Agricultural University (India).
Paper prepared in connection with the symposium on "Combinatorial
Mathematics at New Delhi (India) from December 22 to December 27, 1972.
•
J
"
APPLICATIONS OF PBIB DESIGNS IN CLUSTER SAr''lPLING*
D. Raghavarao' and Rajinder Singh
Department of Statistios
University of North Carolina at Chapel Hill
and Punjab AgriouZturaZ University
Institute of Statistics Mimeo Series No. 855
December, 1972
•
APPLICATIONS OF PBIS DESIGNS IN CLUSTER SAMPLING
by
D. Raghavarao
1
and Rajinder Singh
Unive:rasity of No:rath Caroou,na at Chapel, niH
and Punjab Agncuttural, Unive:rasity
1.
INTROOUCTIOI~
AND
SUi'U~ARY.
The close relationship between sampling
techniques and experimental designs was only partly explored in literature
. (see Chakrabarti
(1963~
Mohanty '(1971».
PBIB designs provide interesting
applications to cluster sampling and in this paper we discuss the app1ications of PBIB designs to cluster sampling.
•
Definition 1.1.
A Balanced Incomplete Block
design which is an arrangement of
v
(BIB)
symbols into
b
sets each of
k «v) symbols, satisfying the following conditions:
1. Every symbol occurs at most once in each set.
2. Every symbol occurs in exactly
r
sets.
3. Every pair of symbols occurs together in exactly
A
sets.
v, b, r, k and A are called the parameters of the BIB design and they
satisfy
vr
=
bk,
A (v-l)
=
r(k-1)
•.• (1.1).
Partially balanced incomplete block (PBIB) designs are generalizations
of BIB designs and to define them, we need the concept of association scheme
as given in Definition 1.2.
~
Ion leave from Punjab Agricultural University. Part of this work of this
author was supported by NSF grants GU-2059 and GU-l9568.
2
Definition 1.2. Given symbols 1,2, ••• , v a relation satisfying
the following conditions is said to be an association scheme with
m
classes:
1. Any two symbols are either
1st, 2nd,
... ,
or
m-th
associates,
the relation of association being symetrical; that is, if the symbol
the
i-th
a,
associate of
2. Each symbol
independent of
a
has
n
ber of symbols that are
a,
is
a
and
13.
i-th
i
is the
associate of
associates, the number
a
j-th
a
and
are
i-th
associates of
n
a.
i
being
asociates, then the num-
a,
and
Pji and is independent of the pair of
Given an association scheme for the
•
i-th
is
a.
3. If any two symbols
of
a
then
a
v
i-th
k-th
associates
associates
symbols, we define a PBIB
design as follows:
Definition 1.3. If we have an association scheme with m classes
and given parameters, we get a PBIB design with
the
v
symbols are arranged into
b
m associate classes if
sets of size
k«v) such that
l. Every symbol occurs at most once in a set.
2. Every symbol occurs in exactly
r
3. If two symbols
i-th
together in
pair of
associates
a
13
and
i
The numbers
scheme and
and
sets, the number
Ai
i-th
a
n , Pjk' v
i
v, b, r, k, Ai
are
Ai
sets.
associates, then they occur
being independent of the particular
S.
are called the parameters of the association
are called the parameters of the design.
These
parameters satisfy
m
m
wr • bk ,
1:
i=l
ni
CD
m i
z: Pjk
k-l
v-I ,
= OJ -
z: n Ai D r(k-l) ,
i=l i
i=l
i
j
0ij' n 1 Pjk CD n j Pki
k
= ~ Pij
(1.2)
3
~
where
i
=j
0ij
is the Kronecker delta taking the value
1 or 0
according as
or not.
In this paper we use PBIB designs with group divisible,
L and
2
rectangular association schemes and for completeness, we introduce these
association schemes in the fo llowing defintions:
Definition 1.4.
A group divisible association..scheme has
symbols divided into
m groups of
n
v
= mn
symbols each, such that any two
symbols of the same group are first associates and two symbols from different groups are second associates.
The PBIB designs with group divisible association scheme are called
group divisible designs and are classified as
•
(a)
Singular
(b)
Semi-regular
(5R)
(c)
Regular
if
(5)
(r)
Definition 1. 5.
arranged in a
s x s
r - A1
if
An
if
=0
rk - VA
2
=o ,
rk - VA,2>0 , and
L
2
and
r - A >0
1
r - A,1>0.
asociation scheme has
v
= s2
symbols
square array such that symbols in the same row or
column are first associates while other pairs of symbols are second
associates.
PBIB designs with
L association scheme are called L
2
2
designs.
Definition 1.6.
A rectangular association scheme is a three-associate-
class association scheme with
with
m rows and
n
columns.
associates are the other
ciates are the other
(m-1) (n-1)
m-l
n-1
v
= mn
symbols arranged in a rectangle
With respect to each symbol, the first
symbols of the same row, the second asso-
symbols of the same column, and the remaining
symbols are third associates.
PBIB designs with rectangular
association scheme are called rectangular designs.
For more details of these association schemes, we refer to Raghavarao
(1971, ChI 8).
4
In sampling from finite populations, it is often advantageous to
form suitable cluster of units and surveying all the units or a fraction
of units from selected clusters (see Murthy (1967), Ch. 8, 9).
We
discuss the problem of estimating population total or mean when the clusters
are of equal size in Section 2 and postpone the discussion on clusters of
unequal sizes to Section 4.
extraneous factors
A and
there being in all
st
total when
s = t
The clusters could be formed based on two
B taking
clusters.
sand
levels respectively;
The problem of estimating population
could be tackled through
discussed in Section 4 and tlhen
t
s
~
L2
designs and this will be
we get required results with the
t,
help of rectangular designs as given in Section 5.
Finally we indicate
some possible generalizations of our results in Section 6.
•
2. 'USE OF GO
EQUAL SIZE.
each of
DESIGi'~S Ii~
Let the
v
= ~n~
CLUSTER SAifjPLIfJG HHEN THE CLUSTER P,RE OF
population units be divided into M clusters
N units and we are interested to take a sample of size
be the value of the study variable on the
Let
i-th
cluster' for
M
Y=
j
N
.r Yij
i=l J=l
L
N
Yi = I y .. /N
j=l ~J
,
= 1,
,
2,
y.
... , H;
= Y/MN, S2 =
N
Sj S2 =
i
Let there exist a GD
i
l
j=l
= 1,
H
N
-
2
I(N
-2
- Y)
I (MN-l)
i=l, 2, •••• N
-1)
design with parameters
unit of the
Let
2, ••• , H.
r L (yij
i=1 j=1
(Yij - Yi )
j-th
n.
(2.1)
v=H'N, b, r, k = n, A , A •
2
l
Let ther population units be identified with the symbols of the design.
sampling procedure we follow will be to select a set of the
GD
The
design
vTith equal probability and sample the units corresponding to the symbols
of the selected set.
Let
YG
be the sample mean and
the study variable for the selected units of the
(i
= 1,
2 , ••• , H).
i-th
2
si
the variance of
cluster
5
Then, analogous
to the results of Chakrabarti (1963)
the following
results can be easily established.
Theorem 2.1.
YG
is an unbiased estimator of
Y,
Y ... Y •
G
that is,
A
Coro11 ary 2. 1. 1.
is an unbiased estimator of
Y,
Y = VYG"
that is,
Theorem 2.2. If V(.) denotes the variance of the estimator in
parenthesis, we have·
-
V(yG) ... (vrn)
-1
M Z
z
{N(N-1) (AZ-A l ) iII 8i + (v-1) (rn-vA 2) 8 }.
(2.2)
Coroll ary 2.2.1.
(YG)
~e
V
given in (2.2) can be'reduced by choosing the
GD
design to be semi-regular and in that case we will have the following results.
Theorem 2.3.
If the sample is selected as one of the sets of a
SRGD
design, we have
V(yG)
Clearly
= v-n
Mn
v
M
\'
L.
(2.4)
Si
i=l
.
Theorem 2.L.
If the sample is selected as one of the sets of a SRGD
A
and if
N/M
~
2,
A
V(y )
G
then the estimated variance
of
V(y )
G
is
(2.5)
Corollary 2.4.1.
~(
-
VYG
)
::t
v(v-n)
Mn
M
I
i=l
2
Si
(2.6)
We easily observe that the method described in this section can be
~
used in stratified sampling and the estimate
Y and its variance by
G
selecting the sample as a set of 8RGD design are identical with statified
6
~
sample mean and its variance under proportional allocation.
3.
SAI~PLING
FRON CLUSTERS OF UNEQUAL SIZES-USE OF BIB DESIGNS.
Let there be M clusters, the i-th cluster having M population units
i
M
and let
L M = N. As in the last section, let Yij be the value of
i=l i
the study variable of the j-th unit of the i-th cluster
(5 = 1, 2, ••• , M ; i = 1, 2, ""
i
M).
Let
Mi
M
Y = l Y , Yi'a'Yi!M ,.Y= l MiY
i
i
i
j=l ij
i=l
Mi
- 2
Y = yIN , Y = Y!M , S'.2
=
(yij-Y i ) !(Mi-l)
l
wi
j=l
M
'2
Sb = L (Yi-Y) 2! (M-1)
ic:l
Let there exist a BIB design with parameters
~
,
(3.1)
v = M, b, r, k
A.
and
Let the symbols of the design correspond to the clusters. We select a set
of the BIB design with equal probability in the first etage.
(iI' i 2 , ""
i k)
If
is the selected set of the BIB design, then the clusters
numbered
iI' i 2 , ••• , i
enter the sample. If we determine to have a
k
sample size n, then from i. the cluster we randomly select without
a
M n
i
replacement
n
a
= -k.,.-·...;;...-
i
a
L
a=l
M
i
units for
a
= 1,
2, ••• , k.
The sampling
a
from different clusters will be independently made.
Let
(3.2)
where Yi
is the sample mean of the units of
a
ia
cluster.
With a little algebra, analogous to the results of Section 2, we have
~
the following results:
Theorem 3.1. YB is an unbiased estimator of Y.
7
Tileorem 3.2.
clusters in the
B 's
the
i
denotes the sum of the cluster sizes of the
B
i
If
set of the BIB design and
of the sets in which the
i
P
ex
denotes the sum of
symbol occurs, we have
ex-th
(3.3)
'2
swi
Let
be the sample variance of the variable for the units of
ex
i
ex
cluster.
Then
1.s given by
(3.4)
•
L DESIGi'4S Ii~ CLUSTER SAf'iPLHJG. Let the population units
2
be formed clusters with the help of two factors A and B each at s levels.
4.
USE OF
Thus there are in all
(ij),
where
of
factor,
B
units in the
i
clusters and each cluster can be designated by
denotes the level of
A factor and
j
denotes the level
the cluster represents.
ij-th
variable on the
... ,
s2
cluster.
ex-th
Let
unit of
ij
population
Let there be M
ij
Yijex be the observation of the study
cluster
...
, s) • Let
j = 1, 2,
s
s
s
s
=
M , N= ~
I
M
Ni • = ~ 11i oJ N.
ij
j=l
J .J
i=l jllIl ij
i=l
}\I
s
s
ij
=
;
Y
=
Y
IM
,
=
I
Y
,
I
Yij ,
Y
Yij
ij
ij ij Yi •
ijex
ij Y~ j
ex=l
j=!
i=l
i = 1, 2,
s',
(ex = 1, 2, ••• , M ;
ij
r
III
Y=
L
s
~
s
I
i=l j=l
Y .,
iJ
Y = YIN, Y =
Y/s,
,2
(Mij-l) Swij =
e
'2
(9-1) . SbA
8
Mij
2
(Y
ija Yij ) ,
I
0.=1
s
s
- 2
-2
'2
(Yi·-Y) , (s-l) SbB = I (Y. j-Y) ,
i=l
j=l
s
s
Y2
=
(Y ij i=l j==l
=
2
r
(s2_ l ) s'2
s) .
2
Let there exist a
L
2
design with parameters
V
(4.1)
= s 2,
and let the symbols of this design be identified with the
b, r, k, Al and A2
s
2
clusters.
Let a set of the design be selected with equal probability and if
selected set, then the sample consists of the clusters
ij
sample size is fixed to be
ij-th
n,
then from the selected
€
T.
T is the
If the
cluster a
simple random sample without replacement will be taken of size
n .. = n M /( L M ).
ij i j e T
ij _
1J
•
Let
pendentlym~de.
Let
YL =
~
r
Sampling from different clusters will be inde-
Yij
be the sample mean of the selected
ij-th
cluster.
H Yij •
ijE:T ij
Analogous to the results of previous sections we will have
Theorem 4.1. YL is an unbiased estimator of Y.
Theorem 4.2.
vGL)
r 2.~~2
~
s
+
where
of
Pij
L
2
2
Er-2 Al + A2) (s -1)
= -b -
s
I I
i=l j=l
Pij
-
( n
'2
'2
s '2 + (A -A ) (s-l) (SbA
+ SbB)
l 2
'2J
r) Mij Swi
(4.2)
is the sum of the cluster sizes of all the clusters in the sets
design where symbol
ij
occurs.
(Y
v L) can be reduced by choosing the L2 design to satisfy
r - 2A + A = O. The estimate of V(y) can be easily determined through
2
l
the standard techniques.
5.
USE OF RECTANGULAR DESIGNS IN CLUSTER SAMPLING. As in the
previous section we form
of two factors A and B at
st
clusters of population units with the help
sand
t
levels respectively.
Let the
ij-th
9
~
cluster have
M
ij
variable on the
i
= 1,
2,
elements.
a-th
... " s;
Let
unit of
ij-th
j ... 1, 2, ... , t).
r M , N.
j=l ij
=
Mij
Yij
= ~
Y=
a-I
s
cluster
L
...
j
t
L. L
r rM,
i=l j=1 ij
t
6
Mij , N ...
t
L
jel
_
'2
Y1j , Y = YIN, (Mij-I) Swij
Y 2
6
(.a'" 1, 2, ••• , M ;
ij
Let
i=l
_
Yija' Yij = Yij/M1j' Yi ·'"
1=1 j=l
'2
be the observation of the study
s
t
Ni •
Y
ija
'2
(s-l) S
... L (Y · - -) , (t-1) SbB
i
bA
i=l
s
'2
sty 2
(st-1) S
= L L (Y ij - st) •
i=l j=l
=
t
L
j=1
(Y'
s
L Yij ,
1=1
2
(Y1ja-Yij)'
Yij , Y. j •
M
ij
L
III
aal
Y 2
j
- -t) ,
(5.1)
Let there exist a rectangular design with parameters
~1' ~2'
v = st, b, r, k,
4IIt
identified with the
st
and
~3
and let the symbols of this design be
clusters.
We follow a similar procedure of
drawing the sample as described in the foregoing section._ Let
the sample mean of the selected
ij-th
cluster and let
where the summation is over the selected clusters.
Y
ij
be
st t
YR = b~ LMij Yij ,
Then
Theorem 5.1. YR is an unbiased estimator of Y.
Theorem 5.2.
b
V(y ) = 2 2
R
r N
r(r-A -A +A ) ( st-l) g'2 + (A -A )(t- l )
l 3
l 2 3
L
'2
+ (A 2-A 3) (s-l) SbA +
where
Pij
s
t
L
L
i=l j=l
g~i
'2'J
Pij
(n - r) Mij Swij
(5.2)
is the Sl1m of the cluster sizes of all the clusters in the
'sets of design where symbol 1j. occurs.
V(YR)
e.
can be reduced by choosing the rectangular design to satisfy
r-A -A +A = O.
l 2 3
book procedures.
The estimate of
V(y)
can be easily obtained by text-
10
6.
CONCLUDING REMARKS. Hypercubic designs and extended group
divisible designs (see Raghavarao (1971»
could be effectively used to
develop sampling schemes when the clusters are formed based on more than
2 factors and results in this direction will be discussed in a future
communication.
The relative efficiency of using differentdes1gns in'
a given sampling situation is also understudy and we expect to discuss
these results in a subsequent paper.
REFERENCES
•
[1]
Chakrabarti, M.C. (1963). On the use of incidence matrices of
designs in sampling from finite populations. J. Indian Statist.
Assoc., 1, 78-85.
[2]
Mohanty(197l). Unpublished Ph.D thesis, Institute of Agricultural.
Research Statistics, New Delhi.
[3]
Murthy, M.N. (1967). Sampling Theory and MethodS.
Publishing Society, India.
[4]
Raghavarao, D. (1971).
Statistical
Construations and Combinatorial Problems in
Design of Experiments. John Wiley, New York •
© Copyright 2026 Paperzz