Jour. Ind. Soc. Ag. Slatislics
52(2),1999: 201-209
On Ascertaining Superiority of Without Replacement Sampling Over With Replacement Sampling With Unequal Probabilities Pranesh Kwnar 1 and Nihan Kesim2 University of Transkei. 5117 Umtata, E. Cape, South Africa (Received : March, 1997) SUMMARY
When is without replacement sampling superior to with replacement
sampling with unequal probabilities? An attempt is made to answer this
question by suggesting an algorithm to test for a quadratic form to be
positive definite or positive semi-definite. A program in WF77 for
FOR1RAN compiler is written to implement the algorithm. Only
requirement to execute the program is to input a matrix whose elements
are simple functions of the first and second order inclusion probabilities,
derived from a without replacement 'ltpS sampling scheme.
Key words : Quadratic fonn, Positive definite, Positive semi·definite,
Probability proportional to size, Inclusion probability proportional to size.
1. Introduction
Let there be a finite population U of N identifiable and distinct sampling
units, denoted by i. This population is subjected to be sampled for estimating
the real valued unknown parameter Y, total of a characteristic, denoted by y,
which is defined on U. Given is a set of known probabilities Pi'
N
i
E
U,
L Pi = I, with which a probability proportional to size (PPS) sample
i-I
of size n is selected with replacement (WR) or without replacement (WOR).
The comparison of WR and WOR sampling with unequal probabilities has been
considered by various researchers, notably by Narain [6] and Des Raj (7]. This
comparison for the 1tpS sampling schemes in which 1tj = npi' in terms of
variances, using the Hansen and Hurwitz (3] estimator and the Horvitz and
Thompson (41 estimator of Y, results in
V (TM
2
)
V (T_) = (n
1)[; ~ /1 ~I ~ ~;j ]1
+;
j.
Y; Yj
n
(1.1)
Present Address: University of Northern British Columbia. 3333 University Way.
Prince George BC. Canada V2N4Z9
Bilkent University. 06533 Ankara, Turkey
JOURNAL OF mE INDIAN SOCIEIY OF AGRICULTURAL STATISTICS
202
where Twr and Two< are the PPS WR and WOR estimators of population total
Y, V (Twr) and V (Twor) their respective variances and
n1£· .
Aij:::: I,ifi:::: j;otherwise, AU:::: 1-(
1)'1
n -
Here,
1£;
and
1£ij
1£j 1£j
,iCi ~ j
(1.2)
are the first and second order inclusion probabilities of
the units i and a pair of units (i, j) respectively.
The e<luatioll (1.1) is expressed as a quadratic foml Qo
Q :::: y' Ay, y' :::: [YI" •. , Y•N ) is a IxN row vector and A ::::
n 1
- - Q, where
11
[A..)
is a NxN
IJ
synunetric matrix. It may be noted that the quadratic foml Qo is the same as
given by the earlier researchers while investigating the conditions for the WOR
sampling to have smaller variance than the WR sampling with unequal
probabilities (Gabler ([1], (2)) and Mukhopadhyaya and Bhattacharya 15]). Thus,
the superiority of WOR sampling over WR sampling can be established by
ascertaining if the quadratic fonu Q is positive definite or positive semi-definite.
This is accomplished through an algorithm and a computer program in what
follows now.
2. Algorithm to Ascertain the Positive Definiteness or Positive
Semi-Definiteness of a Quadratic Form
Without loss of generality. we deal witb the case of positive quadratic
fonns; the modifications necessary in case of negative fonns are obvious.
Given a NxN real valued matrix A :::: IAi) , denote with J the set:
{ O'} that is the set containing all real vectors of N comllOuents, except tile
vector with all comllOnents equal to zero. The quadratic foml Q :::: y' A Y is
said to be positive definite ir for every Y E J, Q > 0 and positive senll- deflllite
if for every y E J, Q > 0 and for some vectors y E J Q :::: O. In otber tenns,
if
we
define
the
sets
B = {y: y E J, y' A Y :::: O}
and
C = {Y E J, y' A Y < O}, Q and equivalently A. is said to be positive definite
if B u C
Ij) and positive semi-definite if and only if B ~ Ij) and C :;; Ij).
t
In the first place, the problem of detennining the positive definiteness
or positive senll-definiteness of a quadratic fonn can be solved simply by
inspection of the matrix A. Precisely, it happens if
(I) an entry of the main diagonal, say Au. is negative. The t]uadratic fonn
in tllis case is neither positive definite or I)ositive selni-definite, since any vector
with Yj ~ 0 and Yj :::: 0 for j ~ i belongs to C;
----~
.........-
-
- _......_ ..._ _ - _ .
..
SUPERIORITY OF SAMPUNG wrm UNEQUALPROBABIU1Y
203
0 and there exists a j such
(2) an entry of the main diagonal, say Au
tbat Au + Ajj;;J. O. In this case too, the quadratic fonn is neither positive definite
nor positive semi-definite, since Q is negative. For example, if we take
Yj !::: Aij + Aji' Yj < - Ajj' Yk = 0 for every k;;J. i. j;
(3) for all i the following inequality holds:
N
Aij>~
L
j~i
(2.1)
(IAj}+IAjjl)
.. I
In this case, the quadratic fonn Q is positive definite since we can write
N "1A-·!t·2+"
N 1J,t..,N[A-._!"
N
,t..,
u Z,t..,
!"
Q =2,t..,
IJ
j .. lj~j
where ~j
!:::
Yi
.. 1
J~j
j .. l
1
(IA·-I+IA··/)
yfI
IJ
JI
.. l
Yj' if Aij < 0; otherwise tij == Yi + Yj' if Aij 2:: O.
If the problem has scnse, all entries of the main diagonal, i.e., Aii must
be positive [it holds in (1.2) where Ajj
= 1].
Given any positive integer h
(1 < h < N), we can define a hyperplane H, by means of the following equation
Yh
N
=- ~
L
hh j*h
Now, define the sct : J*
t = [y: Y E J, Yh
= -
=J
2~
(Ahi + A ih) Yi
(2.2)
1
(j
H, that is
N
L
(Ahi + A ih ) Yi]
hhj*h=1
and prove a lemma which will be useful.
Lemma 1. Given any vector Yo E J. there is a vector Y* E J* sllch that
y., A y* < Y~ A Yo'
Proof. The quadratic foml Q can be split into two parts as
Q
=
i i
AijYiYj+[AhhY~+ i
i .. I j*h .. I
i~h
(Ahi+Aih)YhYi]
.. I
(2.3)
JOURNAL OF mE INDIAN SOCIE7Y OF AGRICULTURAL STATISTICS
204
Since
Ahh
> 0, the tenn in square brackets of the right member of (2.3),
viewed a function of
Yh
is minimized when equality (2.2) holds. Therefore,
the lemma is valid with the vector y* having all components Yj equal to those
of the vector Yo' except for Yh which is detemtined by (2.2).
Now, we prove the following theorem on which the algoritlun is based.
Theorem 2. A quadratic form is positive definite (positive senti-definite)
in J if and only if it is positive definite (positive senti-definite) in J*.
Proof. If the set C is void in J, it is void in J* too, sinee J*
J. The
converse is true, because of the lemma. In the same manner it is seen that
if C is void, the set B is void or not void together in J and its subset ]".
This completes the proof.
E
Corollary 3. If none of the situations (1), (2) or (3) holds, we can consider
instead of Q its restriction to J* by elintinating the variable Yh' This means
that Q can be replaced by Q* = y' A*y where A* is a square matrix of order
(n- 1) x (n - 1). If necessary, that is if still none of the situations (1), (2) or
(3) prevails, and if the entries of Ole main diagonal are still positive, we can
reduce further the number of variables and so on.
In the course of procedure it can happen that one or more variables are
excluded from the quadratic fonn. A variable Yj is excluded when Ajj = 0 and
for every j
;;j
i, Aij + Ajj = O. The presence of excluded variables means
obviously that the quadratic fonn is not positive definite, because a vector
belonging to B is obtained by giving the excluded variables arbitrary non-zero
values and putting equal to zero the variable which still remains at this stage.
The rows and columns corresponding to the excluded variables should be
deleted, because the presence of a zero in the main diagonal would prevent
the application of substitution (2.2) and also in order to avoid useless
calculations. Strictly spoken, this deletion will not necessarily preserve the
equivalence of the quadratic fomls; it may happen that the elintination of the
excluded variables remove the set B. This circumstance, however, would be
irrelevant, since it is already ascertained that originally this set was not void.
On the basis of the above considerations, choosing for example to
elintinate, by means of fonnulae (2.2) each time the first variable (obviously
any other choice would do as well), we fomlUlate the recursive algorithm as
follows:
Step 1. Verify if any of the situations (1), (2) or (3) holds. If yes, stop
and draw the corresponding conclusion. If not, delete the coefficients of
205
SUPERIORITY OF SAMPUNG WITH UNEQUAL PROBABILITY
eventually excluded variables, taking notice that the quadratic fonn is not
positive definite. Proceed to Step 2.
Slep 2. Change the entries of the rows and colmnns from the second unto
the N-th, according to the formulae which will be given below. Delete the
first row and first colullm. Go to Step 1.
If the procedure does not stop before, we shall end up to obtain a single
number which denotes the final coefficient of y~. Of course, if there has been
no exclusion of variables, when this munber is positive, the quadratic form
is positive definite; when it is zero, the quadratic fonn is positive semi-defmite
and when it is negative, tbe quadratic form is neither positive definite or positive
semi-definite.
For what concems the transformation of the matrix entries, it is seen that
N
putling in Q, Yj::
21
2: (Au + A jl ) Yj
the coefficient of product Yi Yj
11 j . 2
becomes (Aij+Aj)-(Au+AilHAIj+Aj)/2Atl1, if i .;. j. This can be split
into
smn
of
two
coefficients
Ajj - [(Ali + AiI)(AIj + A j1 }/4A lI ].
I Ali
Aij - [(Ali + Ail) (A lj + Ajl }/4A Il ]
The
coefficient
of
y~
and
becomes
[(Ali + Ajl )2/4AII]1. However, to avoid fractions when the original entries
are integers, it is preferable to multil>ly these expressions by the positive munber
4 All (all operation which obviously does not affect the nature of the quadratic
form), to obtain as new entries 4AIl Aji - (AJj+ Ail)(A jj + A jl ); i.j= 2....N.
Since the first row and eollUlUl are deleted, the entry written above will
be placed in the new matrix in the (i - l)-th row and the (j - 1)-th COIUllUl.
It may be observed further that these entries can be written as determinant
These expressions are simplificd whcn, as can always be obtained. the
matrix A of the quadratic form is synunetric : Aij
Aji • Since the symmetry
is preserved by the above tr3nsfonnatiolls, the calculations are reduced almost
by tbe half. Dividing by the common factor 4, we obtain that. in this case,
the new entry placed on the (i - 1}-th row (j - 1)-th cohul1n can be written
as
JOURNAL OF mE INDIAN SOCIETY OF AGRICULTURAL STATISTICS
In symmetric cases, algorithm can require at most calculation of
(~
)+ (N ~ )+ .. ,+ (N;
1
1 ::::
1 ) different
2x 2detenninants.
3. Comparison of Without Replacement and Willt
Replacement Sampling with Unequal Probabilities
To execute the calculations involved in the algoritlun, the computer
program (may be obtained from the authors) is prepared in WF77. Two examples
below present the application of the algorillull to compare with replacement
and without replacement sampling, The main advantage of the algorithm lies
in the simplicity in its execution. The users only Ileed to illJlut the matrix whose
elements are fUllctiolls of the ftrSt and second order inclusion probabilities
derived from the sampling scheme under investigation and nlll the program
under FORTRAN77 compiler 011 their Personal Computers.
Example 1. We take the example from Sampford (S]. Here, N = 10,
n :::: 2 and 1t 1 :::: .36, 1t2 = .28, 1S:::: .26, 1t4 = .22, 1t5 = 1t6 = .20,
1t7
= .16,
1tj
= npj'
1t8
= .14,
1t9
= .1 0,
1t 10
= .OS.
Since the Sampford's sampling method provides the WOR sample with
the synunetric matrix A corresponding to the quadratic fOIDI
y' A Y is
1.00 .912 .914 .916
1.00 .919 .921
1.00 .922
1.00
A=
.917
.922
.923
9.25
1.00
.917
.922
.923
.925
.926
1.00
.918
.923
.925
.927
.928
.928
1.00
.919
.924
.925
.927
.928
.92S
.930
1.00
.921
.926
.927
.929
.930
.930
.921
.926
.928
.930
.931
.931
.932 .932
.932 .933
1.00 .935
1.00
The Output-Ex.l in Appendix indicates that during the execution of the
algorithm, when matrix A is of order 5 x 5, the matrix A and hcnce, the
corresponding quadratic fonn is positive definite. Thus. for the situation
expressed in Example 1. the Sampford's WOR sampling scheme is superior
to the PPSWR sampling.
Example 2. COllsider tbe same data as in Example 1 but for sample size
5. Thus. N = 10, and n 5 and 1tl = .90, 1t2 = .70, 1t3
.65, 1t4 .55.
1t5 = 1t6
= .50,
~
= .40.
1t8
.35, ~ = .25,
1t1O
= .20.
207
SUPERIORITY OF SAMPUNG WITH UNEQUALPROBABIU7Y
The synunetric matrix A corresponding to the quadratic form y' A Y is
1.00
A=
.232 .229 - .222 - .218
1.00 -.821 -.160 -.148
1.00 - .143 - .130
1.00 - .094
- .218
-.148
- .130
- .094
1.00 - .076
- .211
.127
- .106
- .063
-.048
1.00 - .042
1.00
.207
-.117
- .094
- .049
- .027
- .027
- .987
1.00
.201
-.099
- .073
- .023
-.000
-.000
.957
- .198
-.091
- .063
- .011
- .987
- .987
.943
.940 - .923
1.00 - .887
1.00
During the execution of program when algorithm reaches at the stage
where matrix A is of order 7 x 7, A is neither positive definite nor positive
semi-definite. ThllS, tile Sampford's WOR sampling can not he considered
superior to the PPSWR sampling for n = 5, although it is for 11
2, for Ule
population given ill the eXarnl)le.
ACKNOWLEDGEMENT
Authors thank referee for the useful suggestions on the earlier version of the paper.
REFERENCES
[l}
Gabler, S., 1981. A comparison of sampford's sampling procedure versus
unequal probability sampling with replacement. Biometrika, 68, 3,725·727.
[2)
Gabler, S., 1984. On unequal probability sampling: Sufficient conditions for
the superiority of sampling without replacement. Biometrika, 71, 1.171-175.
[3)
Hansen, M.H. and Hurwitz, W.N., 1943. On the theory of sampling from finite
populations. Ann. Math. Slat, 14. 333-362.
[4] Horvitz. D.G. and Thompson. DJ.• 1952. A generalization of srullpling without
replacement from a finite universe. J. Amer. Stal. Assoc.• 47. 663-685.
[5) Mukhopadhayay. P. and Bhattacharya. S .• 1991. On a compari~on between
midzuno strategy and PPSWR strategy. Statistics, 22, 2, 291-294.
[6] Narain, R.D., 1951. On srunpling without replacement with varying
probabilities. J. Ind. Soc. Agric. Statist., 3, 169-174.
[7] Raj, D., 1966. On a method of srunpling with unequal probabilities. Ganita.
17.
[8] Sampforo. M.R., 1967. On sampling without replacement with unequal
probabilities of selection. Biometrika. 54. 499-5\3.
JOURNAL OF mE INDIAN SOCIFIY OF AGRICULTURAL STATISTICS
208
APPENDIX
OUTPUT-EX. 1
MATRIX A :
1.00 .912 .914 .916
1.00 .919 .921
1.00 .922
1.00
.917
.922
.923
.925
1.00
.917
.922
.923
.925
.926
1.00
.918
.923
.925
.927
.928
.928
1.00
.919
.924
.925
.927
.928
.928
.930
1.00
.921
.926
.927
.929
.930
.930
.932
.932
1.00
.085
.085
.085
.085
0.16
.085
.085
.085
.085
.085
0.16
.085
.085
.085
.086
.086
.086
0.15
.085
.085
.086
.086
.086
.086
.086
0.15
.085
.086
.086
.086
.086
.086
.086
.086
0.15
.007
.007
.007
.007
.019
.007
.007
.007
.007
.007
.01 8
.007
.007
.007
.007
.007
.007
.018
.007
.007
.007
.007
.007
.007
.007
.018
NEW MATRIX:
0.17 .085 .085 .085
0.16 .085 .085
0.16 .085
0.16
NEW MATRIX:
.020 .007 .007 .007
.020 .007 .007
.019 .007
.019
.921
.926
.928
.930
.931
.931
.932
.933
.935
1.00
209
SUPER/ORl1Y OF SAMPUNG W/11/ UNEQUAL PROBAB/U1Y
NEW MATRIX:
.35E-03 .93E-04 .93E-04 .93E-04
.34E 03 .93E -04 .93E -04
.34E -03 .93E-04
.33E-03
.93E-04
.93E -04
.93E -04
.93E -04
.32E-03
.93E - 04 .93E - 04
.93E - 04 .93E - 04
.93E-04 .93E-04
.93E-04.93E-04
.93E - 04 .93E - 04
.lIE - 03 .94E - 04
.31E -03
.24E-07
.24E-07
.24E-07
.24E-07
.1OE -06
.24E-07 .24E-07 .24E-07 .24E-07 .24E 07 .99E-07 NEWMATRIX:
.11E-06 .24E 07 .24E-07 .24E-07
.11E-06 .24E-07 .24E-07
.11E-06 .24E-07
.10E -06
NEWMATRIX:
['i2E -13
.21E -14 .21E -14 .21E -14 .21E -14
.11E-13 .21E -14 .21E -14 .21E -14
.11E -13 .21E -14 .21E -14
.11E-13 .21E -14
.1OE -13
POSITIVE DEFINITE
I
© Copyright 2025 Paperzz