( 14 ) CHAPTER 2 JACKKNIFING A MULTIVARIATE RATIO

( 14 )
CHAPTER
2
JACKKNIFING
UNDER
A
A
MULTIVARIATE
SUPER-POPULATION
RATIO
ESTIMATOR
MODEL
SUMMARY : This chapter deals with the jackknifing of a
multivariate ratio estimator due to Srivastava (1967).
The case of jackknifing a ratio estimator involving only
a single auxiliary variable is discussed in
Chapter 1 as
well as in Nandi and Aich (1994).It is shown that the
jackknifed estimator is quite good compared to its com­
petitors ,including the Olkin's multivariate ratio esti­
mator (1958).The jackknifed estimator is then examined
under a model which is an extension of Durbin's model
(1959) to the multi-auxiliary situation.lt is shown that
any increase in variance due to jackknifing is of the
order
1/n
2
,n being the sample size.Fina1ly,the condi­
tions under which jackknifing reduces bias as well as
variance are identified.A numerical illustration is also
given by taking real-life data.
(15)
1. INTRODUCTION
In Chapter 1, we have dealt only with the case o f a single
auxiliary variable. A modified ratio estimator incorporating a
single auxiliary character is jackknifed, and the resulting e s ti­
mator is shown to have better precision under Durbin's model,in
addition to having reduced bias.
Olkin (1958) has extended the cla ssica l ratio estimator
involving one auxiliary variable to the case of a m ulti-auxiliary
situation. He has considered a weighted average o f several uni.luxiliary ratio estimators,the weights being determined optimally
by minimizing the variance. Raj (1965), on the other hand, has
considered a weighted average o f several difference estimators.
His estimator, though computationally simpler, is no more e fficie n t
than Ollcin's estimator.
The cla ssica l ratio estimator has been extended in a number o f
ways. Srivastava (1967, 1971) has defined a class o f estimators
which contains both the ratio and product estimators.As is w ellknown, the ratio estimator is useful when the auxiliary variable
is positively correlated with the study variable, otherwise a
product estimator should be chosen. The multivariate ratio e s ti­
mator o f Olkin can be shown to be a member
class o f estimators.
of
Srivastava's
(
16 )
In a recent article, Rao (1991) has compared some commonly
used estimators using auxiliary information for four natural
populations and found no merit in considering modifications of
the classical estimators, e.g., Srivastava's modified ratio esti­
mator.Nandi and Aich (1994),however, have observed that i f the
mdified estimators are jackknifed,then a gain in precision can
be achieved in some cases,in addition to having reduced biases.
They have considered the case of a single auxiliary variable
only in their discussion.
The first idea about the jackknife technique, which has ori­
ginated outside the field of sample survey,is due to Quenouille
(1956) who has used the technique to reduce bias of an estimator
in an infinite population context. For finite populations, the
jackknife technique is first used by Durbin (1959).The literature
of jackknifing in problems of sample survey has grown rapidly
since then.
The present chapter is concerned with the problem of jack­
knifing the Srivastava's version of multivariate ratio estimator.
The proposed jackknifed errtimator is shown to be unbiased upto
the order 1/n, n being the sample size. Also, this estimator is
seen to be the minimum variance unbiased estimator (MVUE) in a
class of estimators, in some sense.The expression of the minimum
variance is obtained. The proposed estimator is then compared
with its competitors by using real-life data and found to be
better.
(
17 )
Finally,the proposed estimator is examined under Durbin's
super-population model.This model is an extension to the multi­
auxiliary situation of a model considered by Durbin (1959) for
the uni-auxiliary case.In the multi-auxiliary case,it is assumed
that each regression,taken individually, is linear passing through
the origin while the residual variance is a constant.Also, the
auxiliary variables jointly follow a version of a multivariate
gamma distribution.The expressions of the bias and the varianoe
of the proposed estimator,under the above model,are obtained.
ALso, the conditions for the superiority of the proposed Jack­
knifed estimator compared to the unjackknifed version have been
identified.
2. NOTATIONS AND PREDIMINARIES
let y denote the character whose population mean Y is to be
estimated using information on p-auxiliary characters which may
bo denoted by
x f ,x 2 ,...,x^
,respectively. Also,let
the
vector ( Yj , X,^ ,X2^ , ...,X^ ) denote the values of the i-th
unit of the population, i = 1,2, ...,N, which we assume to be
positive. It is also assumed that the population means of the
p-auxiliary characters,namely, Xj
,...,X^ , are known.
Let us now consider a random sample ( y- ,x^
,x
,...,x^L ),
i = 1,2, ...,n, of size n drawn from the above population with­
out replacement. Also, let
y ,
,x 2 ,..., Xj,
be the sample
means of y and the p-auxiliary variables, respectively.
(
18)
The following population characteristics are then defined :
N
2
SO := (N - l)’1
9
= (N - 1)
V
2
2
„2
=
S0
/
Y
,
Go :
_ 2
Y )
H
.1 U &V -
-
t - i
_2
2
ci = S3 / X *
2
f = (N - n)/N, the finite population correction(fpc).
Again, let
denote the correlation coefficient between
xj
the characters
cmd
and
xy
( J ^ J / ),and
^
that between
y
x; *
To obtain the large sample approximations for the mean
and
the variance of the proposed estimator,we shall use the Taylor's
series method which is traditionally employed in the context of
ratio and regression methods of estimation.Because of the compli-
2
catod form of the terms of order l/n
and their dubious values
( see Olkin (1958)), only terms of order l/n ,n being the sample
size, will be considered under the following assumptions :
X|i =
( i = 1, 2, . ..,N;
with
E (
€i)
✓
= 0,
Cov ( €
where
J = 1, 2, .. .,p )
E (
£ji)
2
Y ( e ) = (f/n) SQ
and
6
(2.2)
,
= 0
Y
- ✓
( ۥ
)=
) = (t/n)e0-Se
2
(f/n)
,
,
— /
and
are the sample means of the respective errors
(
19)
For the Taylor's series expansion,it will further be assumed
that
_ _
[( £ / Y )j <
_
1 ,
|(
/
,
)J<1,
X-
for each
3.
(2.2a)
Finally,the proposed estimator is examined under the following
extension of
(i)
Durbin's model (1959) s
Each regression,taken individually, is a straight line
passing through the origin. In other words, we have
E( y / x^ ) = P j x^ ,
(ii)
The conditional variance of
3 =1,2, ...,p.
y } for fixed
x j
3 is a
constant free cf j. That is,
y / X^) - (J2 ,
y(
j =1,2, ...,p.
/
ALso, the vector (
,X^
, ...,X^ ) follows a version of a
multivariate gamma distribution (see Johnson and Kotz(l972),p.217)
having the p-f-1
parameters
®„,
,which are all
positive. Again, the marginal distribution of X*
is a standard
gamma distribution having the parameter
covariance between X efficient between X^
the
(say). The
and X*/is Q q ,while the correlation co­
and X y
is Pjjj/ —
/'/(AjAj/), Similarly,
coefficient of variation (cv) of X* is
l/VAi.
Now, we quote the following asymptotic expansion of the
gamma function for later ute in our study :
Rn
•
a + b + 1 ) / f (n a) =
b +1
(n a)
.
(2.3)
(
20)
3 . JACKKNIFING SRI VAST AYA1S LTULTIVARIATE RATIO ESTIMATOR
S r iv a s ta v a (1 9 6 7 ) has d e fin e d a c l a s s o f e s tim a to r s w hich can
be g iv e n by
y£
< <*]<<*, j =1, 2, . . . , p ] »
:
( 3 .1 )
where
_
k
„
•md y«r y (V7 V "
=
The w eig h ts wr are determ ined o p t im a lly by m in im isin g th e
J
P1
v a r ia n c e o f y
under th e r e s t r i c t i o n th a t
. £ w = l . A s in
s
S r iv a s ta v a (1 9 6 7 ), th e <*j are known and t a k e n
—(>(3^C0 /C^ ,
t o be equ al t o
j —1» 2 , » . * , p »
I t I s easy t o see th a t th e O lk in ’ s m u lt iv a r ia t e r a t i o e s t i ­
m ator, w r it t e n , say,
as
yol
, i s a member o f ( 3 . 1 )
w itho(* = - l ,
v
.for a l l j .
Our aim h ere i s t o ja c k k n ife
y
so as t o red u ce i t s b i a s .
V
As th e ja c k k n ife d e s tim a to r
y'jS » we d e f in e th e w eig h ted average
—
v—*
J2. w; Jrv . ,
yTC
V5
i *j
(3.2)
r
where
#
y^,
i s th e ja c k k n ife d v e r s io n o f
J
of
y^.
« I t sh ou ld be
J /
n oted th a t th e w eig h ts
th o s e i n ( 3 . 1 )
_
w = (w^ ,w 2 , . . . , w ^ ) are d i f f e r e n t from
and t o be determ ined by m in im izin g th e v a r ia n c e
JS
y
The p ro ce d u re
b e lo w .
o f c a lc u l a t in g
y^.
V
J
f o r la r g e sample i s shown
It
i s to be noted that the quantity
o f observations,nam ely, (y»
y*.
i s based on the n -p a irs
)» i = 1 ,2 , . . . , n . Next, we obtain
y:<'^ which has the same fu n ction a l form as ya. j , but based only on
the data that remain after omitting the pair (y* ,x** ).The
,
**
"pseudovalues" y ^ *
can now be defined as
7Vr(0
*
-'"-
T rd )
fin a lly , the Jackknifed version
y*
arithm etic mean o f the pseudovalue3
/n)X
<>X|
- * _ ,, / x r
ya. - d
(3.3)
- (n “ 1) y*. > i = 1, 2,«.., n.
of
y*.
i s obtained as the
and i s given by
Ci)X
= n y^. - (n - l ) ( l / n )
X 3^. ^ •
-c =<
(3.4)
I f the sample size n i s la rg e ,th e n the actual computation
—*
from (3*4) i s cumbersome.However,in th is case an
of y*'j
- *
asymptotic expression fo r y ^ t can be obtained under some
su ita b le assumptions.
With t h is in mind, l e t us w rite
y
G
>)
t
as
1
yCO =
*s
where
(]_ „ 1/nJ
-K -+ i)
of,*
(1-lri / (ny)) (1-xji /(nxj))
yaj
,
(3.5)
TV
TV
(1 /n ) ,1 y;
,
x* = (1 /n ) , 1
x-* .
is~ I
N e x t, we put
(3.5) in (3*4-) and simplify the resulting expression
under the assumptions that i(y*/(ny) )]<.!
and l ( x j i / ( n x j ) ) l < l •
( 22)
Thus, u pto th e o r d e r l / n ,
y*.
;
=
we have
( 1 + <*; ( ,:X5 + 1
)/
(3 .6 )
(2 n ) ) y
•
The ja c k k n ife d v e r s io n o f th e S r iv a s t a v a 'e m u lt iv a r ia t e
r a tio
e s tim a to r can then be o b ta in e d , u p to th e same ord er,, as
_ £ _*
y =1 wj
y
S
* =
|
= y <. + ( l / ( 2
Z3
n ))^
r
w -^jC^i + l
)
(3 .7 )
1
S in ce
th e e s tim a to r
y,
i s u n b ia sed u p to th e o r d e r l / n
°V
>
(. see S e c t io n 2 o f Chapter l ) , a n d a ls o
w: = 1 , i t f o llo w s
itrom ( 3 * i )
th e o r d e r
th a t th e ja c k k n ife d e s tim a to r
l/n .
I f,fu r t h e r ,
i s u n b ia sed u p to
th e w e ig h ts w* are so determ in ed as
to m inim ize th e v a r ia n c e o f
t h e s e optimum w eig h ts
y^
y^
, then th e r e s u l t i n g
y
w ith
can be t r e a t e d as th e minimum v a r ia n c e
u n bia sed e s tim a to r (MVTJE) f o r th e p o p u la t io n mean
Y ,. in some
sense.
4 . OPTIMUM CHOICE TOR THE WEIGHTS
The c r i t e r i o n f o r o p t im a lit y o f th e w eig h t v e c t o r
, d e fin e d
/
as
w =
/-j
( w,
l
,w„ , . . . , w K ) ,
z.
r
i s th e m in im iz a tio n o f th e v a r ia n c e
y i. c , d e fin e d In ( 3 . 7 ) , s u b je c t t o th e c o n s t r a in t , £
w; = 1 .
6
js< *
To o b t a in th e optimum w e ig h ts, we make u se o f th e la g r a n g ia n
of
m u l t i p l i e r m ethod.
How, i t
can be shown by T a y l o r 's ex p a n sion and a ls o under th e
assum ptions o f S e c tio n 2 , th a t
(
V (
)
23!
= (f/n) Y Z (
2<*; eo: CQ
*
J
Cj
),
(4 .1 )
.5L 2
Gov( y« . • ^ - 0 = ( f / n ) Y
J
ti
The variance
V( y.
) and Gov(y
then fo lio ? /
,7,/)
<*\
oi;
from (3 .6 )
and (4 .1 )# r e s p e c tiv e ly .
P
Then
V { yy s ) =
w ;^ )
J ( S
= ( Y / n ) wA/A^ wA/,
(4 .2 )
v'herf
A = ( a :; / )
^
(ff P x p
vherein
a~/ = (1+- (l/(2n))(°f; (^J+lJ + ^ v - f - l ) ) )
f.
jV Cj-ay),
( C0tr-J fcjcd C j«/ 6 |zq ,
(4*3)
(j > 5 — 1 , 2 , ...,p).
-1
A*
»v
I t w ill he assumed that A i s p o s it iv e d e fin ite , so that
^
/
e x is t s , l e t e = ( 1 , 1 , . . . , l )
he a
v e cto r o f p-elem ents.
/V
Our problem i s then as follow s :
Minimise
su bject to
v
w/ A w
r s j rsj
/V
r v /
exw = 1.
(4*3a)
fs j
Then,using the Lagrangian m u ltip lie r method,the optimum
weights are obtained as
-1
w t= ( A e ) /
a
/
/
v
.
( ex A
a
/
/ v
-1
e ),
/ «v
(4 .4 )
(
24
)
I t t h e n f o l l o w s from ( 1 . 4 )
th at
sun o f t h e e le m e n ts i n t h e ,1- t h row o f A
*
sun o f a l l t h e p
( 4 . 5 )
e le m e n ts o f A
h a v in g t h e s e
The v a r i a n c e o f t h e j a c k k n i f e d e s t i m a t o r .
>ptimum w e i g h t 3 ?can he o b t a in e d as
-1
V ( JT s ) ~
Y / n ) ( e, A
e. ) .
( 4 . 6 )
/ p a r t i c u l a r c a s e : To f a c i l i t a t e t h e com p arison o f t h e j a c k k n i f e d
estim ato r
y:
w ith i t s
c o m p e t i t o r s , we c o n s i d e r t h e f o l l o w i n g
£t t u a t i o n :
Let
C ^ CjS...
- C ^ = C,
?oi = 64 •‘ *
s £ t
so t h a t
otj = -
(4.7)
fo r a l l
Co /0^ =
j # jf
C^/C = cX ( s a y j .
Tlie optimum w e ig h t s ,u n d e r t h e a ssu m p tio n s o f ( 4 . 7 ) , can be o b t a in e d
as
w* =s l / p ,
fo r a l l
j . Thus, i n t h i s c a s e t h e optimum w e i g h t s a re
<7
u n ifo rm .
Now, i g n o r i n g t h e f p c ,
we have t h e v a r i a n c e o f
y
, u n d er
J
th e above c o n s i d e r a t i o n s ,
V (
7
-
J
r
o
o
as
>
a 2
=
( Y " / n ) ( 1 - H
( ° ( ^
1
) / n ) .
2 ^
6o
C0
C
+
( ^
C
' / p ) ( 1
-h
6
( p
- 1 ) ) ) .
( 4 . 7 a )
(
25
)
The asymptotic relative efficien cy o f the Jackknifed
estimator compared to the Srivastava's estimator :
le t us define the asymptotic relative efficien cy o f the
jackknifed estimator y^ compared to the Srivastava’ s multivariate ratio estimator
wo denote by ARE ( y
«
as
V (
) / V (
yJ S
), which
). Now, from ( 4 . 7a) and also using the
expression o f the variance o f
y^
, as given in Srivastava (1967),
we have
ARE ( y
J^
) =
( l t ^ ( o ( - i - l ) / n)
In many practical problems, CQ equals
<A
-1
.
(4.8)
C , so that we have
- C • This gives
ARE ( y
TS.
) ~
( 1 - e ( 1 - ^ ) / n)
.
oo
-It
(l-f ) / n ,
fo r ({,(l-£ 0)l< n.
r =9
from (4*8a), i t is clear that the ARE (
100,3 i f
£q
(4^8a)
) is greater than.
l i e s in the interval ( 0,1 ). In other words , the
jackknifed estimator is more e ffic ie n t than the unj'ackknifed
estimator i f the variable under 3tudy is positively correlated
with each o f the auxiliary variables. 77e have thus iden tified a
situation where jackknifing reduces not only the bias but also
the variance o f an- estimator. However, the estimators
y r<; become equally e fficie n t i f n is large, or i f
either zero or one.
y_ and
s
is close to
( 26)
5. A NUMERICAL ILLUSTRATION
The follow in g data,taken from Olkin (i9 5 8 ), re la te to the
number o f inhabitants in the 200 la rg e s t U.S. c it ie s in 1930,
with Y = 1950, X; - 1940
and X2= 1930
values (in thousands).
The follow in g population ch a ra cteristics are obtained from
lik in 's data :
Y = 1699,
V--- 1482,
Co = 1.0242 }
C; = 1.0479 ,
eo,= 0-986? ,
eoa= 0.9695,
X^= 1420 ,
1.0635,
e/2= 0.9942 .
I t is c le a rly seen that the conditions (*4*7) are s a tis fie d
approximately by Olkin'3 data, so that the variance expressions,
an developed in Section 4 ,can be used.
The r e la tiv e biases and e ffic ie n c ie s o f some estimators are
obtained, ignoring the fpc,
and compiled in Table !• The sample
size is taken to be 10 for computation o f the gain in precision
due to jackknifing.
Prom
the Table 1 given b elow ,it appears that the jackknifed
estimator fares
b etter compared to each o f the competitors both
in terms o f bias and e ffic ie n c y . I t is also seen that the mean
per unit has the minimum efficien cy,th ou gh i t Is unbiased.As P0
is close to un ity,th e gain In e ffic ie n c y o f the jackknifed e s ti­
mator is
small
compared to S rivastava's estimator.
(
27
)
TABLE 1 . ASYMPTOTIC RELATIVE BIASES MID EFFICIENCIES
0 ? SOME ESTIMATORS COMPARED TO THE PROPOSED
JACKKNIFED MULTIVARIATE
E stim a tor
RATIO ESTIMATOR
R e la t iv e b i a s :
( B i a s / Y) .1 0 0 / j
ARE compared to y „
( in
p e r c e n ta g e s )
J a ck k n ifed estim a torC y^ ,.)
0 .0 0
1 0 0 .0 0
S r iv a s ta v a 1 s e s t i m a t o r ^ )
2.75
9 9 .5 2
C lk in 's
5.81
9 4 .9 2
0 .0 0
50.11
Mean
e s tim a to r ( yQL )
per
u n it ( y
)
Jlote : In th e above T a b le, th e ARE o f an e s tim a to r compared to
th e ja c k k n ife d e s tim a to r yy
i3 d e fin e d t o be th e r a t i o
V (y 7<.) /
e x p r e s s e d as p e r c e n t a g e s .
V ( an e s t im a t o r ),
These are shown in th e l a s t colum n.
A
raid-'::
a .O a of s i no 50
E lk in 's data is oiveu in if-»G APPTHDIX.
r Cl \T:~l
*
fro.
JiO
{
28 )
6. JACKKNIFING SRIVASTAVA'S MULTIVARIATE RATIO ESTIMATOR
UNDER DURBIN'S MODEL
In t h i s S e ctio n ,w e ja c k k n ife S r iv a s t a v a ’ 3 m u lt iv a r ia t e r a t i o
e s tim a to r under D u rb in 's model c o n s id e r e d in S e c t io n 2.
The e x p e cte d v a lu e o f S r iv a s t a v a ’ 3 e s tim a to r
, under D u rbin ’ s
m od el,ca n be shown t o be
P
2 (ys
p ro v id e d i t
) =
I
W* X;
-1 ^ ^
r
T( nAi-f -tf j-k 1)
B;----------------*
---M n A^*) n ^ r '
( 6. 1 )
e x is ts .
P
I f a l l o(.
are equ al t o - 1 ,
then
(6 .1 )
re d u ce s to
®
J r |
which i s th e p o p u la tio n mean o f y . Thus, in t h i s c a s e ,
u n b ia sed . A lso,
f o r t h i s c h o ic e o f oCj , y^
<f *
y^
w
is
becom es th e O lk in ’ s
m u lt iv a r ia t e r a t i o e s tim a to r . Thus, we have th e
r e s u lt th a t
01k i n ’ s e s tim a to r i s u n bia sed under D u rb in ’ 3 model (s e e O lk in
( 1958 ) ) .
A g a i n ,i f the d j are d i f f e r e n t from
m ator
-1 ,
th en S r iv a s t a v a ’ s e s t i ­
y£ i s a s y m p t o t ic a lly u n b ia sed , w hich f o l lo w s a f t e r p u tt in g
( 2 . 3 ) i n ( 6 . 1 ) . Nov/, assum ing n to be la r g e and a ls o u t i l i z i n g
(2.3)
i t can be shown th a t
P
E ( TJS - ys
) -
(l/(2n))_X
which i s o f o r d e r 1 /n .
+■
l)£j
,
( 6. 2)
f 29 )
Tlie r e la tio n (6 .2 ) shows that the d iffe r e n c e o f the b ia ses o f
y,
and
j?
the
y
i s o f the order
75
jackknifed estim ator
1 /n . Also, the e:q>eeted value o f
y^c. , under D urbin's model, can be obta­
ined on using (C .l) in ( 6 . 2 ) .
The la rg e sample variance o f the jackknifed estim ator
under
Durbin's model :
To obtain the la rg e sample variance o f
y-^ , wo
use the r e la tio n ( 3 .6 ) . Thus, we have the represen tation
P
il ^
1^’
(6 .3 )
wnere
P
w/ =
j
W.(
*
lh
(^i +•?.)/( 2n) ) ,
3 •
with
X W; =. 1.
y.i }
(6 .3 a )
It then follow s that unto the order 1 /n ,
1
v
(7 5
)
=
r
(tf/n)-f~X£
r'.m;
*
and
( X Wr
f ( j f J/ ) /( 2 n ) ,
t *
P
2
2
) (o' / n ) + 7
(6 .4 )
P
r'a'Vy *
X
W. w/>£( j,. j ) / ( 2n),
(6.-5)
where
$ (u j ' )
+ x ) Y f t Py * * y !^ y ^ 1 ' Aj
■h 2C-y{ y -Ki ) («<j - 4- 1) 6;
Py
- '/I Aj Aj ' ) *
( 6. 6)
fin a lly ,u p to the order 1/n , we have
V( y
T5
) = V(yc ) +
(( I
y
w/ ) - 1) ( c r / n ) .
(6 .7 )
It is easy to see that
^
2
( I v// )
j., /
^
- 1 — (l/n) X
r
v/i o'* (<*; 4- 1 ).
1
3
(6.7a)
J
from (6.7) and (6.7a) i t then follows that any increase in
variance due to jackknifing the estimator
o
y
S
is o f the order
l/n , and hence negligible in large sample.
It also follov/s from (6,7) that the jackknifed estimato
w ill have smaller variance, i f
£
X
h
V/- d ’
1
'
(tf? 4- 1
)
<
o.
(6.8)
J
I f, further, the <*; are very small, then the condition (6,8)
reduces to
Jo(-w- -C 0. In other words, i f the weighted average
j =i J J
o f the cx'j is negative, then jackknifing w ill result in a better
precision.
It is to be noted that the condition (6*8) is sa tisfie d ,in
particular, i f
each
o(J
£ ( -1 ,0 )
and w* > 0, for a ll j . However, i f
4
is equal to the common value °< , say, then the restriction
that w>>0
J
can be dropped,and (6.8) then becomes
‘
TMs Is because the sum
i
J w.
c< (o(
+•
i ) < o.
is eaual to unity,
j -* 5
A similar result was obtained by
Nandi and Aich (1994) in
the case of a single auxiliary variable.