Bose, R.C.Least squares aspects of analysis of variance."

EXff:.RlME.h .:
\IS lie:;;
INSTIT\WTE OF STATiSTICS
•
:./:.:.-.:, .. ,
;. ,~,,~,:-
klr
~}
,:
'~ !.
t,i
-,
~,
',~-;.
LEAST SQUARE ASPECTS OF ANALYSIS
. OF
VARIANCE
by
R. C. Bose
;
'.
~
~
{
[".
.,
l~
•
Institute ot Statistics
Mimeo. Series 9 .• ~
'or distribution to classes.
he;
eY,'
y, .
•
'.
r,
\.' '
.•.
....
.,
":;, ,'"
Page 1
CRAP!1IR I
•
L.
1.
The expectation of a random variable x, which takes the values
xl' x 2 ' .. •,xn With probabil1ties Pl' P2'''.' Pn is defined as E(x) =
Plxl + P2x 2+· •• + Pnxn • When x is a continuous variate in the range (a,b), then
b
,.
E(x)
=)
xp(x )h,
a
where p(x) is the :frequency density.
(i) E(cx)
cE(x)
It is easy to see that
=
(ii)
E(x + y)
= E(x)
+ E(y)
E(clYl + c 2Y2 + ••• + cnYn )= c1E(Yl) + c E(Y2) + H . + cnE(y ).
2
n
2
If E(x) = m, then Vex) is defined as E(x_m)2. Henoe Vex) = E(x ) _ m2 , or
2
E(x ) = Vex) + {E(X)} 2
(i11)
2.
Let
. E(x) = ~,
vex)
=
o·i,
E(y)
= m2
V(y)
=:r~
then the correlation between x and Y is defined by the equation
This gfves us
When xl and x 2 are not correlated,
,.
f
= 0, and in this case
Let us now calculate
and correlatio~ between xi and x
Let E(x1 ) =~,
... ,
E(x )
n
=mn ,
j
= fiJ.
then
lNSTITitJTE OF STATISTICS'
Page 2
•
V(C1Xl +C 2X2+·· ·+onxn)=E {(C1Xl +o2X2+··· +onXn)-(c l ,\+c2~+' .. cnmn )
=C~(Xl-~)2+C~(X2-~)2+ .•• +O~(Xn-mn)2
=
In partio.ular if ;X:l' x ' .•. , x are independent variates then
n
2
If further xl' x ' •.. , xn have the same variance
2
4.
and
v 2,
then
Again let us calculate the covariance of
+0' X + '+o'x
Y ' -- c'x
11
- 2 2 ' " nn
then proceedinb as before
=
In particular, if ~, x ' ••• , ~ are independent, then
2
J
2
•
and if further the variances of the x' s are equal, then the covariance is
5. Dependence, independence and orthogonality of linear functions.
In o'rder to deal effectively with problems of linear estima-
tion and tests of linear hypotheses, we shall find it convenient to introduce the no~ions of dependence and independence, and orthogonality of
linear functions and vectors.
Consider the linear functions
G1ven Yl = 5, Y2 = 7, what is the value of y,? The clever student will
notice that Y, = Yl +2Y2 so that its value must be 19, without going into
the question of the actual values of Yl' 1'2' 1','
~l = 5, Y = 7, I want you to find the value of
2
If, however, given
then it is obVious that the value of Y' cannot be calculated simply by
knowing the values of Y and Y , but depends on the actual values of
1
2
1'1' Y2' 1',' We, therefore, say that Y, is depopdont on Y1 and Y2' but
Y' is not dependent on Y1 and Y2 In general, if
...
Yk
then
l
Y = J111+
= a1k!1+a2k!2+···+amtYn
J2y2+••• + Jn1'n will be said to be dependent on
Page
•
4
The null tunction
is always dependent on Y1 " Y2' ••• , Yk since we may take b l , b2 " ••• " bk
to be zero.
The linear functions
Y1' Y2 ,···" Yr
may be said to be independent if none depends on the rest.
The necessary
and sufficient condition for this is that it is impossible to find b , b 2J
1
••• , b (not all zero) such that
r
blYl+b2Y2+••• +brYr
=0
(identically)
Consider the two linear functions
of independent random variates 1'1' 1'2' ... , Yn With a common variance
Then
If the covariance 1s to vanish,
In this case Y and. yt are said: to be orthogonal to each other.
•
2
0' •
6. Dependence and independence of vectors •
Any n-plet of numbers, e.g.
(al " tl2 , ... " an) or ( .tl , -'2'···' Rn )
is called a vector and is usually denoted by a Greek symbol. Thus we may
write
Page 5
The addition of veotors is definod by
Multiplication by a number c is defined by
The rolation
()(. = '" would moan,
Thus if
ell 1
Dl
= (aU'
2 =1 (0.12 ,
...
0.21 '
22 '
0.
... ,
a
... ,
o. )
• ••
• ••
oc k = (a , o. ,
lk 2k
nl
)
n2
...
... , am:)
then the relation
would mean that
etc.
INSTIT&JTE OF STATISTICS
pago
6
Vectors are very closely connocted with linear functions and linear
equations.· T1'u9, if thero 1s a linear function
then we say that 1ts coefficient vector is
Linear Function
Coefficient Vector
u.
Yl = ~Yf~a2Y2+ • u +anYn
Y2 = blYl+b~2+ ••• +bnYn
Null function 0 ,. OYl +OY2+••• +OYn
Sum = Yl+Y2=(al+bl)Yl~+b2)Y2+ •••
+(a +b )y
= (ap a ,
,an)
2
{!
(bl , b 2 ,· u,bn )
iii
=
Null veotor
;)t,
l'
.e
0
= (0,
01..
0, ••• , 0)
n n n
cY1 = c~y1 +ca2y 2+· • ~ +ca.xl'n
Thus if' operations of addition or multiplication With numbers are
performed on linear functions, tho coefficient voctors also undergo the
same operation.
Coefficiont Vector
Linear Function
j
Y1
Y2
1
..
•
I
;;¥.k
Yk
c1 ~l +c2 o£ 2+·· • +ok ~k
cIY1+c2Y2+···+ck 1rk
In defining the dependence and independence of' vectors, we keep to this
correspondence.
Thus, if
Coefficient Vector
0(1
= (~,a21'··· ,anl)
0l.2 =
and
(a12,a22 ,··· ,o.n2)
Linear Function
Y1
= ~1"l+o.21Y2+·· .+anlYn
Y2 "" a.~1+a2~2+···+an2Y2
Page 7
then ~ is s'a.id to be dependent on
CJ l'
Q/.
2' ••• , Olk provided we can find
constants bl" b2 , ••• , b such that
k
You would notice that in this .case the linear function Y with coefficient
vector
~
is also dependent on the linear functions Y1" Y ,.··, Yk with
2
coefficient vectors C' l' 0(2' ••• , O.k'
The vectors Q( l' O , ••• , ().r may be said to be independent if
2
none depends on the rest. The necessary and sufficient condition for this
is that it is tmpossible to find b lJ b , ••• , b (not all zeros), such that
r
2
b
l
0(1 + b
2
0(
2 + ••• + b
The set of all vectors dependent;
on <Xl' 0l2' ••• , Q:k' is called
II
Tector, space s'enerated by
:
0(
.
2' ••• , Q(k'
If the
Q(
l'
I
:
.
~eneratlns
dependent it maybe
set is in-
c~lled
the
r Ot r
=0
(null vector)
The set of all linear functions
dependent on T , Y , ••• , Y constik
l
2
tutes the linear set @enerated by
Y1" Y2 ; ••• , Yk •
If the generatin6 set is independent
it may be called the basis of the
basis of the vector space.
linear set.
I will now give you SODle of the well known results relating to vector spaces
and linear sets, the proofs for moet of which are pretty obvious.
1.
The vector space generated by
a eet of vectors Q( l'
remains unchanged if
0(
2' •... ,
Ot k
1.
the linear set generated by a set
of linear functions Y , Y , ••• , Yr
2
l
romaine unchanged if
(a) Y is replaced by c Y
(a) (Xi 1s replaced by c 0. i
i
Where c
i
;. 0
i 1
where c i " 0
(b) Y is replaced by Y + Y
i
i
.1
where (1 ;. .1)
i
(b) OI i is replaced by (Xi+ ~j
where (i ;. .1)
Combining (a) and (b) we may ro-
Combining (a) and (b) we may replace
place
Y by
()( i by
c l CX l +· "+oi ()..i+·"+ck-o{ k
if °11 0
. (c) If the null vector happens
to appear in the genorating set
it is dropped.
i
clYl + c 2Y2 + ••• + ckYk
if c ;. O.
i
(c) If the null function happens
to be in the sot, it is dropped.
INSTITlfJTE OF STATISTICS
•
Page 8
2. For any linear set thoro ox1sts a
numbor r, such that it is possiblo to
choose r independont linoar functions
in tho sot, but not moro. The linear
set· can 'be lonerated by any r linear
functions beloncing to the set. r is
said to be the number of degreos o~
freedom carried by the functions of
the set.
3. If we consider linear functions of
2. For '~ voctor spaco V there
oxists a number r such that it is
possible to choose r in4epondont
vectors in V, but not moro. Any
r indopendent vectors form a
bailia of V.
This number is
callod the rank of the vector
space.
,. If we consider 'Vectors with
n coordinates, then there cannot
/ exist more than n independent
vectors. Hence' e'Very vector
space must have rank .~ n. The
vector space consisting of all
vectors with n coordinates has
. the rank n.
n variates Yl' Y2' ••• , Yn' there cannot exist more than n independent
linear functions. Hence the degrees
of freedom carried by a:ny linear set
~ n. The linear set of all linear
functions ot Yl' Y2' ••• , Yn haa n
degrees of freedom.
The notion of the rank of a vector space, or the degrees of freedom
belonting to a linear set is also connected with the rank of a matrix.
ThUS, if
the rank of the 'Vector space
generated by
~l
= (~l'
~2
= (a121
80
21
the number of decrees of freedom carried by the linear set generated by
, ••• , an!)
8022"."
a
n2
)
... ... ... ...
O'k =
(a , a ,
lk 2k
... ,
a
nk
•••
)
1s r., then r is also the rank of the
i.e.
mat~ixl
all
a 2l .. • anl
8012
a 22 ••• an2
... ... . ..
a
~.
... ... ...
2k
••• a
nk
r 1s the order of the largest non-vanishing partial determinant.
pago 9
'e
We may summarize this as follows:
Number of degrees of freedom carried by a set ot linear functions
= Rank
of the vector space of the coefficient vectors.
= Rank of the matrix of the coefficients.
7. Orthogonal!ty of Vectors.
Consider n independent random variates yl ' y 2' ••• , yn with a common
variance a 2 , then we have seen that if
Yl = c l Yl + c2Y'2 + ... +cnYn
,then
Here the coefficient vectors are
i=
(cl ' c 2 ' ••• , cn)
cn,)
We define the scalar product of ,.. and
/ ("
.1=
cl ' c 2' ••• ,
( 1· 1;, )
For (
1. i ) =
1/ as
=
clc l' + c 2c' 2 + ••• + c c'
2
'2
n n
c l + c2 + ••• + cn we sometimes use the notation
2
Note that the sum of the two vectors is a vector, but their scalar product
is a pure number. We can now write:
COV(Y , Y ) = (
2
l
V(Yl )
i, j/) 0- 2
= ( 11 ) 0. 2= 1 2 0- 2
If Yl and Y are uncorrelated, i.- e. when
2
clci + c 2c + ... + cnc~ = ( 7~ / . . ) = 0
we have already called Y and Y orthogonal. In this case we also call
l
2
'I 8.nd / / orthogonal. Thus the condition for the orthogonality of two
2
veCOl-s is that the sum of the products . of the .corresponding coefficients
vanishes.
10
e
The following theorem about orthogonal vectors and orthogonal linear
functions DJ8\Y be stated:
1. If the vector ;9 is orthOgonai71
each of the vectors Of l' 0.2'
"., O'-n' then /I is orthogonal
to all vectors dependent on 011'
1. If the linear function Y is orthogonal to (uncorrelated With)
Y , Y ,.ooJ Y
2
l
m' then it is orthogonal
to (uncorrelated with) all linear
(). 2' ••• ()!.n i. e. to all vectors
functions depending on Yl' Y2 "···,Ym,
1. e. to the functions of the linear
of the vecotr space V generated
by Q( l' ex 2" .. 0' Ot n • In this
case we seq that ~ is orthogonal
to V.
2. Given a vector space V of rank
r (consisting of vectors With n
coordinates) then all vectors .
orthogonal to V constitute a voctor
of !,ank .~~ Thus
....Rank V + Rank 'V' = n
set generated by Y1" Y2, .. ·, Ym•
2.
wi t~dOgrees of freedom, all linear
functions orthogonal to the linear
functions of the set, fom. a linear
~'e;".
set With
8
degrees of freedom
(considering linear functions with
n variates.)
Ill"
V' is said to be the complete
orthogonal space to V. Likewise"
V is also the complete orthogonal
space to V'.
Given a set of linear functions
I
I
,i
The random variates Y , Y ' ... , Yn ID!J:3 themselves be regardod as conl
2
stituting the vector
1 = (Yl'
Y2, ••• , Yn )
and the linear :function Y = e Y + 0zY2+ ... + enYn may be written as
l l
( '1
-1 ) where ., = (clI c2' • .,.' cn)..j,ith this notation
2
2
V(., _?.) = (ro,,) 0- == ,,2 0
2
Cov [(')"1 )" (')';? )J = (ro,,') 0-
INSTITilJTE OF STATISTICS
Page 11
•
8. Homogoneous e$uat1ons •
Consider now a system of homogoneous linear oquations to be solved.
shall first take a s~le example.
Equations
Coefficient vectors
Oy1 + 4y2 + 8y3 - 16y4 - 121'5
II
0
I0
0
;y1 + 212 - 5Y3 - 17y4" ;y5
=0
1-3
-;
4Yl + 21'2 - 8Y3 - 201.'4 - 21'5
=0
1-4 -3
11 - ;Y2 - 7Y3 + 1314 + 8Y5
=0
!-1 -1
..-
Y, -
i0
Oy2
+ 2y3 +
I
I·} I-}
I
° 1-., 1
I
I
Y4 - 15 =
-1
i
!
!
(1,-3, -7, 13, 8)
=
=
01;
0<4
;)
(0, -1, -2, 4,
3)
(3,
0, -9, -9,
(4,
0, -12, -12, 4)
(1,
0,
,I
I
I
-1,
1, -1)
I
0
I
1
(4, 2, -8, -20, -2)
I
,
OYl - Y2 - 2y, + 4Y4 + 315 = 0
-Y1 + 012 + 313 + 3Y4 + Y = 0
5
011 + OY2 + 0"3 + oY4 + OY5,= 0
Oy1 +
!
Ia I
1-4 1-3
OIl
II
( 3, 2, -5, -17, -3)= o 2
j
i
,
(0, 4, 8, -16, -12)
I I,
C{
0"1 -. 72 - 213 + 4Y4 + 3Y5 • 0
}Yl + Oy2 • 97} - 974 + }Y5 • 0
471 + OY2 - 12y3 ... 12Y4 + 4Y5 = 0
Y1 ,+ OY2 -
1.
f 0 Ii (0, -1, -2,
4,
3)
3,
3,
-1)
( 0, 0, 0,
0,
0)
( 0, 0, 2,
4,
-2)
(0, -1,
I (-1, 0,
0,
8,
1)
=
0,
-3,
2)
II
(0, 0,
0,
0,
0)
-2,
1)
I
1-1
I
0
1
I
0
4y4 - 215 = °
j°
I
i
I
I,
(-1, 0,
,
t
"
i
-
l.oY1 - Y2 + 01 + 8y4 + 1
3
5
-Yl + OY2 + 013 - 314 + 215
OYl + 0"2 + OY3 + OY4 + oY5
OY1 + OY2 - 13 - 2"4 + Y
5
=
II
t
•
II
=0
=0
I
0
1
° 0
° I0
1
(0, 0, -1,
I
i
Yl :: ...3Y4 + 215
"2 = 8Y4 + 75
Y,
= -2Y4 + "5
Hence the general solution of the equation is
("'3 i. + 2JIl, 8
f
We
+ m,
-2.1. + ~ 1. , m)
=
e1
~'}
\~2
(~
\-'3
I
Page 12
or the vector space generated by
"1
= (-3,
8, -2, 1, 0)
"2
= ( 2,
1,
1, 0, 1)
Right hand side of the picture shows us how to obtain the basis of a vector
space V, and to determine the rank.
pendent.
The vectors ult1ma.tely left are inde-
The connection between the rank of vector spaces and matrices
also become obvious.
The left hand side teaches us how to solve a system of linoar homogeneous
equations, and obtain the ba.eis of the vector space completely orthogonal
.
toV. The relation
Rank V + Rank V'
=n
is also exemplified.
This mt'J:3 be expressed by saying that the rank of the vector space of solutions
and the rank of the vector space of coefficient vectors, add up to n.
9. Non-homo§eneous Equations
Next let us consider non-homogeneous equations.
Suppose we consider
tho same equations, but the right hand sides are now 0, -3, -4, -1.
the result is ultimately obtained as
Yl
= 3Y4
Y2 =
+ 2Y5 - 1
8Y4 + Y5
The general solution is now 8iven by
I
Ie
I
where
"3
~
(-1, 0, 0, 0, 0)
F.'or the geIl.era1 non-homogeneous linear equations,
Then
Page 13
...
....
...
•••
+a y
rml"
It is clear that if
A=
all
0.
a
~22
...
a
lm
then Rank A
~
21
• ••
a
2m
.Ic)~i· .C'·;-
·..
a
•
!ln2
•
It
·..
n = a om
,)Ji / . ' ,.. ,.
<
:~,
"
all
nJ.
and A =
...
RankA.
0.
o ••
a
0.
.0 •
a
21
22
• ••
o •
~
a
0
2m
...
• ••
,
nl
n2
•• 0
0.
01
0.
02
•• 0
am a om
--
It, however, the system is to be solvable, whenever a null vector
appears on the right hand side, a. zero must a.ppear on the left hand side,
otlterw1se there will be inconsistency.
-------.-._.-
,
(Make clear by considering the
example when the right hand side numbers are
0, -3, -3, -1.)
Hence the
necessary and sufficient condition for the solvability of the system is
that
Rank A = Rank A
or the rank of the vector space of the coefficients vector of the homogeneous
~-._---
portion does not increase by the adjunction of a new coordinate corresponding
to the non-homogeneous portion, to each vector.
10.
Projections
The length of the vector
0/.
Then the square of the length
=(
= (0.1 ,
Q..
Cit
)
2 ' ••• '. an) is defined to be
0.
= CI. 2 •
If we confine the· coordinates
to real numbers only, then it is seen that the length cannot vanish unless the
vector is null.
Since the Vanishing of the length is the condition of self-
orthogonality, we ma:y. say that a vector carmot be self-orthogonal unless it
is null.
A vector with unit length is usua.lly called a unit vector.
A vector can
always be converted into a unit vector by suitable multiplication with a
constant,
e. g., if
Page 14
= (0.1 ' 0.2 , ••• , an)
then c.O( = (cal' ca2 , ... , can) is 0. unit vector if we take
(j..
c
=
1
2
J(~ +
2
2
2+",+ an)
0.
I shall now give you a few theorems on the orthogonality of vectors, and
their connection with independence.
It must be remembered that we are dealing
with real constants as coefficients.
(1) If ()(p 0. 2"'" Q(n are
mutually orthogonalnon~null vectors,
(1)
If Y1' Y , ••• , Yn are mutually
2
orthogonal non-null linear functions,
they form an independent set:
they form an independent set.
Cor.
Cor.
There cannot exist more than
There cannot exist more than n
n mutually orthogonal vectors (With
mutually orthogonal linear functions
n coordinates).
with n variates.
(2) If Yl , Y2 ,···, Ym and. Yi., Y'2'
••• , Y'm' are two sets of indepen-
o..mand.fil'
t9 2' ••• , ~ m•are two sets of indepe,ment :voctors such that any 01'1 is
(2).If 01'
0. 2 '
•• "
dent linear functions such that any
y. j. s· orth,?gonal to any Y', then the
j
:1 >
••
set Yl , Y2 ,···, Ym, Yi.,
Y~,
is an independent sct.
orthogonal to any .,IS J' then the set
Oil.l • •• ,
Of
Y2' ...,
m' i& l '.. ·,;,.6m" is an
independent set.
(1).
Proof of
I f possible let
c1
•
•
0
1 + c2
Ci i· (ClOl
Ql.
2 + ••• + on
()I..
n = 0
l + C2 o. 2 + ... + on lX n) = 0
2
•• cioc'i =0
••
0i
=
0
which showS that there cannot exist a relation 01
in which the
CIS
•
But
= 1,
0(
1 + 02
2, ••• , n)
Of.
2+" • +en
<3(
n= 0
are not all zero.
(2). If possible let
Proof of
l + c2
Ot
2 + ••• + cm
.,.., = °1 ex l + c 2
01.
2 +···+em
. cl
or putting
(i
Ct
.• .
Oi.
Ct
m'
m + ~ ;91 + d 2 ;92+'''+ dm,tS m,
~ = ~~ 1+d2 ,82+.. ·+dm,
t9 m,
,
=0
Page
"!'his _~GS'':" '" III
Hence tho resuJ.t.
(3)
If
()t..
0,
or c1 =
02= . . .
is a. non-null vector and
(3)
J!
JIl
0
" is ~ vectl7I', then we con uniquely
express , in the forn
r=
where
~
t9 1is
15
= om • ~ dr=~2=' ..=dm, = 0
If Yo is a. non-null linear func-
tion and Y is o:ny linear function,
we can always express l" uniquely
#1 +12
in the form
2 is orthogonal to
OL
dependent on 01,1. e.
1'1--
,
Y = Yl + Y2
where Y is orthogonal to Yo and Y
l
2
, is dependent on.Y • (Thus Y is
o
uniquely decomposed into two parts,
and
c Ql..
,
The vectors (31 and P2 are sa.id
to be the com;pononta ef 1 along and
f
1 ~ be said
orthogonoJ. to 0 . .
to be the projoction of 1 on 0. •
one dependent on Y and perfectly
o
correlated with it, and one orthogonal to Y and thorefore comploteo
ly uncorrelated with it).
(4) If 4 1, ;92'.'" ..4 m are any
system of mutually orthogonAl nonnull vectors, then any voctor ., can
be uniquely expressed in the form
1
(4)
If Y1' Y2 ,· .. , Ym are any
system of mutually orthogonal nonnull linear functions, then a:ny
linear function Y can be uniquely
= IJ +,,1.9m+l
expressed in the forn
where f3 depends on Ii l' f3 2' • •• ,
,& m and ..8 m+l is orthogonal to them.
= Y'
+ Ym+l
where Y' depends on Y1' Y2"'" Ym
and Y 1 is orthogonal to them.
Y
:0+
Proof of (3).
properties then,
Suppose 'Y
;! 1 =
otsl
COl.
+ ~2 whore ~ l' ~ '2 satisfy the above
Thus
1
= COl
• • • (Ot. 1)•••
Co~versely if ;91
propei'ties.
••
C <Y.
+,,6'2
2
.
or
( ex • ,)
2
c
D,
()(
rot • 1)
0(
~._."
a.
2
'
and; 2 are as above,
F'2 ... , -
(0'·1)",
(X
'2
......
they satisfy the required
Hence the result.
Proof of (4), Suppose 1
abQve properties, then
--;3 + ,..4m+l l
where
I
and /
m+l satisfy the
Page 16
•
(3
= c l l l + c2
fd 2
+ •H
+ cm ~ m
• • • r = c l ~ 1 + c 2 ~ 2 + ."" + cm f> m + ,<1 m+l
2
( ;.31" 1)
.
"
1
• • ( ;91 ) = c 1 (3 P or
c1 =
11'2
J
1 = 1, 2, " .. , m
• •
A
/- =
(t9 1"
1) A
(;; 2' 1)
+ •• " + (,B m' 1) A
;112
1- 1 +
/0 2
--,a-m~'2::-- /- D'
,r}
--(5";;;"-':-
;9
(;51" 1)
n+l = 1 P '2
l
(~m· r)
;J 1""'
-
13m2
('1 m
Conversely if ~ and ,4'n+l are as above, they satisfy the required
'properties.
(5)
Rence the result.
Given a vector space V
Of
(5)
rank
r, we can always choose r nutually
orthogonal
I1r
vectors
131';1 '2'
I
••• ,
forming a basis of V.
If 011' ()' 2' " •• , <'Y. r is a basis
of V, then this choice can always be
Ii
Given a linear set of functions
with r degrees offreodOtl, we can
always chooso r mutually orthogonal
2,.••, Y;
linear func~ione Yi' Y
which generate the sot.
I:1ade in such a way that ;4i depends
If Y , Y , ••• , Y are inde2
r
l
pendent linear functions generating
only on the first i vectors of the
the set, then tho choice can always
basis 0' l'
Cor.
(j
be mde in such a way that Y:i. de-
2' ••. , a 1 "
pends only on Yl' Y21 ... ., Yi "
Cor. We can always find n mutually
We can always find n mutually
orthogonal vectors (with n-coordinates)" orthogonal linear functions with n
I
Proof of (5).
variates.
Let a basis of V be Q'l' 0/2' "•• ,
O-r", Now we can
express 0''2 in the form.
<).2
where
;11 ...
131 dependS
".
<:X 1
by
+ ,;12
on ~l and ~2 is orthogonal to O!l(cf Theorem ,)"
co., so 1~2 depeRds on
We replace
= ~l
11
Since
1 and <:X· 2 " Also (3'2 is orthogonal to
and 0(2 by ~2' Now wo can express
0-
111"
Page 17
where ;S l' /!2' ~ 3 are mutually orthogonal. ~ 3 depends on 0'1' 0<2'
We replace Ot 3 by ;53.
Continuing this process we get r mutually orthogonal vectors ;Sl '
,A2' ...,
on . <oJ'l'
(6)
r1r lying in V and forming a basis of V.
I;J/
,
"1 then we can
r, and any vector
i depends
(6)
Given a linear set with r d.f.,
then a:n:y linear function Y can be
Uniquely expressed in the form
uniquo1y express " in the form
,,= Ct.+/
f;II.,
3·
2' ••• , <;Xi·
Given a vector space V of rank
where
Clearly /
()I
Y = Y1 + Y2 whore Y1 belongs to the
lies in V,and ~ is
set, and Y
2
is orthogonal to the
functions of the sot.
orthogonal to V.
The vectors
and
0:'
,ft are
Yl and Y nay be called tho
2
called the components of " lying in
and orthogonal to V.
oc may be
components of Y lying in and orthogonal to tho set.
called the projection of" onV.
ft
Clearly
is the projection of "
on V', the space completely ortho-
-Cor.
gonal to V.
Cor.
,,2
=
0(2 +
Proof of (6).
12
From Theorem 5, we can find a basis
of V SUch that ~~l' ~2' ".';9r are orthogonal.
Theorem 4.
(7)
;91,;e 2'
u
.,
/g r
The result follows from
(7)
Jection ot ~ on V, and ""1 the pro-
Let the linear set L be a sUbl
set of tho linear set L, and let Y
o
be the component of the linear func-
jection of c:x. on V1- Then ()(.l is
also the projection of ., on V •
l
tion Y lying in L and let Y be the
1
component of Yo lying in L1' then
Let V be a sub-space of the
1
vector space V. Let IX be the pro-
Y1 is the component of Y lying in
Proof of
,,=
()t
<X
+ !J
where (JiS orthogonal to V, and thus to V1
= vl+,..d 1 where
·.. ,,=
Renee
(7).
(X
Ii 1
is orthogonal to V1
011 + (l's +11) where ;
+~ 1 is orthogonal to V1-
1 must be the projection of " on V1.
~.
Pase 18
CHAPTER II
1.
Consider Ii. independent random Tariates or observab1es
(1.1)
with a common variance
~ 2, whose expectations are J,.inear functions with
known coefficien~s of m unknown parameters P1' P2' ••. , Pm.
Thus
(1.2}
+ Cnn
Y.
of Y1 , Y2 ,
... , Yn Will be called an unbiased linear
C's't1.inht'e" of' --the
function
of the parameters,
i~
E(Y) :: II
independently of" the parameters.
= (c1~1
Now
+ c 2 a 21 + ••• + °nanl)Pl
+ (c a
+ c a
+
2 22
1 12
+ •••
...
... + cn an2 )P2
Thus a necessary and sufficient condition for Y to be an unbiased linear
estimate of IT is
•
C
1aU + c 2a 21 +
c
18 l2 + c 2a 22 +
...
c1~
...
+ c 2a 2m +
+ cnanl = ..{1
... + cna
...
...
n2
=
+ cnanm =
)/2
(1.6)
pm
INSTITtldTE OF 5TATISTICS
Pa8e 19
e
A linear function
n
of the parameters, is said to be estimable if there
exists a linear functionY of the variates, which is an unbiased linear esti-
n.
mate of
In this case there must exist cl" c ' •.• , c satisfying (1. 6) •
n
2
Bence we get:
.
Theorem (1). The parametric function f.[ =
is estimable if and only if the matrices
~
.-t\Pl
·..
·..
...
a
·..
2m
A=
+
..f~2
+ ••• +
...
...
...
j
mPm
a nl
il \
a n2
£2 \
:l~
(1.7)
I
I
anm
Rn /
have the same rank.
Corollary.
If the rank of A is m, then every parametric function is
estimable.
Proof:
Hence Rank
Rank A " RanJr A, but Rank A cazmot exceed m since it has m rows.
A = Rank
A.
, The column vectors in the equation of expectation may be denoted by
ClC 1
= (8.11 ,8,21' ••• ,aml ),
ex. 2= '(a12 ,8.22 , ... ,am2 ), ... , c< m = (alm,a2m, ••• ,anm.)
and we may denote the observables by the
.'1
v~ctor
= (Yl" Y2' •.• , Yn)
which may be called the observation vector.
The equation of expectation can
then be simply written as
E( ?) = P l
'?(
1 + P2
D( 2
+ ••• + Pm (){ m
(1.8)
The linear function Y can then be written
Y
= ( O' ?)
wher.e
. E(Y) = E( 0'/)
0
= (c ,. c ' ••• , cn) is the coefficient vector
1
2
= P1 ( K .t;l()
+ p~/ 1·01:::) + ••• + Pm( Q'o(m)
which is the result (1.5) in a compact form.
.x.
·2. When 11 = 1Pl + l2P2 + ••• + .f mPm is estimable, there will exist in
general an infinity of solutions for (1.6), so that an infinity of unbiased
•
linear estimates of IT .is possible.
one whose variance is the least.
linear estimate.
Out of these we have to pick out the
This may be called the best unbiased
Before proceoding to this, we shall establish the notions
of error and estimation spaces.
Page 20
A linear funotion Y = cIYl + c2Y2 + ••• +cy
nn may be said to
belong to 'error t if
E(Y) = 0
independently of the parameters.
lience for Y to belong to error
( .,. 0(. )
= 0,
= 0,
('1' 0< 2 )
.,.,
('1.
0(
m ). = 0
Thus the coefficient vector '1 of Y lies in the vector space ~ completely orthogonal to the vector space V generated by the vectors
0< l' 0< 2' ••• , c>: m'
We may call V'
the error space.
The space
is called the 'estimation' space for a reason which will presently
appear.
V
Theorem
(2).
If II
= .R 1
j 2 P2 + , .. + Pm Pm
Pl +
is any est1.
mable parametric function, then there exists a unique linear function
Yo whose coefficient ~eotor'10 = ( c lO ' ••• , cno ) lies in the
estimation space and for which
E (y)
o
=
n
This function Y is tho best estimate of n
o
Since 11 is estimable, there eXists a linear function
Y = ( c1 Yl + c 2 Y2 + .,. + cn Yn ) = ( '1 • ~ )
Now let Yo = ( c lO ' c20 ' ... , cno )
such that E (y) = n.
be the components of 1 along and
and '1~ = (
c2 ' ... , c"n )
cr '
orthogonal to V.
Then '1 = 1
o
+.,t-
Y = (cIYl .+ 02Y2 + ••• CnYn)=(cIOY1+c20Y2+"'+cnoYn)+(c£ Yl+c~ Y2+"'c~ Yn ).
• '. IT= E (Y) = E (Y)
since E (y+ ) = 0 as it belongs to error.
o
This shows that there exists a linear function Yo whose expectation isIl , and whose coefficient voctor lies in the estimation
space. If possible let there exist another such function Y~ , with
coefficient vector
16 = (clo
c~O ' ••• c~o ).
'
Then the expectationot tho linear function
20 ) Y2 + ...
(clO - c lO ) Yl +(°20 - c
.,i·.
With coefficient vector '1o - 0
·to error and is orthogonal to V.
.
+ (coo - c~o ) Yn
_.,t
is zero. Renee .,
belonss
o
0
But it lies in V. This is impossible
Page 21
•
unless, o- , 0 '
=0
of Y .
o
Also,, V(Y)
1. e. Y = Y'.
or , 0 = " 0
2_-2
= (j-r
=
r
o
2
(., 0
2
This proves the uniqueness
0
2
.
= V(Yr)
0
+,')
+ V(Y')
••• V(Y ) $ V(Y)
o
the equality holdine when and only when "
= 0, 1. e. when Y coincides with Yo.
This completes our prolbf.
Corollary. 'Between the estimable parametric functions, and their best
estimates there is a (1, 1) correspondonce su:::h tibat if Yl' Y ,·· .'Y are
k
2
the best estimates of, H p 11 , ••• , TI , then Y = bJY. + b Y2 + ••• + bkYk
k
1
2
2
is the bost estimate of fI = b l
+ bli2 + ••• + bJlIJc"
Proof: Clearly, E(Y) = rr, and since the coc~'ficiont vectors of
l!1
Yl' ••• , Y lie in the esM.mation spaces, the same is t:::-ue for the coefficient
k
vector. of Y.
3.
e
"
The previous theorem may be put in a slightly different form:
If Yl
= ( 0< 1" 1 ),
Y2 = (ot..
'2: Yf ),
•,•• J
Ym = (0( m"
V; >.
and n = .flPl + j 2P2 + " •• + .JnPm is an estimable parametric function,
there exists one and only one linear function of the form
n.
for which E(Y ) =
This linear function is tho bost estiI!la.te of n.
o
To actually determine the best estimate we have to find the q' S. Now
E{Yo }
= qlE( 0( 1" '7)
+ ~E( ()( 2' YJ >. + ••• +
'tn.E ( tX m" ~) =
j'lP1 +)'2P2 +
+ %mPm '
Rence using (1.9) we have
ql (
0(' 1'0(1)
+ q2( 0( 2"0(1) + ••• + ~(O('o ·0(1) ..,
q1{ ();1"(X·2)+ ~(o(2'CX~2) + ••• + <1m{;Xn'o:'2)
•
R1
aY2
...
+ 0.(0(
"'Ill
n ,tX)
n
= A:tJ m
Solving (3.2) for the q's and SUbstituting in '(3.1), we get the best estinate.
Page 22
We shall now prove the folloWing fundamental theorem:
R
Theorem: If II = ,'.(lPl + ~P2 + •.• + mPm is an estimable parametric function, then its best estimate is obtained by substituting for the pIS in IT,
any solution of the normal equations
....
Pl( ~ m·otl) + P2( CXni :;;1(2) + ... + Pm (C(
mfX m) = (0( ni 1 ) = Ym
Pi,
P2' .•. , Pm be any solution of (3.3). Substitute these in (3.3),
Let
mUltiply by ql' %' ... , ~ and add. Using (3.2) we get
~lPJ. + .f2P2 + ... +
imPm =
ql(CX r'lf) + Q2(OC it'() + ...
+ ~ ( 0< rti
,e
1) = best
estimate.
Corollary: It should be noted that the lefthand side in (3.3) is
simp.ly the expectatiol,l of the right hand side. Since any estimating function
is of the form (3.1), it follows that any estimable par~etric function must
be a linear combination of the parametric functions occuring on the left side
of (3.3).
4.
Markoff's Theorem
If II = ,RlPl + .R2P2 + ••• + 1 mPm is estimable, then its best estimate
is obta,ined by substituting for the pIS those values which minimize the sum
of squares of the deviations of the observations from expectation, i.e. the
sum of Bq~es,
2
1 '.. 08e'
'2 . ~
•
.
= -ali (Yl-allPl-al eP2-' •• -almPm)-a2i (Y2- a 2lPl- a 22P2-·· . -a2IJ>m) -',"
-a'n! (yn -anlPl-a~2-' .. -anznPm)
Hence equating the partial differential coefficients with respect to
PI' P2' .•• , Pm to zero we get
JNSTJT~TE OF STATISTICS
=1,
i
which are identical with the normal equation (3.,;)
alread~r
2, •.• , m
deduced.
Honce
the theorem.
5.
V!ri~¢e
,of the Best Estimate
j
We have soen that the best ostima.te depends on the linear function
Yi
=(
0(
i' ~1) in all cases.
We therefore start with writing down the
variances and covar1ances of these,
V( eX f
'1} : : ( 0< i' 0( i) <r 2
. COY { ( Q(
i'
7 ),
(0( j'
2
= Or( i
lr
2
'? }J::: ( OC i' 0( j)
q-2
where the q's satisfy (3.2).
V(to ):::
[q~(
0(1'0(1) + ' .. +
2
fT 1:q i q /
~(o(m'O(m)
+ 2ql%( 0(1'0(2) + etcJa-
2
:::
0< i'D(j)
=O-'?- [~l {ql( 0< 1'(;1(1)
+ ~ [ql(
0( 2'0(
+
~(o( 1·0l2>+···+
l) + ~(o< 2 ·0(2)+~··+
et.n(
0(
Clm( oc.
l'o(m)}
2'O(m)
J
+
+
~{
= ( . .f'1ql
ql ( 0( 0' o(l) +
+
R2~
~(
+ ... +
0(
m' 0( 24~ •• +
RmCla) 0-' 2
C!m(
Pl ::: CllYl + C12Y2 + .•• + ClmYn
P2 = C21Yl + C22Y2 + ••• + C2n!m.
'.
...
= CtllYl
+. Q'12Y2 + ..• + CIDYo
then a solution of (3.2) would be
1]
whore the q' 8 satisfy (3.2).
Suppose a solution of tho normal equations (3.';) is
Po
0( m' 0(0)
Page 24
R2
ql
= ° 11 ./ 1
+ °12
~
= °21.f 1
+ °22 .( 2 +
C1o. == Ow. j 1 + 0m2
-f 2
+
°
+ 2D .R 0
1)
+ .. , + 0
l:'lO..{
0
f P
Then V(Yo ) = cT 21:
i
jC ij ,
We can therefore express V(Y ) in these foroa
o
2
2
V(Yo ) == ()" 1:qiqjUij = 0- (q1
:: (J"
21:
IJ'
/1
1 + ~ ..f 2 + , .. +
..f
CJn.f 0)
=
Ri RjOij
(5.2)
To obtain the coefficients 0ij Fisher has suggested the follOWing procedure:
°211 ",,0m) is a solution of the auxiliary ~quationa obtained froo
'
the nor.cal equations by putting the right-hand aide zero except in the i-th
(Oli
equation where it is put 1.
6. Exanple
Let Yl' Y , ,. ' I Y be the values of the dependent variate corresponding
2
n
to the dependent variate xl' x ' .", x ' If we want to find the linear
2
n
regression we may take
Yi = a + b(xi - x) + E i
Where i = (Xl + x 2 + '" + xn)!n, and £i is a randon. variate with Dean zero.
Then the equations of expectation are
E(Yl)
=a
+ b(Xl
- x)
E(Y2)
=a
+ b(~
- x)
,
..
Then the estination spa.co is generated by
~1
:.
I
=
(1, 1,
0(1' 0(2 == 0, (0(
"'1
1),
.
2'0(
0( 2
= (Xl-X, x2 -x, .. "
- 2
2) = 1:(xi -x) ,
xn-i)
Page 25
The norml equationa are
or a
1
=n
Y1
Hence
Jr. ~~,8UO
of squars.s
The quantity
,p.~
...
1;6 a single desx:.s, of frc.dOli
+ C Y )
2
nn
2
+ 0n
is called the sun of squares, corresponding to the single degree of freedom
carried by the linear function
~
E(y2)
2
= V(Y)
E(S 2)
J
+ (E(Y)
=ry
2
= (ci
+
c~
+ ... +
c~)
0-
2
2
+
[E(Y)}
{E(Y») 2
+2.2
c + c + .••
l
2
s;
where
is obtained froll S2 by substituting for the observations their
expectations. Hence 82 is an essentially positive quantity.
Il
If Y belongs to error, we say that the degree of freedon carried by it
belongs to error. In that case
= 0" 2
•
Thus the expectation of the s.s. corresponding to any linear function
which carieea d. f. belonging to error is always cr 2 .
.
---
Let 0( be the projection of the observation vector "? on the
coefficient veootr 1 of Y. We shall show that s.s. corresponding to Y, is the
,
2
square of the length of 0( , i. e. 0<.
1)
= 0<.
Page 26
+ (J wh,ere C( = cy, and 0 is orthogonal to y.
2
• •• ( y. 1) = (Y·~ ) = cy
•
••
•
c
=
0(=
• •
•
•
2
• 0(.
( y. 1')
y2
(y·n) •
y2
y
= Y.:.!l
y2
8. The sum of squares corresponding to a set of k degrees of
freedom
Consider a linear set of functions with k d.f. We can always
find k mutually orthogonal linear functions, Yl1 ••• , Yk belonging
to the set. If si, .'., s~ are the corresponding s.s. then the s.
s. corresponding to thek d.f. carried by the functions of the set
is defined as
S2
n2
32
32
= ~l
+ 2 + ••• + k
It will in general be possible to choose Yl , ••• , Yk in an
infinity of ways. For our definition to be unambiguous, we must
show that it is independent of any particular choice.
Let Yl' ••• , Yk be the coeff. VGctors of Yl , ••• , Yk , and let
~be the projection of 1) on the vector space ~ generated by Yl' ••
Yk • Let
0<. = c l Yl + c 2 Y2 + ... + c k Yk
then ciYi is also the projection of 1) on "fi. We thus have
2
2
2
32
s2
ex. = «1 + ••• + C( k
<X. = :.?(.l + ••• + O\k = 1 + ••• + '"'k
=8 2
Hence the s.s. defined before is the square of the projection of 1'1
on the vector ·space L; and is thorefo~e unique.
Cor. 1.
E(S 2
) =2
ko 2
+ S
where S~ is the quantity obtained fro: S2, by replacing the observations by their expectations. In particular if the k d.f. belong
to error, E(S2) = ko 2 •
s2 = S2/k is defined as the mean square for the k d.f. in
question •
••
IN STJT(tJTE OF STATISTICS
Page 27
Corollary 2. It is clear that if there are two linear sets carrying k
l
andk2 d.f., such that the functions of the first set are orthogonal to the
functions of the second set, and if 82 and 822 are the s.s. corresponding to
1
those k1 and k2 d.f. then the s.s. corresponding to the kl + k2 degrees of
freedom carried by the two sets taken together is given by 8 2 = 8i + 8~,
In general, if k d.f. are partitioned into mutually orthogonal sets of
k1 , k2, "., ~ d.t., the sum of squares 8 2 , belonging to the k d.f. can also
2
2
2
be partitioned into the correspond:i,ng orthogonal components 81'
8 , ••• , 8 •
k
2
The sum of squares corresponding to n degrees of freedom carried by all
222
2
the linear functions of the observations is '1 = Yl + Y2 + .,. + Yn •
We have seen that there is a (1.1) correspondence botweem estimable
parametric functions and their best estimates, such that if k of the parametric functions are independent then k of the best estimates are i~ependent.
Thus if k d.f. are carried by a linear set of parametric functions, thon
their best estimates will carry k d.f. These k d.f. are spoken of as either
belonging to the parametric functions or tho estimatos, but the corresponding
s.s, is always calculatod from the best estimates.
9. Analytical formula for the sum of squares belonging to any number of
d.f.
Let Y , Y2, .,', Yk be k linear functions given by
1
Yl = CUYl + C21Y2 + ••• + CnlYn
Y2 = C12Y1 + C22y2 . + .,. + Cn2Yn
...
,,
.
...
• ••
Yk = CnYl + C2kY2 + ... + Cnk!n
not necessarily orthogonal to one another. To find tho s.s. corrosponding to
these, we havo to find the square of the projection of 7, on the vector
space generated by 7 , 72, ••• , 7k • Lot this projection be
1
o(
= t171 + t272 + ••• tk7k
then ~ = t171 + t272 + ••• + tk7k + (3
Renee tho t'a are determined by
t 1 (r l ·71 ) + t 2 (r l ·r 2 ) +
where
~iS
.. , + \:(rl '7k ) = (71 • ",
)
= ("2' ?
)
t l (Y2· Yl) + t 2 (7 2 '''2) + ••• + tk(rk·"k)
...
•
.""
orthogonal to 71"7 2, •• "k.
.. ,
tl("k""l) + t 2 ("k'''2) + , •• + \:(.,k 'rk )= .,k •
.. ,.
?)
(9.2)
Page 28
The required sum of squares is
2
S
= (t 11l
+ t212 + •• , + tk1k)
= tl("l''l'l)
+ ". + ~(rk''Y)
+ t 2 ("2''Y)
(9,3)
where t1' t 2 , .. " t k satisfy (9,2).
-It should be noticed that even if Y1' Y2 , . " , Yk are not independent
the formula (9.:5) for the sum of squares is valid.
Cor, 1.
To find the s,s. due to all the estimates, we have to take the
linear function (0(1 'Y)' (0<2'T), ,., (O:m'T). The equations (9.2) now
become the normal equa.tions. Honce the s, s. due to all estimating functions,
(or all estimable parametric functions) is
.So
2
Cor, 2,
=
Pl (0\ 1 ''I')
+
P2 LA 2''')
+ ••• +
Pm(0< m' "
)
Lot no be the rank of the estimation space, then the rank of the
error space is n = n - n, Thusn· d. f •.belong to· 'the estimates and n to
eo
0
2
e
the error. Those two sets are mutually orthogonal and if S. is the s, s. due
e
to error, then
So2 + S e 2
or Se
2
= 1) 2
2
2
+
2
= Yl + Y2 +,.,
Yn ·
= ,, 2 - :p;.(0( 1 ,,,)
~ ( ~ 2' T}) - ... - 'Pm( eX m' T) )
-
E(S2) =n 0 2
e
e
Hence 0 2 is estimated by
which as we have seen is
I:(Yi - ai1Pl - a i2P2 - .. ' - amPm)2
"
= 'I') 2- '2Pl(0"1'Tl)
.....
....
- 2P2(o(2''') - ••• - 2Pm(o'm"1)
+i~J PiPj ( o(i".x 3)
= ,, 2 - Pl (
0(
1 '1)
-
P2(
Q( 2 •
'Ii) - .", - :Pm( 0( m'
T) ).
on using the normal oquations.
2
Renee Seis alao given by the sum of the:;> squares of' the deviations of the
observations from their graduated values"
Page 29
The generalized t-test
So far we had not assumed anything about the nature of the
universe of the y's except that their expectations were given by
(1.4), and that they, had a comnon variance 0 2 • In what follows it
will bE; assumed that the y's are normally distributed variates.
Suppose we want to test whether tho estimable parametric function
10.
= RIPl
J2 P2
+ ••• + fmP rn
(10.1)
is significantly different from an assigned value 'a'.
Let Yo be the best estimate of il. Then the coefficient vector
Yo = (C 10 ' ••• , Crno) of Yo vector is in the estimation space. If
n o is the rank of the estimation space, we choose another n 0 -1
vectors Yl' ••• , Yn -1 of the estimation space, and mutually orII
o
+
.
thogonal to one another, and orthogonal to Yo. Let Yl1 •••
~
Yn -1
o
be the corresponding linear functions. Also if n e is the rank of
the error space (n a +n e =n), we can choose Yl', ••• , Y' routually orno
thogonal unit vectors in the error space and let Y1, ... , Y~ be
e
Yi, ·
the corresponding linear functions. Then Yo' Yll ••• ~ Yn -1'
o
•• Y t will be independontly and normally distributed. The mean of
n
e
.
2 22
Yo is 'a' under the hypothesis, and its variance 1S Ya a = a.
Th e
0
means of Yl , ••• , Yn -1 are unknown. Let them be lVT"1' ••• , Mn -II
I
whereas
Yi, •.• ,
o
0
Y-A
are error functions and have thoreforezero
e
Yi, ..• ,
means. Since tho cooff. vectors of Y , ••• , Y -1
Y'n
l
no
'
e
2
are of unit length, their variancos are cr.
Honce the joint distribution of the Y's is given by
canst
)(. e
..
Xe
dYi, dY2,
••• ,
(10.2)
dYri
e
We can at first "integrate out for Yl , ••• , Yn -1- Next we note
o
that each of Y!/a
is
a
normal
variate
with
mean
zero and unit
~
variance. Hence
IN STITlWTE OF S1ATlSTICS
'e
'X
2
=
Page 30
n
~e Yt 2/0 2
j=l J
= S2/0 2
(10.3)
G
'obeys the X2 distribution with n e d.f. Hence the joint distribution of Y and '1. 2 is
o
2
_ a' 2/,>~2 2 (n e. - 2)/2 .·r 2 /2
const .x. e- (Y0
)
.... voC()
e ,...
dYodX
The estimate of
0
2 is
s 2 :: S2/n
e
e
a
Hence an estimate of tho variance of Y is
a
Lot us taka
t :: (Y
0
= Sq.
- a)/(so!cio
. y
2
+ c 20 +
... +
2 )
c nO
(10.5)
-
Best estimate of 11
Value of Ii under tho hypothesis
root of the esti:nlate of the variance of the best estimate
then from (10.4) I it can bo proved in tho usual manner that t obeys Student's t-distribution with n e d.f. For let
(Yo - a)/o 0 :: R sin
¢,
1.= R cos
¢
then the joint distribution of R and ~ is
2/
n
n -1
const X e- R 2 R e cos 0 ¢ d¢ dR
Integrating out for R , we get
n -1
canst X Cocos e
¢
d¢, where
(Y
a - a)/a 0
Now
t ::
=
se/ o
Hence the sampling distribution of t is
n +1
2
a
C dt / (1 + t /ne)-r
where
•
c=
tan¢'
Page 31
This is the well known t-distribution with ne , d.f. Find the 5 % and
the 1 % values
for this distribution
with n0 , -d.f. If the hypothosis is true,
_
_
then the 5 % value will be exceeded by the observed t given by (10.5), only
in 5 % C8.ses. When the observed t oxceeds this value, we my, therefore,
think the observed value to be too large to have occurred by chance, and reject
the hypothesis. We say in this caso that t is significant on the 5 % level.
In so doing we shall, however, be rejecting a true hypothesis in 5 % cases.
This is expressed by saying that we shall coonit a nistake of the first kind
in 5 % cases. If we do not want to cotlOit a o1stake of the first kind in so
many cases, we nay work at the 1 % level, and reject tho hypothesis only
where the observed t exceeds thel % value of t. In this case we shall be
on safer ground so far as the unworrantod reJoction of a true hypothesis is
- concerned, and shall coDC1t 0. mistake of the first kind in only 1 % cases~
But there is another side of the picture. When the hypothosis is not true, the
observed va~ue of t will exceed tho 5 % va~ue of t in I10Jly tlOre cases, than
it will oxceed the 1 % value. Honce, working on the 5 % level, we shall be
rejecting the hypothesis when false, in Dany Doro cases than when worki~ on
the 1 % leveL We shall, therofore, bo coDC1tting an error of the second
kind (non-rejection of a falso hypothesis) in a lesser nunber of cases. Thus,
in Bubstituting the 1 % level, what is gained in the first kind of error is
lost on the second kind. This is mde clear by the follOWing considerations.
If the hypothesis to be tested is not true, and as a matter of fact
'SC
then E(Yo )
:=
+R mPn = a
I
a', and the joi~ ~~stribution of Yo and
-·(Y -a l
Ce
+ ~ ~2 + •••
= f 1Pl
)
2
/2 0;;
0
2
e
....;..,2~
2
(1 )
-
dY
0
d:f!
whero
C
= -1
..[2;
1
(j
Putting
2.
o
,
6
Y
•
-0.
72 n
0 , J..£)
. \2
a - a
0
=----, ---=R sin ~, X = R cos
o
we
1
n
d
0
now get tor tho distribution of R and
¢
d
~
f
will be
Page 32
. We have to integrate out for R which varies from 0 to e..-:>.
Now
In (x) = _ .._1_,- .J2n i~,"(n+l)
where In is the function introduced by Fisher in the British Association
Tables Vol. I and defined by
I (x) = -
o
x
~
(The function Hh
_! t.2
Je
r
1
2'
dt,
In (x)
=j';"In _1 (x)dJ:
x
o
n
ex)
='[2~
In (x) is tabulated in the British Association
Tables Vol. I)
Now integrating out for R we get as the distribution of ¢
l~'{n
e
n -2
6
2'
1
+ 1)
e
2
- '2 A cos
2~
'1'
n -1
cos e
¢ I n (- . A sin ¢) d ¢
2-r~~(~)
Finally since t
=..r;;6
tan
¢ the distribution of t canes out as
i_At
r\.\
,
.[2-t +n J'
e
INSTJTlt1TE OF STATISTICS
Page 33
Remembering that
iii r (n;._
n
4- 1 ) ::: 2 e
+ 1
ri
r(
e
"2
I
"'
n'
r(
e
+ 2
2 t) ~
we can write this distribution as
f(n c ' t) ~(nel t, t:.)dt
flne ; 1)
where
f (no. t) d t
,
dt
--~- -.....:-~;;;,...;.,.--~tr
:=
~
(n
-
.'n°1(
'f
n
r \2~I
l
(
1 +
is the distribution of i on the null
n +2
1
e
I
,-~ In +2 \
2
:= 2
e
f\-T)
M O"!!
.L~
"
I n \(0)--
n +2
e
2
°
+.L
~2)?_
n
0,
hyp@thesi s on n e d.f. and
neD
----2
n +t
..
At
\
e
In( - a
.
t +n J.
e
1
e"
~ ,ine +
-
a'\
f\-r;
= 1,
a.nd. the distribution of .treduces to the familiar distribution on tha null hypothesis with
n o d.f.
If we want to test our hypothesis on the ~ %
level, then
we shall not reject the hypothesis if the ob$orved t lies between
-t~ and +t~ where t~ is the tab10 value of t (null hypothesis) at
the 0(0/0 level w:i,th n d.f. Hence when the hypothesis is wrong
e
tho chance of our not "rejecting it is given by
+t()(
Pa =
f(n el t) ~(ncl t, b) dt
-to(
which is therefore the !~a.gnitude of the second kind of error.
~hen ~ decroasos t« increases, and henco P
2 increases. The
quantity 1 - Fa is the power of the test namely the probability
with whioh the test enables us to reject a wrong hy~othesis. Of
course the power depends on both n e and b, a.nd for a f~xcd « can
be shown to be a. monotonic increasing function of both.
So that when b ::: 0," ~(n ,
e
J
t , b)
Page )4
Example. (i).
11.
Let xl' %2' ••• , %n be a random ~anPle of n observations from a. normal
population with mean m and variance a • We WQllt to test the hypothosis m = a.
Now
m
E(x2) =n
E(X )
l
=:
... - ...
E(xn)
=m
The estiroation space is generated by the single vector ~y.
= (1, 1, .•• , 1),
hence the rank of the estimation space is 1, and n-l d.f. belong to error.
The normal equations are
Thus m is estimted by i = (Xl + ~ + ••• +
observation is i. Renee
S 2 = ~(x _ i)2
2 e
i
and the estiroate of
ia
Jlh).
The graduated value of ea.ch
,. 21:(x1 - x)2
=
1
.. man sa. due to error.
e 2 ns~
= --- its estimate is -£-. Thus to test our hypothesis we have to
n
n
s
Since
Vex)
take
t
=x
l!J
and working on the 0<.
%
a
levol wo rojoct the hypo'ihosis if tho observed t
exceeds in absolute value tho
Ex. (11).
~
e ;in
oe. %
value with d.f. n-l found from the tables.
Let ~l' ~, ••• , ~ and
random samplos of sizes ~ and n
and ooons '~ and
~.
X21 '
~21 ... , x~' be two
frOD nornal popula.tion with variance
2
It is requ:t.rod to' test the hypothesis I!J. = ~.
cr 2
Page 35
E(Xl l)
=:
~
= I12
E(x12 )
E(X~) =.~
E(x21 )
=:
~
...
=:
~
E(~2)
E(x2n
)
=
D
2
2
The eet1lllation space is generated by the vectors
c(l = (1, 1, ••• , 1, 0, 0, ... , 0), C(2 = (0,0, .•• ,0, 1,1, .•. ,1)
So that the estiDation space has rank 2, and.~·+ n2 - 2 d.f. belong to error.
The observation vector is
~ =: (Xl 1' x 12 ' ••• , X~' %21' %22' ••• , ~)
Thenomal equations are
(o( l'
\
0(
1)~ + (0(1'
( 0(2' C'C 1) ~ +
~=xl
Lx 2 •
ex2)°2 = (o( l' .5
)
= (C{ 2' i
)
0(
2)~
n 2 =x2
where Xl' X2 are the I:lea.na of the two sanplos,
Se
2
==
n, (
_ )'2
~
- )2
i~l Xli - ~ + j~l (x2j .. %2
and an estioate of 0- 2 is
2
2
S· e =
S
0
n +n
l
2
.. 2
vcx.:. o ..Li_>
= a-2 (!... + !...)
e
n
~
2
Benco the t-statistic which We have to use is
-
t= %1-x2.-
s·o/l.. +!...
~
Working on the 0<
t, exceeds the C(
%
%
n2
level, we reject the hypothesis if the observed value of
value of t with d.:f'. ~ + ~ .. 2.
jNSTIT~r£
'.
Of STATISTICS
Page 36
Ex. (11i). In the E>XEUlple considered in paragraph 6, lot it be required
to. test if the regression coefficient,b can be considered to have a particular
va.1ue B.
Here
n
S 2
o
= i=l
E
f Y1 -
1
2
Y- b (Xi
- x)
1
Henco an estioato of cr 2 is given by
n
2
So
E
=
i=l
Y- b(xi - x)
Y i
2
n - 2
Honce the t-statistic require~ is
(b - B ) , r n - - ;
1:: (x"-x)
II
,
.
and we have to test fo'!;'
12.
,e
./
=
t
i=l
s
..
o
significancE> on d.f. n-2.
i"~s
The generalized z-test.
To test whether k independont ostioable paraootric functions can be
stmultanoous~ regarded
11'1
11' 2
as
signific~witly
= .fllPl
= ..fl#l
+.
f 21P2
+
...
+
f 22P2
+
Jin:P1
+
.f~2 +
"k =
zero, i.o. to test the null hypothesis,
+ .f~Pn = 0
... +
... ...
... +
f n2n
p =0
(12.1)
f~·m = 0
Lot YI , Y2, •.• , Yk be the best estioaten of n1 , "2' •.. , 11'k' Then their
coefficient vectors 'Y , 'Y,z, •.•• , 'Y Dust lie in the estiDationspace and be in1
k
dopendent. In the vector space of rank k genorated by 'Y1' 'Y 2 , ••• , "k we tlO\V
choose k Dutually orthogonal vectors of unit length "10' '1 20 ' ••• , 'Yko and let
Y ' Y20 , ••• , YkO be the corresponding linear functions. It is clear that if
10
11'1' 11'2' •.• , "k' the expectations 'of Yl' Y2 , ••• , Yk are zero" then also the
expectations of Y ' Y , ••• , YkO are zero and conversely.
10
20
2
Also Sk ' the s.s. due to Y1' Y2 " ••• , Yk is the same as tho s.s. due to
Y10 ' Y20 , ••• , YkO'
Thus
. 2
+ ••• + YkO
Page 3:7
Let "k+l' "k+2'
0'"
.
"n be no -k IilUtua11y orthogonal vectors of unit
0
length in the est1Ii1a.t1?D- space, so that they are at the same time orthogonal
to "10' "20' ••• ,
functions.
"kQ'
and let Yk+l' Yk +2 , •• 0' Yn
o
Let theirneans be ~+l' 1\+2' " ' J Mn •
be the corresponding linear
Finally, let
o
"i, "2' ..'.,
,,'n
be as before mutually orthogonal
. vecotrs of unit length in the error space,
e
and let Yi, Y " 0 ' Y~ be the corresponding linear functions.
e
The Joint distribution of YlO ' Y20 , ••• , YkOl Yn ' Yi, o. Y~ can be
o
0
written
k
~ 1: y 2 12 cr 2
. 1 io
canst X e ~=
2,
0
We can at first integrate out for Y + , Yk +2 , •.• , Yn •
k l
1(J" and Y jl GO'
o
Hence if we put
that Yi
2
'k
then
Xi,2
X
2
1
=
2
k Yio
2
Sk
- =0""2
-
(12.2)
1:
1=1 q-2
· 2 ,
obey the X
2
Honce their Joint distribution is
•
and
Next we note
o
are nomal variates with zero mean and un1 t variance.
distribution With k and n
e
d.f. rospectively.
- ST AT\S1lC~
INST1TiWTE. 0 to
Let
•
2
sk
Page 38
3 k2
=.~
= mean
8
2
se
square due to hypothesis
2
e
= n-- = mean
squaro due to error
e
Let us take
s2
=~
F
S
Putting Xl
s2
1
It
(12.4)
2 log 2"
se
Q,
we
got
tho joint distriR cos
1
z == 2 log F
I
e
= R sin
=
';(2
Q,
=
but ion of R and Q as
const
x
1 2
n e +k-l -2 R . k-l
n -1
R
e
S1n
Q cos e
Q dR dQ
whence integrating out for H, we haye as the distribution of 9
~onst
x sin
k-l
n -l
Q cos e Q de
Now
",2
1
~.
2"=
const
';(2
/k
n0
1
==
.2 /
:>:.
tan2 g
(12.5)
.
n +k
I" . kF\~
\1 + ~/: 2
.
c
This as we know is the F distribution with k, n e d.f.
If
r
I
s2 / k
k
=k
se
32
/
no
'X.
2
n
e
e
which gives as the distribution of F
F=
e
.. k
2
sk
1
1
z = 2 log F = 2 log "2
se
then
F
!i:
e
2z
J
dF
= 2e 2z
,
dz
whence the distribution of z is
(12.6)
const
h+
Page 39
or
Observe carefully the structure
.
2
the denominator se is
other hand E(S~)
0
= 02
2
F
independently
=
or
The expectation
or
any hypothesis.
On the
if the hypothesis is true, but exceeds
0
2
In fact, ifE(Yio ) = Mio (i=1,2,.,k),
to be true it is necessary and sUfficient
= O. When the hypothesis is wrong,
it the hypothesis is wrong.
then for the ~ypothesis
that MIO = M20 = ••• =
2
E ( sk) =
0
0/;0
+
_2
~lO
Mko
+ ~ 0 + ••• + ~lIflk2 0
.
k
If we work at the ~%
level, we shall reject the hypothesis if the. observed ~ or ~ given by (12.4) exceeds the corresponding OlO/O value on d.f. k, nee This will happen in .~%
cases when the hypothesis is true. So that we shall be commiting
an error of the first kind in IX 0/0 cases. We can obtain a
better control over this by dlminlshing~. This will however
increase the ~%
value of E or~, and when the hypothesis is
wrong, will lead to its rejection in fewer cases. Thus a reduction of the first klnp of~rror will involve an increase of
the second kind of error. The usual conventional levels are
5 %
and I 0/0. With given ~ it is aloar that rejectio~ of a
wrong hypothesis will be in a larger and larger percenta~e of
2
2
.
.
cases. as 6 increases, so that h may be taken as a mea,ure of
the departure of the actual state of affairs from the hypothesis.
13.
Ex. i.
To
dist~nguish
between group means.
Let there be
k
samples
Sample
Sample means
1
Xl
XII x 12 • •• x1n
1
2
•• •
k
x 21 ~2 • • • x 2n
2
•••
• • • • ••
• ••
x kl ~2 • •• XkI\:
x2
•••
xk
of sizes nl , n 2 , .'" ~, supposed to have come i'rom normal
popUlations with a common variance. We wish to test whether the
\
Page 40
.
means of these populations can be regarded as identical.
Suppose the means of the !£ populations to be ml " ffi2 " ••• ,
,i
and the conn11on variance to be 0 2 • .Then we ha.ve
j = 1" I')
E(xlj ) = ~ + om2 + • •• + O~
.... "
E (x )
2j
= Oml
+
j
~ + • •• + O~
= I"
2"
•••
•••
• ••
•••
E(x )
om1 + Om2 + • •• + ~
j
1, 2,
kj
The number of variates is now
nl + n 2 + ••• + n k = N
•••
=
=
~{"
·.. " n1
• • • I
n2
·.. , nk
The estimation space is generated by the column vectors
~l' CX2" "'" ock
where for ~u" the first n l + n 2 + ... + n u - l
coordinates a!'s zero" the next nu unity and the rest zero.
Clearly" 0(1" " ' , O(k are orthogonal and hence independent. The
rank of the estimation space is
The observation vector is
3"- =
(XII' ""
(~·0(1)r.11
or·
and N-k d.f. belong to error.
x ln ; x 21 " " ' " x 2n ; ••• ; xkl ' "'" xkn. )
1 2 K
= (CX1'3)"
nlml
!s.,
= nlxl ,
(Dv'2'(X2)~
n 2mz
= (0(2·r),
= n 2x 2 ,
2
Sk
nk~
""
Thus ml " ~" "" rI\r are estimated by
due to the estimates is
" ' " (~k'O(k)~'= (~k'r)
= nkxk
(13.1)
Xl' x 2 ' "'" xk " The
13,13.
-2
-2
+ n 2x 2 + •• , + nkxk
and tho s.s. due to error is
2
Se
=~
2 .
-2
-2
= nIxI
-2
-2)
x ij - (nlxi + n 2x 2 + ••• + nkxk
nl
n2
2
=' ~ (Xl j ... xl) + ~ (x2 j - x- 2 ) 2 +. •• +
j=l
j=l
Now we want to test the hypothesis
1~=~=
'~
...
x- k )2
(13.3)
(13.4)
~
Any linear function of the m's shall be called a contrast between
the mls if the sum of the coefficients is zero. Thus
elml +
C2I~
+ ••• +
ck~
Page 41
is a contrasp if ~ c'=
E~idently then our hypothesis is that
I
:
all the ~ohtrast$ varllsh, and we have to find the s.s. due to the
k-l d.f. carried by the contrasts.
The contrast (13.3) is estimated by
0,
.
~.
.\
c l xl + c 2 x 2 :- ••• + cmxm
and it is readily seen that this is orthogonal "to x, representing
the general moan,
n_1,;;,;,xl _+_n.;.,;2_x...2_+_._._._+_n_k,; "X.. ,;k.,;. == i: xi j
x =
N
Tho s.s. duo to
x is
N
Nx2
(13.5)
subtracting it from (13.2), the s.s. due to the estimates, we get
s.s. due to all the contrasts. Hence the s.s. due to the cont.rasts is
-2
== nl(xl - x) + n 2 (xe
2
--2
- x) + ••• + nk(xk - x)
Hence the mean square due to the contrasts between the group
means is
2
2
sk_l ==
Sk-i
==
i: n.
J
j
(x.J
k-l
k
and the mean: square duo to error is
2
So =
2
3e
=
.2:.(X ij
1. , J
- X}2
-1
.., x- )2
j
N-k
N - k
To test the hypothesis we use the F-statistic
2
F=
sk_l
2
8
e
-and working on the 0<.. %
level, re ject the hypothesis if the
observed value of E exceeds the ~%
value of F on d.f. k-l, N-k.
The result may be presented in form of an analysis of
variance table,
sum of square s
Due to
Hypothesis
(between groups)
k-l
Error
(wi thin groups)
N-k
Total
N-l
) F
Page 42
mean square
=
In the usual presentation of the analysis of variance the degree
of freodom due to the grand mean is always omitted, in the guise
of tho correction for the mean. Adding the s.s., nx2 corresponding to this degree of freedom we get the total sum of squares as
2 and the number of degrees of freedom as N.
0,1: 0 x
ij
1,J
Ex. (ii).
Two way classification
Suppose there are ~ individual readings subject to two way
classification, viz.,
14.
• •• ••• ••• •••
••
-
x' l x' 2
x. n x
The reading x ij belonging to the i-th row and the j-th column
belongs to tho i-th A-class, and the j-th B-class. The variates
2
x
ij are supposed to be normal with common vari~nco 0 , and tho
mean of Xij is supposed to be
Pi + qj' tho portion Pi being sup-
posed contributed by the i-th A-class, and the portion qj by the
~
j-th B-class.
It is required to
PI
tes~
the hypothesis
= P2 = ••. = Pm
INSTITlfJTE OF STATISTICS
Page 43
indopendently of any hypothesis regarding the Ql'
~,
.",
~
(i.o., we wish to test whether tho' A-class means can bo regarded
as identical). Now
= Pi
E(X ij )
+ qj'
The estimation space is generated by the vectors
' .
IX.
I
f
I
l ' "'2' " ' , O(m' 0(1' 0(2' ••• , o(n
where the coordinates of
~i
are
(0,.0, ... , 0
........ ...1.
0,0, " ' , 0
............
1, 1, ""
0,0, , .. ,0)
the unities being only in the i-th row of the above scheme.
Similarly the coordinates of o<j are
(0,0, , .• ,1, •.. ,0
....................
0,0, ••• ,1" ••• , 0
1, .. ~, 0)
the unities being only in the j-th colunm of the above schemo,
Clearly,
(;(1 +0(2 + ••• + O(m - o<{ - cX 2 - ... - c<ri, = 0
0,0, ""
and it is readily seen that this is the only connecting relation
between them. Since if
cIO(l + c2G'C:2 + " . + cm~ + cio<i + c 20(2 + ... + cri«ri.
=0
we get
ci + c j
::
0, . i ::: 1, 2.1 " ' , m; j
from which it follows
Cl = c 2 :: , ••
= em::
-c
1 :::
-c
= I,
2, ••• ,' n
2 = ••• :::
-cri
Thus exactly m+n-l. of the vectors 0(1' «'2' " ' , o<.m'
oti, -<2' ""
O(~
are independent, so that the estimation space has
rank m+n-l,.
Consequently the number of d.f,
is mn-m-n+l = (m-l) (n-l).
Setting
Pl + P2 + ••• + Pm
ql +
~
belo~ging
+ ••• +
<au
p=----~----...;;;;,q=----------..-;;.
m
n
to error
(14.5)
cIXl + c 2x 2 + ••• + cmxm
(14.6)
and since thero are (m-l) independent contrasts, there are (m-l)
such linear functions.
The s.s. due to the m linear functions
(14.7)
which estimate
is clearly
-2
-x2 )'
1· + x 2 · '+ ••• + m' .
'n (x-2
It is also readily seen that the functions (14.6) arc all
,orthogonal to the grand mean
1 iii
(xl, + x 2 . + ••• + ~.>
-2 •
for which the sum of squares is ~nx
Honce the ~ d.f. carried by the linear functions (14.7) can
be split up into two orthogonal sots, viz., the m-l d.f. carried
by the functions (14.6) which estimate the contrasts between the
'
2
pIS, and the one d~f. carriod by the grand mean x. TI1US 3 m_ll
x =
1
rnn
1: xi j =
the s.s. due to the contrasts between the pIS, is given by
n(-2' -2 +
+ x 2 ) _ n'.. . . ..,.2
2
3 fa-l
=
xl' + x 2 .
· •·
m.
U.lJ\.
= n 1: (xi. - X) 2
i
Likewis'o the s.s. due to the n-l indepondent contrasts between
n
the q's is
ill
1:
j=l
(x .j
-
x)2
Page 45
To find the s.s. due to error two ways are open to us.
Method I. The graduated value of x ij is the best estimate of
Pi + qj. 'This from the normal equation is seen to be
x.J. . + x • J. Hence tho s.s. due to error is
3
2
e
=
X
Z . (x ij - x.J.. -
i,J
X 'J. +
x) 2
Method II. It is casy to see that the linear functions estimating contrasts between q's are orthogonal to
as well as to
linear functions estimating contrasts between the pts. Hence the
total s.s. due to the estimates is
2 =
2
-2
n Z• (x.J.. - x)2 + m Z• (+
n ~~. + m Z x . - mnx
x. j - -)2
X
x,
mnx
J
J.
Hence tho s.s. due to error is
2
-2
3 e = Z x 2 ' - n Z xi.
ij
i,j
i
e
=
Z
(X
ij
- m Z• x-2. J.
- -x. - xJ..
J.
j.J
-2
+ mnx
J
X)2
.j +
i,j
To test our hypothesis we have therefore to put
2
Sm_l
= ---- -
mean sq. due to hypothesis
m-l
2
3e
--....;;.-- = mean
(m-l) (n-l)
sq. due to error
or
Then working on the ~%
level, we reject the hypothesis, if the
observed F (or,&) exceeds the 0( 0/0 value of F (or !.), on d.f.
m-l, (m-l) (n-l) •
The results may be presented in the form of an analysis of
variance ta.ble.
/1
Page 46
Due to
Hypothesis
(Contrasts
between
the p'S)
Contrasts
between
the q 's
..!
d.• l' •
•
!,
m-l
n-l
I
Error
• (m-I) (n-I)
Total
mn-l
Inean square
s.s •
. 2
Sm_l
::
32 _
n I
::
n
~
i
m Z
j
(x
; 2
2
IS
: m-I
::
s;_I/(m-l5.
.-X) 2
s2
: n-I
::
s~_l/(n-I)
(xi -X>
J
s; ::
F ::
s~_l/s;
To complete the ma degrees of ~reedom, we have to add the
degree of freedom f'or the grand mean, to which corresponds a sum
of' squares mnx 2 • 1he completed s.s. is then Z X~j.
i,j
15.
Conditional error
The calculation of the s.s. due to the hypothesis is in some
cases simplified by using the concept of conditional error. Suppose we want to test the hypothesis
i 21 P2
+ J22 P2
ill :: fllPl +
il2
"-.
:: i
+ • • • + ,fmlPm= 0
+ ••• +
•••
••
•••
11k :: IlkPl + 12k P2 +
+
12 Pl
...
·..
f m2 Pm ::
0
• ••
•
~i1kPm
:: 0
Then any linear f'unction Y of the observations may be said to belong to error conditionally if E(Y) :: 0, in virtue of the hypothesis. Linear functions belonging to error also belong to error
. conditionally, but other linear functions, namely those whose expectations are of the form bInI + b 2il2 + ••• + bk~ conditionally belong to error. In the equations of expectation, if we make
~phange of parameters, and substitute from the hypothesis, then
we can reduce the number of parameters 'to m-k. If we now calculate the s.s. due to error, then this will exceed tho s.s. due to
L.
IN STITMTE OF S1ATlSTICS
Page 47
ordinary error, the difference being exactly equal to the s.s.
due to the hypothesis.
s.s. due to hypothesis =
= s.s. dUG to conditional error - s.s. due to error
== s.s. duo to parameters - s.s. due to conditional parameters
We may illustrate by using the example in para. 13. The
hypothe~is to be tested is
n1.==m2 == •••
==~
which is equivalent to the vanishing of the (k-l) independent
contrasts. We can therefore reduce the number of parameters to
one only, by putting
ml == m2 == ••• == mk == m
The equations of expectation now become
E(Xlj ) _om
j == 1,2, ••• , n
l
E(X 2j ) = m
j = I, 2, ••• , n 2
•••
•
E(xkj ) == m
~
The estimate of 8 is
S~2
•••
j == I, 2, •• ', n k
and so the conditional error is
•
x,
== Z
X~j
- Nx
2
Hence from (13.3), the s.s. due to the hypothesis is
2
-2
-2
-2-2
Sk-l == nlxl + n 2x 2 + ••• + nkxk - Nx
which is the result we otherwise derived.
Page 48
CHAPTER III
~
Up to this time we have been considering a case when the
2
obse~vations Yl' Y2' ••• , Y have a given variance o.
V~en
n
1.
I
however we design an experiment, we have some choice, and by
using what has been called nlocal control" we may succeed in
reducing 0 2 , and improving the efficiency of our experiment.
This will be first illustrated by a simple example:
Suppose it is required to test whether two drugs, A
and B, are of equal value in,producing sleep, or whether one of
them can be regarded significantly better. We could conduct
the experiment in one of two ways.
Method I. We choose 2n individuals at random from
the universe of individuals for which we want our results to be
true, and to £ of these administer the drug A, and to the other
B administer the drug B, and note the number of hours of sleep
produced.
Suppose the values are as follows:.
Drug A
•••
x' 1
Drug B
Xl
2
•••
If the mean effect, of the drug A is m, and the mean effect'of
the drug B is -,
m' then we have to test the hypothesis m = mt •
Here
Xi
=m +
£1'
x'i
= m'
+ e'i'
i
= I,
2~
••• , n
Where Ei , eli are random variables with mean zero and variance,
say 0 2 • Hence
2
E(X i ) = m, E(x'i) = ru l , V(x i )
V(x'i) = 0 , i = l, ••• ,n.
=
To test our hypothesis, we have to use the t-statistic
t
where
2
sa
=
f ~X.i
-r;::'7.x'
= x~--,--­
S
ei.2/n
-
X> 2
+ Z ex' i - x l ) 2
2n - 2
and reject the hypothesis on the C( "10 level if the observed value
i exceeds the ~i~ value of twith d.f •. 2n - 2.
Page 49
Method II. We pick out !l individuals at random, and
to each administer both drugs A and B, (with a suitable interval
to eliminate any carry over effect~), and note the results.
We now have the following scheme:
Individuals
1
2
•••
y' 2
•••
n
Drug A
Drug B
The variability in the responses to the two drugs may
be supposed to be compounded of two parts; the individual response,and the variability due to the other residual factors.
Thus,
.
Pi + 6 i , Y'i = m' + Pi + 6'i' (i = 1, 2, ••• , n)
where Pi is the personal response of the i~th individuals, and
Yi
=m +
6 i , 6'i are random variables with mean zero and variance say 01 2 •
We may write
Zi
and
E(Zi)
= Yi
= E(Yi
= (m Y'i) = m -
- Y'i
-
m') + (6 i - O'i)
m'; V(Zi) = V(Yi - Y'i)
= 201 2 •
To test our hypothesis, we now have to use the tstatistic
t -
where
8,2
e
=
i:(Zi
s~.
y - y'
Z
/':rn = s'e Ih-.,.
•. Z).2 = Z(Yi -
J.J.
2
y1 - Y - yt)
n - 1
n - 1
and reject the hypothesis on the o(~~ level if the observed value
of' 1 exceeds the c:x al~ value of t with n - 1 d.!.
Now let us compare the two methods. If the hypothesis
is true, then of course both tests would lead to the rejection
of the true hypothesis in ~ ~ocases, so that there is nothing to
choose between them. vVhat happens when the hypothesis is wrong?
The expectation of the numerator is in both cases the same,
viz.,.m - m'. The expectation of the square of the denomi~ator
is in the first case 20 2/n Whereas, in the second case it is
2
.
20l /n. If' we denote the variance in personal response by V(p),
Pafe 50
we have
o
2
= V(p)
+
0
2
•
1
If V(p) is at appreciable, thon
0
2 exceeds
oi.
'!hus
t in the second method will in the long run deviate .from zero
more, than in the first method. On the other hand the deviation
of t from zero will be considered significant only if t exceeds
the ~~_value on n - 1 d.f., whereas, in the first method, it
is considered significant if t exceeds the ~~ value on 2n - 2
d.f. For example, if n = 10, then the 1 'c> and 5 % values of tare
as follows
d.f.
570
i_
9
2.262
3.250
18·
2.101
2.878
The ~1G value on n - 1 d.f. is always larger than
the corresponding value on 2n - 2 d.f., thOUgh the difference
tends to decrease as U increases. Hence, the larger deViation
of t from zero in the second method is compensate.d by its being
required to exceed a larger ,"X ""leo value, for significance. The
physical reason' for this is that in the second method ffilr estimate of error is relatively uncertain, being based only on
n - 1 d.f. as against 2n - 2 d.f. in the first method, and
for the same margin of safety as regards the first kind of error,
the deviation of i from zero should be relatively larger in
order to be considered Significant.
It is therefore clear that if the variance of the personal response is appreciable enough to compensate for the loss
of n" 1 d.f. in tho estimation of error, tho second method
will give a better result. The elimination of personal response
in the second method is an example of local control. It is a
general principle in the design of oxperiments, that if by
instituting suitable local control we can eliminate significant
causes of variation, then we shall usually gain in procision~
We must, however, be careful not to leave too few degrees of
freedom for the estimation of error by over elaborating the
experiment.
As another examplc, suppose there are m treatments in
an agricultural experiment, and we want to test whether all
Page 51
~
the treatments arc equally efficacious so far as the yield is
concerned, or whether they are significantly different. Let ron
plots of land be available for the experiment. In this case we
could again use two methods.
First Method.
In this case each treatment is applied
to ~ randomly selected plots from all tho plots available. We
can apply the method of para, 13. 'Taking all the yields from
the same treatment as forming one group, we have simply to test
for the equality of the group means, If tho yields for the i-th
treatment are
Xil , x i21 ••• , x in
then the analysis of variance is
due to
d.f.
sum of squares
mean square
Treatment
2
m- 1
Sm-l
Error
ron - m
S2
Total
mn -
1
EJ
i
,
=~
( xi. - -x. )2
J
~
i,j
~
2
F
t
-2
~ X)
•
L
t
= Z(x.
i' ~
sm-l
= s~
The expectation 0 2 of the mean square in the denominator of F
includes, besides other residual factors, the variance due to
the different fertility of the different plots.
Second Method.
In this case the land is divided
into £ compact blocks, each consisting of ~ plots. Within any
block each of the ~ treatments is applied to one plot chosen at
random. This is known as the randomized block design. Let Yij
be the yield of the i-th treatment, for the j-th block. To
distinguish between the treatment means, we can apply the method
of para. ,.; and exhibit the analysis of variance as
,\;
(
JNSTJT&JTE OF STATISTICS
Page 52
due to
!
mean square
sum. of squares
d.f.
Treatments:
m-l
Blocks
n-l
Sf 2 =mZ (-::'\2
Y .-y/
n-l j • J
mn-l
-2
Z (x ..-x)
. . 1. J
1.,J
2
2
s'n-l=3'n-l I{n-l)
Error
Total
F=
.e
s,2
m-l
;;r
e
To compare the two methods we observe that in the second method we have eliminated from the error differences'of
fertility between blocks. This may be exhibited by writing
=ti
+ e ij , Yij = t i + b j + 5 1j
where t Is the effect of i-th treatment, and b j the mean effect
1
of the j-th block. We shall denote by t,the mean effect of all
the treatments. Thus
x ij
a
2
n
\
= E(s~)
= V(e)"
e
'
at
2
= E(s'~)
= V(5)
.
n
0
But since V(z) = V(b) + V(e), where V(b) denotes the
component of tbe variance due to thG difforences between the
blocks, a2 .} at 2 • 'rhus, if' plots vd thin tbe sa:tr'e block tend to
be relatively more aliJ.w in their response to treatments, than
plots in different blocks, then a 2 will approciably exceed cr,2.
Now the ratio of the expectation of the numerator and denominator
of F
•
1 +
n~ ( t i
- t) 2
nZ ( t i - t) 2
2 and 1 +
(m - ~)a
{m - 1)a t2
in the two methods, when the hypothesis of equality of the treatment effects is false. Thus, F in the second method will deviate
Page 53
moro from unity than in tho first method. on the other hand, for
signi;f'icance on the 0( %level l E in the second method ntustexceed
the ~?ovalue of E on d.f. m - 1, (m - 1) (n - 1). This value
will be somewhat larger than the ~~value of E on d.f. m - 1,
mn -m, which has to be exceeded. by the observed F in the first
method. For example, note 1% and 5~values of E shown below for
the case m = 5.
54].
d.f.
4.43
4.18
Hence, as in the previous example it is clear that if the variance between blocks is appreciable enough to compen~ate for the
loss of n - 1 d.f. in the estimation of error, the second
method will give a better result. As a matter of fact, the randomized block design has been found to be of great value and is
now in almost universal use.
2.
The randomized block design
discussed above is efficient
,
only when the number·of trea.tments is not very large, for the
variability in response of plots within the same block, always
goes to swell the error. ~hen the block size is largo, the design tends to become inefficient. To overcome this, "inconlplete
block designs" have been introduced. I shall first discuss the
general theory of analysis of incomplete block designs before
coming to concrete 'incomplete block designs'.
Let us consider! observations on heterogeneous material to which u treatments
(whose expected
effects are given by,
.
the parameters t l , t 2 , ••• , t u ) have to be applied in order to
-
test their relative efficacy. It would be advantageous to divide the material into 2. relatively homogeneous parts, before
applying the treatments. Thus in a field experiment we would
divide tho whole land on which the experiment is to be made into
b compact blocks. Each block is then divided into a number of
plots to which treatmonts are applied, and the yield (or what·
ever other effect we are interested in) is observed. In a biological experiment the animals on whom the experiment is to be
performed may be divided into £ relatively homogeneous groups.
-
1
~~~~~~~~~~------"
Page 54
Eacn such group corresponds to a block, and the animals
group to a plot.
We have thu~ in tho general case ~ treatments
blocks. Let the number of plots in the j-th block be k
i
tho number of replications of the i-th treatment be r i •
r l +.r2 + ••• + r u = k l + k 2 + ••• + kb
N
=
within a
and £
, and let
Clearly
(2.1)
In the usual experiments, the number of plots within a block is
kept constant, so that k l = k 2 = ••• = k b , but due to accidental
circumstances such as a missing plot, these numbers may become
unequal. Hence it is advisable so fer as the general theor'J is
concerned to keep open the possibility of their being different.
Suppose the i-th treatment has been applied to n ij plots in the
j-th block. Then n ij is either 1 or 0, according as tho j-th
treatment has or has not been applied to some plot in the j-th
block. (The same treatment is neyer applied t~ more than one
plot, in the same block). The numbers n ij can be represented in
the matrix form
Total
Total
nll
n l2
•••
n lb
r1
n 21
n 22
• ••
n
r2
•••
•••
~l
•••
nu2
• ••
•••
nub
ru
k1
k2
•••
kb
N
2b
..
(2.2)
If n ij = 1, then the yield of the i-th treatment in the
j-th block may be denoted by Yij. We then have
E(Y1j) = t~ + b j
the part t 1 <being due to the i-th treatment~ and the part b j to
the j-th block.
The plots being assumed to come in a certain o:t:'der, the
equation~ of expectation can now be written in full as
E(Yij~
= ot l
+ •• + t
i + •• + otu +
Ob 1
+ •• +
b j + •• + Obb(2.3)
Let T}be the observation vector whose coordinates are the yl~lds
of the N plots; and let ''(1' 12' ... , 1'u, ("3l'(~2' ••• ,/3b' be the
· lN STIT&1TE OF STATlSrlCS
Page 55
column vectors, corresponding to the parameters t l , t 2 , ••• , t u '
b l , b 21 •• " bb' It is readily seen that
(Y1''1 )'
i
= Ti
;
(T')'(3 )
j
= Bj
(2.4)
where Ti is the tot~l yield of the i-th treatment, and B j is the
total yield of the j-th block, Since any best estimate of a
linear function of the param.eters, must be com.pounded of ('rI.T!)
= 1,
and ('rI' (3j)' (i
2, .'"
u), (j
= 1,
2, " ' , b) we a. t once
get the following result:
Theorem 1.
Corresponding to any estimable function TI, of the
treatment and block effects t l , t 2 , .'" t u ' b l , b 2 , " ' , Db
there exists a unique linear function Y,
Y
= qlTl
+ CLzT2 + .,. + eauTu + qiBl + q2 B2 + '"
of the yield a.nd block totals for which E(Y)
function Y is the best estimate of ll.
= n.
+ qbBb (2. 5)
This linea.r
We are however usually more interested in those parametric
functions which do not contain the block effects. Let us find
out the general form of the best estimates of functions of the
treatment effects only. Now
E(T i )
nil(t i + b l ) + n i2 (t i + b 2 ) + ••• + nib(t i + bb)
3,
=
E{B j )
= rit i +
= nlj(t l
= kjb j +
(3.1)
(nilb l + n i2b 2 + ••• + nibb b )
+ b j ) + n 2j (t 2 + b j } + ." + nuj(tu + b j)
(3.2)
(n1jt l + n 2j t 2 + ••• + ~jtu)
Hence the coefficient of b j in
E (Y) :;: E{ qlTl + CLzT 2 + ." + C1uTu + qiBl +
<i2 B2
+ .,. + qbBb)
is given by
kjqj + qln1j + %n 2j + ... + Clunuj
(3.3)
Honce if E(Y) is to be free from block effects, we must have
qj
=- ~
SUbs.tituting in
!
(nlj q1 +
n2j~
+ .... + nUj'lu)
(3.4)
we find that it must be of the form
ql Q1 + ca:a ~ + ••• +
eau ~
(3.5)
Page 56
where
-
~ = Ti
n i2 B2
k2
nilE l
k
l
•••
nibBb
kb
Conversely it is ea.sy to verify that E(~) docs not contain a.ny
block effects.
~
The quantities
are called the adjusted yield
because ~ is obtained by deducting from Til the average yield of
the blocks in which the i-th treatment occurs.
We can now state the following theorem:
Theorem 2. Corresponding to any estimable parametric function
I I t 1 + ~2t2 + ••• + f u t u
of the treatment effects only, there exists a unique linear
function of the adjusted yields, viz:
Clu ~
q1 Q1 + ~ ~.+ ••• +
such that
E{ql~l"" Cle~ + ••• + ~~) =
This linear function
function zit.
4.
Zq~
-,\t l
i 2t 2
0 t
+ • • • .". -ru
U
is the best estimate of the parametric
+
Let us now find E(~i)' V{Qi) and Cov(Q,i' ~,,).
n
nil
n
i2
E(Qi) = E(T i ) - ~E(B1) - ~E(B2) - ... - k ibE(Bb)
1
2
b
SUbstituting from (3.1) and (3. 2) I we g'a t
E(~) = Oilt l + 0i2 t 2 ~ ••• + ciut u
where
/nilnifl
I
\
k
1
+
.2
I nil
= r i - i ~.".
\ 1
(4.1)
...
(4.2)
......
(4.3)
.".
When as is usually the case in field experiment designs,
the block size is constant and equal to k, we have the simple
relations
Aii ,
(4.4)
Oii' = - ~, i I i '
( 4.5)
where).!!, is the number of blocks in which the treatments i and
~ occur together.
Pare 57
2
VeTil = riO' , Cov(T i , Ti , ) =
V(B j ) = lr JO' 2 , COV(B j , B j') =
Again
0
(4.6)
0
Cov(T i , B j) = nijO'2
Hence
nil
V(Q,i) = V(T i - ~Bl
1
-
n' 2
-LB
k
2
n2
.)
= V(T i ) + ~:b1.V(B
. k2
J
j
J
=
,
•••
2
- knbibBb )
In ..
2~i ~~COV(Ti'
J\
J
\
B o7})
.J/
2
n 2.. ,
n.....bl..
.. _
\
2l: 21.
0'2 (r i + z k
k. }
j
j
j
J/
= CiiO'
2
(4.7)
Cov( Qi' ~ ,) =
- ~Bl
nil
=Cov(T i
=-
,.
rt-
~'.
,
••
nib
~Bb' Til
b
n i '2
- -ni'l
k1B1- k
2
n i 'b
b
B 2 -·- ~b)
ni
I •
ni .
n ..n. , .
k JCoV(T i , B .,} - ~ :..rf.c0v(T il , B .} + E ~~2~ JV(B )
j
J
J
j
J
J
j
l:
j
=-
1
n' 2
2.":'B
k2 2
2
0' E
j
n ..n., .
~J
~
.1
kj
.
2
= Ciilo
Corollary.
(4.8)
= Cov(B J.,
= nijO'
2
n ..
Ti } - ..2:.J.V(B.)
k.
J
J
n ij
2
k. kjO'
J
=0
(4.9)
This shows that any block total is orthogonal to any
adjusted yield.
5.
and
The constituents of the matrix . «C 1~
.. ,»
V(~i).
occur both in E{Q.)
1
This matrix plays a very important part in our theory.
(i) The matrix is sYmmetrical, i.e., Cii , = Cili
(ii) Each row and column adds up to zero.
(iii) It follows from {ii} that the determinant of the matrix
vanishes.
Page 58
(iv) We shall see later that the rank of the matrix is equal
to the number of independent estimable linear functions
of the treatment effects.
Since any estimate of a linear function of troatment effects
is of the form
ql QI + ~ ~ + ••• + ~ ~
it follows that any estimable linear function of the treatment .
effects must be of the form
L = ql E ( ~) + ~E ( ~) + ••• + CluE ( '<u)
-ql(C II t l + Cl2 t 2 + ••• + Clut u >
+ ~(C2Itl + C22 t 2 + ••• + C2u t u )
+
•••
• ••
•••
...
+ Clu{Cult l + Cu2 t 2 + ••• + Cuut u )
Hence the sum of the coefficients of t l , t 2 , ••• , t u in L, must
vanish in consequence of (ii).
A linear function of the treatment effects, in which the sum
of~the :coefficients vanishes, m~y be called a treatment contrast.
Thus we get:
If any linear function of the treatment effects is estimable
it must be a contrast. In particular the sum of treatment
effects
is non-estimable.
To answer the question whether every treatment contrast is
estimable, we have to bring in the concept of connectedness.
A treatment and block may be said to be associated if the
treatment is contained in the block. Two treatments, two blocks,
or a block and a treatment are said to be connected if it is
possible to pass from one to tho other by means of a chain consisting alternately of blocks and treatments such that any two
members of the chain are associated. Thus in a design if i o and
in are connected treatments wo must have a chain
•
(5.1)
such that the block jp is associated to the treatments i p _ l ' i p
for p= 1,2, ••• , n.
JNSTIT~TE
OF STATISTICS
Page 59
design is said to be a connected design if every block and
treatment of the design is connected to every other. Likewise, a
portion of the design may be said to be a connected portion if
every block and treatment in the portion is connected to every
other. Any general design must break up into a number of connected parts, such that a block or treatment belonging to one part,
is unconnected with a block or a treatment belonging to the other
part.
A
6.
Let us first study the properties of connected designs. All
the usual designs for varietal trials, viz., randomized blocks,
balanced incomplete blocks, lattices (inclUding rectangular lattices) and the more general class of designs known as partially
balanced incomplete block designs, are connected designs in the
sense defined above.
We shall first show that for a connected design any treatment contrast is estimable.
Since a general treatment contrast
Ilt l + -f2 t 2 + ••• + f u t u ' Lli = 0
can be written in the form
i1(t l - t u > + J.2 {t 2 - t u ) + ••• + fn-l(t n - l - t u >
it is sufficient for us to show that t. - t i is estimable,.
~o
n
where i o and in are any two treatments. Since the design is connected there exists a chain (S.l),.showing that i p_ l and i p both
occur in the block jp.
Hence t i
- t i is estimated by the
p-l
.P
difference of the yields of the plots in this block to which i p _l
and i p have been applied. Since
-
-
-
-
= (t i 0 t.J. l > + •• + (t.J. p_1 t i ) + • • + (t.~n-l t i n )
n
P
t i is estimable.
it is clear that t i
n
0
The linear function E(~) , E(~) ,
, E(~) of the' treat-
t
io
ti
-
...
ment effects are not independent since their sum vanishes. But
at least u - 1 of them are independent since every treatment
contrast being estimable is of the form
J
Pago 60
qlE(~) + ~E(~) + ••• +
<1uE ('\1)
and there are u - 1 independent treatment contrasts. This shows
that in the case of a connected design the rank of the matrix
({C ii ,» is u - 1. It also follows that of the linear function
~,
~,
.'"
~
of the yields, just
u - 1
are independent.
Of
course the relation
Ql+~+"'+~=O
is easy to verify.
Since the linea~ function
~
is orthogonal to Bj , and since
Bj and Bjf (j I j') are orthogonal to each other, it is clear ~
that in the case of a connected design, the estimation space is
at least of the rank b + u - 1, as it contains the vectors
corresponding to ~, ~I . ' " ~, B1I B2 , ••• ,~. On the other
hand, the identity
Bl + B2 + ••• + Bb
= Tl
+ T 2 + ••• + Tu
shows that the vectors (3 1 , (32' "" ¢lb' 1'1.1 12' " ' . 1 1"u generating the estimation space cannot be all independent, as we must
have
Ch + (5-2 + ••• + @b = + or'2 + ••• + 1""u
1"1
Accordingly the rank of the estimation space is exactly b + u ~ 1.
Thus b + u - 1 d.f. belong to the estimates, and N - b - u + 1
degrees of freedom belong to error.
7.
Let us now turn to the problem of estimation of any contrast
between treatment effects. For this purpose we can enunciate the
following theorem.
Theorem.
The best estimate of the contrast
L
= f 1t l
+ !2 t 2 + ••• + j"U t u
(7.1)
-
is obtained by substituting in L, any set of values of t obtained by solving the following system of normal equations:
•
cllt l + c12 t 2 + • • • + clutu
c 21 t l + c 22 t 2 + • • • + c 2u t u
=
=
Ql
Q.:,
,::,
•••
• ••
•••
• ••
c
t
+
cult l + u2 2
• •• + cuut u = Gu
•••
I,
P;opo,f:
If EqC;:, is the best estimate of
L~
Page 61
we ha,ve
•••
which gives rise to the following equations for determining the
qls.
.J
Cll ql + C2l~ + • • • + CUl~. = "/'1
C12 Ql + C22~ + • •• + Cu2 Ciu =
•••
•••
·..
•••
i
2
••
r1
cluql + C2u Q2 + • • • + Cuu~ = .{u
Now if tIl t 21 ••• " t u is any particular solution of (7.2), then
sUbstituting these values in (7.2), multiplying these equations
by ql' ~I . . . . " ~ respectively" adding and using (7.3)" we find
that
t
Rl t l
+ -f2 2 + ••• + tutu
=
ql ~ + ~~ + ... + ~~
= bost
estimate
Corollary 1. Only u-l of tho e qua t ions (7 • 2 ) arc indopendent.
Since the general solution of tho homogeneous equations corresponding to (7.2) is
(Q, Q" ••• , Q)
So tho general solution of (7.2) is
=tl
(7.4)
2 = t 2 + Q, ••• " t u = 'tu + Q
TIIUS (as is to be expected)" the difference between any two
tIs and in general the contrast
t
l
+ Q"
t
•••
z:..f = 0
is uniquely determined. In order to render the solution of the
normal equations unique" we may if we like take the equations
(7.2) together with some arbitrary restraint
clt l + C2 t 2 +. ~ •• + Cut u = 0
(7.5)
where Cl + C2 + ••• + eu 10. The unique solution thus obtained
is bound to satisfy the normal equations, and on substitution in
,z.!it will give its best estimate. The reason why in the restraining condition (7.5) we must not take EC = 0 is that in this case
(C l ' C2 , ••• , Cu)' will depend on the vectors (C il " Ci2 " •• J Ciu )'
-.....'
~
.
and so the imposition of tilis condition will not load to a unique
solution of the normal equations.
INSTIT&JTE OF 5TATI5TfCS
.~
Page 62
Corollary 2. Let us find the variance of the best estimate'of
Zit. We have seen that the best estimate will be ZqQ, where the
Q1s are given by (7.3).
Now
V (ql QI + ~ ~ + ... + ~ ~)
~,Cov ( %' Qi
=i
I )
,1.
Z .,
= 02.1.,1.
qiqi'C ii ,
=rql (Cll ql
+ C12~ + ••• + Clu '1u)
'.
+ ~ (C 21 ql + C22~ + ••• + C2u~)
•••
+ ~ (CuI ql +
••
= (qli'l
.e
+
...
CU2~
• ••
+ ... +
•••
Cuu~)J
~.P2 + ••• + ~.fu)o2
2
0
(7.6)
On comparing (7.2) and (7.3) it appears that if the normal equations lead to
t l = Cll~ + Cl2~ + • • • + Clu~
t 2 = C21 QI + C22~ + • • • + C2u Ou
(7.7)
•••
•••
• ••
• ••
••
t u = Cul '-il + CU2~ + • • • + CuuOu
then ql'
, Clu ..may be obtained from tIl t 2 ,
, t u in
%'
...
...
(7.7) , on replacing
qi
QI
s by It s.
= .Rl Cil
+
i 2 Ci2
Thus
+
eo.
+
RuCiu
Substituting in (7.6) we seo that the variance of the best estimate is
2
<1
()"
i ~/i-fi,Ciit
(7.8)
,1.
Thus the variance of the best estimate of z;t, involves only the
coefficients in the algebraic solution of tho normal equations.
In particular the varianc'e of the estimate of t i - t I is
i
2
(C ii - Ciit - Citi + Cifit)O
(7.9)
Continuing our study of the connected design, let us now
turn to the problem of partitioning the total swn of squares into
2
its various constituents, and to the estimation of tho error (J •
The s.s. due to all the observations is
8.
,
L
Page 63
If G is tho grand total of all tho observations, then G estimates
rlt l + r 2 t 2 + ••• + rut u + klb l + k 2 b 2 + ••• + kbb b
The s.s. clue to
G = i: Yij
1s then
2
G /N
which is known as the 'correction for the mean'.
f rom 3 12 , we ge t th e quan.tir.'" y
2
S
Subtracting it
0. 2
.,
= S''''
-
N
which in the language of the analysis of variance is called the
total swn of squares. It corresponds to tho N-I d.f. belonging
to the contrasts between the observations. S2 is the sum of
squares of the deviations of the individual observations from the
general TIman, and we can write S 2 = dov 2 y.
The total sum of squar68 cnn bo partitionod into throe orthogonal components.
(i)
The s. s. duo to tho contrasts botvleon the ·block totals,
which carry b-l d.f.
(ii) Tho s.s. belonging to tho adjusted yields, which carry
u-l d.f.
(iii) The s.s. duo to error, carrying N-b-u+l d.f.
(i) The block totals B l , B2 , . " , Bb are orthogonal to one another, the s.s. due to B. being B~/k .• Now
J
J
J
BI + B2 + ••• + Bb
=G
Honce from the sum of squares
b
i:
due to the b
subtract the
the b-l d.f.
This s.s. is
2
B ./k
.
j=l J J
degrees of froedom belonging to the B.IS, we have to
.
J
s.s. dUG to G, in ordor to obtain the s.s. due to
carried by the contrasts between the block totals.
therefore
n12
Sb = k
2
1
+
••• +
In the language of the analysis of variance, tho above is
called the 8.S. due to tho blocks (ignoring treatments).
We can write Sb 2 in tho form
Page 64
2
nThus Sb is tho sum of squares of ·the deviations of tho
block avera.ges i'rom tho general average, weighted by tho number
oi' plots in the block.
For the designs in current usc, tho block
size is constant and wo get
2
R,,2
1 ~ (B
G)2
dov B
-b
=
K~
k
=.
j - b
J
(i1) Let us next find the s.s. due to the adjusted yields, ~,
~, ••• , ~~ i.e., the s.s. due to the estimates of the treatment
contrasts.
Let 'Y'i be the vector corresponding to ~i. Then
2
2
o (Y1..· y 1..,) == Cov(Q.,
Qi') == a C1.' ,1.Of
1.
or
(Yi.Yi')
°
= Ci,i"
i,i'
= 1,
2, ••• ,u
Ii'
is the projection of the observation vGctor on the vector
space generated by the y's then ~ = 0' + 01 whore
°
= d1 'Y'l
oc
+ d2 "(2 + ••• + du Yu
and 0' is orthogonal to Yl' Y2' ... , Yu •
Qi
or
=
=
(~·'Y'i)
Honce
(Oo·'Y'i) == dl(Yl·Yi) + d 2 (Y2· Yi) + ••• + du(YU·Yi)
.
Cildl + Ci2di + ••• + Ciudu == Qi" i = 1,2" ... " u
Thus dl , d2 , ••• , du satisfy the normal equations" and we
may take them to be
t I " t 21
°0 = tl'Y'l +
••• ,~.
Hence
,,,,,""-
-
"
t 2 'Y'2 + ••• + tu'Y'u
Consequently the s.s. due to the adjusted yields is
~
A
-
(t l 'Y'l +t 2 'Y'2 + ••• + t u 'Y'u)
./.
.'
2
......
= Z til t 1 ("(1 "fi) + "t 2 ("(2 "(1) + ••• +
1
Z """
t (C
=i
....
i
= t l Q,l
.....
il t l + Ci2 t 2 +
+ t2~ +
.. .
A..
A.
••• + Ciu t u >
+ t u~
t u ("(u
"fir,."
Page 65
(iil) Finall:f, the s. s. due to error is obtained by subtracting
from the total sum of squares, the s.s. corresponding to the b-l
d.f. carried by the block contrasts, and the s.s. corresponding
to the u-l d.f .. carried by the adjusted yields. Thus. the s.s.
due to error is
2
s~ = 3 - s~2 - s~
The estimate of the per lot variance
0
2
, based on the
N-b-u+l d.f. belonging to error is
s =
3
2
e
N - b - u + 1
The general scheme of the ~nalys~s of variance for any connected design can therefore be given as
Due to
d.f..
s.s.
moan square
Troatments
Eliminating
Blocks
j
J.. .
s~
u - 1
+2
Blocks
ignoring
treatments
b - 1
Error
l N-b-u+l
I
Total
Sb
=2
= i2
ti~
Bi
ki k
i
s~
= S~/{U-l)
_
G· 2
- N
(dev 2B)/k when the
block sizG is constant
=
s;
iS~ ,= S~/(N-b-U+l)
(by sUbtraction)
N- 1
To-test for the hypothesis that the treatments are significantly differentiated, we have to use the F-statistic
F
=. St21S e2
with u-l, N-b-u+l d.f.
Suppose this result comes out significant. Wo can thon procoed to tost whether two troatments i and if are significantly
;'
different. Now the estimate of t i - t if is t i - t if , where
/\..
"....
.A.
2
V(t i - t i ,) = (C ii - Cii , - Ci'i + ci1i,)0
Hence the required difference is tested by using ~he t-statistic
= (t i
- ti,)/{se/Cii - Cii , - Ci'i + Ci '1 1 )
with N-b-u+l d.f.
t
lNSTJTliJr£ OF STATISTICS
Page 66
9ti
Ex. (1) Balanced Incomplete Block Designs
In a. balanced incomplete block design, there are u trea.tmonts, arranged in b blocks, such that each block has It plots;
and each treatment is replicated ~ times. Further, each pair
of treatments must occur together in exactly A blocks. It is
easy to sec that
bk = ur, l(u - 1) = r(k - 1)
Also, Fisher has proved the i~oquality
b ~ u or k::: r
The actual designs are listed in Fisher and Yates tables when
r :: 10.
Let us consider the problem of analysis of those 6osigns.
Now we have
Ci i
= r (1
- 11k)"
Cii'
= -)!k"
(i lit).
.'
Hence thu normal equations are
- ktl -
~t2
,- •• + r(l... ~)ti - •• -
~tu = (i l
(i
= 1"
2" ••• " u)
Taking these together with the restriction
t l + t 2 + ••• + t u = 0
we get
(I' - ~ + ft)t
i = ~Y (i = 1, 2, ••• , u)
NoW'
_ Eo + .2 _ r (k-l) + ).
A,(u-l) + ,{
u:\
I'
k
=
k -
'k
• . t.1
k
= u)..,.- Q.."
•
kr Qi
1 Qi
= -u>., -r -- -E -r
whero
E
= :k
k
1
_U)._Utk-l~_I-k
- kr - k u-l
<
- I _ 1
I
u
is defined as the efficiency factor for a reason which will presently appoar,~'
Tho contrast betweon the i-th and il-th treatment is esti-
mated by
A..
t i
- til
1 Qi
=E
-
%
I
....;;.;-r-~
~
2
2a2
V(t i - til) = cr (C ii ~ Ciil - Cil1 + Ci'i ') = rE
In ordinary randomized
block design, the corresponding
var,
n
G
iance, for the same nunilier of replications would be 2a /r. Hence
if there is no reduction in the pOl" plot varianco due to tho
A
Page 67
.....
reduction of the block sizo, the variance of t i - tir is increased in tho ration liE or the information (which is defined as the
reciprocal of the variance) is! decreased in the ratio: E. Hence,
E is callod the efficiency factor. Of course this thooretical
loss of efficiency will in senoral be lnore than offsot by tho
reduction in the error variance por plot. The analysis of
varianco is
Duo to
mean square
d.f.
s.s •
•
•
Treatments
2
2
2
Eliminating
3 = dev q/rE
u - 1
St = 3~l(u~1)
t
blocks
/'-
•
f
Blocks
ignoring
treatments
3 br2 = dev
b - 1
"N;"b~\1*l
4It
Error
where
N = bk = ur
Total
N ... 1
3
2
0
2B/k
;s2
(by subtraction) : e
b...u+l)
= s2/(N...
0
dev2y
Hence to test whether tho treatments are significantly
differentiated we have to use tho F-statistic
p
= s~/s;
with u-l, N-b-u+1 d.f.
To test for the significance of the difference botween any
two treatments, we have to use tho t-statistic
with N-b-u+l d.f.
10.
The Lattice Design
Consider a kxk two-dimensional square. The k 2 colls corres2 -pond to ~ (=k ) treatments. We can form k blocks by taking sets
of k treatments oecuring in tho same rewa Similarly ~ blocks may
be obtained by taking sets of treatments in the same column. For
example from the 5x5 square
Page 68
1
2
3
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
~
5
4 •
I
24
I
.
I•
l!
j
25
•
we get the two sets of blocks
( 1,
( 6,
(11,
(16,
(21,
Sot I
2, 3, 4, 5)
7, 8, 9, 10)
12, 13, 14, 15)
17, 18, 19, 20)
22, 23, 24, 25)
(1,
(2,
(3,
(4 ..
(5,
.
Set II
6, 11, 16,
7, 12, 17,
8, 13, 18,
9, 14, 19,
1·0 , 15, 20,
21)
22)
23)
24)
25)
Each set of blocks gives one complete replication. The two replications are orthogonal in the sense that the treatments in any
block of one replication are distributed enG each among the
blocks of the second replication. We can get another orthogonal
replica.tion by taking a 5x5 Latin square, and taking those varioties which correspond to the letters of the Latin square. ThUS,
if we take the Latin square (Ll )" shown below
C
E
D
B
A
A
C
B
E
D
B
A
D
G
D
C
A
E
A
E
C· : B
B
D
•
•
•
•
then tho blocks of the third replication are given by
Set III
(5, 6, 13, 19, 22)
(4, 8, 12, 16, 25)
(1 .. 7 .. 15, 18, 24)
(3, 10 .. 14, 17, 21)
(2 .. 9, 11, 20, 23)
If there exists a Latin square orthogonal to (L2 ), then another
Page 69
replication may be t~~en corresponding to this Latin square. It
is known that when k is a primo or a power of a primo, there always exists a set of k-l mutually orthogonal Latin squares. The
practically useful values of k are k~9, so that only in tho case
k=6, there does not exist a Latin square orthogonal to a given
one. Vfuen k=6, we can go up to three replications only. Of
course, we may stop at any number of replications, but in case k is a prime or a prime power, and we go on to the maximum possible
number of orthogonal replications, viz., k+l, then every pair
will occur once, and we get a 'balanced lattice', which is identical with the balanced incomplete design with parameters,
2
u = k, b = k{k + I), r = k + 1, k = k, -1\= 1
A lattice design with ~ orthogonal replications may be called an
-m-ple lattice. Of course, we may extend tho number of blocks by
repeating the whole design, say £ times, so that we get ~ replications, but in goneral it will be preferable to add new orthogonal replications, instead of duplicating or triplicatinG the
old ones. But in tho case k=6, not more than throe orthogonal
replications are available, and if it is desired to llave more replications, we may duplicate or triplicate each replication getting 6 or 9 replications in all.
These remarks may bo illustrated by taking the case k=5. In
this case the complete set of orthogonal Latin squares, consists
of four squares. Thus, three other squares orthogonal to (L 1 )
and mutually orthogonal to one another are
(L )
3
(L )
2
E
B
D
A
0
I!
; :
,
B
D
0
A
d,
A
0
B
E
~
r
D
c
iI
E
B
0
A
it.
I!
I
I
E
A
IE
I
l
D I C
L
•
I
D
B I A
I
B
D
i
t
A
I
j
l
0
D
E
B
E
11
n-ilI
I1
I
I
A
II
If,
C
D
A I- E
B II
i;--"----'----+---:..-0--1',•.
t·
I:
I
, E l B ! A D
'
__ .
L
I
-=--~_-:"_~_-"
and the corX'esponding sets 01' blocks are
A
j
E
I
•
c
I
E
c
j
B !
""' I D
,
B
I
A
D
B
I0
I
B
;
A
E
A! D
IC
i
---------+i--+- A
1E t D t B t~
. C -=--...,l-'---.....
1
__ ' - ' ,
E
D
----!
-.....1,•. - - - - - '
(3,
e
(2 "
(5"
(4,
(1,
Set IV
6, 12, 20,
8" 15, 19,
7, 14, 16,
10, II, 18,
9, 13, 17,
24)
21)
23)
22)
25)
(4,
(1,
(3,
(2,
(5;
Set V
6, 15, 17,
8, 14 , 20,
7, 11, 19,
10, 13, 16 ,
9, 12, 18,
23)
22)
25)
24)
21)
Page 70
Set VI
(2, 6, 14, 18,
(5, 8, II, -17"
(4, 7, 13, 20,
(1, 10, 12, 19,
(3, 9, 15, 16,
25)
2~)
21)
23)
22)
If we take all the six replications possible, we shall Cet a
balanced lattice , bu.t we can stop at any staGe. Thus, by taking
only tho first fou.r sots, we would get four replications (quadruple lattice with 25 treatments).
Let us now consider the analysis for a lattice design with
2
k treatnwnts, laid" out in mk blocks, consisting of ~ orthogonal
replications. We the,n have
2
u = k, b = km, r = m.
Also, Aiit = I, if the treatments i and it corrospond to the same
row, same column, or to the same letter of one of the Latin
squares.
Otherwise ':\ii t
Gii
= m(l
=
O.
- 11k},
Then
Cii t
= - Aii ,Ik
Denote by Sr{t i ), Sc{t i ), Sj{t i ), respectively tho sum of
the effects of the tree.tments" in the same row, in the same column, corresponding to the same letter of the j-th Latin square.
Similar meanings are given to S (Co), S (0.), S.(Q.), with refer ~
c ~
J
~
renee to the adjusted yields. Thus with example (k=5) considered
before
Sr(t l ) = t l + t 2 + t 3 + t 4 + t s , Sr{ Ql) = Ql + ~+ %+ Q4 + %
Sc{t l )
Sl(t l )
= tl+
t 6+ t l1 + t 16+ t 2l ,
t l + t 7+ t 15+ t lS+ t 2il ,
SC (S)
3 1 (Ql)
= Q+
1
= (;,1+
Q6+ ~+
'1 Q16+ ~l
~+ 0,15+ Q18+
~4
It should be noted that'S (t,) could also be denoted by
r ..L
B {t ), Sr(t ), Sr{t i1 ), or Sr{t ); and similarly fer the other
3
5
r 2
symbols. Now the i-th normal equation is
•
or
Page 71
Also, let us impose the restriction,
t l + t 2 + ••• + t u = 0
Summing up over the row containing the i-th treatment, we
get
rnSr(t i ) - Sr(t i ) = Sr"(Qi)
Sr(t i )
Sr(Qi)/(m-l)
Similarly
Sc(t i ) = Sc(~)/(m-l)
Also, if we sum up over the treatments, which correspond to
the same letter as the i-th treatment in the j-th Latin square,
we have
Sj(t i ) = Sj(~)/(m-l)
=
Hence finally we get
1\
ti
+
1
=m
Qi
I
m(m-l)k Sr(~i) + Sc(~i) + Sl(Qi) + ••• + Sm_2(Q )
From this we get
For i
-I
1
= in
i
1
+ k(m-1T
we ge t
1
Ciil
1
= m(m-l)k
or 0
"according as the treatments i and ~ occur in a common block or
do not occur in a common block. We may thus write
Aii l
Ciil = m(m-l)k
Now
V(ii -
Hence if i and
"If
!
and
~
~~I)
~
= (C ii
do not occur tog8ther in the samo block
ft
= 20
2 1 +
2
=
•
- Cili + C11il )02
occur together in the same block
V( t "i - iAt)
f
f
e iil
-
m
20,
I
k(m-l)
m
ill . . . + (m-l)k
Notice that in the second case the variance will be slightly
larger. Tests of significance can easily be carried. out by
applying the general formula.
INSTITlWTE OF STATISTICS
•
Page. 72·
N.B.
If each of morthogonal replications in an m-ple lat2
tice on k troatments is laid out U times, then we have
2
u == k , b == mnk, r == mn
Also, Aii t is U if the i-th and it-th treatment correSpond to the
smae row, column, or the same letter of one of th8 Latin squares;
otherwise it is O. Thus in effect. c..
and CoJ. i· tare multinlied
by
J.J.
•
u, so that the new normal equations can be obtained from the old
on replacing Qi by Qi/n for i == 1, 2, •• ~, u. Hence our estimates are givGn by
f\
1
1
(
,
'1
t i == iiiiiQi + mrdm-l)k~Sr{Q,i) + Sc('li) + Sl(Q.i) + ••• + Sm-2{Qi)J
Also we now b.s,ve
1
;
k(m-I)J'
h
t.J. t)
V(t.J. -
2
== ~
mn
(1 +
==
&2mn
(I +
m)
==
1
mn(m-l)k or
0
1) when i and it occur
k ' together inIi' blocks
wheh i and it do not oc-
(m-I)k ' cur togother=in U blocks
The Partailly Balanced Incomplete Block Designs
The Partially Balanced Designs are a general class of designs, which include as a special case, both the Balanced Incomplete Block designs and tho lattice designs. Here, as in the
case of lattice designs, two treatments are not always compared
with the same accuracy.
'l'ho conditions for a partiallJT balanced ,:!esign are:
(i)
Every treatment is replicated £ timos, and each block
contains k plots.
(ii) With rospect to any given treatment, tho rest fall into ill groups of n , n 2 , ••• , nrn each, such that any treatment of
l
the i-th group occurs with the given treatmont Ai times, tho numbers Ai and n being independent of the treatment with which we
i
start. The treatments of the i-th group may be called i-associates ef the given treatment.
(iii) If ~ is i-associato of 0, then ~ is an i-associate of
~, and the number of treatments which arc at the sarno time j-as~·
sociates of C( 1 and k-associates of (3 I is P~k and is independont
of ~ and {3.
11.
•
2
A
eiit
Page 73
If !l be the total number of treatments, and b the number of
blocks, then tho following conditions are satisfied by the parameters occuring in tho c10sign.
bk = ur
u - 1 = n l + n 2 + ••• + nm
1
I
= nl
I + n 2 2 + ••• + n mm
i
i
i
nl - I
Pil + Pi2 + ••• + Pim
i
i
Pjl + Pj2 + ••• + P~m
n j ' (i -I j)
r(k - 1)
I
!
=
(ll.l)
=
i
n P
i jk
j
= n j Pik
Tho parameters u, b, r , k, All },.2 1
AmI nIl n 2 , ••• , n m
i
are called parameters of the first kind, and the Pjk'S
are called
parameters of tho second kind.
It shall now be shovm that if a design so.tisi'ies the conditions of a partially balanced incomplote block design, then the
analysis can be carried through by solving sots of m+l linear
equations. For the practically useful designs, Dl would be 2 or
a.t most 3 (when m=l, we get a balanced incomplete block design).
Lot us denote by S_ (t) the Sunl of the effects of the i-o.sso1
ciates of the treatment t, with a similar convention for Qi(t).
Then our normal equations arc
r(k-l)t - AlSl{t) - )2S2(t) - ••• - AmSm(t) = kQ
(11.2)
••• ,
Sunuuing up over i-o.ssociates we got
kQjt)
t
+ ••• +
2
S0
+ P-2
(t) +
1
...
•••
...
( 1 S (t)
2 S (t)
- ,"\m iPil11 1
+ Pim 2
+
·.. +
·..
·.. + P-
- Aini t
•
Now impose the restriction
t l + t 2 + ••. + t u
which may bo written as
m S ( t )i\
TIl
P-2
1
...
I
m S m(t)}
1 rn .,
(11.3)
=0
t + Sl(t) + S2(t) + ••• + Sm(t)
=0
Page 74
Thon (11.3) can bc written as
tAini -
AIPil - ~2Pi2 - ••• - ~mpim331(t)
, 2
2
+ ~ Aini - A.IPil - A2 Pi2 - ••• - AmpimfS2(t)
•••
•••
•••
••
••
•••
i
1 i}
+{r(k-1) + ~ni - AIPil - A2 P i2 - ••• - ~Pim)Si(t)
••
•••
•••
•••
•••
••
! .
\ m
OJ
m
"
~1 ')
(t)
(11.0
+t -{ini - '''''lPil - il.2 P i2 - ••• - I\:mPim~Sm
Lot us sot
.
kQ. (t)
1.
D.
=
1
1
1 ( L n _... P R
kt/1. i
/1.1 il
-
il. 1
a ii = iC':.r
Then we have
a
i
_ 110"
D
I
2" 12
_
•••
_ " P~
(k 1 ) ' n
'I
i
..
i
+ /I.i i - "lPil - A,2 P i2 -
3 (t) + a 3 (t) +
11 1
12 2
• •• + aIr-Pm (t)
Solving these wo can express Sl(t), S2(t), ""
= All~(t)
i-L.l
r=.
)
i)
... - f\mPilllJ
a 21S1 (t) + 0.22S2 (t) + • •• + a 21nSm. (t)
•••
•••
•••
•••
~nlSl(t) + a m2 s 2 (t) + ••• + anmPm(t)
3 1 (t)
i(
t'm im)'
=
)
Ql
=~
..
(11.6)
= Sm
Sm(t) as
..
.
• ••
• ••
Sm (t) ;;: Alm~ (t) + A2ll1~ (t) + • • • + A111m Q111 (t)
anc~
11.5
+ A21(~(t) + ••• + ~nl~a(t)
•••
•••
(
finall'JT
r(k··l)t ;;: kQ +
t;tlAll
+
•• •
iA1 Aru
-10
+ A2 A12 + ••• + ).mAl o1l ••·\Ql
)
•••
•••
•••
(11.7)
(11.8)
+:A2 Am2 + ... + ,Ami\mn \(~
Thus our main task in solving the normal equations is to
evaluate the constituents of «a i ; ) , and then solvo (11.6).
It appoars from (11.8) that if tho treatment i' is tho j-th
associate of tho treatment i, then
C•• = k/r(k-l)
1.l
Cii ,
v(t.1.
t'i
=
,)
F\.lA jl + )'2 Aj2 + ••• + AmAjm\/r(k-l)
202 r
= r(lc-Ink
t
- ;'lA jl - )2 Aj2 -
... - AmAjm"!
Tests of significance can bo carried out in tho usual way_
IN STITliJTE OF STATISTICS
Page 75
•
Two way elimination of heterog?neity.
In this case the blocks are classified into two ways, so
the.t each experhwntal unit is a part of one block of either system. Thus, in the case of a field experiment tho design is laid
out in tho form of a square or a rectangle, with 'rows' and 'coIUl1m.s I • We shall use the names 'rows' and 'colUl1'.ns I even in the
general caso. Lot there bo k rows and k' columns. Then there
are besidos the treatment effects
12.
... ,
t
u
the k parameters correspondinG to tho row effects, viz.,
b , b ,
2
l
••• , b
k
and the !E.:. parameters corresponding to tho colman effects
bi, b
2, ... , b kl
There are now kk' plots. Let the yield in the plot j,j', i.e.,
the plot occuring :1n the j-th l~OW and j' -th column, be denoted by
yjj'. If thei-th treatment occurs in this plot, then the equations of expectation are
E (Yj . ,)
J
=
t. + b. + b J~ "
1
J
fl. ,
Let ~l' ••. , a'~k' '''I'
•.• ,
f"
Pk"
;-
L
l'
of expectation arc written out in full.
the observed yields, clearly
( ~. (1.)
J
= B.,
J
( ~. ("'. ,)
J
= B J~ , '•
... , t u be tho coefficiIf
~
( ~ .... )
1
is tho vector of
= T.1
where D. is the total yield of tho j-th row, B~, is the total
J
yield ef the j' -th colurnn, and T. is tho total yield of tho i-th
1
treatme:.nt. Honce from tho r;onoral theory we get tho follovling
theorem: .
Theorem: Corresponding to any estimable parametric function
n of tho row, column and treatmont effects, there exists a unique
linear function Y of tho row, colUinn and treatment totals, such
o
'that
E(Y ) = IT
o
This linoar function Y is tho best estimate of TI.
o
Ex. Before proceeding to the general theory let us apply
the above to a Latin square design. Such a design is formed by
taking a kxk Latin square, and assigning k treatments corres-
Page 76
Of course, there must be proper randomi-
ponding to the letters.
zation.
Here k'=k, as there are k rows and k columns.
Let
S(o)
= bl
= b'1
= tl
S (b ,)
set)
B
D
C
A
E
A
C
B
, E
D
C
E
,
D : B
f D
A
E
f
C !I ' B
,
B
A
D
+ b 2 + • •• + b k ·
+ b 2' + • • • -I- b'k
+ t 2 + • •• + t k
.
,
I
~
.....
i
I
I
Now
Also, u=k.
E
!
C
E(T i ) _. kt.1. + S(b) + s (b I)
Also if
= Tl
+ ••• + Tk = Bl + • • • + B
'k = 13'1 + • •• + I1~
is the grancl total, thon
G
B(G) =
•
• •
E11.
'1"
C:..
(T i
k~S(t) + S(b) +
S (b ' ) }
••
- kG) 1J =
'I t .
~ 1-
S(t) }
- ~}"
)
i
11
If we multiply those equations by Xl I
~X = 0, and add, we at once see that
E{(~lTl +f 2 T2 + ••• + i k Tk )/k1
=
(12.1)
= 1, f..J,
I)
l·"
G
... , k
• • • I
.ikl
where
11 t 1 + ~)2t2 + ••• + ·Y.k t k(12.2)
Thus the treatment contrast )Itl + ••• + iktk is estimated by
(111 Tl +
In particular
t
R2 T 2
+ ... + .fkTk)/k
i - til is estimated by
,
~ (T i -
•
T i ,)
Similar estimates can be obtained for row contrasts in terms
of row totals, and the colunm contrasts in terms of column totals.
We have hitherto accounted for 3(k-l) degrees of freedom, ,
viZ., those belongine to the contrasts between the T's, the contrasts between the Bls, and tho contrasts between the B"s. Due
to the relation (12.1), there is only one other independent linear function of tho T J D nne: D t 's J viz., G. :Ienco tho estimation space is of ranI:: 3k-2 ancl k 2 -3k+2 = (k-l) (k-2) dogrees of
freedom belonging to error •
Clearly T 11 ••• , Tk ~ro orthOGonal. The s.s. belonging to
the treatment contrasts is therofore
with similar oxpressions for tho s.s. due to row contrasts~ and
the column contrasts. Tho s.s. duo'to G is of course G2/k ,
which is the correction for the mean. The s.s. due to error is
therefore 2
2
G2
dev 2 T
dev2 B
dev 2 Bf
Se = Z Yj j ' - ~ k
k
k
k
:: d
ov
2
dev T
k
2y .
Hence we have the following analysis of variance
Due to
Mean square
d.f.
s.s.
2
Treatment
2
2
1
3
k
St = S~/(k-l)
Contrasts
t = dev T/k
Row
ContI'asts
k - 1
Colwml
Contrasts
k
Error
Total
1>.n
-
S2
b
2
2
Sb' = doV Bf/k
2
sb' = S~I/(k-l)
3 2e (by subtraction)
So = 3~/(k-l) (k-2)
1
l{2- 3k+ 2
=
..,
2
=
Sb!(k-l)
sb
dev 2B/k
2
2
dov 2y
Ie -1
estimate of tho per:"plot error variance
s2 = s2/(k 2 -3k+2)
e
0
2
is given by
e
To test whether the treatlnents are significantly differentiated,
we use the F-statistic
F = St2/se2
with k-l, (k-l)(k-2) dogrees of freedom.
To test for the significanco of the difference between the
i-th and il-th treatments we use the t-statistic
t
=
(T.:L - T.:L I )/(s e
flk)
with (k-l) (k-2) degroes of freedom.
13.
•
Normal equations for two way elimination in the general case.
k
kl
E (T.)
= rit i + L; nijb j + l: n!j Ib j,
:L
j=l
j'=l
where n ij = 1 or 0 according as tho i,,:,th treat:nent does or does
not occur in tho j-th row. Likewiso nl jl = 1 or 0 a.ccording as
•
Page 78
the i-th treatment does or does not occur in tho i'-th colwnn •
Now
u
E (B.) = k 'b . + b' + b' + ••• + bl~' + l: nijt
1
2
i
J
J
i=l
u
E(Bj,) = kb j, + b l + b 2 + ••• + b k + ~ nij' t i
i=l
The adjusted yield Q is now defined to be
i
k'
l:
j '=1
A littlo calculation shows that E{~) is given by
1
... + ,Aiu t u ) ~ k{Ailtl + ••• +
1
+ kk' (rl t l +
where Aii' is tho nlli~ber of rows in which the i-th and i'-th
treatments come together, and ~ii' is the number of columns in
~lich the i-th and i'-th treatments como tOGether, and ~ii =
",
•
/\1i = r.~
• • E (01.) = Cilt l + • • • + Ciut u
where
r. r. ,
~ii' .- A!D.• , )
~ ~
i ;L i'
=
~,
( kk t
Cii '
~
,
1
1
ri
Cii = ri(l- k - kT+ kk')
These are tho coefficients which take tho place of tho coefficients in the theory of one way elimination.
By calculating tho expectation of an arbitrary linear function Y of Tl " ••• , TU1 B1I ••• IBk' Bi, ... , Bl~11 we can show
that in order that Y may bo an osti~ate of pure functions of
treatment effects. Y must be of the form
.Also, a little calculation shows that
V(~)
•.
'
= Cii a2
Page 79
•
The idea of connectedness will now be slightly generalized•
In finding whether two treatments are connected, we have to allow
connection both through rows and through colUl1ll1s. If any two
treatments are connected, then the design may be said to be a
connected design. For such a design any treatment contrast is
estimable, and the estimate is obtained by substituting in the
contrast solution of the normal equations
+ ••. + ciut u = ~
The s.s. duo to the u-l d.f. belonging to the adjusted
yields can be proved to be
cilt l +
ci2 t 2
-\.
/'-
t 1 0.1 + t 2 ~ + ••• + t u Ou
It is readily soen that the row and colunm contrasts are orthogonal to the Q's. Hence we get the folloVling scheme for the
analysis of variance
Due to
•
d.f.
s.s.
mean square
•
Treatments
elininating
U - 1
rows & cols.
Row cont.
(ignoring
treatments)
Col. cont.
(ignoring
treatments)
k' - 1
Error
kk'-u-k-k'+2
Total
kk' - 1
14.
••
k - 1
3
2
e
(by
subtraction)
kk '-u-k-k '+2
Youden'sSquares.
Sometimes in a design the position wi thin the block is im...··
portant. The classical example is due to Youden, in his studies
on the tobacco mosaic virus. He found that the responso to treatments also depends on the position of the leaf on the plant. If
the number of leaves is sufficient so that every treatment can be·
applied to one leaf, then wo got an ordinary Latin square, in
which the trees are columns and tho leaves belonging to the same
JNSTITlWTE OF STATISTICS
•
Page 80
position constitute the rows. But if the number of treatments is
larger than the nunmer of leaf positions available, then we must
have incomplete columns. Youden used a design in which the columns constituted a balanced incomplete block design, whereas the
rows were complete. These designs are kno~~ as Youdon's squares.
For exan~le, consider the following, in which thoro are eleven treatments but rows are incomplete, each containing only five
treatments.
5 ,6 7
1
2
9 10 11
34
8
10
11
1
2
3
5'
6
7
8
9
10
4
5
6
7
8
1
2
3
4
5
9
5
6
7
8
1
2
3
4
9
11
10
11
1
2
6
7
8
9
10
3
11
Of course, here the design is given in a schematic form and
we must randomize rows and columns. Also, tho actual physical
treatments have to bo assigned numbers 1 to II in a random order.
It will be seen that every treatment is replicated 5 times, and
any pair of treatments occur in the samecolurmn (i.o., on the
same tree) twice.
In the soneral case, if there are £ treatments, we have b
columns, each with k treatments, so tha,t each treatment +s replicated £ times, and each pair of treatments occurs in the samo
colu.'I'JU1 A times. Since the rows are complete, oa~h conto.ins u
treatments.
Thus we must have b = u, and consequently r = k. Hence
the balanced incomplete block design formed by tho colwnns is a
symnletrical balanced incomplete block design. It can be shown
that every symmetrical balanced incomplete design can be converted into a Youden1s square, i.e., the blocks can bo so arranged
that in a particular position within tho block, each treatmont
just occurs once.
Let us now consider tho problem of analysis. Here
k
= k,
kl
= u,
Hence
.'
Cii,
=
=
=
r i = r,
~
A
riri
)1i'
( kk'
k""'
r2
r
(ku
u
k-)
- 11k since
-
= r,
ii ,
-
.',
,
"\1i I)
u
- -
r = k
-, I
.Aii I
=
,'"
A
Page 81
"
1
= r{l - k)
These coefficients are the Same as in the case of an ordinary balanced incomplete block design, and we find as before that
t i - til is estimated by
.At i - t i , = (~ .. Oi)/rE , whore
~
..\ , , \
and that
2
= 20/rE
vet! - tit)
The only difference is in the estimation of
0
2•
The number
of d.f. belonging to error is
ur - 2u - r + 2
The s.s. clue to error is
2
dev 2B
A
dev y - z tQ -
u
k
Thus thero is a loss of r-l d.f. in the esti~ation of error.
this error is offset by· tho reduction in 0 2 (the per unit variance). If this reduction is large enough, i.e., the positions
within the block are sUfficiently differentiated, then we shall
gain by using a Youden's s~are in place of an ordinary balanced
incomplete block design.
15.
Missing variates
Consider the problem of linear estimation with variates
2
Yl' ••• , Yn ; xl' ••• ,' xk~ with a conrrnon variance 0 , and expectations given by
E(Yl)
= allPl
+ a 12 P2 + ••• + almPm
E(y2 )
•••
= a 21 Pl
+ 8,22 P2 + ••• + a 2mPm
• ••
•••
=
E(Xk )
Let 'x' j = ( a l j ' a 2 j
~j
= (0,
0
I
•
•
•
0
+ b 1mPrl
•••
•••
•••
bk1 Pl + bk2P2 + •• + bkmPm
....
.0.
•
•
••• ,
•
•
0
0
an j
I
Q,
(15.1)A
0, ••• , 0)
0, ••• , 0, b 1jl b 2j , ••• , b kj )
(l5.1)B
Page 82
Yj
•
=
,,=
.1
(alj , a 2j ,.··, anjl
(Yl' Y2' ..... Yn ,
= (0,
0, .'"
b ljl b 2jl ..... bkj )
= LXj
+ (~j
0, 0 .. ••• ,0)
0.. Xl' x 2 '
~)
""
f= (Yll Y2 , '''' Yn , Xl' x 2 ' " ' ,
Xk )
= 'r)
+
The normal equations are
(Yl'Yl)Pl + (Yl'Y2)P2 + ••• + (YI''Ym)Pm
=
(Y2'Yl) PI + (V2' Y2)P2 + • • • + (Y2' 'Ym) Pm
,
,
•••
+
(Ym·Ym)p~
(Ym'Yl) PI + (Ym'Y2)P2 + • ••
= (v2,1)
..
..
(Yl.:t)
., .
• -e •
=
(15.2)
( Ym<~')
Now suppose that the variates ~ are missing. We make the
analysis as is they wero present, and minimize the sum or squares
due to error, which is
s~
= j2
- (Yl'1)Pl - (Y2·f)P2 - ••• - (ym.j)Pm
Where PI' ..... Pm are solutions of the normal equations
dS 2
dP1
dX~
= 2x i
-
(Yl,r) dX
i
- ... -
Dirferentiating (15.2) we get
dPl
.
dPm
(Yl'Yl)aX + ••• + (Yl.rm)dX = b i1 , (i = 1, 2, "', m)
i
i
Multiplying the equation (15.4) by PI' "'1 Pm' and adding, we
find on using (15,2), that
dP
1
(Y1· j ) dX
Hence
1
i
dS 2e
2 dX7
i
dp "
+ .,. + (ym<f) dX;
= PIb i1
= Xi
+ ••• + Pmb im
- PIbiI - P2 b i2 - ••• - Pmb im
Thus to minimize S2e we must put
Xi = P1b i1 + P2b i2 + ••• + Pmb im , (i
The normal equations (15.2)
C~D
= 1,
be written as
2, "'" k)
(15.5)
'.
Page 83
[(''<1 0;1) + «(~1 ·(31)J P1 +
0
I(tJt 2 0 d,1)+(02·{31)] P1 +
•
0
0
•
0
•
C( )+((51
+ 1«(1l·
._
m
+
I( 2
Q
"
.. +
(3m.
}II·
P.m
=
(C(1·1')+«(.31·'~)
J
m' + «(:}2 °r'm) Pm = (0: 2 • 1') + «(32· ~)
•••
• •• (15.6)
X
0 t
.
• ••
o • •
0
L'G(m·Xm)+ «(3m·{3mY{ Pm = {cx'm°1"}}+ «(3m o %)
But from (15.5), if xl' ••• , xk are the values of ~ which minimize
then
s;,
~)
= P1 ('3 10()1
( t32 ° t) = P1 «'2"(\)
({31 • t)
••
«(.3m·f:>
Hence
+ P2 «()1..,3 2 ) + .0 • + Pm·
«(3lo~ill )
+ P2«32°~2) + o" • + Pm «(32e(3m)
"
0" •
• ••
• ••
Pl (t3m·(3l) + P2 «3m'(32) + ••• + Pm (Pm·t'm)
I
=
..
(15.7)
(C(l ·0(1) P1 + (~1·~)P2 + • •• + <0<.1 ·!tm) Pm = (,'.ll • 1')
(0(2·/Xl) Pl + (~·ty2)P2 + .0. + (~2·a.ill) Pm = (£t m01')
o•
"
•••
•••
) pm = (Q:m e.Tl)
+ (Q·CC
(ttm• (1"'1) P1 + (~·~2) P2 +
ill
ill
•
•
0
·...
..
(15 0 8)
This shows that if those values of ~ are taken which mini2
mize Se' then the solutions of the normal equations would be the
same as if we had formed them by neglecting the missing variates.
Hence we get Fisher's principle: If the variates Xl' x 2 ' ••
• , xk are missing, we can proceed as if they are present, providing that we put for them values which vvould minimi.ze the sum
of squares. The procedure would not change the estirnat·es at all.
16. Let us now consider what would bo the effect on tests of
significance. We can write
2
Se
n
= igl
(Y i - ailPl-···-aimPm)
2
k
2
+ i~l (Xi - bilPl-···-bimPm)
where Pl' ••• , P are solutions of (15.2). But if values of Xi
2
m
minimizing Se be taken, then (15.5) shows that
2
=Z
2
(Yi - a il P1 - a i2 P2 - ••• - airnPm)
which is tho actual sum of squares due to error when the variates
are missing. Hence
The sum of squares due to error in the case of missing variates can be obtained by proceeding as if the variates were present, and taking those values of them which minimize this sum of
squares.
Se
•
Page 84
I~ now we had to test any hypothesis, then the
(s.s. duo to hypothesis) = (s.s. due to conditionat error)
- (s.s. duo to' error)
But our previous reasoning is valid for conditional error. Hence
the actual s.s. due to conditional error can be obtained by proceeding as if the missing variates were present and minimizing
the s.s. due to conditional error. O~ course, the values o~ xl'
x 2 ' ••• , ~ which minimize tho a.s. due to conditional error will
not be the samo as those which minimized the s.s. due to error.
Hence, to get the s.s. due to hypothesis we have the following
rules:
(a.ctual s.s. due to hypothesis) = Min~Js.s. due to hypothesis) +
'-,
(a.s. due to error)! Mint(s.s. due to error)}
the s.s. on tho right hand side being calculated as if the variates are present.
17. Missing Plot in a Randomized Block Experiment.
Suppose in a rando~nized block experiment one plot is missing.
Let the missing plot occur in the j-th block, and suppose that
the i-th treatment had beon administered to it. If we write down
the s~~ of squares due to error as i~ there had beon no missing
2
2
2
2
plot, then
Bl + • • • + Bn
Tl + • • • + Tm
2
2
(17.1)
+
3 e = :z y m
n
r.m
where thore are m treatments and n blocks
)
!t
dS;
~
cy ij
2T i
2B.
2G
--1:. + _
= 2Yi'J - --m
mn
n
Now the Ti , Bjl and G as given above also contain
y~~.
Hence if
.J.,)
we denote by Til B J~, G.1 the 8.ctual yield total, the actual block
total and actual grand total, then we may write
"J' ~ • (1 _ ~ .. 1
... J
..n
n
•
• •
To obtain the actual 8.S.
value of Yij in (17.1).
+ L)
Tl
Bt
=.2:.+..:..J.
n
m
mn
mTJ + nEt - G'
J.
J
-----'-......- - =
Gf
__
mn
0(
(m-l) (n-l)
duo to error , we have to put this
JNSTJTfWTE OF STATISTICS
•
Page 85
Again the conditional error is obtained by adding tho s.s.
uue to error and the s.s. due totreatmonts. Thus tho conditional
error is
B22 + ••• + Bn2
+
3,2 = ~ y2
(17.3)
e
m
dS,2
2B.
e
•
--l
• • ay:-:- = 2yi j
m
--Bi
-
-
~J
We then have to put
Bt
Yij (1 ...
1
iii) =-1.
rn
B'
or
y ij
= m~l =(3
(17.4)
in (17.2), to, get the actual conditional error, i.e., the actual
sum or squares due to treatments and to error. If we have already
carried out the analysis or variance, by using the value (17.2)
tor Yij we would have over estimated the s.s. due to treatments,
by an amount which ls equal to the dirferonce of the values of
(17.3), when putting in it Yij as given by (17.2) and Yij as given by (17.4). This is called the bias in the s.s. due to treatments. Now
2
,2
2
(Bt + Yij)
3
= y
.1
_
+ other terms not containing y
e
ij ...
m
= m-l
_ y2
m ij
2Bt
_ -al
y .
m ij
Hence bias is given by
2B t
m;l(o? - ~2) - -nf(~- (3)
=
(<<.- Q) ~\( m-l ) I\,:)',,+(3
' )
~'
m-l (0( - (J. . ) 2
... 2pt}
In
'I
JIm
=-
-"
Thus we get the rol1owing rule: Estimate the missing value by
the tormula
Yij = (mTi + nBj ... G1)! [(m-l) (n-1U
Carry out the analysis of variance, with this value, but correct
the s.s. duo to the treatments by sUbtracting the amount
•
.
B t,2
...J.....:
m-i lY ij
m-lJ
but remember that only (m-l) (n-l) - 1 d.f. belong to error.
If i l and i 2 are two treatments for which the blocks are
complete, then t. - t. is estimated by
m
~l
f.
...
(T.
... T
~2
~l
i2
)/n
•,
,
Page 86
and the variance of this comparison is clearly
20
2
/n
It is of interest to compare this with the variance of a
contrast between the i-th treatment for which a plot is missing
and any other treatment say the i o -th~ In this case the estimate
will be
which may be written as
(raT! + nB t -
I)
nt
1.
J
G
(m-l)(~-l)
'"
l
+ Tf - Tio \
Some algebraical calculation shows that the va.ria.nce of
this estimate is
2,
o }2 +
n(
m
"I
(m-l) ~n-l) J
Hence the contrast between these two treatments is measured by
somewhat greater inaccuracy.