Corral, Ada R.; (1980)Repeatability of Spatial Configurations Defined By Interaction Matrices."

REPEATABILITY OF SPATIAL
CONFIGu~TIONS
DEFINED BY INTERACTION MATRICES
by .
ADA RAY
CORR.~
A thesis submitted to the Graduate Faculty of
North Carolina State University
in partial fulfillment of the
requirements for the Degree of
Doctor of Philosophy
DEPARTMENT OF STATISTICS
RALEIGH
1 9 8 a
APPROVED BY:
iv
TABLE OF CONTENTS
Page
..
LIST OF TABLES .
v
vii
LIST OF FIGURES
1.
INTRODUCTION
1
2.
LITERATURE REVIEW .
5
3.
HETHODS AND MATERIALS
3.1
Description of Interaction Effects - Model
and Assumptions . . . . . . . .
Measuring the Agreement Between the True
and the Estimated Location Points
Monte Carlo Study of Repeatability of
Spatial Configuration . . . . . . . . . . .
3.3.1 Definition of parametric A matrices
3.3.2 Generation of A, the matrL~ of
Estimated interaction effects
3.3.3 Analysis of the Amatrices
3.2
3.3
4.
REPRODUCIBILITY OF SPATIAL CONFIGURATION DISTRIBUTIONAL PROPERTIES OF m2 .
4.1
4.2
5.
v x
6.1
6.2
INTERACTION
60
OF EIGENVALUES EST~~TED FROM
COVARIAl~CE MATRICES
62
62
76
8.
LIST OF REFERENCES
9.
.~PENDIX
-
78
CONCLUSIONS
Su~Y &~
~~
22
23
52
Results
Discussion.
7•
15
15
52
Results
Discussion
~
13
25
47
EFFICACY OF USING REDUCED DL~NSIONALITY TO
CHARACTERIZE SPATIAL CONFIGU~~TIONS .
~1E~~S ~~ VARI~~CES
10
25
Results
Discussion
5.1
5.2
6.
10
84
?
UPPER BOUND FOR THE EXPECTATION OF m-
86
v
LIST OF TABLES
Page
3.1
when
1
=
v
when
4.1
4.2
4.3
5.1
5.2
5.3
5.4
6.1
and standardized
21.
2
a i = ai/~=l 0i ' used for
=
~
8 . . . . . . . . . . . . . .
........
(a~), and standardized
1.
used for ~
a i = ai!~=l a~1.
17
Values of eigenvalues
eigenvalues,
3.3
(a.),
+
eigenvalues,
3.2
2
Values of eigenvalues
~
+
2
= 4, v = 8
17
2
Values of the error variance,
,used to generate
the random error e.. , for each choice of
parameters used
.1.J.•. . • . . . • . . . . .
22
?
Mean value of the statistic m~ for different
interaction matrices A (v x '9..) , and different
ratios r
...•...
. •...
Estimates of the upper bound of
from the linear regression of
where possible . . . .
26
2
m , b(A), obtained
m2 on (l-r),
27
2
Variance of the statistic m
(multiplied by 100)
for the different interaction matrices A (v x 9..),
and for different ratios r . . . . . . . . •
35
2
Mean values of m calculated using the first p
eigenvalues of the estimated interaction covariance
matrix, AA, for the different interaction
matrices A, and different ratios r . . . .
54
2
Variances of m calculated using the first p
eigenvalues of the estimated interaction covariance
matrix, AA, for the different interaction
matrices A and different ratios r . . . . . . . . .
55
Values of the 95th percentiles of the distribution of
m2 (p), p = 1,2,3, and k, for the different A
matrices and different ratios r . . . . . . . . .
56
Values of dI!(cr2vi) for the different interaction
matrices A and different ratios r .
.....
58
2
Mean values of 5i (for those values calculated) for
the different interaction matrices A, and
different ratios r . . . . . . . . . . . . . .
63
vi
LIST OF TABLES (continued)
Page
6.2
Mean values of
interaction,
1- and v
.
S2
I in the case where there is no
A = 0, for different choices of
.....
66
. .........
....
6.3
Values of the variance of S2 in the null case,
for different choices of
and v
.
66
6.4
Variances of S~ (for those values calculated
for the diffe~ent interaction matrices A, and for
different ratios r . . • . . . . • . . . • .
72
i
vii
LIST OF FIGURES
Page
3.1
3.2
4.1
Configuration of the location points given by the
columns of the v x t interaction matrix A,
when t = 8, v = 8 and A A has at most three
non-zero eigenvalues, 01' 02 and 0 3 , o~ = 0i/21Z . . . . •
Configuration of the four location points given
by the columns of the v x ~ interaction matrix A,
when t = 4, v = 8 and A A has at most three
non-zero eigenvalues, 01' 02' and 03' o~ = 0i/2
Th e
8
19
21
. . m2 aga~nst
.
1 - r, f or eac h 0 f t h e
stat~st~c
x 8 interaction matrices A
• • ••
30
4.2
Th e
. . m2
.
1 - r, f or eacn
. 0 fh
aga~nst
t e
8 x 8 interaction matrices A
.....
32
4.3
2
The variance of m ex 100) against the ration (r),
for the 8 x 4 and 8 x 8 interaction matrices A
37
4.4
The empirical conditional distribution of the
statistic m2 , shown as box plots, conditional
stat~st~c
\,m
2 (\'m
2
on the value of r = Li=l Oil Li=l 0i + (2-1) (v-1)cr
2}
Each rectangular box axtends from the first to the
third quartiles of the distribution of m2 for
that value of r. The middle horizontal line is
the median; the pair of horizontal lines above
the box represent the 95th and 97.5th percentiles
and the pair of horizontal lines below the box
represent the 5th and 2.5th percentiles. The (.)
represents the extreme values in the sample of
2000
38
4.4a
v = 8, 2., = 8 and AA~ -has three equal
non-zero roots,
= o~ = 8~ > 0
39
4.4b
v = 8, 2., = 8 and AA~ has two large equal
roots and one small,
= o~ = l6o~ > 0
40
4.4c
v = 8, = 8 and AA~ has one large and two
small roots, 02 = l68~ > 0
1
4.4d
v = 8, = 8 and AA; has two almost equal
non-zero roots, 98 2 = 160 22 > 0
1
or
or
..
41
.
42
viii
LIST OF FIGURES (Continued)
Page
v = 8, t = 8 and
root, 0'21 > 0
4.4e
4.4f
4.4g
4.4h
6.1
.
AA"
····
43
v = 8, t = 4 and AA" has three equal
non-zero roots, 15 12 = 15 22 = c~ > 0 • •
.····
44
v - 8, t = 4 and AA" has two large zqual
roots and one small, 15 12 = 15 22 = 160 3 > 0
.····
45
v
= 8,
t = 4 and
small roots,
AA"
has one
1ar~e
cr = 16c~ = 16C~
and two
46
>°0
eigenvalue of
(standardized by 0'2) ,
v At A"v
(1.-1) (v-1) S2 against t - 2 for the case
The~largest
/.
d12 /
2
=
1
where there is no interaction
different choices of v
.
6.2
has only one non-zero
.·.··
A
= 0,
·.··
and for
The largest eigenvalues of AA
(standardized) ,
against the ratio r, for each of the 8 x 8 and
8 x 4 interaction matrices A ·
. .
··.
····
67
·· ·
69
2
51'
.
1.
INTRODUCTION
Interaction effects, in general, have been studied to get a better
understanding of the model representing the data.
For example,
Mandel (1971), in order to simplify the model, partitions the interaction into a significant and a non-significant part, according to the
significance or non-significance of the eigenvalues of the matrix
where
d..
~J
D~D
are the interaction effects, and (') denotes the transpose.
In particular, genotype by environment interactions have been studied
because of their importance in developing improved varieties and in
variety testing programs.
Knowledge of the magnitude and pattern of the
genotype by environment interaction has become essential in helping the
plant breeder reach many decisions concerning his breeding programs.
For
a~ample,
Eberhart and Russell (1966) reduced genotype by environment
interaction by selecting stable genotypes, those that interact less with
the environment, and proposed a model to provide the criteria necessary
to rank cultivars for stability.
Abou-El-Fittouh, et ale (1969) defined
cultivar testing regions for cotton by using as a criterion the reduction in the within region genotype by location (environment) interaction.
By reducing the within region interaction, the precision for comparing
cultivars was increased and relative difference in performance of
varieties will be more stable over the environment within the region.
While this particular application provided the motivation for this
research and the language is in terms of "cultivars" and "locations",
the formulation of the problem
and the results obtained are generally
applicable to any interaction analysis.
Z
The
g~notype
by environment interaction can be controlled by classi-
fying locations according to the similarity of their interactions with
a fixed set of cultivars (genotypes).
Such a procedure would result in
dividing a set of locations into not necessarily adjacent regions) such
that the interaction of cultivars and locations within each region would
be small.
Any classification of locations based on cultivar by location
interaction will depend to some extent on the set of cultivars used.
Therefore, to do a classification of locations based on the cultivar by
location interaction which would be useful, the set of cultivars chosen
must be a representative set of the cultivars to be used in the future;
that is, the cultivar effects either must be considered to be random
effects or the classification must be considered to be conditional on
the set of cultivars used.
Different methods have been used to measure the similarity of locations.
Let each location be represented by a vector
estimated cultivar by location interaction effects,
. ", ~ ,), i = 1, ... , t,
v~
where
t
(v x 1)
t~
-~
of
A
/'.
= (aS l " aS Z"
~
~
is the number of locations, and
is the m.nnber of cuI tivars, and Ie teach "true" loca tion be represen ted
by a vector of the parametric cultivar by location interaction effects,
~i
= (aSli,aS Zi ' ... , aS vi ) , i=1,2, ... , i .
cultivars being used, the
ith location.
as. ,
~J
Conditional on the set of.
are fixed interaction effects for the
One measure of similarity is the set of Euclidean dis-
tances between all pairs of locations
(t,).
~
Another measure of
similarity used is the product moment correlation coefficient between
2.
~
and
Z..
J
The classification is then done by applying a clustering
algorithm on the similarity measure.
v
~
3
Although classification of locations and of cultivars has been done
in this manner, little is known about the reliability of the classification.
That is, little is known about the relation of this classification
to the "best" classification, where the "best" classification is defined
to be that which is obtained when the same clustering algorithm is
applied to the true interaction effects.
The problem of how close the
classification of locations, based on the estimated location vectors, is
to the "best" classification, is equivalent to the problem of determining
how similar the configuration of the estimated location points,
to the configuration of the true location points,
A..
-~
!L • ,
-~
is
The two problems
are equivalent because any consistent clustering procedure depends only
on the relative spatial arrangement of both sets of points, and not on
the absolute distances between the points of each set.
Therefore, the
impact of random error in any clustering procedure can be studied by
analyzing the agreement between the configurations of the two sets of
points.
The purpose of this thesis is to measure the repeatability of this
spatial configuration, where the location points are defined by the vectors of location by cultivar interaction effects, for various levels of
random error, sizes of interaction-matrices, and true configurations of
location points.
To do this, a Monte Carlo experiment was carried out
using 2000 independent sets of random errors to generate the estimated
interaction effects for each parametric combination.
A closely related problem that will be discussed is the relative
gain or loss of precision in the classification when the
of the space of
estL~ated
dL~ensionality
location points is reduced to include only the
more important. dimensions as reflected in the eigenvalues of
D~D
where
4
d
ij
=
~
aS ij . A reduction in the dimensionality of the space of estimated
location points according to the importance of the eigenvalues, by
eliminating the smallest eigenvalues, corresponds to a partitioning of
the interaction effects (Mandel (1971)).
If, in fact, certain components
of the interaction effects which are essentially random error can be
eliminated then a gain in precision might be expected by this reduction
in dimensionality of the space.
Since the eigenvalues were used as the basis for reducing the dimensionality of the space of the estimated location points, the reliability
of the estimation of the eigenvalues is basic to the reliability of the
classification using less than full dimension.
Because of this, a small
section was dedicated to the discussion of some distributional properties
of the eigenvalues of the singular matrix
D~D
for different sizes of
the matrices, different true interaction effects and different choices of
random errors.
~
5
2.
LITERATL~
REVIEW
Interactions of the cultivars with the environments in which they
are grown are of major importance to the plant breeder.
Because of this,
different methods have been proposed to evaluate the cultivar performance
in several locations.
Plaisted and Peterson (1959) suggested that the portion of the cultivar by location variance component contributed by a single cultivar
could be used as a measure of cultivar dependability.
Finlay and
Wilkinson (1963) computed a linear regression of yield on the mean yield
of all cultivars for each location and season, for each cultivar.
Eberhart and Russell (1966), based on Finlay and Wilkinson (1963), used
the regression of each cultivar on an environmental index, and a function
of the squared deviations from this regression to provide estimates of
cultivar stability.
The environmental index was defined as the mean of
all cultivars at the jth environment minus the grand mean.
Okuno,
~
al.
(1971) proposed two other methods for measuring the performance level of
the environment.
The first is a version of Finlay and Wilkinson's
method, which uses one or more independent measures of the environments
in place of the average performance of a large number of cultivars for
each location and season.
The second method is to extract one or more
hypothetical factors for evaluating the responses of cultivars to
different environments by applying principal components analysis to the
elements of genotype by environment interaction.
The classification of environments (locations) into more or less
hanogeneous groups to control the within group genotype by environment
interaction is another procedure of interest used in cultivar testing.
6
Abou-El-Fittouh, !! al. (1969) used cluster analysis as a tool for
classifying locations in order to minimize the within cluster genotype
by environment interaction.
cotton was used.
Data for lint yield per hectare in upland
Each location was represented by a vector (v x 1) of
:L ~
estimated interaction effects,
-~
~
/'.
A
= (a S1 . ,a ~2 . , •.• , as
~
~
.),
v~
where, v
is
Two similarity measures were computed, the
the number of cultivars.
k
d . = [:L~~./V]2 , which was found to be the more
-~-~
iJ
efficient for the, purpose of defining regions when the criterion is the
distance coefficient,
minimization of within region genotype by environment interaction, and
the product-moment correlation coefficient,
= corr(l.,I.).
-~ -J
r ..
~J
These
~NO
measures of similarity were computed for all pairs of locations,
1 ..
Using both sets of similarity measures they proceeded to do the
-~
classification using a clustering algorithm.
modifications in the
curr~~tly
The results suggested some
recognized zones of adaptation for cotton.
Most methods of classification operate on a
ties
be~Neen
all pairs of objects.
Fitting a dendogram to a
given set of distances might be regarded as
f(d .. , d~.),
:neasure and
l.J
+
d ..
J.J
J.J
·..;here
d..
optimizi~g
function have been suggested.
is any C:uclidean sir:J.ilarity
J.J
Several possible forms of this
Gower (1971) discussed some preliminary
'Nork. on one of these criteria, a statistic
?
R-
':vhicn compares t";,;o
different sets of distances Nithout using the dependent
measures of distance.
Y of
n
The vertices of
some goodness of
is apparent Euclidean distance read from the resulting
dendogram (Gower and Banfield, 1975).
tion
of the similari-
Then a criterion is used to fit a
dendogram (Clifford and Stephenson, 1975) .
fit criterion,
matrL~
?e discussed the problem of fitting a coniigura-
points, to another configuration
A, Pi'
pai~vise
are fixed, while chose of
A,
also of
Y, Qt' are
n
points.
7
·e
allowed to translate, rotate, and dilate to cope with possible changes of
scale between the two configurations.
2
R ,
The sum of squares,
of the distances between corresponding
vertices is minimized after the centroids of both sets of points are made
to coincide.
Assuming
dent centroids,
and
H
2
R
P
and
Q have been translated to have coinci-
= tr(P-oQH)(~-oQH)~,
is a rotation matrix.
where
0
is a dilation factor
Minimization with respect to
0
and
H,
gives
where
H
= ZW~,
vectors of
Z
Q~PP~Q,
is the
Z;Z = ZZA
= I,
orthonormalized.eigenvectors of
and
0
=
tr
P~QH/tr QQ~.
matrL~
txt
and
P~QQ;P,
of the orthonormalized eigen-
W is the
WAW
txt
= \~~ = I,
After substitution for
matrix of the
H~
= BRA
=
I
0,
Gower and Banfield (1975) studied the behavior of this criterion, with
six other criteria, when dendograms are computed using the single
linkage algorithm (Gower and Ross, 1969).
They examined the distribu-
tions of the seven criteria when fitting single linkage clusters to
n
samples drawn from a spherical multinormal distribution in a varying
number of dimensions
v.
?
The
.
R-min
Banfield (1975) differed from the
criterion used by Gower and
2
R min
discussed by Gower (1971) in
that only rotation, not dilation, was permitted.
and Banfield (1975),
2
R min
defined as in Gower (1971).
=
Therefore, in Gower
tr PP~ - 2tr P~QH + tr QQ~
where
H is
For this statistic a regression through the
origin, weighted by the inverse of the statistic's variance, gave
8
?
meaneR-min)
= 1.022
2
mean (R min)
= nv.
nv,
which was consistent with the simple relation
The conclusions found by Gower and Banfield (1975) were that
2
R min
although
is not easily computed, its mean and variance are easily
2 .
R
" Q1~n
interpolated from a table of means and variances of
v.
nand
2 .
R m~n
Also
the other criteria.
is not a "function of dependent measures as are
All criteria were found to approach normality as
2
R min
increased., The statistic
v
and
for different
n
seems best if metric depen-
dencies are to be avoided, and it also has good distributional properties.
Mandel (1970) developed some theorems on the eigenvalues of the
interaction covariance
is an
interac tion
m x n
with mean zero, and
ma tri..'{,
cov(vec
D)
where
= B
~J
?
0
T..;here
Acr-
Q1atr~~
of rank
r,
and
then the non-zero eigenvalues of
JlatrL'{
TT
r
T..;here
J
(t .. )
=
dependently d.istributed variates,
2
is a mn
x 1
matrix of rank
,-i ~ ,
~,;here
A
is
s,
of :1or:nally and in-
Jlatr~<
having- zero
~ean
and cornmon
a ,..Q
--
So he proved that the non-zero eigenvalues of
che same' as those of
s
D
one under the ocher,
D
r x s
-..;
variance
= D
are the same as those of the
DD~
is a
~J
vee
is an n x n
B
~J
is normally distributed
d ..
vector formed by stacking the columns of
an m x m
Cd .. )
one of which states that if
matr~~,
1.S
distriouted as a ;-ashart '..;ich
degrees of freedom,
Later,
~3
.. , i
1J
of
i
(1971) did a partitioning of
~ndel
= 1, ... ,'1,
and
= 1, .. . ,1,
J~
j,~;3iJ'
=
-
I~:1
;;0...
-
3.,
:\,.
are the.
is the matrix of true interaction
J::<.
is the eigenvec tor
to
and
..J"'"
L, .. _,5,
(~
~.:::.
:L3. ,),
(v;v)
~ =
v ..
effects,
1.:'<'
?
3;,
where
f~nctions
:.1.,
~J~
(u .,)
interaction effects,
i:1to the sum of multiplicative
non-zero eigenvalues of
~J
t~e
is the eigenvector of
:.<.
0
f
:!M ~
~:1
corresponding
corresponding to
4It
9
Only a few multiplicative terms of the form
e
v.u.
J
J-
(generally one or
two) are retained in the model, the remaining terms are pooled together
and regarded as
a~perimental
error.
The same partitioning is done on the
~ij = L~:l 8k uik vjk · He had shown in
estimated interaction effects
a previous paper, Mandel (1969), that there is a partitioning of the sum
of squares of interaction of the usual analysis of variance that
parallels the partitioning of the interaction effects.
He developed an
empirical criterion by which a decision could be made as to where the
partitioning should be ended.
For the sum of all terms involving
sum of squares and a "degrees of freedom" were obtained.
sum of squares associated with the kth non-zero root,
~2
be
\),
K.
ok
.
6,K.
a
The interaction
6"K.
was shOwLl to
The clue to an appropriate definition of degrees of freedom,
, was that in the absence of true systematic interaction effects, the
1, .... ,s,
quotients
Therefore, if
error.
(J
2
.,?
=~
freedom,
')
2
occurs in the real model, then
for each
= ~(~2
::. K. /c 2)
j~
k
ex~erimental
is the e."Cperimental error variance, and if
none of the terms
E(e;/v,)
K.
i:(.
should be estimates of only
I
k,
providing a definition for the degrees of
To obtain estimates of
.
a
Carlo
~onte
experiment was carried out using 625 independent sees of identically and
independently distributed (iid)
of
v
and
L
and
v,K.
empirical estimates of
no~al
was calculated as
v,
f:.
and
obse~Ting
partitioned interaction relative to
can be
~ade
(0,1) deviates for each combination
on how far the
2
?ar:itioni~g
~?
mean(e;).
i:(.
Using
the magnitude of
~1andel' s
~
-3
2I
1
f:.
,/
~}..
l<.
in the
the error variance, a judgement
?rocess should be carried.
10
3•
3.1.
METHODS AND MATERIALS
Description of Interaction Effects -- Model and Assumptions
To provide the interaction effects needed to accomplish the objec-
tives described in the introduction, a two-way linear model with interaction was adopted;
= 1, ... ,
i
where
a.
a. S
is the ith cultivar effect,
i
is the jth location effect,
S.
J
is the interaction effect of the ith cultivar and the jth location,
ij
and
v, j = 1, ... , fL, k.= 1, ... , n,
e, 'k.
are independently and identically distributed (iid) normal
l.J-
with zero mean and variance
. 2
cr
The
e
conditional on the set of
a.S , , ,
l.J
cultivars being used, are regarded as fixed effects and
..J
as
+ e
/'
The estimated interaction effects,
>a.!3 ,. = O.
;
....
" ..J
a.:3 . , ,
1.J
were calculated
A
a. i3 .. =
l.J
)
V
J
-
ij .
where
+
. . i .. - y . J. .
'1
(.)
=
Y
+ (e ..
!:LS, ,
l.J
l.J .
represents the average taken over that sub-
scr~pt.
Let
A
=
(a. ,)
i. ~. , . a ij =-'6
- ij
a
k x 1
be the
l.J
and therefore
vector of ones and
columns, a. ,
-'..J
v x
0,
-;\.
~
matrL~
\ ~',
. ='7
:i.
is a
action effects.
\'
.~-,
-.v
k x 1
= -'I
0 ,
",,;here
1.5
vector of zeroes.
and
v x 1
The
vector of true inter-
Therefore, the s?atial configuration of the
location points is contained in the A
i
!"\
= '"
J'
-.v
of the matrix A are regarded as the coordinates of the
true location points as defined by the
locations
of true interaction effects,
j
is gi"en by
matrL~.
(a, - a.)
-1.
-J
The distance
~(a.
-1.
L
true
oe~Neen
',.;hich is equal
11
I~=l
to
2
°k(T ik
A~A
values of
-
L
jk
and
)
(T
A;A
eigenvectors of
2
2
where
lk , .• "
ok' k = 1, .. . ,m t
are the non-zero eigen-
= 1, ... ,m,
are the orthononnalized
, k
T x.k)
corresponding to the non-zero roots.
There are,
of course, an infinite number of possible parametric spatial configurations as defined by the A matrix which might be considered.
The specific
configurations that were considered in this study are described in
Section 3.4.
(a.. )
A
A=
Let
be the
~J
effects, Le.,
a
&6'..
= as ~J
..
~J
=
ij
A;~v =
21 ' A~Q. = ~v'
where
I
is a
k
k x k
E = ( e.. ),
and
e ..
•
~J
~J
v x 1
.
matrix of estimated interaction
+ e ~J
.. - e.
•
~.
identity matrix,
is
iid
A
J
is a
k
and
k x k
matrix of ones,
2
2
N(O,cr In = cr ) .
e
SS(aS) , which
A gives
A
A
A
=
is equal to
UDV;,
AA;
tr
D2
where
diagonal matrix containing the non-zero eigenvalues,
2
d.,
is a
of
~
A
A;A,
U is a
AA;
corresponding to the non-zero eigenvalues,
a
+ e
can also be expressed in matrix notation as
A singular value decomposition of
k x k
•j .
Therefore, in matrix notation
The interaction sum of squares,
L~=l L~=l ~i~
- e
•
v x k
matrix of the orthonormalized eigenvectors of
A~A
matrix of the eigenvectors of
1 x k
V~V = I.
zero eigenvalues,
Then
ss(aS)
=
U~U = I,
and
V
is
corresponding to the nontr(A;A) = tr(VDU~UDV~)
tr D2 = ,k
d2
L.i=l i
When
as assumed herein, ss(aS)/cr
with
A
=
(t-l) (v-I)
'i'
~ L'
A
are fixed effects and
.
~,J
?
as ~ . I cr
~J
2
as ..
~J
is normally distributed,
is distributed as a non-central Chi-square
degrees of freedom and non-centrality parameter
2
; Le.,
ss(aS)/cr
2
=
is distributed as
12
XC;-l) (v-1) (A)
(Searle (1971)).
(X~2(A)) = 2v + SA
(see Searle (1971)),
v
k
L
E(
i=l
and variance
Since
d7/0 2) = (9.-1)(v-1) +
~
9.
v
2
L L ~S, ./0
2
~J
i=l j=l
m
= (t-1) (v-1) +
L
5=1
and
Var(
k
L
i=l
2
2
2(9.-1) (v-1) + 4
d,/o) =
~
t
v
L L
. 1 . 1
2
~S . . /0
~J
J=
~=
m
= 2(9.-1) (v-1) +
L
4
5.=1
Scaling the
d~
~
1/t0
by
2
gives
k
3.1
E(
L
i=l
2
5.)
~
=1
and
k
Var (L
3.2
i=l
2
5 .)
~
4
1!:
2 2
2(1-1)(v-1)0 + 4 L 8. 0
~
= -------~..;...=....::1'----::2:-"
r
U - 1) (v-1)02 +
=
m
where
r =
L
i=l
2(1-r 2)
(1-1) (v-I)
~
i=l
8:1
~I
J
2
13
er~or,
In all studies reported herein, the magnitude of
relative to true interaction is controlled by choice of
r, 0 < r
<
1 .
The relative spatial configuration is controlled by choice of the relaO••
tive magnitudes of the non-zero roots,
3.2
1.
Xeasuring the Agreement Between the True and the Estimated Location
Points.
Since the classification of locations depends only on the relative
spatial configuration of the estimated location points, a standardized
2
R.
version of the
statistic (Gower, 1971) was used to measure the
m1.n
.... ,
degree to which the estimated location points,
",...,.
a6V:I..),
reproduced the spatial configuration of true location points,
. . .,
squared distances
a8 .).
Vl.
be~~een
2
Rm1.n
.
is the minimum possible sum of
corresponding vertices of
~~o
configurations
X and Y after allOWing translation, rotation and dilation of one set of
points to "match" as closely as possible the other.
2
R
~ith
respect to
0
and
=
H,
:finimization of
tr (A -oAR) (A -c.\H) ~ .
a dilation scaler and rotation
~atr~<,
respectively, gives
?
R-
min
= ......
.....
AA~
-
= tr
A.i..~
- ... -..... (A
Str(A
'-
'~here
natr~(
HA#)
a..c\.~
C
)
~
~
?
... I .....
,
"-~
is a (1
AA~
x~)
diagonal
containing the positive square rooes of the eigenvalues of
14
(A ,1A) .. (A ,1~\),
orthonormalized eigenvec tors of
H = ZW~, H,IH
= HH,I
= I,
0
=
A H A"/tr A A,I
tr
For all results reported herein,
ing by
tr AA,I
ZZ·" = Z"'2 = I,
R2 .
m~n
and
(see Gow·er, 1971).
was standardized by divid-
to facilitate comparisons over different
A matrices.
Thus,
1 - m2
"Note that
.
t h e square
~s
0
f t h e corre l
'
b etween the e 1 emen t s
at~on
A
of
A and the elements of
are zero.
A H,
Also note that
since, in both cases, the centroids
tr AA,I
true location points defined by
standardization puts all
is the sum of squared distances of the
A from their centroid so that this
A matrices on a common basis in terms of the
dispersion of the true location points, or equivalently in terms of the
sums of squares of true interaction effects.
Thus,
2
is the stan-
m
dardized minimum sum of squares of the distances between corresponding
vertices,
a.
and
-~
A
dilation of
A or
a.,
that can be obtained by allowing rotation and
-~
2
1 - m
is the maximum correlation that can be ob-
A and the elements of
tained between the elements of
Clearly,
a~
coefficient, or
when
A
?
1 - m-
tion.
tr C
Therefore,
The lower bound of
tr AA~
= Ec.
~
2
2
=
O.
finite.
2
m , zero, is obtained
The other limit of one is
Tr
that is, if and only if
AHA,I
=0
(using the definition or
= 0
or that
m
(J
(tr AHP.,1)2/ tr AA' = 0;
tr AHA~ = 0, assuming
A,IA)
because it is the square of a correlation
i. e., when
reached if and only if
sition of
1,
2
0 < m <" 1.
equals
tr WCZ,I ZW·
~
c.
~
Ali.
=
0
for all
i
since
implies
H and the decompoc. > 0
~-
by defini-
~
=
1
i f and only i f
A,IA = 0,
which means that
15
every column of
A must be orthogonal to every column of
~
A.
But
prob(m 2=1) = probCA~A = 0)
1
1
= prob{A"CA+(I-y J)E(I- J)] = O}
i
since
e ..
is an absolutely continuous random variable and
J.) •
ii-xed numbers.
2
Therefore,
2
E(m ).
2
ECm ) < 1
appendix) that
- r,
was sought, and it was proven (see
'"here
r =
Lm
~ 2/
3.3
2
< 1 - 0
~
. A,AR
~egative,
.
~2 ,
i=lc i "" ( 1.-1) (v-l) C1
(Lm
i=loi
From this proof, the bias in the statistic
determined to be
J.J
is less than one '"ith probability one.
m
A better upper bound for
are
a ..
:n
2
2
, if any, was
(see appendi.x) .
Monte Carlo Study of Repeatability of Spatial Configuration
3.3.1
Definition of Parametric A Matrices.
ability ''''ith which
A,
To assess the repeat-
the matrix of estimated interaction effects,
A matrix of true
reproduces the spatial configuration defined by the
interaction effects, a Monte Carlo study was run using several ?ara-
A.
metric definitions of
The
A
~atrices
were defined in
singular value decomposition of
and
r"'''' = r.
t:te
~
1
te~s
of the
co~?onents
of :ne
\
=:
e~'
!be rows ofT O. x m) may be regarded as coordinates of
•
~oca t~ons
•
~n
~m
in a standardized coordinate system; standar-
..'"\, ,
dized in the sense that the centroid is zero, the
~~es
are orthogonal,
and the sum of squares of the values of the itn coordinate over
location
I
?oints equals one for all i,
non-zero eigenvalues of
........
\ .\
~
Choice of
~~~"=
~,
I,
the
m is the number Jf
the :n x
::l
diagonal ma tri:{
)•
16
containing the positive square roots of the non-zero eigenvalues of
determines the final scaling of each axis.
~T~
Thus,
AA~
completely
defines the spatial configuration, including all pairwise distances, of
the
Q,
m
R .
location points in
The matrix
one-to-one transformation carrying the
Rv , m < v.
r(z
x m)
serves as a
Z location points in
R
m
to
The spatial configuration, including all pairwise distances,
is unaffected since
r ~r
= I; '~.e.,
.
t h e 1 ocat~on
.
po~nts
.
~n
=
m
V
lie on
an m-dimensional hyperplane in Rv.
The following
1.
£
= 8,
v
r~
T~
A matrices,
= 8,
A
=
r~T~,
were used:
m < 3
1
-1
1
-1
Q
Q
0
= ~ 1
1
0
Q
-1
-1
Q
0
0
1
1
Q
0
-1
1
-1
1
-1
1
-1
1
1
1 -1
-1
1
1
-1
-11
1
1
1
1
-1
-1
-1
-lJ
1
=--
_:1
:.J
-1
212
6
= diag(ol,o2,o3)
for all choices of
8~ shown in Table 3.l.
~
See Figure 3.1 for the spatial relation of the eight locations.
e
17
Table 3.1.
Values of eigenvalues (01), and standardized
eigenvalues, ot = or/L~=l oi, used for ~ when
i = v = 8.
2
3
0+
1
+
O
2
+
0
3
128
128
.333
.333
.333
.02
.02
.02
.333
.333
.333
(iii)
128
128
8
.485
.485
.030
(iv)
128
8
8
.889
.056
.056
(v)
1.28
.72
0
.640
.360
0
(vi)
2
0
0
1.0
a
a
02
1
0
(i)
128
(ii)
Case
2.
2
2
0
= 4, v = 8, m2.,3.
i
r'"
is as shown in 1, above.
T" =
~r:
-1
1
-1
1
-1
-1
Ll
-1
-1
1
6.= diag(01,02,03) for all choices of
o~
shown in Table 3.2.
See Figure 3.2 for the spatial relation of the four locations.
Table 3.2.
?
Values of eigenvalues (oi), and standardized
eigenvalues,
= 0r/IT=l
used for D. when
v = 8.
ot
ot,
02
02
2
02
°3
0+
1
0+
2
0+
3
(i)
64
64
64
.333
.333
.333
(ii)
64
64
4
.485
.485
.030
(iii)
64
4
4
.889
.056
.056
Case
1
t
=
4,
Figure 3.1.
Configuration of the location points given by the
columns of the v x t
interaction matrix A, when
t = 8, v = 8 and A~A has at most three non-zero
eigenvalues, 01' 02' and 03' 5~ = 0i/ 2!2
19
(-0 1'* ,
A*2 ' 6*)
3,
-v
*
/12
~------.:..-------::
A
°2
*
*
°:\:
/i2 3)
: (-01' 0 02'
*
A-
1
I
~(~8~1~,_-_O":2:..,_5.:3:"*)
-r:----1 (° 1 ,
:I:
I
I
t
I
I
-(0, 0, 0)
I
I
I
I
I
8 3 /12
I
I
I
(\-v"
*
l'
:I:
I
I
:I:
\J
-~2'
-°3)
------;
(
------:\:.,
'I,
... ) ...
-° 1 ,
•
.,
'"?'
-0
3
...
"' ...
...
"",
:I:
{', 3 1 , :;2* '
Figure 3.2.
Configuration of the four location points given by
the columns of the v x ~ interaction matrL~ A,
when ~ = 4, v = 8 and A~A has at most three
non-zero eigenvalues, 01' 02' and 03' o~ = 0i /2 .
21
°2
* -0 2* , 15 3*) - - - - - - - - - . . . : . - - - - - : ::
( -0 l'
I
I
5 *)
,I
3
I
I
I
I
I
I
I
I
-(0,0,0) :
I
I
I
I
--
I
I
..
,
----- '.. - -,(,
------ ( -~*
~, -8 ) ""
-, -0 , O ,
,
2
1
J
,,
,,
,
.22
3~
A
= O.
~,v
4.
A
All possible combinations of
= 3,4,5,6,7,
= (I
-
8 J 8)
1
and 8
for
~
L and
v
for
were used.
= 8,
v
= 8.
There are seven equal non-
zero eigenvalues for a matrix of this form.
3.3.2.
effec ts.
Generation of
For each
A
A,
the matrix of estimated interaction
matrix, the
(v x 1.) ma tr ix
E = (e .. )
J.J .
erated using a randon number generator called Super Duper, developed at
McGill University (Dickey, 1978), so the
e .. , i
J.J •
=
l, ... ,v, j
were iid normal random variables with mean zero and variance
= 1, ... ,2.
~~
2,
,In =
e
.~
2
.
was chosen so that the
The variance of the random error,
ratio
m
L
i=l
5~J. + (1-1) (v-l)a 2
was equal to .001, 1., .25, .5, .75 in case 1, and
case 2.
r
is always zero in case 3, and
The values of
r
~s
.,
.5, .75 1:1.
~,
always .001 in case -.
are given in Table 3.3.
:'able 3.3.
2
Val '.2.es of the er-rcr variance, -- , :..lsed :0 generace the
••
, for each choice of ?aramecers used.
random error .::>-:'J
.
CA.se
rati.o
1-i
.001
.1
.25
.50
75
l-ii
7828.90
70.53
23.31
~
i
.~
I
.0....
2.61
1. 22
.011
.0037
.0012
.00041
I-iii
I-vi
1-\7 , ~li_
5382.37
48 .49
16.16
5.39
1. 80
2935.84
26.i5
3.82
2.94
.98
iO.78
.37
1 ,.,
.~-
.0"'1
.014
..,
2-i
2-ii.
.;;..-1..1.':'
32.29
7""' .43
56.57
18.86
6.29
2.10
30.36
_I
9.143 .05
. . u . .;.."J
1
"'"
., f"\
3~~J
":".J..-
.23
A
A,
The matrix of estimated effects,
= A + (Iv-vI
A
A
where
I
k
is the
k x k
J )E(I
v
was then calculated as
i-I1
J )
Q.
identity matrix and
J
is a
k
k x k
matrix
of ones.
All parametric combinations having the same
and
Q,
v
pared using the same set of 2000 random error matrices.
were com-
This was done
to increase the precision of the comparison of results from different
A matrices.
3.3.3.
A
Analysis of the
A matrices.
The reproducibility of the
A
A matrix by the
Ai matrix was measured by computing
A
previously, for each
Ai matrix,
= 1,2, ... ,2000,
i
2
m,
as defined
where 2000 is the
number of independent sets of random errors generated.
In addition, to
determine the loss or gain of precision in reproducing the true spatial
configuration one might realize by considering only the "dominane'
dimensions of
Ai'
the
matrL~
Ai
was decomposed according to the
singular value decomposition (SVD) , using the SVD decomposition of
IMSL (1975), i.e.,
A.
~
= U.D.V:
~
~
~
where
A
eig envec tors
V:V. = I,
=
UiUi
0
and
f
A:A.
~
Vi
I,
is the
is a
(k x k)
zero eigenvales of
A.A:
(d
~
~
~
Then the matrices
A. (p)
~
.
= U. (p)
~
v x k
matrix of
A
corresponding to the non-zero
(i x k)
matrix of the orthonormal
corresponding to the non-zero eigenvalues,
~
D:
~
is the
~
A.A:
~ ~
the orthonormalized eigenvectors of
eigenvalues,
U.
~
2
1i
A. (p)
~
were defined as
Di (p) . Vi (p)"', P
diagonal matrix of the
p
diagonal matrix containing the non-
=
1,2,3, and k,
largest roots of
A.A: ,
~
~
where
Di (p)
is the
and
U. (p)
and
~
-
24
A
Vi(p)
are the matrices of the eigenvectors of
A
A.A~
~
~
and
A
respectively, corresponding to the
For each
Ai(P), Ri(P)
= A~Ai(P)
calculated, and the diagonal
(1
x
square root of the eigenvalues of
SVD subroutine of
L~L
largest roots;
p
(1 x 1), p
matrix,
1)
Ri(.P)~Ri(P)
note
A.(k) =A ..
~ .
~
1,2,3, and k, was
~
C. (p)
~
of the
positive
were found, using the
(1975).
Then the statistic
2
m
was calculated as
?
m-:-).
~
is calculated for each of the
l.(p)
~
matrices so they can oe
compared to determine the gain or loss of precision in the fitting of
the true spatial configuration of location points when the dimensionality of the space of the estimated location points has been reduced.
The process described above was repeated 2000 times (2000 was taken
to be a sufficiently large number to give accepted
est~ates
of tne
~ean
and variance of the statistics), and the first four ::loments around the
origin were
obtai~ed for m7(p), p=1,2,3, and ~, and as wei: as for each
of· the .esti:nated eigenvalues',
Also, the histograms of
j=1,2, ... , 1 - 1,
were obtained.
calculated using a set of
FORT~~~
j :-1 , 2, . . . ,1 - 1.
p=1,2,3, and k,
and of
The histograms and the
d .. ,
~oments
subroutines for generating and
summarizing large Monte Carlo studies (Jickey, 19i8).
j:L
were
25
4.
REPRODUCIBILITY OF SPATIAL CONFIGulRATION 2
m •
DISTRIBUTIONAL PROPERTIES OF
4.1
Results.
The statistic
2
m
expresses the sum of squared distances of the
observed location points, represented by the columns of the
A,
"interaction matrix,
v x i
after rotation and dilation, from the corres-
ponding true location points, defined by the columns of
A,
as a pro-
portion of the sum of squared distances of the true location points from
their centroid.
2
Alternately,
1 - m
is the square of the product
moment correlation be~Neen the elements of
the rotation
In these senses,
matr~~.
2
m
.~,
and
A
where
is the
H
reflects the degree to which
an observed configuration of location points reproduces the original or
An assessment of the reproducibility of a spatial
"true" configuration.
configuration and, consequently, the degree to which a classification of
locations according to cultivar by location interaction
be success-
~ight
ful is obtained by studying the distributional properties of
:n
2
for
different parametric situations.
In Table 4.1 is presented the
2000 ~amp1as,
obse~Jed
~atri~es
for di££erenc A
:neans of
2
:n ,
(spatial configurations) and
.,
5 ~, i=l, ... ,::1,
,.here
are the non-zero eigenvalues of
limit of
r,
A= A
Table 4.1 that
.,
(0- goes to zero)
0 < r < 1.
and
m.
2
=
Clearly, at the upper
o.
!t is clear rram
m is a monotonically decreasing function toward zero
from seme upper linit,
tion defined by
.~~~,
based on
::1,
A
b(A)
<
1,
(Tabla 4.2).
which depends on the parametric situaAlso, rram section 3.2
2
m
is
..,
strictly less than one with probability one, and from the appendix
£(m-)
26
Table 4.1
v,~
2
Mean values of the statistic m for different interaction
matrices A (v x ~), and different ratios r.
Relative sizes of the
non-zero roots of A~A
.001
2
01
.597
= oZZ = 03Z
>
0
= 160 Z3
>
0
= 160 2Z = 160 32
>
2 _ oZ
01
Z
8,8
2
0
1
90
Z
1
2
0
1
0
e
8,4
2
1
2
8
1
?
81
Z
1
>
= 160 ZZ
>
0
0
.721
.723
0
.848
.Z
= °z =
0
2
3
>
b
.664
b
b
b
b
a
2
>
b
4,8
52 > 0
.668
4,4
.2 > 0
°1
.664 d
a
b
C
d
e
1
Standard error of mean
= 0.01.
Standard error of mean
=
O.OZ.
Standard error of mean
=
0.03.
Standard error of mean
= 0.05.
Results for
r =
.634
.65Z
b
b
b
b
b
.761
c
c
.694 c
a
.856
>
.596
.675
a
8
.S44
.652
2
= 82 = 160 3
Z
= 160 Z = 160 3Z
ratio (r)
.25
.5
.1
1.0 are exact.
C
b
.457
.490
b
.511b
.541
.630
.579
.S82
b
b
c
c
.Sal c
.300
.31S
.32Z
.354
.413
.413
b
b
b
b
b
c
a
.145
a
.150
.lS1
a
0.0
0.0
0.0
.1na 0.0
.20Z
.Z09
.401 c
.204
c
.Z03
.393
l.oe
.75
.219
.171
a
b
b
b
b
a
.176 C
0.0
0.0
0.0
0.0
27
2
Estimates of the upper bound of m , b(A), obtained from
the linear regression of m2 on (l-r), where possible.
Table 4.2.
Relative sizes of the
non-zero roots of AA"'
i=4, v=8
1.=8, v=4
.602
.759
.286
.657
.769
2
2
2
0 = 160 2 = 160 > 0
1
3
.698
.776
90 2 = 1602. > 0
1
2
.721
2
0 > 0
1
.843
2
2
01 = 02 =
...
2
= 0 > 0
7
2
02 = 0 = 0; > 0
1
2
0
2
= 02 =
1
2
160~
~alues for
>
0
r = 0.001.
i=8, v=8
.249
i=4,v=4
a
.856
a
.668
a
a
.664
a
e
28
is less than or equal to
1 - r,
In Figures 4,1 and 4,2 are shQwn a
-- and
m
nearly linear relationship between
1 - r,
for each of the
matrices.
In all cases, a linear regression through the origin of
on
gave a good representation of the change in
ing
(l-r)
r,
m
= b(A) . (l-r)
coefficient
bound of
b(A)
.
b(A)
= 0.001
r
A, (Table 4.2).
2
R
= 0.997
The smallest coefficient of
and the greatest difference
calculated from the regression and the values of
m
2
is functionally dependent on
AA~
the relative sizes of the non-zero roots of the
results obtained for
AA~
2
matrix.
were changed, while keeping their relative sizes constant.)
v,
and
the mean of
2
m
has only one non-zero eigenvalue and smallest when
equal eigenvalues.
2
m
The mean of
is largest when
and
for fixed
has
AA~
also depends on the number of non-zero roots.
roots of
\~en
from
1
=
8
v
to
appear to have greater impact on m
~
= 4,
keeping
little, from 0.848 to 0.856 (r
= 8 to v = 4,
keeping
from 0.848 to 0.668
(r
v
kept constant at
2
m
has only one non-zero eigenvalue and
AA~
1 = 4
~
v
but it
AA~,
constant,
decreased.
than do changes
i
was changed
m changed very
= 0.001), but when v was decreased from
=
8
= 0.001).
and
=8
- 1
~
As the number of non-zero
increased, the value of the mean of
Changes in
AA~
is not only
v
dependent on the relative sizes of the non-zero roots of
i.
(The
were not changed when the values of the roots
m
For a given value of
in
m
was 0.023.
From Table 4.1, the mean value of
of
m for chang-
For a given A matrix, the regression
determination obtained was
for
m
is the ordinary least squares estimate of the upper
m for that
between the
A
v
constant,
m changed appreciably
The same effect was seen when
was changed from 8 to 4 (;;:
for
2
was
Figure 4.1.
The statistic m . against
8 x 8 interaction matrices
1 - r,
A.
for each of the
30
.90
·80
a
3
.70
4
.SOL
.50
I
.40l
I
..30l
i
I
I
I
l-
Si
i
2.
9c~
I
3
.2cl!
I
i
> 0
?
..,
= 1602:
>
f"
."
..,
.,cl
4.
I
5
I
!
i
..
,2
°1
.75
·50
I
1-
r'
I
t"
",,-
,~
,2
..,
= l~ 2 = '"'3
"
,~
I
.25
.,
.
, .- "2
... = .l.OO2 = 16c~.J
·i
?
,2
51: = 2 = .LOO..,.J >
5~
.30
.0
> u
"
>
Q
."\
.)
Figure 4.2.
m
The statistic
against
8 x 4 interaction matrices
1 - r,
A.
for each of the
32
·90
·so
.70
I
2
3
.60
·~o
·40
i
.JClI
2C~
?
1-
5;
,.,
~-
'" .
3.
?
"1
,2
"\
~
?
'-'
= 16°3
= 1_00
=
... =
1
.le
·7'5
I-
r
·?C
,,- = 16°3:
?
~,.,
'"
,2
:J
2
=
?
,-
"\
~3
>
>
> "v
0
0
33
r = 0.001
changed from .856 to .664), compared to when
from 8 to 4 keeping
v = 4
constant
(i
for
r = 0.001
was changed
i
changed from
.668 to .664).
After trying, unsuccessfully, to obtain
2
(tr A~AH)2
E(m ) = 1 - E --'-~~=-
2
E(m )
it was decided to approximate
by
E(tr A~AH)2
l--~~"';;':;"~~--
tr
AA~
E(tr
M~)
This was accomplished for the case where
AA~
has only one non-zero
eigenvalue, because in this case
Therefore, when there is only one non-zero root,
2
E (m ) - 1 -
E(tr A~AA~A)
--,...;;.,.::-~--=--
= 1 -
tr AA~ E(tr .tA~)
tr(AA~)2 + (£_1)cr 2 tr AA~
tr AA~(tr AA' + (Z-l) (v-1)cr 2 )
---"-""-="'---'---"'--~-----=-
2
4
2
0 + 0 (£-1) cr
1
1
v - 2
= 1 - ----------- =
(l-r)
0i(oi
+ (£-1) (v-1)cr 2)
This approximation for the expected value of
observed values of
(Table 4.1,
and 0.666 for the cases
2
m
with the approximation,
(Z,v)
was very close to the
~ = 0.843, 0.856, and 0.664
m; compare
r = 0.001)
.
v - 1
=
2
E(m ) ~ 0.856, 0.856,
(8,S), (4,S), and (4,4),
It also can be proved that this approximation for the
respectively.
?
E(m-)
when
·34.
AA~ has only one non-zero root is an upper bound to E(~2)
A matrices.
other cases of
The variances of
ratios r,
m
2
are presented in Table 4.3 for the different
2
A matrices and Figure 4.3 represents
and for the various
2
2
r.
m , V(m ) , against
the plot of the variance of
m
for the
The variances of
reach a maximum value when the ratio is in the vicinity of 0.1 and
0.25, depending on the
A
matrL~.
This interval for which the largest
variance is obtained also is the interval where the skewness of the dis2
tribution of
changes from negative to positive (skewness
m
coefficients are not shown).
number of cultivars
1
were decreased, the variances of
v
V(m 2)
.
.
.Th e :Lncrease
:Ln
is more drastic when
is decreased; compare
the cases
As the number of locations
v
and the
i
:It
2
increased.
is decreased than when
2
V(m ) : 0.538, 1.136, 1.937, and 4.168
for
(v, 1) : (8,8), (8,4), (4,8), and (4,4), respec tively.
In Figure 4.4 a-h are given box plots of the :mpirical distributions of
:It
2
•
The rectangular box extends from the first to the third
quartile with the median
show~
by the
~iddle
horizontal line.
The
~NO
horizontal lines above the box represent the 95th, and 9i.5th ?ercentiles, respectively; the
~NO
horizontal lines
the 5th and 2.3th ?ercentiles, respectively.
a~~reme
obser7ations from the sample of 2000.
spaced according to the ratio
2
m
tribution of
r
finite boundaries on
m •
v
~atrices
are
f~~ed,
for which
very s i . .n ilar .
From these,
the box represent
!he (.) represents the
The box plots have been
i~
is clear that the dis-
changes from negative skeT..m ess ,.;hen
positive ske<Nness as
and
-...
~elow
the
A~A
r
is small, to
increases which is reflecting the impact o£ the
2
Also, as
a~pirical
r
increases toward one, and
distributions of
for those
A
have an equal number of non-zero roots become
35
2
Variance of the statistic m (multiplied by 100) for the
different interaction matrices A (v x 2), and for
different ratios r.
Table 4.3.
v,2
8,8
.001
.1
2
2
0 = 0 =
1
2
.954
1.011
.899
o~
>
a
2
2
2
0 = 0 = 160
1
2
3
>
2
2
2
0 = 160 = 160
1
2
3
2
2
90 = 160
1
2
2
0
1
8,4
r
Relative sizes of the
non-zero roots of A;A
>
a
>
a
a
2
2
0 = 0 =
1
2
°32
,>
a
2
2
8 = 0 =
2
1
160~
>
0
2
2
2
81 = 160 = 160
3
2
>
.5
.75
Loa
.950
.550
.140
0.0
1.045
1.011
.581
.151
0.0
.836
1.113
1.083
.600
.152
0.0
.807
1.001
1.010
.607
.166
0.0
.538
.891
1.029
.670
.190
0.0
1.437
1.612
1.429
.507
0.0
1. 713
1.954
1.442
.467
0.0
1.962
2.154
1.441
.446
0.0
0
.25
2
0 > 0
1
1.136
.507
4,8
2
8
1
4,L..
81:
?
>
0
1.937
.401
>
0
4.163
.959
aResu1ts for
r
=
1.0 are exact.
Figure 4.3.
.
_
2
The var~ance or m (x 100) against the ratio (r),
for the 8 x 4 and 8 x 8 interaction matrices A.
37
-
~~
,
.,- I
,.\
I
I
\
I
",
I
/
1.8
II
"
\'
I
"
1.
" ,_
2
I
I
1.7
"
I
\,
,
2.
\
,/
\
I
1.0
;'
I
fA
1..3
1.2
v (rTf)
j.j
(X 1001"
-
"
"
3
"
~,
....
"
,
""
\\
3.
2. = 4, v = 8
,)
,\
"Ill', \\
'<,.
,
1. = v = 8
~
.
"
1
"'I
I
\
\
\
\
\
\
\
I
I
I
II
"
i
~
,I
I
"
'.\
"
"
"
"1\
II
II
I·
~\__
,
3
-,,,,
\ \
,,
'I
" ,,
,,
,
\\
\
'~
Figure 4.4 a-h.
The empirical conditional distributions of the
statistic m2 , shown as box plots, conditional on
-m
? -m?
2
the value of r = ).
1 5-:/ (). 1 o~+(l-l) ('11-1)0 ).
-1.=
1.
-1.=
l.
Each rectangular box extends from the fi~st to the
third quartiles of the distribution of m2 for that
value of r. The middle horizontal line is the
median; the pair of horizontal lines above the box
represent the 95th and 97.5th percentiles and the
pair of horizontal lines below the box represent
the 5th and 2.5th percentiles. The C·) represents
the extreme values in the sample of 2000.
39
. :;0
•
•
--,.I
I
----I
I
•
I
I
I
I
I
I
I
m2
L
I
\\
I
I
.4°1 i
---I
I
\
--,.-
\
\\
I
I
I
I
I
I
\
I
I
\:
\'
I
\
I
I
I
'30f
--,.-
\
I
\
-'--
I
~
.2J I
r
,
I
--r--
.-.-
•
\
\r
iI
!
!
.IC~
•
~
~
\
•
0
.1
.25
·5
.75
\
'\
1.0
ratio (r)
Figure 4.4a.
v = 8, ~ = 8 and .~~~ has three equal non,7 > a .
zero roots, u~2l = 02? -- 03
•
.aor
.
i~
~
I
I
I
I
I
--..I
•
I
I
-
\
-
I
I
I
I
I
I
I
I
II
·3d
I
-,
J
•
Ij
i
I
..
.2-d
II
I
I
.: o~
I
I
i
i
. I)
iatlOIJ
Figure 4.6.0.
v=8,.~=a
roots and one
~rge eq~al
:. =
2
1::".'::.L
~,-,
... 3
>
...
J.
41
•
.90
•
j
.70
I
I
- rI I
I
I
I
I
I
I
I
.50
.
I
Z
m
I
-J...-
- rI I
...-..-
:\
.40
\
I
I
I
I
\:
\:
\
----- \
I
I
-I--
.30
\
\
I
I
I
I
I
I
-..--
·20
--'--
\
\
~ \i
I
I
1"-
I
•
.iel
'"."
\
..~ \;
i
,
I
:::r:=
\
.1
·25
·5
·75
"\
1·0
ratio(r)
Figure 4.4c.
and AA" has one large and
,;2 =
two small roots, "1
l6<5~ = l6o~ > O.
v
= 8,
L
=
8
42
·90
-,...I
·80
--l.I
I
I
I
I
I
I
\ :-,-
·50
·40
-
I
I
:\
::
I
~
\
I
\::
\
\1
\
I
I
\
\\
~
•
·30
:
:1"
~
,
·2C
'"
•
.10
\-;\-+-
,,~'
'""
\
<I
I
~\
----:'"
\\
~'", \ \
.
~\
\, \
'\\,\
'\
o
.\
·25
.5
.15
1.0
ratio (r)
Figure 4.4d.
v = 8, Z = 8 and AA~ has two almost
equal non-zero roots, 90i = 16o~ > O.
43
•
.-orI
I
.-orI
I
I
I
I
I
I
-
---I
I
I
I
I
I
I
I
·50
-
2
m
\ I
,I
I
I
......l-
•
\
·40
.
-------
.30
I
\r
\R
\:
\
\
.20
\
~\
I
=\'\
.10
\\, \
\~
'\
~
0
.25
.J
..5
.75
I·G
ratio(r)
Figure 4.4e.
= 8, 9- = 8 and AA"
zero root, 52 > O.
v
1
has only one non-
44
•
·90
--,....I
-...I
I
I
I
I
·80
I
I
-,-
-
I
I
I
I
I
I
I
I
I
I
I
I
I
.70
-.-I
I
I
I
·60
I
I
I
I
I
I
I
I
I
I
I
·50
I
I
I
I
.-...l.-
I
I
I
I
I
I
I
,
I
I
I
I
--....-
-
I
1
·30
I
I
1
1
I
I
I
•
2C
.
1
-r-I
r
--'--
I
I
I
I
I
.10l-
I
•
-or--'--
·5
.75
I
o
.1
.25
ratio (r)
Figure 4.4f.
v = 8, i = 4 and
nonzero roots,
0i
AA~
has three equal
= o~ = o~
>
o.
45
·90
---,
I
I
I
.so
I
I
I
I
I
I
I
-,-I
I
,
~
I
I
I
I
I
I
I
I
.10
I
I
I
I
I
-,--
.so
·50
I
m2
I
I
I
- - 'I - I
I
I
4C
I
I
-
I
r
1
\,
I
\\
,
I
I
·3C~
,
-"-\
,I
,I
I
\
I
I
'\'\,
\\
I
.2C~
\
:
j~
I
'. I
!.
I,
\il\
!\
I
I \
~\
!
I
i
:'\
,I
\
~
-,- ,,\
"Or'
I
I
'
\
"
\..\\,\
I --7"---~"5-----~-------":;::-----"'~';~.\~-
~
...
Figure
~
. 2
\ '.
..5 5 :
.7~
ratio (..)
,I I
.ig.
v = 8, ~ = 4
roo ts and one
equal
(\
u
.
46
--
.-,.I
·90
I
I
•
I
I
I
I
I
I
I
~o
.70
I
1
I
.-...I
I
I
1
I
I
I
I
..5C
m%
--rI
1
04cr
I
......0-
-
.3ol
,
I
--!..-
I
I
I!
---l
.<;-
i
i
.IC~
i
I
I
I
.~
.
.25
.1
1.0
.::;
ratio (r)
Figure 4.4h.
v
= 8,
t~NO
1 = 4 and
small roots,
_~~~
-.,Ji
=
has? one large
and
., -1.602: = 168 > O.
3
47
4.2
Discussion
With the characterization of the distribution of
2
m
given in
section 4.1, it is now possible to determine the effect of the various
parameters, within the limits of the parameter values used, on the
degree to which the configuration of the true points is reproduced by
the configuration of the observed points.
2
When the number of locations was decreased the mean of
is
m
changed very little (Table 4.1), but its variance increased (Table 4.3).
When the number of cultivars was decreased,
m decreased, but the
2
V(m )
increase in its variance was twice as big as the increase in
when
t
was changed by the same amount.
Therefore, the number of
cultivars used affects more the degree to which the true location
points are reproduced by the observed location points, than does the
number of locations.
From the box plots presented in Figure 4.4 a-h and from Tables
4.1 and 4.3, the distribution of
2
m
is seen to depend on the
A matrix, that is, the repeatability of the location points is seen to
depend on the configuration of the true points.
As the dimension of
(~
the space in which the true location points lie is increased
fL~ed),
and
v
the degree of agreement between both sets of points increases.
Although it is clear from Figure 4.4 a-h that the distribution of
2
m
depends on the configuration of the true location points, it is also
cl ear tha t as
Also, as
r
r
approaches one, the distributions become very similar.
approaches one, both the mean and variance of
2
m
decrease;
hence, the reproducibility of the true location points by the estimated
location points is improved.
Therefore, knowledge on the magnitude of
48
r
in a particular trial, used in conjunction with the conditional dis2
m
tributions of
would provide a basis for forming a judgment on how
"successful" any classification would be.
Recalling that
nitude of
2
CT e
CT
2
2
= CTf?/n,
Two approaches might be taken.
prior information, if available, on the mag-
and the magnitude of the interaction effects can be used
in the designing of the experiment to make the ratio,
desired to one (by choice of n).
r,
as close as
In the absence of prior information,
or the lack of opportunity to choose n,
the experimental results them-
selves provide information on the magnitude of
Information on the magnitude of
r
r.
in anyone experiment can be
obtained from the f-ratio,
f =
SS(aB)/(~-l)(v-l)
SS(ERROR) In
from the analysis of variance where
error.
and
n
This
f
n
is the degrees of freedom for
is distributed as a non-central
F
(~-l)(v-l)
with
degrees of freedom and non-centrality parameter
A = k\m
2Li=1 6~/CT2
~
I
From Searle (1971),
(v-I)
(~-1)f
(~-l)(v-l)
+ 2)"
is approximately distributed as a central
S
= [(1-1) (v-l)
+ 2)..J 2 /[(1-l) (v-l) + 4AJ
The non-centrality parameter can be
A=
~r(2-1)(v-l)/(1-r)
a central
F
,"ith
S
so that
F
and
a~pressed
f(l-r)
= U-l) (v-l)/(1-r 2)
with
n
degrees of freedom.
as a function of
r
as
is approximately distributed as
and
Using this approximation, the null hypothesis
n
H :
o
degrees of freedom.
r = r
o
against
49
r > r
can be tested or alternatively, an approximate confidence
o
interval estimator for
r
would be
F1_a.
P{F
where
a
< (l-r)f < F
1
a
1
and
respectively.
Ct.
2
1-a
}
f 2 < r < .1 -
1 - a
Z
1 - a2
are the lower and upper tail confidence coefficients,
Now, this confidence interval gives information on
2
which in turn gives information on
E(m );
r,
that is, on the degree to
which the estimated points reproduce the true points.
Thus, a decision
rule as to whether or not one should attempt classification of locations
according to the cu1tivar by location interaction could be based on the
magnitude of the i-ratio of the cu1tivar by location interaction
~ean
square to the error mean square once a decision is made as to what constitutes an acceptable value ot
crt
2
•
In the absence of direct information on the probability of "correct"
classification conditional on
2
TIl,
which in turn would oe dependent on
the clustering algorithm used and on a somewhat arbitrary definition of
what constitutes a "correct'l classification, one ::luse resort to gecmetrical considerations in forming a judgment on ehe desired
2
m.
The
L
d~ensional
being
"m
L.i=l
~agnitude
of
m ~ (v-I)
true location points are distriouted in a
space with the average squared distance to their centroid
The estimated location points are distributed acout
che same centroid :",ieh the average squared distance of a given estimated
point to its corresponding true poine being
::1
2
=
:n 2 I'm
,.
L.i=l
8
~
2/,
" ,t
1.
Thus,
.25, say, implies that the average squared distance of an
obse~led
point to its true point is one-fourth the average squared distance of the
50
true points to their centroid.
Letting the terrIL "standard distance"
mean the square root of the average squared distance, and scaling so that
standard distance to the centroid is the unit of measure, the standard
distance between an observed and its true point is
m
=;;!
maximum diameter of the configuration of true points is 2.
m2
= .25,
or
m
= .5,
and the
Thus,
would be sufficiently small to clearly differen-
tiate the more dominant axes in the location space with a high degree of
success.
Classification applied to situations where
2
m
is of the
order 0.25 or less would be expected to be successful and repeatable for
the dominant dimensions.
Joint consideration of the mean and variance of
=
=
Carlo studies f..Then
v
=
.5
and
When
v
= 8,
bility .95 if
if
r
=
2
m < .61
.75.
r
2.
m
2.
2
<
=4
.25
~he
r
>
.75
for
~
box ploes) for
The minimum value of
expec ted to be successful
1
=~
with approximate probability .95
=~
(m
= 0.75.
= 8 and
2.
r
r
needed so
2
r
r
= a.5
Thus, it would appear
.8 (linear inter?olation
>
<
.25)
~hat
che claSSification can Je
for the cases of
~men
2. = 8
for
L
=.:.
can be used
how much smaller.
r, or
to be conser;ative.
is greater than 3 , it is clear that the minimum v""alue of
smallar than
and
the number of locations is
4 and 8, linear interpolation can 'Je used to decer:line
the value of
~e
2
m < .33
and
would suffice.
were .75 and .80, respectively.
~et"'..Teen
f..Tith approximate proba-
;)
with approximate probability 0.95 if
thac requiring
for the Monte
the corresponding results were
with approximate probability .95 if
from
.
2
m <
shows
8
2
m
.
~
needed will
75, but from this swdy ic is not possible to deter:;tine
Sl
The one-sided confidence interlal of
P (r
1 -
>
f1
FCL) = 1 -
=
distribution with s
Since
s
depends on
able to determine
1
1 -
~
in
s
!
F
CL
FCL
is given by
is the critical value of the F?
(i-l)(v-l)/(l-r-) and
r,
CL
where
r
a point estimate of
n degrees of freedom.
r
must be used to be
A biased point estimate of
~
r, r,
It should be noted that the effect of the adjustment for
r
in
is to increase the apparent degrees of freedom for the numerator.
Consequently, conservative results are obtained by using
for
is given by
s.
C~-l)(v-l)
52
5.
EFFICACY OF USING REDUCED DL.'1ENSIONALITY TO
CHARACTERIZE SPATIAL CONFIGURATIONS
5.1
Results
2
m
The
statistic as defined in the previous sections used the
full dimensionality (k) of the estimation space with no consideration
given to the relative contributions of the different dimensions to the
estimated interaction sum of squares.
It is conceivable that reproduci-
bility of the true location configuration might be improved if some of
the minor dimensions of the estimated space which, presumably represent
only random error, are ignored.
This is equivalent to considering, for
classification purposes, only the significant components of the interaction matri."'< as detected by, .for example, Mandel's (1971) interaction
analysis.
To this end,
2
m (p)
was defined as the standardized sum of squared
distances between the estimated location points in the reduced space
(9f
p
dimensions,
p
= 1,
2, 3, and k),
and the true location points.
Any improvement in reproducing the configuration of true location points
will be reflected in a decrease in
m(p),
compared to
~u.
)
with a
concomitant decrease, or at least not a.drastic increase, in the
variance of
2
m (p).
Such a result would suggest that greater precision
in classification of locations on the basis of interactions would be
realized by using in the classification only the "dominant" dimensions
of the space of location points, which are represented by the columns of
the
interaction matrix.
If there is shown to be some advantage in
using reduced dimensionality, a second consideration
is to define a
53
reasonably consistent method of determining the "appropriate" dimensionality in anyone case.
2
m (p)
The mean values of
the minimum to occur when
AA~.
eigenvalues of
p
=3
p
corresponds to the number of dominant
Thus, for example, when
;Cp)
zero roots, the minimum
at
(;able 5.l)~ showed a clear tendency for
for both
r
AA~
had three equal non-
(for those values of
= 0.5
and
0.75.
p
studied) occurs
In a few cases where this
pattern does not seem to hold it could be argued that the error variance
is small enough that the secondary, but non-zero, roots carry some
information; for example, when
two small roots, and when
m(p)
occurs at
p
r
AA~
= 0.75
had one large and
equal to the total number of non-zero roots.
r = 0.5
and
AA ~
2
m ,
The
ranges from trivial, when
has three equal non-zero roo ts, to
approximately a 75% decrease in
zero eigenvalue.
AA~
had two almost equal roots, the minimum
potential gain, in terms of reduced
1. = v = 8,
and
m(p)
when
AA
has only one non-
It should also be noted that the potential loss in re-
ducing dimensionality too much is much greater than any potential gain;
compare
mel)
with
m(k;.
The variances of
of change, when the
2
m
did not appear to have any distinct pattern
dL~ensionalitywas changed
CTable 5.2).
A fairly
definite pattern did emerge, however, when the mean and the variances
were combined by using the value of the 95 th percentile (Table 5.3). For
the three cases studied with
Z
= 4,
there was no advantage to be
gained in reducing the dimensionality below three.
When
Z = 8,
however, there were distinct advantages of reduced dimensionality in
several cases.
These results suggest that knowledge of the relative
54
2
Mean values of m calculated using the first p eigenvalues of the estimated interaction covariance matrix,
AA, for the different interaction matrices A, and
different ratios r.
Table 5.1.
AA ..
v, !L
8,8
Relative sizes of the
non-zero roots of AA"
2
01
= 0 22 = 0 23
2
01
= 022 = 160 32
2
01
=
2
90
1
2
01
2
81
8,4
2
160 2
=
>
=
=
2
160
2
>
>
160
p
r
1
2
3
k
.5
.722
.476
.293
.300
.75
.688
.388
.107
.145
.5
.602
.274
.301
.315
.75
.547
.112
.131
.150
.5
.247
.304
.317
.322
.75
.155
.151
.141
.151
.5
.521
.244
.320
.354
.75
.424
.080
.144
.172
.5
.127
.308
.380
.413
.75
.042
.134
.179
.202
.5
.752
.545
.413
.75
.708
.437
.209
'"
.:)
.658
.414
.401
.75-
.579
.199
.204
.5
.377
.397
.393
.75
.255
.215
.203
0
>
2
3
0
>
0
0
0
2
02
=
2
83
>
0
,2
o1
=
2
0
1
= 160 22 = 168-3
52
2
2
= 168 3
>
?
0
>
0
.
e
55
Table 5.2.
v,i
Relative sizes of the
non-zero roots of AA~
2
0
1
8
8,8
2
81
.2
°1
8
2
2
= 82 = 83
2
1
98
8,4
2
of m calculated using the first p eigenthe estimated interaction covariance matrix,
the different interaction matrices A and
ratios r.
Variances
values of
AA~, for
different
2
2
= 82 = 160 3
=
2
1
>
168
2
2
=
>
2
168
3
,2
>
1 = °2 = 160;
,2
°1
2
3
.5
.00146
.00470
.01293
.00550
.75
.00029
.00061
.00147
.00140
.5
.00299
.01352
.00795
.00581
.75
.00039
.00117
.00173
.00151
.5
.00707
.00674
.00737
.00600
.75
.00063
.00122
.00214
.00152
.5
.00898
.01321
.00737
.00607
.75
.00257
.00110
.00144
.00166
.5
.00591
.00639
.00675
.00670
.75
.00057
.00121
.00168
.00190
.5
.00292
.00774
.01429
.75
.00080
.00209
.00507
.5
.00647
.02196
.01442
.75.
.00146
.00532
.00467
.5
.02786
.01547
.01441
.75
.00328
.00406
.00446
0
>
0
0
.2
1 = °2 = 523
8-
k
1
0
?
= 160-2
?
?
>
p
r
0
>
= 160~ = 160;
0
>
0
56
Table 5.3.
v,i
8,8
Values of the 95th percentile of the distribution of
p = 1,2,3, and k, for the different A matrices and
different ratios r.
Relative sizes of the
non-zero roots of AA"
2
01
= 022
2
0
1
2
0
2
=
=
2
2
0 = 160 2
1
2
90 1 =
2
0
1
0
8,4
=
>
2
0
3
>
2
160
3
=
2
m (p)
p
r
1
2
3
k
.5
.798
.604
.514
.435
.75
.717
.433
.176
.214
.5
. 705
.507
.463
.450
.75
.585
.175
.207
.220
.5
.402
.447
.466
.454
.75
.203
.215
.225
.222
.5
.699
.477
.474
.488
.75
.530
.142
.212
.244
.5
.273
.447
.523
.550
.75
.087
.195
.253
.279
.5
.859
.708
.620
.75
.764
.518
.336
.5
.812
.681
.611
.75
.652
.330
.325
.j
.709
.624
.608
.75
.310
.331
.327
0
>
168;
0
>
0
160~
0
,2
2
2
0
1 = 2 = °3
>
?
8- = 0 2 = 160 2
2
1
3
a
>
2
2
8 = 160 = 168;
2
1
0
>
0
e
57
magnitudes of the roots of
would provide a basis for determining
AA~
the appropriate dimensionality.
The dimensionalities suggested by the above were compared to those
obtained by applying Mandel's (1971) partitioning of the interaction
sum of squares,
SS(aS) ,
for the different cases of A matrices.
The
results obtained applying Mandel's (1971) partitioning to the average
results are given in Table 5.4.
The numbers in the table -are what
Mandel calls the f'mean squares"for -the partitioned interaction effects
?
?
di/ v i cr - ,
divided by the error variance, namely
Mandel's
'Nhere
i.:\~
estimate of f'degrees of freedom", is the ith eigenvalue of
A
=0
and
cr
2
=1
in Mandel (1971».
(these values of
partitioni~g
= 0.05)
(~
with
L('v-l), corresponding to
7 ..,.'
.-4- I"
.tithough
v.
1.
and
n degrees of
...
'... .: I
-/ •
o.J
~
n
=
2
1
in a randomized com-
is not di5tributed as an
=,
~he
Only the resultS :.:lr t::e cases
is greater than or equal to .5 were presented, because,
as 'Nas reported in section
I
,.,
~'''"'
a classification of environments
i::lporCan t.
Table 5.4
To determine the cut-off point, the critical
?artition .:latained ':vas acceptable.
r
i~
Since the error degrees of freedom were noc known
plece block design.
in which
The vertical lines
which was done; only the eigenvalues to the left
values of the F-distribution-
;vas taken to be
error.
rar~om
of the lines will be used.
freedom was used.
can be obtained from a table
~
Thus, the values should be approximately 1.0 if that
root is reflecting only
mark the
v.
,.;hen
there :,;ould 'Je littla
u~less
i~terest i~
doing
i:lteractions ,:.;ere relatively
Table 5 .f••
v,9-
Values of
ratios r.
2
2
d/
(0 "i)
Relative sizes of the
non-zero roots of AA"
2
2
2
0 = 0 = 0 > 0
1
2
3
2
2
2
° 1 = ° 2 = 16° 3 > 0
8,8
,2
2
2
1 = 16° 2 = 166 3 > 0
°
2
. 2
98 = 166
2
1
2
°1 > 0
>
for the different interaction matrices
A and different
i
r
1
2
.5
2.031
2.102
.75
3.968
4.591
.5
2.230
2.177
.75
4.763
5.334
.5
2.76fl
4
5
6
7
I 2.056
1.769
1.597
1.614
0.135
5.023
I 2.140
1. 835
1.808
1.673
I
1.628
1.496
1.425
1.434
1.335
I
2.011
1. 743
1.610
1.614
1.507
I 1.510
1. 396
1.356
1.300
1.343
1.269
.75
6.896 I 2.003
1.806
1.651
1.533
1.580
1.425
.5
2.340
2.091 I 1.532
1.414
.75
5.708
5.172
I 1.803
1.626
.5
3.001
I
.75
2.553
I 0.443
0
3
1.312
V1
00
i
L
e
e
e
e
e
Table 5.4.
v,l
Continued.
Relative sizes of the
non-zero roots of AA~
2
2
2
01 = 02 = 03
8,4
0
e
> ()
2
= 02 = 160 2
1
2
3
i
r
1
2
3
.5
11. 868
2.081
2.343
.75
3.360
4.365
6.158
.5
11. 981
2.071
1.824
.75
3.850
4.709
.5
2.261
I 1.604
1.528
.75
5.139.1 2 . 212
2.170
> ()
ci = 160; = 160;
I
4
5
6
7
2.677
Ul
'"
60
5.2
Discussion
2
m (p) ,
The mean, together with the variance of
2
the 95th percentile of the distribution of
mine the "best"
ing to
AA~
p
p,
.~~
When
t
=8
and
AA~ because both the mean and the variance of m2 el)
By going to one dimension in such cases,
reduced to about 1/3 to 1/5, depending on
variance was reduced only slightly when
r
used.
had only one non-zero root, it was advantageous to use only the
were the smallest.
of
were used to deter-
and the gain in precision to be realized by reduc,-
the number of eigenvalues of
first root of
r
m (p)
and the value of
= 0.75.
AA~,
= 0.75,
2
m (2)
r,
r
;(1) was
of the original
= 0.5,
m.
The
but by 2/3 when
Therefore, precision is gained by using only the first root
in this case.
When
AA~
had two almost equal eigenvalues and
there is a similar result; both the mean and variance of
are the smallest,
2m(2)
=m
.
Comparing the values of the 95th percentile of the distribution of
2
m (p), p
= 1,
2, 3, and k, the "best" p coincided, in general, with the
number of dominant roots of
AA~.
Therefore, in general, the reproduc-
tion of the spatial configuration is improved by reducing the dimension
of the estimated location space to the number of "dominant ff dimensions
of the true space when
Z = 8.
wnen
Z= 4
there seems to be no advan-
tage in reducing the dimension of the space.
Mandel's (1971) approach gave a partitioning of the interaction
effects which was similar to the above.
tfhi1e Mandel's (1971) partition-
ing was done on the average of 2000 runs, not on the individual trials,
the results suggest that this criterion for partitioning the interaction
sum of squares into a significant part, would be useful for gaining precision in the classification of locations by reducing the dimensionality
61
of the estimated location space.
How reliable the procedure is when
Mandel (1971) is applied to a single trial is a problem that will need
further investigation.
62
6.
ME&~S
FROM
6.1
AND
VARI_~~CES
OF
INTE~~CTION
v x i
ESTIMATED
EIG~~ALUES
MATRICES
COV.~L~~CE
Results
Since the eigenvalues were used as the basis for reducing the
dimensionality of the estimated location space, the reliability in the
estimation of the eigenvalues is basic to the decision as to how much
the dimensionality should be reduced.
Because of this, the means and
AA~
variances of the eigenvalues of
A matrices were
for different
studied.
AA
A~
AA~
The means of the eigenvalues of
?
S~
standardized by
E(tr
.~~),
estimated from the Monte Carlo study are presented in Table 6.1.
1.
The i-I"n eigenvalue
~
(normalized) of all
goes to zero, to a point
A, v
and
determi~ed
converges, as the ratio
r
only by the dimensions of the matr;x
d~1. give the sample values
These limiting values of
2..
~
AA~
the distribution of the eigenvalues of a
(v-i) x (v-i).
of
'hshart :lat=i:<
'J
TNieh
degrees of freedom and associated covariance mat::ix
(2,-1)
(:1ande1 (1970»).
. 2 I cr 2. ( TNnen
.
Therefore,
.J..
tables for c:he la=gest eigenvalues of the
(1971), Schuurmann, et a1.
--
= 0)
r
0.., I
~';ishart
1:,'"
were compa=ed to
matrix ( Cl er:m,
et
a1.
(l973)Y, and they ·,.;ere found to agree up to
decimals.
t:'"No
The 'Talues of
?
S1:
for the case of no interaction,
given in Table 6.2 for several
These values, multiplied by
?1otted against
?
.\
. l'J ( V-1)
( ~...
~~l
c~oices
(~-l)
('1-1),
(2,-2) (Figu::e 6.1).
,'
.!
- (iV-~i
• )
+.J..
on
of
:v-L:
L
and
v
;.lp
?
(2,-1) ('1-1) S:
J.
=
to
?
,..J
_ ..
L
_:~
a'" 0-
= «1 = 3.
2.
1-'oJ
~.;are
.1
A regression through the origin of
~
,
()
~r
A
~
• )
-.J..
,
r (A)
1
-~,
'/
(dA) - 1) -
-
= 0,
r
gave an appro:d..::J.at:'.on or the char.ge in
(.~-1)
ana•
.,
(v-1) 5i
a.s
e
e
e
')
Table 6.l.
Mean values 01' S~1. (for those values culcu1ated) fm:- the different interaction nmtrices
A, und different ratios r.
._---
-----_._--_.-----, . _ _ _ ,._, _ _ _ _ • • • • ' . . • ___ . _ . _ . , . _ _ • _ _ • _ _ _ _ _ ••• _____ n
v,JL
___ ' ,
. _ . , _ _ . __ • _ _ _ _ _ _ • ____ • _ _ _ . ' __ n
RelatIve sizes of the
non-zero roots of AA'-
_ _ _ _ _ _ _ , _•• _ _
r
8
S2
2
S3
. 161f,
.1615
.1611
.1660
.2027
2
2
1
8
2
4
8
2
5
8
2
6
S2
7
E8
2
i
_._-_.~----_._--_._--~.--_._._-._--_._------
U,8
222
6'1. = 6 j > 0
l\1 =
.t,311
.4))7
·/,237
.2615
.2619
.2633
.2748
.3001
a
1
.33))
. 3J33
.3333
.f,272
.f, )03
.f,f,2S
.2616
.2616
.2626
.00]
·]
222
6 = 6 = 166] > 0
1
2
.25
· ')
.75
]
a
.001
.1
6
22
. '1.
= 166 2 = 166 > 0
1
3
.t,273
.(lOl
•1
.25
.5
.75
.f,2:87
.0913
.0911
.0897
.0807
.04273
.03558
.02045
.01/,66
.01211
.00678
.00209
.0016j
.00093
1.0048
1.0045
1.0034
0
0
0
0
1.0000
.t1718
.508S
.28f,6
. ]l,87
.1614
.160U
.1561
.1310
.01U2
.09123
.09071
.08613
.06823
.03975
.04110
.03175
.01794
.01397
.01076
.00606
• 00196
• O~it 8
.00083
1.00M•
1.0040
1.0030
.f.Uf,8
.f,Uf,8
.030]
0
0
0
0
1.0000
.f,!. 7J
.f13]6
.f,711
.161A
.1592
.1l,70
.1127
.09125
.08912
.OU186
.061U4
.03766
.03805
.0289H
.(n708
. ()l:n8
.01008
.00593
.00191
.00140
.00079
1. 0033
1. 0027
1.0015
0
0
0
0
1.0000
.25
.5
.5902
.7')
.7]63
.2616
.2607
.2/191
.1975
.1310
ti.l
.Baa9
.05S')() .0556
.()129
.Of,88
0W
Tahle 6.1.
v_Ji
Continued.
HelatJve ::;lze::; of the
lion-zero root::; of AA~
r
.001
1()Ol > 0
.)
')
0"" ... 0I
l
2
8] > 0
')
e
I "-
2.
t\2
2
1M]
>
0
0
ES
2
i
0
.6055
.5981
.5121
.2903
.2921
.3023
.3111
.1108
.1131
.1290
.1695
1.0066
1.0039
1.0007
.9987
.33D
.3333
.3333
1.0000
.2903
.2906
.3009
.75
.6090
.6082
.60:W
.5867
.3 LtH
.1100
.1089
.1004
.073/
1.0092
1.0077
1.00 Lj7
1.0025
Ia
. L.HLI 8
.48L,8
.0303
1.0000
.1
t\
0
1
1
.25
.5
0
S2
.2616
.2601
.569L.
0
2
S6
.4270
./,360
.4850
.MI0
.8179
.25
.5
.75
ld
.067 L,
.0912
.0906
.0850
.OM5
.03/.L•
2
S5
.001
.'.271
.4308
.4/178
.1614
.1603
.1531
.1231
2
S4
.36
.1
8,11
3
.6L.
.25
.5
.75
1<'1
1
S2
la
. j
(';2 > 0
2
S2
.L.991
.56 L,9
.25
.5
.75
,l
%2
I
2
1
.2616
.2611
.2612
.2734
. 3138
.1
8_8
S
0
1.0000
.1716
.0871
e
e
(J\
.p-
-
e
Table 6.1.
Continued.
--_._---,--_._- .._
V.JL
..
_--._..
..
_---,-------------~._--_
Relatlve sizes of the
non-zero roots of AA-
8. II
e
2
.2"~ \(\)
0,1. "'" 1602
") 3 "» 0
_.---"--------------"--------
Sl
S2
2
S3
.1
.25
.50
.75
.6119
.6289
.6893
.7832
.2889
.2767
. 2330
.1607
.1092
.1033
. 08LI 1
.0597
1.0LDl
1.0089
1.0065
1.0032
1<1
.3889
.0556
.0556
1.0000
2
I"
2
52
4
52
5
52
6
52
7
1:S2
i
--------_.. _._-----------.,'-a",
I~xact
results.
0">
Vl
Figure 6.1.
The largest eigenvalue of
vAl A~
2
?
2
a ), di/a
against
=
2
(l-l)(v-l)Sl
case where there is no interaction
different choices of v.
(standardized by
~
- 2
for the
A
= 0,
and for
67
o
2
3
1- 2
4
5
6
Figure 6.2.
The largest eigenvalue of .~~ (standardized),
against the ratio, r, for each of the 8 x 8 and
8 x 4 interaction matrices A.
si,
69
·4
?;
= 1602
2.
901
3
",= '6<5-2 =
~l
0
>
) )
J..
..,
:'6o~
I
..;.
>
?
5
.1
,..
>
1
= 4,
1
= '7 =
~
- 11"-'.
•....J:
• ' •.i
0
= .3
v
,J .....
A
i
0;
-
.-
> :)
.)
.~
0
Table 6.4.
si
Variances of
(for tllose values calculated) for the different interaction
matrices A. and for different ratios r.
e
Table 6.4.
e
Continued.
e
72
Table 6.2.
1
values of
interaction, A
and v.
~!ean
si
in the case where there is no
= 0, for different choices of· L
}\\v
1
3
4
5
6
7
8
2
1
1
1
1
1
1
1
3
1
.9098
.8551
.8029
.7802
.7501
.7310
4
1
.8551
.7563
.6981
.6596
.6345
.6096
5
1
.8029
.6981
.6434
.5960
.5649
.5388
6
1
.7802
.6596
.5960
.5474
7
1
.7501
.6345
.5649
.4822
.4553
8
1
.7310
.6096
.5388
.4553
.4271
and
v
2
R
changed,
= .9998.
?
The regression equation obtained was
2.83(r(A)-1)( .081!v-Ll + .029(r(A)-1) + 1}
(l-l)(v-l)S~
1
+
(I v-L I
+ 1)
~
is the rank of
A.
The limiting results as
r
approaches one are the
values of the ?arametric eigenvalues of the
AA~
matr~~.
illustrated in Figure 6.2, where the'mean values of
dii=erent
si
are ,obtained '..;hen
and the smallest values are 00 tained
~"hen
This is
~2
;::1 ' for che
A matrices are plotted against the ratio
larges t values of
sta~dardized
r.
Clearly, the
A~ A has only one non-zero roo t,
AA ~
has
(2,-1)
equal 7.'1on-
zero roots, as is expected.
?
Table 6.3 presents the values of che variances of
(r
= 0).
As
1
and
tonically decreasing,
v
increased, the variances of
1, v
~
3,
as did the mean.
5;
51
?
when
A
=
were mono-
From a regression
0
73
through the origin of
2
22
(t-l) (v-l) v(Sl)
obtained ehae
2
R
=
.988.
The variance of
and values of
increased.
wnen
2.
[(1-2)(v-2)]
As
2
si
2.
and
v
=4
,
A matrices
The same effeces as in
decreased from 8 eo 4 ehe variances of
had three equal roots, two
AA~
~~o
small, ehe variances of
varied from 2.4 to 3.6 times as large as for
are fixed, the variances of
maximum in the region of
decrease to zero as
it was
for ehe different cases of
For ehe cases where
for
~
22.
~
= v(dl/cr
) = 5.26[(Z-2)(v-2)]
large and one small, or one large and
and
on
rare preseneed in Table 6.4.
Table 6.3 are seen .
si
2
2
2
(1-1) (v-l) VeSt)
r =.1
to
r = .25
increases eo 1.0.
s~1. appear to reach a
and then monotonically
It is also clear that
always has the largest variance.
Table 6.3.
«
. t.e
h nu 11 case,
Values of the variance of 5 12 l.n
for: different choices of 1. and v.
,.
2
3
4
5
0
7
8
'"
0
0
0
0
0
0
0
3
0
.41iO
.2.606
.1692
.1242
.1013
.0859
4
0
.2606
.1441
.0940
.0576
.0466
.0363
.0284i
.,
.
S
0
.1692
.0940
.0636
.QilQ
.0452
6
0
.1242
.0710
.0452
.0312
7
0
.1013
.0363
.0180
'"1
• 01_J_
8
0
.0859
.0576
.0466
.028i.
.0152
.0119
--
74
The last column of Table 6.1 shows the sum of the means of the
that is, the mean of
~-=. 1
2
S.~
From
over the 2000 runs.
f~rmu1a
2
S. ,
~
3.1 it
is known that the expectation of this quantity is one.
In the cases presented in Table 6.1, the observed means are slightly
greater than one, but not significantly so.
The fact that almost all
are greater than 1.0 reflects the fact that the same random error samples
were used for all cases having the same values of
6.2
and
i
v.
Discussion.
AA~
It is clear that the means of the eigenvalues of
relative sizes of the non-zero eigenvalues of
true interaction matrix, and within each case of
.,..2
( or r).
v
on th e error
var~ance
where
AA~,
A
In the case when
depend on the
A is the
matrix, it depends
r
approaches zero,
an approximate relation to characterize the mean and variance of
'Nas
0
b tained .
The importance of these limiting values of
di/a2
is
seen in papers like Mandel (1971), who uses these values to test the
hypothesis that
2
E(d.)/v.
.~.
~
= a2
wh~re
in the null case, and
then partitions the interaction according to the result of the test.
Only those
for which the hypothesis is rejected are kept.
When the
values of
v.
obtained from the Monte Carlo study were compared to the
values of
v.
presented in Mandel (1971), they agreed to one or two
~
~
decimal digits depending on
Z and
v.
As explained in section 5.1,
Mandel's partitioning can be used to reduce the dimensionality of the
space where the estimated location points lie.
75
The eigenvalues are used to determine the partitioning of the interaction effects, that is, to reduce the dimensionality of the estimated
location space, hence the precision with which they are calculated is
basic to the precision in the decision made in reducing the dimensionality.
Therefore, since the eigenvalues are estimated with greater pre-
cision when the ratio
r
is close to one, or equivalently when the ratio
of the interaction mean square to the error mean square is large, the
closer
r
is to one, the greater the precision in the decision made in
reducing the dimensionality of the estimated space.
interval estimate of
r
is given in section 4.2.)
(The confidence
76
7.
SUMMARY
fu~
CONCLUSIONS
Analysis of two-way interaction matrices is of interest for many
reasons.
In this particular study the motivating objective was to assess
the degree to which the true spatial configuration of locations, as
defined by the cultivar by location interaction effects, is reproduced by
the estimated interaction matrix.
conditons under
~hich
The ultimate purpose was to define
it would be reasonably consistent and successful
to classify locations
so as to minimize within cluster cultivar by
location interactions.
A Monte Carlo experiment was carried out to study the repeatability
of the configuration of the
(v x 1)
w~s
1
location points each represented by a
vector of estimated interaction errects.
2
measured by the statistic
(Gower (1971)),
~hich
m ,
This repeatability
a standardized version of
2.
R .
m~n
,
measures the degree to <..;hich the estimated inter-
action effects in any trial reproduces the crue, or ?arametric, spatial
coni igura tion .
distances of the
from the
This statistic,
obse~led
~orrespo~ding
2.
m
is the standardized sum of squared
location points, after rotation and dilation,
true points.
The rationale for allowing =otacion
and dilation in the assessment of repeatability Nas chat any consistenc
classification procedure would be dependent only on relative distances
be~;een
locations, not on the
absol~te
distances, nor on the orientation
of the configuration.
The
of
2.
m
t:'"..70
aspec ts of the problem 'Ni th po tential ::npac t on the ,ahavior
were the
~ature
of the true spatial configuration of the location
poincs and the relative inportance of cultivar by
co the
experi~ental
loca~ion
interaction
error in the estimation of the interaccion etrects.
The true spatial configurations
,
-.
•
~vere ·c.e!~nec
oy the matrix of true
1
77
interaction effects,
A(v x i).
The three dimensional figures used for
the configurations of the true location points were rectangular parallelepipeds.
The relative sizes of the sides of the rectangular paral-
lelepiped were controlled by choice of the eigenvalues of
and
2
2
01' 0Z'
AA~,
Such a definition allows for spatial configurations ranging
from all locations being "equally" spaced in three dimensions, i f
..,2 =u. . =, 2
. . , to
2 .~ncreasing degrees
u '
2
l
3
u
c;
lets of points as
0
f simiI arity of pairs or qua d rup-
The
decrease in size relative to
relative importance of the interaction effects to the error variance was
controlled by specification of the ratio
m
r
2 [mL c.2 + (i-I) (v-I) cr 2]
= L c./
i=l
where
cr
2
= cr e2 In
location means
~
i=l
~
is the variance of the estimates of the cultivar(y .. ).
~J
.
In any specific classification problem,
i
completely under the control of the researcher.
and
v
are known and
The ratio
r
is not
known a priori but is subject to some control by the researcher in the
choice of
n
and may be estimated from the data.
The true spatial con-
figuration, however, is completely beyond the control of the experimenter
and all information derives from the data.
The statistic
2
m
measured the degree to which the data reproduced
the true spatial configuration in our Monte Carlo studies but, in any
set of real data,
2
m
cannot be computed.
Therefore, since
:n
2
was not
uniformly small enough over the cases studied to provide a uniformly high
degree of success of classification, criteria must be developed for judging whether or not classification would be reliable.
78
2
The study of
showed that
m
tonically decreasing function of
b(A) < 1,
dependent on
A.
m
the mean of
r
was a mono-
t
toward zero from an upper limit,
In all cases of A matrices studied, a
linear regression through the origin of
m= b(A)' (l-r).
2
m on
2
m
The distribution of
(1-r)
gave a good fit,
depended on the number and
relative sizes of the non-zero roots of
However, as
AA~.
r
increased
toward one these differences in the distributions decreased to the point
Z and
that all cases studied, for a given
2
m
values of
for nearly the same values of
2
an upper bound on
m
gave satisfactorily low
Thus, specification of
r.
can be translated t for practical purposes, into
specification of a lower bound on
and
v,
r,
L
conditional on the values of
v.
A confidence interval estimate of
r
can be calculated by
~sing
the approximation of a non-central F distribution by a cent=al F given
by Searle (1971).
The ratio
f
outed as a non-central F with
(0
= ~S(inte=action)/~S(er=or)
(L-1) ('1-1)
and
is distri-
degrees of freedom
n
are the error degrees of freedom) and non-centrality ?arameter
and
(1-1) C'T-l)f/[(J.-1) ('1-1)
2J\J
..l.
[(~-1)(v-1) ~ 2...-\.;
central
I
degrees of freedom.
This can be
,,",!
.\,
is approximately distributed as a
2 /L.(l-l) (v-J..)
l"'"
~
..'
a~pressed
,~,
:"\.i
in ter.ns or
.,
and
r
as
f(1-r)
')
is approximately distributed as a central F with
and
degrees of freedom.
1
lated as
~s
~
r = 1 -
1
-
(L-1) (v-l)/(l-r-)
can be calcu-
A point est±nate of
The approximate confidence interval esti::1ate of
given by
P 1 -
< '" < 1
f
= 1 -
:1
1
79
2
m
To determine what constitutes a desirable value for
a geome-
tric argument was given in section 4.2, and it was decided that an
2
m < .25
was "sufficiently" small.
2
The distribution of
changed.
m
was seen to change as
The effect of varying
v
and
the ratios of interest were compared.
constant, an4
2
m
Z=4)
r
>
.75,
was almost negligible.
the minimum value of
For
1.
be~Neen
1.
When
v
2
m
v
for
Z
was changed keeping
the change in the value of the 95th percentile
r
studied
For the cases of
necessary so that
mate probability .95 was .75 for
of
were
Z was greater than that of varying
when the values of the 95th percentiles of the distribution of
of
?
m-
1.=8
and
2
m < .25
.80 for
1.=4.
(1.=8
and
wi th approxiFor values
8 and 4 a linear interpolation can be used to determine
greater than 8 it is clear that a smaller value of
r
r.
would be
per.nissible, although, ftom the cases studied, it is not possible to
make any inference on how much smaller the value of
r
will be for
1
greater than 8 .
.~other problem discussed was the relati'le gain or loss of precision
in the classification when the dimension of the space of estimated locacion poincs is reduced to include only the more important dimensions as
retlec!:ed i:l the eigenvalues of. the interaction cO'lariance llacri.."'{.
The
~onte
Carlo results clearly indicate that there are situations
when repeatability of the spatial coniiguration is
L~proved by
a reduction in dimensionality on che estimation space.
dete~~nea,
ioposing
It was
in general tor the cases studied, Chat when the
dL~ension
of che space of estimated points was reduced so that it coincided with
the number of dominant
dL~ensions
of the space of true location points,
there was an improvement in the reproducibility of the true points by
80
the observed points.
But it was also found that it is better to keep
more dimensions than optimal than to keep fewer.
To reduce the dimen-
sionality in a specific problem requires information on the eigenvalues
of the interaction covariance matrix, and it appears that comparison of
the estimated eigenvalues to their averages in the null case (no interaction effects), provides a basis for judging what dimensionality should
be used.
This approach is explained in Mandel (1971) where he also
provides a table of expected values of the first three largest eigenvalues of the interaction covariance matrix for the null case.
To make
the decision as to whether or not the dimensionality should be reduced
the hypothesis
2
E( d. / v .) = cr
~
2
~
is the ith eigen-
is tested where
value of the interaction covariance matrix and
value of the ith eigenvalue in the null case,
is the expec ted
v,
~
cr
2
is the error variance.
If the hypothesis is not rejected, then that component of the interaction
sum of squares
now
(L~=l d~)
k~
2
SS(aS) = \.
1 d.~
L~=
is assumed to be random error.
for some
k~ < k,
Therefore,
and since the number of non-
zero eigenvalues of the covariance matrix reflects the dimensionality of
the location space, the dimension of the estimated space has been
reduced.
Mandel (1971) uses as a cut-off point to determine the parti-
tioning that point in which there is a 'jump' in
below this point are declared as random error.
2
d,/v.,
~
~
and the
?
di/vi
In section 5.1, the
table of the F-distribution was used to test these hypotheses, and the
conclusions obtained using this F table agreed with the conclusions
obtained by using the statistic
2
m (p),
p = 1, 2, 3, and k.
Recognizing that only a small part of the parametric space has been
studied, the results suggest that, when presented with a real problem
the steps to follow to determine whether or not a classification of
~
81
locations according to the cultivar by location interaction and a reduction in dimensionality should be attempted are:
1.
Perform the analysis of variance.
2.
Using
MS(AB)
Ho :
from 1 calculate
against
f (l-r) ~ F~
where
::. ,11
HI:
s
=
r
>
f
ro
=
MS(AB)/MSE
by using the approximation
(1..-1) ('1-1) / (1_r 2 )
or compute an
approximate confidence interval estimate for
(
l
and test
r
F1-0.
Pl----=.2<r<1-
3.
If
r
is sufficiently large, say
1 - a and
step 4.
4.
f
1.. > 8
or
Othe~Nise
r
>
.i5
r > .7S
and
wit~
4 < L < 8,
probability
then proceed to
stop.
Calculate the interaction covariance
matrL~
and decompose it
according to the singular value decomposition.
5.
Test the "significance" of the eigenvalues of the interaction
covariance mac:ix oy using :!andel (1971) or any si::1.ilar ;:rocedure.
6.
Recalcula. te
-~,=l.
'_.,,-
inte~action
covariance
mac=~~
jy
~sing
only
"significant" eigenvalues, and proceed to do the classification
using this
matr~~.
82
8.
LIST OF REFERENCES
Abou-El-Fittouh, H. A., J. O. Rawlings, and P. A. Miller. 1969.
Classification of environments to control genotype by environment
interaction with an application to cotton. Crop Sci. 9:135-140.
Clemm, D. C., P. R. Krishnaiah, and V. B. Waikar. 1971. Tables for
the extreme roots of the Wishart Matrix. Aerospace Research
Laboratories, ARL 71-0264, December 1971.
Clifford, H. T. and W. Stephenson. 1975. An Introduction to Numerical
Classification. Academic Press, New York.
Dickey, D. A. 1978. A program for generating and analyzing large Monte
Carlo studies. Unpublished manuscript, N. C. State University at
Raleigh.
Eberhart, S. A. and W. A. Russell. 1966. Stability parameters for
comparing varieties. Crop Sci. 6:36-40.
Finlay, K. W. and G. N. Wilkinson. 1963. The analysis of adaptation in
a plant-breeding programme. Aust. J. Agric. Res. 14:742-754.
Gower, J. C. 1971. Statistical methods of comparing different multivariate analyses of the same data. Mathematics in the Archaeological and Historical Sciences. pp. 138-149.
Gower, J. C. and C. F. Banfield. 1975. Goodness-of-fit criteria for
hierarchical classification and their empirical distributions.
Proc. 8th Int. Biometric Conf.
Gower, J. C. and G. J. S. Ross. 1969. Minimum spanning trees and single
linkage cluster analysis. App1. Statist. 18:54-64.
Hoe1, P. G., S. C. Port, and C. J. Stone. 1971. Introduction to
Probability Theory. Houghton Mifflin Company, Boston.
DlSL, Computer Subroutine Libraries- in Mathematics and Statistics.
International Mathematical and Statistical Libraries, Inc.,
Houston, Texas.
1975.
Mandel, J. 1970. The distribution of eigenvalues of covariance matrices
of residuals in analysis of variance. Journal of Research of the
National Bureau of Standards - B. ~lathematical Sciences. Vol. 74B,
No.3.
Mandel, J.
data.
1971. A new analysis of variance model for non-additive
Technometrics 13(1) :1-18.
Okuno, T., F. Kikuchi, K. Kumagai, C. Okuno, M. Shiyomi, and H. Tabuchi.
1971. Evaluation of varietal performance in several environments.
The Bulletin of the National Institute of Agricultural Sciences,
Series A18, 93-147.
83
Plaisted, R. L. and L. C. Peterson. 1959. A technique for evaluating
the ability of selection to yield consistently in different
locations or seasons. Am. Potato J. 36:381-385.
Schuurmann, F. J., P. R. Krishnaiah, and A. K. Chattopadhay. 1973.
Tables for the distributions of the ratios of the extreme roots
to the trace of Wishart Matrix. Aerospace Research Laboratories,
ARL 73-0010, February 1973.
Searle, S. R. 1971.
New York.
Linear Models.
John Wiley and Sons, Inc.,
84
9.
APPENDIX
&~ UPPER BOUND FOR THE ~~ECTATION OF m2
In the following it is proven that
2
expected value of the statistic
m ,
1 - r
is an upper bound for the
and moreover that the bias in
2
m ,
if any, is negative.
m2
Given that
E(A)
(L
x
W~W
A(v x L),
=
- (tr A~AR)2/(tr AA~tr AA~)
E(tr AA~)
and
0,
>
A~A = WCZ~
WW~
= I,
(A~A)~
(A.~A)
,
Z~Z
=
= T-,
ZZ~
diagonal rna trLx containing the eigenvalues of
~~
=
ZP~
t'I
where
. r
,
=
tr
A.A~/E(tr
W is the
(A~A)(A~A)~,
matri:<: of the orthonormalized
is the (1- x 1-)
Z
~
eigenvec tors of
H
where
matrLx of the orthonormalized eigenvectors of
2.)
=
=1
..u.~), then
2
E(m ) < 1
and
2
C
is a
(2. x 2.)
(A ~ A) ~ (A ~ A) , and
-r
Proof:
From Schwarz's inequality (see Hoel, et al. (1971»,
2
?
E (XY) -:. E(X-)
?
. E(Y-)
and equality holds if and only if
for some constant a.
Let
x :: ,,.--Itr AA~
Then,
and equality holds
~~
and only ii
and
?(X = aY)
=1
85
for some a.
Therefore,
E2 (tr A"AH)
tr AA"
=
where
is the correlation coefficient
;JA,AH
0
2 ~
A,AH
be~~enn
the elements of
~
and
But now the
Ali.
2
1 - E(m ).
to
left hand side term of this inequality is equal
Therefore,
2 ~
. A ,.-lli
> 0
if and only if
2
pel - m
if
P{cr A"Ali
=b
tr AA")
)Tow, to prove the t
. . 2 ..
>,..
·~A,AH--·
=a
=1
and equality holds
=1
tr AA'"
or equivalently if and only
for some constant b.
E (m 2) .:. 1
'~e need to prO~Te tha t
-!."
The proof of this is as follows:
Proof.
A
I:: ,.;as given that
H
=
and
'N\r
= \;~\.; =
I
= Z~z.
::
ZZ~,
ZW~
tr(A~AH) =
Let
all
= \.;CZ~,
. ~~A
1,
T
=
Z'Ttl.
which
cr C
Then
L~plies
=
\' .G
:'T = I,
that
and
,~
and therefore
; i-~ik; < 1
for
~k:=l
for all
k.
and
So we have
,
.l"
i""'l
tr CZ;.;' = tr CT =
;
i=l
I
;
•
_ Co.;
i=l'-
'to
i
'-..;.;
.ioo~
,
i
<
86
Therefore,
tr(A;A) < tr C
=
tr A;Ali
which implies
All terms are positive, so both sides can be squared,
E
2
A
(tr A;AH)
> 1
(tr A;A) 2
E
2
A
(tr A;AH)
(tr A;A) 2
So
Tl"lerefore
,
,
E (m -) < 1 - c - A, < 1 - r
A,A.H -
Also, the bias of
t~an
2
1 - :n ,
or equal to zero.
?
1<
_
,
(1 -"" -) - ,,~
-,...
,..-
~
A
~
. ,-
,can ""'e '=een to Je ,zrea :2r
.I
_
_