Antille, Gerard; (1979).Information Matrix for Optimal Design."

INFORMATION MATRIX FOR OPTIMAL DESIGN
Gerard Anti 11 e*
University of Geneva, Switzerland
Institute of Statistics Mimeo Series #1247
August 1979
.
* This work supported in part by the National Science Foundation under grant
MCS 78-01434.
e-
INFORMATION
MATRIX
FOR
OPTIMAL
DESIGN
by
Gerard ANTILLE
University of Geneva (Switzerland)
In the research of optimal design of experiments the
information matrix has a very important part. This paper,
divided into two distinct sections, gives a characterization
of the structure of the information matrix for G-optimal
sign and for
~-optimal
de~
design.
§ 1 • Information matrix for G-optimal design
I. Introduction and summary
A G-optimal design is a design which minimizes the
maximum of the variance of the response. In this section
we prove that the G-optimal information matrix can be expressed with a discrete uniform measure defined on the space
of the controllable variables.
We also give a new and short proof of one part of the
equivalence theorem of Kiefer and Wolfowitz for the general
and the truncated case.
2
II. Design and information matrix •
Let y be a random variable related to a controllable
variable v by the relation of the form
y
.
with f
e
B
=
Iv =
lDm
•••
E[ev J = 0
'e + ev
B C 1I")k
Jl\.
, a continuous application
~;,'\.,
(e 1 ,
(v)
f
,em)
,
a vector of unknown parameters
Var[e J = (52
v
The covariance between the ev 's , for distinct v , are
all zero •
If
e
is estimable, it is well known, that the best linear
unbiased estimator for
e
is
where X is the N x m design matrix whose i
th
row is f(v i )' •
1
Then N
X'X is called the information matrix of the design
.!.X'X
1 N
= N 1~1 f(vl)f'(v l )
N
or if we note f(v l )' = (xlI'
1
1 X'X
we have N
= N
... ,
xm 1)
N
L:
xl x'l
1=1
II for each vI we have n l observations a concrete design
n
of experiment is given by {(vI' l )} 1 = 1 .•. n •
N
Let D = -1
N
·n
)
X'DX is called the information
n
nl
matrix of the design {(vI' ~}
l=l, ••• ,n
' with N
L: n l
3
A discrete design is given by {(v 1 ,P1)}
t
1=1, ••. , n ' he
information matrix of this design is
P1
XI
(
0
0
.p
) X
, with
2: P1
1
n
Let e be a class of probability measures defined on B ,
with the following assumption : all the discrete measures
belong to
e .
A probability measure
mij ( ~) =
J
~
E
e such that
f i (v) f j (v) d ( ~ (v) )
i
j
=
1
1
m
m
B
exists and is finite, is a continuous design of experiment ,
and the matrix M(~) = (mij(~»
matrix of
~
is called the information
•
By noting A = {AlA = ~f-1 ; V ~
1ity measure defined on
matrix
J
:Je
E
e}
a class of probabi-
fZ? = f (B), we have for the information
xXld(A(x»
4
III. Representation and properties of the information matrix
with probability measure on finite support
m(m+l)
The application
H :
Z-
) R
2
\ xix j
\
.
'" :
i
<
is a one to one correspondence between
H(2') and
If
vi! = {
xx'
I
x
E
:JP
c JRm }
.:Jl? is a compact set, since H is a continuous
application we have H(2') is also compact and the convex
r:l'?( vi! )
hull of H(2') is compact, then the convex hull
of
L
also.
Then M(A)
E
Jl?( vI!)
V A
E
A
Then for every matrix M(A) there exists a representation
n
of the form
~ Ai xix'i with
i=l
and we have n
~
!
~ A. = 1
xi
E
2'
l
m(m + 1) + 1 by a theorem of Caratheodory
(Rockafellar 1970 , p.155-156).
Lemmas1): M(A) is a sYmmetric positive-semi definite matrix
(Fedorov 1972 p. 66) •
2) If A is the class of all the probability measures defined
on 2',~ = {M(A)
I
A
E
A}
is a convex set.
j
5
The proof is obtained by using the property of convexity
of A •
3) Let A
A
E
we have
J
x l M- 1 (A)xd(A(x))
m
Jf'
(Fedorov 1972, p. 68-69) •
4) Let A
E
Let
n
L
A such that x l M- 1 (A)X ~ m,
Ai xi xli
i=l
Let ~ = { X
E
v x
E
~
a representation of M(A)
~I x is an element of the given representa-
tion of M(A)}
Let ~A, m
{
x
E
~ I x lM- 1 (A)X
=
m }
~ c ~A,m •
We have
Proof
=
If there exists a k such that
X l kM- 1 (A)X k < m and xli ~l(A)xi
m
i
:I
k
L Ai xli M- 1 (A)x i < m
we have
and it is a contradiction with the preceding lemma •
Proposition
Let
A
E
A such that x l M- 1 (A)X ~ m
There exists a representation of M(A) of the form
1
n
Proof : we know that there exists a representation of the
n
form
L Ai xi xli' and for xi
E
~A we have
6
x\
~
M-
p,.)X
i
==
m
M- 1 (A) x i = nm
xli
Tr(M-
1
1
n
(A) E
-n1 x i
Xl
i
)
== m
A solution of this last equation is given by
Example
Hence if we choose the Hadamard matrix for the design matrix
X, in the case of n == m , we have n1 XIX == I •
If n > m the design matrix X is given by the first m
columns of the Hadamard matrix of order n •
7
IV • Applications
1 .G-optimal design .
A design A*
min
AEA
E
A is G-optimal if
max
XEJ!f
1 (,*)x
x 'M- 1 (,)
A x = max x MA
'
By using lemma 4 we obtain :
m
=:
J
x 'M- 1 (A)xd (A(X))
~
max
1
x ' M- (A)X
Jr
J
d(A(x))
J!f
Then we have the information matrix of a G-optimal design
defined on a compact set Xc Rm that can be expressed in
1
n
the form n
L: xi x I i ' xi E J!f
and n ~ t m(m + 1) + 1 •
When the possible choice of the controllable variables are
points in an m-dimensional cube of side 2 , the points of
the G-optimal design can be chosen in the set of the vertices of
the cube.
2 • Remark on the Kiefer-Wolfowitz equivalence theorem •
Lemma
Let A
E
We have
Proof
A such that x'M- 1 (A)X ~ m
Tr [M-l(A)M(~)J ~ m, v ~
det [M- 1 (A)M(OJ ~ 1, v ~
Tr[M- 1 (A)M(OJ
=:
Tr[M- 1 (A)
J
E
A
E
A
xx'd(M(O)J
J x,~l(A)Xd(~(X)) ~ J d(~(x))
m
=:
and by using inequalities between the geometric and arithmetic
mean we have
m
8
(
A design A* is D-optimal if
-m1
min det(M- 1 (A)) = det M(A*) .
A
r-J
If we note N the information matrix of a G-optimal
design we have
v M
Then
det (N- 1 ) (
1
E~
= det M- 1
det(M)
r-J
and the matrix N is D-optimal •
9
v .
The truncated case
If the experimenter is only interested in parameters
in a regression with m parameters, we will speak of truncated
design and truncated optimality.
In this case for x E JPc JRm , s < m , :x. E A and if
we note x (1 ) = (xl' ... , x s )' , x(2) = (x s+ 1 ' ••• , xm) we
have
J
M(:x..) =
)
xx' d (:x.. (x)) =
JP
with Mij(:x..)
=
J
JP
x(i)x(j) 'd(:x..(x))
M11 (:x..) is a s x s-matrix
If M- 1 (:x..) exists, we can make a partition of M(:x..) in the
same way as for M(:x..)
Hence
M- 1 (:x..)
.
M11 (:x..)
M12 (:x..)
( M21 (A)
M22 (:x..)
with
=
(M 11
-1 M )
M12M22
21
M22
=
(M 22
M21 M11-1 M )
12
22
-M 11-1 M12 M
M12
·e
-1
M11
M21
=
- M22 M21 M11-1
-1
)
10
Lemmas :
.-v
Let fJl? = {
x
I
x
I
Then MS(A) =
E
Z
for a fixed A
E
A } •
x x'd(A(x))
fr
Proof
Ix
xx'
=
X
( 1 )x
(1 ) ,
x
by integration the two last terms are the same hence
x'd(A(x))
=
Mll - M12 M22
1
M21
:;r
2 •
( 1) ( 2) ' - 1
'
x
M22 M12
I
x' M~l
:;r
Proof
(A)X d(A(x))
Ix' M~l
:Je
(A)
=
=
MS(A)
s
x d(A(x))
I
:Je
-I
=
l
x' M- (A)X d(A(x))
x ( 2)' M 1 (A) x ( 2 )d (A (x ) )
22
:Je
= m - Tr[M 22 1 M22 ] =
=
Corollary
max xN'
X
m -
(m -
s)
=
s
11
A design A* is called Gs-optimal if
A design A* is called Ds-optimal if
A short proof of G -optimal implies D -optimal
----------------- s -----------------s -------
Let A* a Gs-optimal design, as in the general case we
can prove that
Tr[M- 1s (A*) Ms (A)] ~ s
v A E A
Det[M- 1 (A*) M (A)] ~ 1
s
s
Hence A* is Ds-optimal •
12
§ ;2 •
Information matrix for <!>-optimal design
I. Introduction and summary
A <!>-optimal design is a design which minimizes a real
valued function defined on the space of the information matrix for a given regression problem.
In this section we prove that the <!>-optimal information
matrix satisfies the inequality constraint x'M- 1x ~ m for
x belonging to the controllable variables space. It can be
expressed in the form
~ ~ xix i ' •
II. <!>-optimality
Let <!>
A design A*
E
A is <!>-optimal if
O(M(A*)) = min O(M(A)).
A
The most common examples of optimality criteria are
1
D-optimality
Q (M) = det Mo
1
L-optimality
<!>1,C(M) = tr(CM- ) ; C ~ 0
A-optimality
~1(M) = tr(M- 1 )
E-optimali ty
<!>(X)(M) = maximum eigenvalue of M- 1 •
There are particular
(see Kiefer 1974) •
cases or limiting cases of <!>p(M)
13
If A is convex and if
~
is a convex mapping we have for
o<
then A is also
~-optimal
a < 1
•
Hence in the above conditions we have proved that the set
of
~
-optimal measures is convex.
III. An optimisation problem •
We will note M the solution of the following optimisation
o
problem
For a given regression problem, for
for a finite subset
V
of
JP
~
a convex mapping and
with cardinality cf
V
~ m , we
consider :
mir:u~(M)
Meow 1\.
x'M-1 x
Let L(M,v) = ~(M) +
~
n
m ,
0)/
V X e U
1
L vi(x'i M- xi - m) with
vi > 0 be the
Lagrange function of the minimization problem. By a KUhn-Tucker
theorem (Appendix A) for convex optimisation Ne have 3 vo > 0
such that
14
Then we have
and in the case of
vi
=0 we
obtain
o where AOi
V~
1
A solution of this last equation is given by M0 = L Ai0 xi x'i
Hence, we have seen that the matrix solution of our minimization
problem can be expressed as
L
A~ xix i '
have for this representation of M(A)
:
, x.1
E
~A
V
V
so ",Ie
Proposition :
A class of solution of our minimization problem is given
by the set { MIM =
-1
x' N x
~
m , x
0)/
E U
aMo + (1 - a)N} for all N
E
~ satisfying
•
Indeed : using the fact that Mo and N are positive definite matrices implies for 0 < a < 1 ,
[a M + (1 _ a)N]-l
o
~
a M- 1 + (1 _ a)N- 1
0
Appendix B
we have x'M-1 x
and
~
~(M) ~ a~(M
o
ax'M-1
x + ( 1- a ) x'N-1 x
o
) + (1 -
~
m
a)~(N)
then M is a solution of our minimization problem.
Proposition
A solution of our minimization problem satisfies the
inequality constraint over all the controllable space ~
if ~
is strongly convex.
Proof
Let W =
Let
Assume there exists x such that
V
u {x}.
Mbe
a solution of our minimization problem on W ,
vO )
we have L(M,vo ) ~ L(M o ,
~
- 1
but M
satisfies x'Mr
x
~
m
X
E
V
then ~(Mo ) ~ ~(M)
then ~(M ) = O(M)
o
As ~ is strongly convex we have Mo = M but
X'M~1x -I x'M- 1x: which is a contradiction.
Hence X'M~lx ~ m
v x
E
~
Corollary :
Let M* be the
~-optional
matrix,
~
strongly convex •
Then M* is a solution of our minimization problem and satisfies the inequality constraint over ~ •
16
Proof : Since Mo is a solution of our minimization problem
we have
L(M ,v o ) ~ L(M*,v O )
o
<Ii(M )
o
then
~
<Ii(M * )
but M* is <Ii-optimal
then
<Ii(M * ) = <Ii(M ) as <Ii is strongly convex we have M
o
0
M*
Concluding remarks:
1 ) In the first section we prove that there exists a
n
representation of the form 1 L x. x.1 , for the matrices satis1
n
-1
fying x'M x ~ m
v x E :Je
Now we knovr that the <L>-optimal
.
information matrix satisfies x'M-1 x
~
m
v
X
E
:Je
Then there exists a representation of the form
-n1
for the
x.1 '
ill-optimal matrix.
2) The duality theorem of Sibson (1972, 1974) is a particular case of our result by choosing <li = - in det •
Thus we have a form of the general equivalence theorem
of Kiefer and Wolfowitz •
3) Our result is also valid for truncated design in the
following sense : if we define an application
of the matrices Ms (A) , we have
matrix satisfies X'M;1(A)X
the
<Ii
<Ii
s on the set
s -optimal information
17
Acknowledgments
This paper is based on part of the author's Ph. D.
dissertation written at the University of Geneva under the super-'
vision of Professor F. STREIT and I.M. CHAKRAVARTI.
I wish to thank the department of Statistics of the University of North Carolina at Chapel Hill and especially
Professor I.M. Chakravarti for having allowed me to spend
sometime there.
18
Appendix A
Modified Kuhn Tucker Theorem :
M is a solution of our minimization problem if and only
O
if there exists
v
O
==
, v.o
(v~ , ••• ,
l
~
0 such that
Proof
1) Let M be a solution of our minimization problem.
o
Let A ==
{(Yo' ... , Yn) Iyo
~ ~(M) , Yi ~ xi IM- 1x i - m,
== 1 ,
i
••• , n
for at least one M }
1, ... ,
i
By the separation theorem for convex sets we have
is
v == (v o ••• v n ) , v
v'a> v'b
for a
E
I
there
0 such that
A , b
E
B •
But this last inequality is only valid if v
~
0 and is
still true if
1
x i ' M- xi - m
1
then v oW ( M)\ + L: vi (
xi- M
Xi - m)
~
v o tI>(M 0 ) for all M
M: 1xi - m)
~
~
0 •
0 which is in contra-
nr
19
diction with x' M- 1 x.
l
i
1
Now set v o
v
0
-
m ( 0
.
(v 1 ,···,vn )
o
so v
~
0
and
ror M
= Mo it rollows ~ v~(Xi M~lXi - m) ~ 0
but rrom
the inequality constraint we obtain
~
1
vi ( x.l, M-0 x.l - m) ( 0
also true for v.
l
=
o
v.
l
And at last
o
2) Conversely: using L(M ,v) ( L(M ,v )
o
0
As v
~
we obtain
0 this inequality is only true if
, M- 1
xi
0
,.. 0
xi - m ""
Thus M satisfies the inequality constraint and
o
but for vi = 0 we obtain
1
~ v~(x!
Nr0 xi - m) = 0
l
l
~ v~(xi M~1Xi
- m)
~
~ V~(XiM~jxi-m)( 0
0 then
20
Hence M is a solution of our minimization problem.
o
Appendix B
Theorem (cited in Fedorov p. 19) .
If A and B are positive-definite matrices then
aA-
1
+ (1 - a) B-
1
~ [aA
+ (1 - a) B)
-1
0 < a < 1.
Proof
a!\. ~ -
and
(1 -
aA
a) B
-1
~ - (1 - a) B- 1
(
( aA -1 +1)
aA+ ( 1-aB~-
But using the fact M
~
N implies -M-1
~
( aA- 1 + (1 _ a) B- 1 )- 1
~
-N-1 for positive
definite matrices we have
_ ( aA + (1 _ a) B ) - 1
On the other hand
(aA -1 + (1 _ a) B- 1 ) -1 + a A-1 + (1 _ a) B- 1
(aA-
1
1 -1
+ (1 - a)B-)
-1
~ -
(aA-
1
~
0
1
+ (1 - a)B- )
~ _ ( aA- 1 + (1 - a )B- 1 )
Then -( etA + (1 - a)B)
1 ~ ( aA + (1 - a)B) -1
1
and
aA- + (1 - a)B-
21
BIB L I 0 G RAP H Y
FEDOROV V.V.
Theory of optimal experiment • Academic Press,
N. Y.
KIEFER J •
(1972) •
General equivalence Theory for Optimum
designs. Ann. Stat. 2 (1974) p. 849-878.
ROCKAFELLAR
Convex Analysis. Princeton University Press
(1970) •
SIBSON, R.
Discussion on the papers by Wynn and Laycock.
J.R. Stat. Soc.
Seri~B,
34 (1972), p. 181-183.
DA-opt1mality and duality. Colloquia Mathematica Societatis Janos Bolyai. 9. European
Meeting of
(1974).
Statistician~,
Budapest, 1972