Parish, R.G.; (1969)Minimum bias approximation of models by polynomials of low order."

"
.
- --... --.-._,_.:---,I
-~._
.
'.
.~
..
''''---'
MINIMUM BIAS APPROX1MATION' OF MODELS
BY POLYNOMIALS OF LOW ORDER
by
Robert George Parish
Institute of Statistics
Mimeograph Series No. 627
May 1969
iv
TABLE OF CONTENTS
Page
..... . . · ....
LIST OF FIGURES
·....
INTRODUCTION . . . . • • · . • • • • • . . . . . • • . .
•
2. REVIEW OF THE LITERATURE · . . . . . . . . . . . . · . . . .
LIST OF TABLES •• • • •
•
•
•
•
..
•
•
•
0
•
•
• •
•
•
•
•
•
•
0
•
0
•
•
1.
•
2.1 General Review of Response Surface Literature • • ••
2.2 Criteria Used for Fitting Response Surfaces • • • ••
2.; Minimum. Bias Des igns • • • • • • • • • • • • • • • •
; •
DERIVATION OF THE
~TIMATOR
• • • • • • • • • • • • • • ••
;.1 Introduction ••• ~ • • • • • •
;.2 Terminology and Notation •
; .; The Bias Criterion • • • • • • •
;.4 Variance Criterion • • • • • • •
; .5 Some Properties of the Estimator
0
4. THE LINEAR ESTIMATOR (d
= 2)
• • • • • • • •
6
6
9
16
21
21
21
• • • • • • • • ••
• • • • • • • • ••
• • • • • • • • ••
;2
;6
. ... . · ....
40
0
•
•
•
24 "
40
45
4.4 Comparison with the Linear Least Squares Estimator •
55
59
••••••••••••••••••••
4.; Variance for the Linear Estimator • • • • • • • •
..............
6;
5 .1" Introduction • • • • • • • • • • • • • • • • • • • •
6;
5• THE QUADRATIC ESTIMATOR (d
= ;)
5.2 Bias for the Quadratic Estimator ••
• •••••
5.; Variance for the Quadratic Estimator
••••••
5.4 Comparison with the Quadratic Least Squares Estimator
81
DERIVATION OF THE ESTIMATOR WHICH PROTECTS SJMULTANEOUSLY
AGAmST HIGHER ORDER POLYNOMIALS AND EXPONENTIALS .• • • •
87
·· .. .. .. .. ... .. .. .. .. .. .. .. .. ..
·.... ......... .
. . . . • • • . . . . .. . . . . .
.... ... .
6.1 Introduction • • • • •
6.2 The Bias Criterion • •
6.; The Variance Criterion
7.
1
4.2 Bias for the Linear Estimator • • • • • • • • • • • •
4.1 Introduction
6.
vii
••
••••••
• • • • • •
vi
~
.
co
••••
8 . LIST OF REFERENCES
"
"
•
•
•
•
•
•
•
•
0
•
•
64
75
87
88
92
102
108
v
TABLE OF CONTENTS (continued)
Page
9 • APPENDICES. • • • • • • • • • • • • • • • • • • • • • • ••
110
• •
·
·
·
. . . . ·· · ·· · ·· ·· ··
·· ·• ·
110
112
113
116
9·1
9·2
9·3
9·4
Algebra Leading to Equation (5·59) •
•
• •
Algebra Leading to Equation (5·63 )
• •
Proof of Lemma 4 •
• • •
•
• •
•
Some Algebra Leading to Equation (4.49)
.
.....
~.t:_
vi
LIST OF TABLES
Page .
4.1 Bias and deviations for various grids • • • • • •
42
4.2 Designs for minimizing V2 • • • • • • • • •• • • • • • • •
5.1 Bia.s and deviations for various grids •
• •••••••
46
5·2 Designs for minimizing V • • •
3
........ . . .
66
67
vii
LIST OF FIGURES
Page
4.1
An illustrative example of the deviations of bias, B , from
"
~oints • ••
43
4.2 ~ for the estimator defined by (2,4;.6,.3;.340514,.834771).
52
. .. .
53
5.1 ~ for the estimator defined by (3,4;.2.5,1;.8633 06,.32789).
74
5.2 Integrated MSE for the quadratic and linear estimators • ••
76
I Min B
. • . • . . . . . . • • . . . . . . . . . . .•
85
Min B, B(,,), where "1'
"2 and "3 are the grid
a
The interated MSE for the least squares estimator and
Min V Min B estimator ••
0
•
•
•
•
•
•
•
•
•
•
•
a
5.3 Integrated MSE for least squares estimator and Min V
estimator
e.
1.
INTRODUCTION
Response surface analysis may be considered as a statistical technique, which aids the detection and description of unknown functional
relationships between a dependent variable and several independent
controllable variables.
When the nuniber of ir.dependent variables is
two, this relationship is conveniently represented by drawing contours
of equal response on a graph, whose co-ordinates denote the levels of
the controllable variables.
The estimation procedure and the levels
of the independent variables are factors which affect the exploration
of the true response surface.
It is usual to assume, during the
investigation of a response surface, that a true underlying functional
relationship
.e
where the a i are .~_~~eJDAt~~'J-;~~~tsbetween the response 11 and
t independent variab"U!s ;"'~i' 01!?+; sQ~e region of operability in the ~space.
In situations where the estimation of the parameters a., is either
.
J.
.
difficult, or expensive, or both, a convenient approach is to approximate
this function in equation (1.1), by a polynomial.of low c:>rder.
cases the error of approximation stems from two sources:
In such
the experimental,
or sampling error and bias due to the apprOXimating function failing to
exactly represent the true underlying functional relationship, 11.
This thesis is concerned with the problem of approximating a true
functional relationship of the form
•
2
where
Ct,
I:l* and., * are unknown parameters, and
~
is a controllable,
independent variable, by a low order polynoinial in the independent
variable, over some region or interest, (~l' ~2]' in the ~- space.
Suppose that the
x =
~-variable
2~ - (~l + ~2)
so that -1
,
(~2 - ~l)
~ x ~
1.
is standardized by the transrormation
The true response in eguation (1.2) may then be
written
•
(1.4)
and let R denote the scaled region.or interest, -1 < x < 1.
'.
'J . ..
region or interest, N observations,. or the
set or N levels or
the ~iablEi:i;
1-·
--
response~,
Over this
are taken at a
.' Ti:il!se ·~evels, which need not all be
distinct, constitute an exPerimentai ·desigll.' The data are then used to
rit the approximating polynomial, or degree d - 1,
1\
,
y(x)
to the true response
~,
over the scaled region R, where the coefricients,
b
(i = 0, 1, ••• , d - 1) are to be chosen to· satisfy certain specified
i
criteria. As a basic measure or goodness or the approximation, the
integrated mean sguare error (MSE) was chosen.
•
The integrated MSE (J)
can be considered in two parts, namely bias (B) and variance (V), where
3
J =
f
1\
e[y(x) - ~(x)]
2
~
,
R
B
=f
1\
(e[y(x)] - ~(x)}
R
2
,
~
•
(1.6)
and
V=
f
1\
,
var y(x) ~
R
and where the measure
~
is to be chosen such that it represents the
distribution of interest over R.
Since both integrands in equations
(1.6) and (1.7) are non-negative, the integrals always exist.
Karson, Hader and Manson (1967), where the problem
As in
o:f approximating
polynomials by lower order polynomials was discussed, the primary
criterion chosen .to:r.est1lJlate the:l:lill!Uicients, b , is the minimization
i
of the integrated'-squa.:eed:~'i:.tl·
••
StibJ.ee-t. to
this "Min B". requirement,
the secondary criterlpil,'.to'be"'llat:i8fiedi, fa the minimization of the
1\
.~
,~-.:
integrated var y(x),"l\f..:
~~' . .
~. ':;c'_.,
-. '. __ r.
•
.. -'-"",
•
It was :found not'posslbre: to 'satisfy the primary criterion of
minimization of B, given in equation (1.6), for all values of the
parameter.,.
£' = (b o'
For this reason, it was decided to derive an estimator,
b l , " ' , b _ ), which simultaneously satisfies the primary
d l
criterion for a number of chosen., values, called a gridJ
o:f .,-values.
'
or lattice,
In the selection of the grid o:f"" values, it has been
necessary to assume that some
approximate range of.,:
or theoretical, or both.
~
priori knowledge exists concerning the
the!!. priori knowledge being either experimental
In addition, in the section devoted to deriving
•
•
4
designs, r is assumed positive.
However, the procedures ror rinding
designs, when r is negative, are exactly the same as described in
these sections on designs.
From a search or applied and theoretical
publications (see rererences) which discussed the exponential response
model, in some form or another, i t was found that r > 3 was considered
large, and that most applications gave
re<O, 1).
Thererore, where it
has been necessary to investigate certain functions involving
r, r
is
allowed to vary over the interval (0,6).
The exponential functional relationship, equation (1.4), rrequently
·occurs in many branches or science.
Engineers and physicists use
equation (1.4) to represent Newton's law or cooling, giving the temperature (n), of a cooling body, as a function or time (the independent
•
variable), and where ex represents the <environmental temperature.
In
the field or agriculturej_.it- is· orten known as Mitscherlich's law, and
is used to give the relation ,bet'\(eencrop ;yield (n) and the rate or
application of rertilizer.
In- biology, th1s -eXponential model, is
userul ror representing the growth or an organism as it approaches
maturity.
models.
In physics, it is used to represent gamma decay, and shielding
Economists use it for predicting price changes as a runction_ or
demand.
The difficulty in obtaining very satisfactory estimators of ee, 13
and r has prompted this thesis.
There is no transformation or equation
(1.4) to a form, which will allow the least squares estimators of the
unknown parameters, ee, 13 and r, to be represented in a closed form.
•
The arithmetical labor involved, in the numerical iteration techniques
.. - .. _--..
- .-.
5
used to obtain least squares estimators of these parameters, is great,
and unsatisfactorily inefficient methods have been adopted instead,
see Stevens
(1951), and Inkson (1964).
Imposing criteria on the estimator obtained in Chapter 3 in this
order,
i.~.,Min
B and subject to Min B attain Min V, is a different
approach than the traditional one.
Variance criteria have usually
been preferred for choosing estimators (~.a., least squares).
However,
the assumption here, is that the underlying true functional relationship
has a different model structure than the fitted model.
Therefore, vari-
ance criteria should not be the major objective, since the bias error,
due to certain inadequacies in the fitted model, will usually be domi- .
nant.
.- ..
i1~n.
Chapter 2 consists 91' a..survey of the relevant literature, and
includes some exteIls10~ to t.'tHE! ~Qve; ..P91n.tS.
...
-..
,
estimator
d
01<.
,
= 2),
In. Chapter 3, the
;;.
satisfying.thep~pos~dcr;tteria
designs which
(i'~"
•
is derived.
m.irli~ze:<'V(~~~';~estSrf;r'ai-e found
and
the~t'e~a.ted~l!i'iS
when
In. Chapter
~(x)
4
is linear
plotted for these designs.
1\
Chapter 5 is concerned with the quadratic estimator y(x) (i.~·, d = 3)
and a cOIlq)arison of the linear and quadratic estimators is made.
In_
both Chapters 4 and 5 the estimators are cOllU'ared, using MSE, with the
corresponding least squares estimators.
In. Chapter
6, it
is shown
that one polynomial estimator can simultaneously protect (!.~., minimize
B) against the true model being either a higher order polynomials or
an exponential model,
!.~.,
simultaneously satisfy conditions specified
in this thesis and those given in Karson, Manson and Hader
(1967).
6
2.
REVIEW OF THE LITERATURE
This chapter consists of a survey of the relevant literature.
In
section 2.1 a review of response surface literature, of a general form,
is given.
Section 2.2 is concerned with the different criteria of
optimality for fitting response surfaces.
Section 2.3 consists of a
review of the literature concerned with the primary criterion, given
in equation (1.4), i.e., minimizing the integrated squared bias.
2.1
General Review of Response Surface Literature
Rotelling (1941) considered the problem of designing an experiment,
for determining the level of an independent variable x, for which an
unknown functional
relati~nship
attains either a maximum or a~.:~minimum..
An experiment is performed
~:,:'"
',::..
•
A •• ,
.......
,
earlier, to provide reasonablY.good
knowledge as to the
,
. approximate
.
neighborhood of the maximum (or minimum).
In addition, it is necessary
to have some idea of the higher order parameters,
~2' ~3' ~4
these may also be obtained by earlier experimentation.
and
~5:
Now, using this
information, a second order polynomial in the x yariate maybe fitted
in the approximate region of the maximum. (or minimum).
polynomial is fitted by least squares.
This quadratic
Two kinds of error are considered:
experimental error and the error due to bias.
The experimental error is
due to the inaccuracy in the observations, and the bias error results
from the failure of the fitted equation to represent exactly the true
7
response.
The chosen criterion of optimality was the minimization
of
e(bl - ~l)
222
= ab + (ebl - ~l) ,
l
2 2 2
where ab is the variance of b (a assumed fixed), and (t b - ~l)
l
l
bl
1
is the squared bias. Rotelling discovered, that it was possible to
find a set of designs, which made the component associated with
~3
in
(eb l - ~l)' zero, and within this set of designs how to find that
subset, which minimized the component associated with ~4 in (tb
l
- ~l)'
Three distinct levels of x are required, and they must satisfy
Rotelling also
ciated with
found~hat i~\w~~p~s~ible
to make the component asso-
~4 zer9~·,for d~~gns ~y'~~&.~nly
x, but for these
associated with
4..es:t.enEr.-t~l!. W:~
three distinct levels of
l:I.91i.. .RO'P.:i'Qle to have the component
~3·:J~.is~. ;i.. ~C'\1 •.,:Jm::'::;.r;;.:: ':':
Box and Wilson ..~l9-5\Ll~ :t~;;:"foundation for response surface
analysis.
response,
They defined tp,e.problem in the following manner:
n,
is assumed dependent on the levels of k variables
The
is.'
X2' ••• , ~, which are capable of exact measurement and control.
the uth combination of levels (u = 1, 2, .•• , N)
Over repeated trials,
nu
For
could not be measured without error, and so
the observed response, y , varied, where y = n + e, e a random
u
u
u
8
variable.
2
was a.
The mean of the observed responses was
'TIu and the variance
In the entire k - dimensional factor space, there is a bounded
region R, called the experimental region.
The experimentation is to
be performed, in R, to find that combination of factors yielding the
maximum response. The problem is to find this combination of factors
000
(Xl' x2' ••• , ~) , within R, in the smallest number of experiments.
In the investigations, with which Box and Wilson were concerned, the
experimental error was usually fairly small, and the experiments were
conducted sequentially.
these advantages.
The method described in this paper, exploited
Since the experimental error was small, small changes
were determined accurately, and the experimenter could explore adequately, a small sub-region: of'. R, with only a few experiments.
addition, the results
obt~1ned inone'sub~region may
In
be used to move
to a second sUb-regionJfn 'which the: pesponse 1s higher.
In this way,
a region containing the maximum point of response can be reached.
Box and Wilson develop, in their 'paper j a. procedure which enables the
experimenter to move from one sUb-region to another, where the response
is
i~roved.
a time.
This procedure involves varying more than one factor at
The region containing the point of maximum. response may be
explored by determining the main effects and the interaction effects.
These are determined by performing experiments in the sUb-region and
,'~
fitting an equation of suitable degree.
When first order effects are
small, it is suggested that a near stationary region has been reached,
and a full second order model is fitted using a. new design.
9
When it is desired to estimate first order .termS only, a full, or
fractional, 'tf factorial de.sign is sufficient.
1.~.,
For higher order effect,
when the second order model is fitted, other types of designs are
proposed,
~.£.,
composite designs, which are used to determine the
effects up to the second order:
Composite designs are formed by
adding further points to the two-level designs.
Box and Wilson were also concerned with the bias due to the fact
that the response function was not represented exactly by the chosen
polynomial.
A matrix, called the alias matrix, was introduced to
reflect the bias in the least squares estimators of the unknown
parameters.
2.2 Criteria Used for Fitting Response Surfaces
Prior to
1958,·mt)del:;tna.Q.eg~ci.as.;wereseldom
considered and,
therefore, variance criteria were thecnly-type investigated.
Rotelling's (1941) paPer ~seUSBed:'"G"iection 2.1 was a notable
.
exception.
~
-'\
It wa.s mainly,-&f'teI' 1958, that bias type criteria were
introduced.
Elfving (1952) considered the problem of estimating two unknown
parameters,
~l
and
~2'
in the model
where 2S.i and X2i denote observations on two independent variables ~,
and X , and where a total of n observations are taken, with nPi repli2
cates of the ith experiment, i = 1, 2, ••• , r. The problem is to
10
determine the Pi' where Pi
> 0,
Lpi = 1, such that certain criteria
i
are satisfied.
The criteria depending on the particular estimation
problem.
For the problem of estimating 9
1\
estimator 9
1\
= ~~l
1\
+
a2~2
was chosen.
By the Gauss-Markov theorem,
1\
for any fixed set of p.'s the estimator. 9 is the minimum variance
J.
L
i
piGi -
1\
1\
Since 13 and 13 minimize the weighted sum
1
2
unbiased estimate of 9.
xil~l - xi~2)2
nPi
, where
Yi =.l.
L
nPi d=l
Yij' they depend
on the Pi'S • .E1fving introduces the criterion of choosing the set of
1\
Pi's such that the var 9 is minimized.
was found using a simple
The solution to this problem
g:E;o~'.?:;:;ic..:'~~.nt..
Elfving discovered that,
for estimating a single-quantity 9.,.ex:perimentation need only be
carried out at two points,. '. ',For estimating
bothpa.ra.me!tel'SI'-~latl4:-'>~2'
1\
was the minimization of var, ~i +
A
var~2.'
the criterion chosen
FoZ' this problem it was
found that only three experimental points are needed.
In general
for estimating s ~'s, at most s(s + 1)/2 experimental points are
required for optimality.
Chernoff (1953) considered the problem 'of estimating. s .parameters
9 , 9 , .•• , 9 , and generalized the work of Elfving (1952). He
1
2
s
showed that, under weak conditions, locally ,optimal designs, for
.5.
large n, could be approximated by selecting a certain set of the
experimental points available, and by repeating each of these chosen
experimental points in certain specified proportions.
The criterion
11
of optimality chosen was one that involved Fisher's information matrix.
For the case where it was desired to estimate one of the parameters,
the optimality criterion corresponded to minimizing the variance of
the
as~totic
distribution of the maximum likelihood estimate of
that parameter.
de la Garza (1954) investigated the problem of spacing observations in certain applications of polynomial regression.
It is shown,
that for a polynomial of degree m, the variance-covariance matrix of
the estimated polynomial coefficients, given by a spacing of more than
(m + 1) distinct values of the independent variate, can always be
attained by using the same number of observations, but spacing them
at only (m + 1) distinct points:
The new (m + 1) distinct values
"-
being bO'lmded by the minimum and the maximum of the previous values
':.. - - ....
of the independent
var~f.!.te:
'.
;....
~
"" ..:"
..~
\..
.',.: J , ,
i . " ~::i:.~~.:
:'.!-:~
de la Garza po~~e!i}lu~t;~ha:~-:~~er~~ultsof his paper find
application in
expeJ;i~?t~t d~si~ •. ~~e
determination of a spacing
which optimizes some criteria involving the variance-covariance matrix
is made sinq>ler; an exa.n;>le quoted was from interpolation where the
criterion is minimizing the maximum of a quadratic form, the matrix
of which was the variance-covariance matrix.
Box and H'lmter (1957) considered the problem of estimating the
response"
= cIl(~l'
••• , ~k) where ~l' ••• , ~k·a.re the levels of k-
quantitative factors, and where the only assumption made is that
<I>
may
be adequately represented by a dth degree polynomial in a bounded region
of interest.
The problem of selecting designs was discussed.
Box and
12
Hunter introduced the concept of rotatable designs and gave reasons for
preferring these designs in this setup:
A rotatable design is one
in which the variance of an estimated response, at a given point in
the factor space, is dependent only on the distance of the point from
the origin of the design.
Rotatable designs having satisfactory
variance functions are given for d = 1, 2, and k = 2, 3, .•• ,
co.
Blocking arrangements are derived.
A
The estimated response, y, is fitted by least squares, and the
A
criterion of rotatability requires that var y(~)
and
~
= var
A
y(~),
where ~
are two points in the design space, equi-distant from the origin:
This gives two conditions on the design moments, which are derived in
the Box and Hunter (1957) paper.
The concept of bias, due to the inadequacy of the fitted model,
is discussed for the case when least sgu.a.:f.e~· estimators are used.
Hoel (1958) and (1961) d1sbuSsed"the-'pr6blem of using the generalized variance, as a criterion for theeff1diency of least squares
estimators, in the general linear model w'ith uncorrelated errors.
It
was considered that the best designs were those which minimized the
generalized variance.
Using this criterion, some results were obtaiIied
on the increased efficiency arising from doubling the number of equally
spaced design points, when 1) the total interval for estimation is
fixed, and 2) when this interval is doubled."'
Williams (1958) considered fitting a polynomial of degree p in the
p
independent variable X. Interest lay in the coefficient of X • The n
values of X were allocated such that maximum precision was given the
estimator of the coefficient of ~ .
It was shown that optimum alloca-
tion required exactly (p + 1) distinct levels of X, with repetition
•
I:f the experimenter restricted the allocation, then a
allowed.
restricted optimum allocation may be determined.
(1958)
Folks
considered the problem of approximating true poly-
nomial responses with a lower degree polynomial, over an experimental
region R.
He used least squares estimators and derived conditions on
the designs such that certain optimality criteria were satisfied.
Eight different optimality criteria were considered:
1\
1.
Min Max
X
var y(x)
2£
•
1\
2.
Min
X
JR var Y(2£) ax
~
.....
4.
.
:'_ ..
:
... ".. :
J
B(2£) dx
6.
2
B (x) ~
MinI
X R
7·
Min
8.
Max
X
.;
2£
Min
X
.-. ~...:.;_--::..
.
5·
X
.
.: .:: :.,:.--: .. ,.
'"
Min Max B(x)
X
'
R
I
". --
1\
MSE (Y(2£) dx
R
(power of detecting higher order terms),
•
•
14
1\
where X is the design matrix, y(lS) is the estimated response at the
point ~ B(lS) is the bias at lS due to the inadequacy of the fitted
1\
model to represent the true model.
MSE (y(x)) is the mean square
1\
error of y(x) at the point x.
1\
~.
of the coefficients is
The vector of least squares estimators
A design satisfying any one of the above
was considered optimal.
Folks showed that the optimality of a design, was invariant under
simple linear transformations.
to cubical regions, -1
This enabled Folks to restrict attention
~ xi ~ 1,
and spherical regions
I x~ ~ 1, and
still be able to generalize the results from these to rectangular and
ellipsoidal regions.
optimal designs with respect to the variance and
bias criteria, were obtained for the p-dimensional case, where the
•
approximating i'unction is linear, and'the true relationship is quadratic.
Da.vid and Arens (1959) considered the problem,of fitting a straight
line to a response which may'be considered aPProximately linear, but
which may contain a quadratic component.
Mean square error type
criteria were used to find an optimal spacing of a single controllable
factor.
It was shown that no improvement is obtained by spacing x at
more than 2 distinct levels.
For the allocation of ~ observations at
each of the two distinct values of x,
~,
x ' two optimality criteria
2
were considered:
1.
minimize the integrated expected squ.s;red error; the integration
taken over [-1, 1], and
2.
minimize maximum expected squared error.
•
-.-.-.-.-.-
-
--,.--
..-
-
.
• ._...• __ "'.'.
'•.'.".. _
.~
•. '.__ h_.,....• • .
,~,
•..
15
Criterion 1 leads to the symmetric spacing xl
4
root of the equation r' (3r
2
= -x2'
20'2
- 1) = ~ , where c is the coefficient
2
9nc 2
of the Legendre polynomial P (x) in f(x)
2
= Co
+ cl Pl(x) + c 2 P (x).
2
Criterion 2 also leads to the symmetric spacing
~
i
- Min
where xl is the
~
= -x2'
but with
i
~ [1+ (1+ ~::;) ]
,1
Kiefer (1961) considered the construction of experimental designs,
which satisfy certain specified criteria.
Several such criteria were
given by Kiefer:
1.
D-optimality:
minimizing the generalized variance, M;
2. A-optimality:
minimizing the average variance, trace V;
3.
minimizing :t¥:1.argest eigen value of Vj
E-optimality:
t.
4. Minimizing the maximum. diagonal element of V;
•
.
"q"
.
5• Maximizing tne' '~i:\fera:ge efficiency j this average being shown
to be proportional to trace Vj
6.
L-optimality:
for the normal-case, maximi,zing the minimum. .._
power on spheres 1'1 = c 2 as c +0, for testing the hypothesis
I = 0, where 1'1 =
•
u
L'f~
, and 'f j =
I
i
C
ij 9 i , 9'1 parameters
of the model.
In the above V is the variance-covariance matrix of best linear estimators
of contrasts
I.
Kiefer gives conditions which, if satisfied by experi-
mental designs, then these designs will be called optimal according to
any one of the above criteria.
16
2.3 Minimum Bias Designs
Box and Draper (1959) considered the general problem of fitting a '
A
fWlctiona1 relationship n(tf) = g (tf), by a polynomial y(tf) of degree ~,
over some standardized region of interest, R.
The assumptions made on
the response fWlction are:
1.
The true functional relationship n can be represented exactly
2.
in R as a polynomial of degree d > d •
2
l
A
The fitted polynomial, y(tf), adequately represents, for all
practical purposes, the true response.
Subject to these assumptions, they considered the problem of choosing
designs which minimized J, the expected mean square error, averaged
over R, when y(x) was fitted by least squares.
The expected mean
square error, J, is defined as,
where n- 1
=!
dtf., and a2. is the exp,e;rimental error variance. The
R
quantity J, can be broken down into two cOllq)onents, the integrated
A
var y(tf), and the integrated squared bias, !.,!.,
J = V
+ B,
where
and
---
- ...
---------~-~~--~._--
------~._-~-----=->----~--~-~-~-~--'---'---~
•
B _ Nil
-
a
2
Box and Draper showed that, even though the designs which minimized J
depended on certain unknown parameters, the bias contribution, B, was
a far more important factor than the variance contribution, V.
Condi-
tions on the design moments were given, such that if these were
satisfied, then these designs would minimize B.
The design moment
conditions are that the moments of the design u;p to order d
l
+ d
2
should be equated to the corresponding moments of a uniform distribution over R.
Particular examples of designs are given for fitting
linear functions to true quadratic responses, where R is some specified
spherical region.
In a further paper Boxc~d Drt";per -<1963) extended the work of their
•
earlier paper, . by applying. t~~i!: ~ey..el.opment to fitting quadratic polynomials when the true mo,del_).s.
•
_
..,.
.-.'
u
C1.WJ~.,
•••• ' ..... ,
'j;'hey found various design moment
•••
conditions so that the design
.
.. would minimize B, defined above.
Only
second order rotatable designs were considered.
Manson (1966) applied the Box and Draper all bias development to
d
the problem of fitting
~
=a
+
~=
L
b
j
j=O
-_.
e
jZ
to a true exponential response
.
~e7Z , where 7 was restricted to be a positive integer, and Z
was a bounded controllable variable.
Transformations, to facilitate
the solution for minimum bias designs, were made so that ~ = a + ~7
d
A
and y =
L
j=O
b j x7 .
•
•
18
Minimum bias designs were obtained for specific values of N, and
an approximation function of degree d.
These "Min B" designs contained
several degrees of freedom in the choice of the design levels of the
z, or Z, variate, which could be used to satisfy additional design
requirements.
It was shown that, for a given N, the same designs,
which minimize bias for approximating polynomials of degree one also
minimizes bias for general degree, d.
In deriving these Min
B designs,
Manson applied some aspects of numerical analysis connected with
/
polynomial equations.
Karson, Hader and Manson (1967) considered a problem,which was
set in the framework of the Box and Draper (1959) paper, except that
•
there was no restriction as to requiring that least squares estimation be used.
surfaces.
The problem was one of'exploring polynomial response
Over the entfre'region of operi..bHity , it was assumed that
the true response was
a: 'pcilYiJ,oinial' bf degree' d + k
- 1.
In addition
within some region of interest, R, it'was assumed that a polynomial
of degree d - 1 could be used. adequately as an estimator of the true
response.
1\
y(x)
The fitted polynomial was written as
,
where the vector of coefficient,
combination of the observations.
£'
= (b ' bl;'.~., b _ ), is a linear
O
d l
These estimators, £, are chosen such
that the integrated squared bias, B, as defined in the Box and Draper
•
(1959) paper with R = [-1, ll, is minimized.
The estimators which
19
satisfy this Min B criterion form a class of minimum bias estimators.
1\
.
From this class, that estimator with the smallest integrated var y(x),
V, was chosen.
In this way, for any fixed design, minimum B is
1\
obtained, and sUbject to this minimum integrated var y(x). Min V
I Min
B
estimators were found for the case of several independent variables
Xi (i = 1, 2, ... , p).
Conditions on the design moments were found, such that the same
estimator of degree (d - 1) would simultaneously protect, in the
Min V
I Min
B sense, against polynomials of degree d, d + 1, d + 2,
where d = 2 or 3; also this simultaneous protection was obtained for
the cubic estimator protecting against polynomials of degree 4 and 5.
Designs were found which satisfied the derived design moment conditions
for this simultaneous protection.
In a paper by Karson, Hader and Manson, given at the VPI Eastern
Regional meetings of the D1S and Biometrics Society (1968), the
criterion of simultaneous protection, as mentioned above, was replaced
1\
by minimization of the integrated var y(x), over designs.
This work
has been submitted for publication.
In the work which follows, and in Karson, Hader and Manson, the'
restriction of only using least squares estimators has been discarded,
..
..4
since the Box-Draper results indicate that the bias contribution to
mean square error dominates the variance contribution, and least squares
estimation essentially optimizes with respect to variance.
An estima-
tion procedure is used which, for a fixed design, minimizes the
integrated squared bias, and subject to this minimizes the integrated
20
variance of the estimator.
Manson (1966).
The response is assumed exponential, as in
In certain sections of the work which follows it has
been necessary to use a numerical range of values of one of the parameters (the parameter in the exponent) in the exponential response
model.
In order to choose a realistic range, a search was made of
some applied publications, which were concerned with the exponential
model, and these publications are given in the reference list.
I
i -': .."
t
21
DERIVATION OF THE ESTIMATOR
3•
3.1
•
Introduction
In this chapter the bias and variance criteria are developed,
i'~"
equations (1.6) and (1.7), and these are applied to deriving
the general estimating polynomial of degree (d-l).
First, sufficient
conditions on the estimator to be derived in this chapter are found,
which minimize the bias, given in equation (1.6).
These conditions
define a class of estimators, which minimize this bias.
Secondly,
the estimator, in this class, which minimizes the integrated variance
A
y(x), given in equation (1.7), is de,ived
3.2 Terminology and Notation
Throughout the entire region .of operability, the true model
.n.
..'
.'
the point x, is assumed to be an exponential, given by
.
=
ex + f3e
:>,X
.; ._ . i
.. '
~.'~
..
~,
at
•
.7' .•
= ~' t!
,.
where
,
and
Over some region of interest R, scaled to the closed interval
-1
:s x :s 1
,
•
•
22
it is desired to fit
n by
taking N observations on
n,
which will be
represented by
where the
£
are independent random variables, with zero mean, and
common variance
1\
y(x) =
2
CI.
The approximating function is given by
!{:Eo ,
where the (1 x d) vector
!{
•
= (1, x,
!{
is defined as
x 2 , . .. , x d-l)
I
,
and the (d x 1) vector, £, is 'to b'e determined.
As mentioned in Chapter 1, it will be assumed that some .!!: priori
knowledge of the true range of 7 is available:
.!!: priori knowledge
which is either theoretical, or experimental, or both.
Recall that
.L
the need for this .!!: priori knowledge on 7 was due to the necessity of
placing a grid of 7 values over this range of 7, since it was not
possible to satisfy the stated criteria for all 7'
From the .!!: priori
range of 7, (r - 1) values will be chosen to form the grid
. -
,-
(:5·3)
If' 7 = 7 i' i = 1, 2, ••• , r-l, then the true functional relationship,
at the point x, is
•
2;
.
(; .4)
= E!
-:L.
12 ,
where
E'
-:i.
= (1,
r x
e 1 )
Let the N observations on 11 be denoted by
and, theref'ore,
X=D.+~
(;.6)
,
It is proposed to restrict attention to those £'s which are linear
combinations of' the observations,
1.~.,
24
E.
,
= T':l,
where the (N x d) matrix T is to be determined.
It remains, now, to
determine T, such that the primary and secondary criteria are satisfied,
!.~.,
minimization of the integrated squared bias B, defined by
equation (1.6), for all
1
i
(i
= 1,
2, ""
r - 1) of the grid, and
subject to att.aining this Min B, minimization of the integrated
variance V, defined by equation (1. 7) • Deriving T will define, in
A
terms of,T, the estimating polynomial y(x) which may be written as
(from equations (3.2) and (3.8»
y(x)
= ~i
T':l,
•
(3·9)
3.3
The Bias Criterion
Bias at the point x is defined as
B(x) =
e
A
[y(x)] - 11 (x)
(3·10)
A
From equation (3.9), the expected value of y(x) is given by
A
e[y(x)]
= ~i
T' e[:l,(x)]
Using equation (3.6), and assuming 1 is the ith point on the grid
where the (Nx2) matrix E is defined as
i
(3·11)
25
Ei =
In terms of the model, given in equation
(3.4), the fitted equation in
(3.2), and using eq~tion (3.12), equation (3.11) becomes
A
e [y(x)]
!oJ.
T' E;t ~
= !.i
T'] ,
=
i = 1, 2, ••• , r - 1,
when r is one of the points on the grid.
Therefore, equation (3.10)
becomes
when r is the
T' Ei~
A..
i~
point on the grid i = 1, 2, •.• , r - 1.
= --;t
V
,
i = 1, 2,
Let
•.• , r - 1
and
i
= 1,
2, ... , r - 1
,
where ~ is a (dx1) vector and ri(x) is a scarar~
(3.15), (3·16) and (3·17)
2
B.(x)
J.
= [Xl'
L - r.(x)]
'[xl'
Vi - r.(x)]
J.
J.
-.I.
,
Then, from equations
26
that is,
In equations (1.6) and (1.7) the measure
J..L
will be taken to be a
uniform. distribution over the closed interval [-1, 1], assuming
that the experimentor's interest is evenly distributed over the
region R.
Substituting equation (3.18) in eguation (1.6)
that is,
Bo; =
...
y.:
J.
W
1
LJ. - 2V'
w +W
,
~i
3
i
1 = 1, 2, ... , r - 1 ,
(3.20)
i = 1, 2, ... , r - 1
,
(3.22)
i = 1,·"2,· ••• , r - 1
,
(3.23)
where
1
Wl =
1
2!-1 2£1 a£i
dx
and
(In Chapters 4 and 5, where B , given by eguation (3.14), is investigated
i
NB
for various grids with d = 2 and 3,
~ is the function evaluated.
a
This
27
is in keeping with the criteria given by Box and Draper (1959).)
Where an integrand is a matrix, in equations (3.21) and (3.22), the
integration is to be perf'ormed element by element.
The (dxd) matrix
W is independent ~f T, and so are the (dxl) vectors ~2i' and the
l
scalars W •
3i
The (dxd) matrix W is symmetric, positive definite (Manson,
1
1966), and therefore, has an inverse. Hence, Bi , i
r - 1, may be
e~ressed
= 1, 2, •.• ,
as
Since the only term involving T is Vi' Bi given by equation (3.24) is
minimized (for every i = 1, 2, ••• , r - 1) with respect to T, when
i = 1, 2, .•. , r - 1
Using equations
Using equation
T' Ei
~
(3.16) and (3.22), equations (3.25) becomes
(3.17), equation (3.26) becomes
= wl-1 21 J
1
-1
~1(!i~) dx , i
The (N x 2) matrix E may be written
i
= 1,
2, ••• , r - 1
~
i
= 1,
2, ••• , r - 1
,
where the (N x 1) vector l is the unit vector, that is
1
1
,
l=
1
and the (N x 1) vectors x. are defined as
-:t.
x =
i
9.- ..
= 1,
2, .•• , r - 1
= 1,
Therefore, equation (3.27) becomes, for every i
2, .•• ,
-1 1 1
-1 1 1 .., i X
w
TIl ex + T ~13 = (ji1 2
~1 dx + I3 1 2
e
~1 dx
-1
-1
I
f.
f
~
- 1,
• (3 ·31)
Thus, sufficient conditions on TI , such that equation (3.31) is
satisfied are
and
29
Lemma. 1:
1
o
, a (d x 1) vector •
o
Proof:
This lemma will be proved for d odd, however, the proof is
the same for d even.
1
From the definition of W , equation (3.21),
1
1
x
x
··......
··..
.
x
x
d-l
x
x
d
x
d-l
d
··•
2d-2
dx
,
that is,
1
0
W =
0
1
3
1
3
0
0
1
5
1
5
...
1
d
0
0
,
l
1
d
...
o ...L
d+2
-.l..
2d-l
1
o
dx=
x
d-1
1
3
o.
1-.
5
1-.
d
30
Therefore, from equation (3·32)
-1
0
1
1
0
T'!!
3
=
1
3
0
0
1
5
1
5
1
0
0
1
d
0
0
1
-.1....
1
3
......
1
0
d
-d+21
0
d
2d-l
Premultiplication of both sides by W gives
l
1
o
1
3
1
d
-1
Since equation (3.32) defines a unique (T'!!) (since W1 eXists),
equation (3.34) holds if and only if the (dx 1) vector T'!! satisfies
1
T'!!
=
o
o
o
.-~
:
Therefore, using Lemma 1, and the r conditions in equations
(3.32) and (3.33), the matrix T' must satisfy
31
1
o
=
TIQ:.
,
o
and
Tlx
9.
-1
= Wl
w.
-
i
= 1,
2, ••. , r - 1
Ei
where the (d x 1) vectors ~E
are defined as
/
i
~i
,
1
J
= '2 -1
1.
rix
e
2£1
ax
i = 1, 2, ..• , r - 1
The conditions in equations (3.35)and (3.36)/ which may be written
as
TIE = A ,
where the (N x r)matrix E is defined by equation (3.7), and ~he
(d x r) matrix A is defined as
1
o
A=
o
-1
W1 ~E
1
...
. -1.
W1
~E
-
r-l
,
()
3 ·39
are the sufficient conditions for satisfying the primary criterion.
32
Condition in equation (3.38) defines a class of T's, which in
turn, through equation (3.8),
d,efines a class of estimators. Any
estimator in this class minimizes the bias B given in equation (1.6)
if '1 is one of the grid values,
In add;ttion, the value of minimum.
B , for each i, is using equations (3.24) and (3.25)
i
= 1,
i
2, .•• , r - 1
3.4 Variance Criterion
Conditions in equation (3.38) represent a set of sufficient
conditions on the elements ot T which generate a minimum. B (Min B)
estimator for any '1 on the grid.
Subject to these conditions, the
secondary criterion, a variance criterion, is introduced to uniquely
specify T, as a function of an arbitrary design.
min V
I min
This defines the
B estimator.
Using equations (3.2), (3.6), (3.8), and the assumptions on ~,
2
A
var y(x) = -1
x' T' T -1
x a
A
It is now intended to minimize the integrated var y(x), over the T'
satisfying the conditions in equation (3.38)(the integration being
with respect to the uniform distribution over [-1',1]).
That is,
sUbject to minimum B (i = 1, 2, ••• , r - 1) minimum. V is obtained
i
by minimizing V, given in equation (3.41), with respect to T and
subject to conditions in equation (3.38).
1
V = -2
f
1
-1
A
a2
var y(x) ax = -2
f
1
-1
2£1
T' T ~l ax •
33
in equation
(3.41),
with
respect to T, sUbject to the conditions in equation
(3.38),
using
It is intended to minimize
V given
the Lagrangian multiplier technique.
It will be useful to write the
(d x N) matrix T' as
t'
-2
,
T' :;::
t'
..::.a.
where each!j is a (1 x N) vector, for j :;:: 1, 2, ••• , d.
equations
(3.40)
and
From
(3.42)
...
1\
va.r y(x) :;::
2Si.
t't
t't
~
~a
~
2
Symmetric
For d odd, integrating equation
(3.43)
with respect to x over (-1, 1],
d+l
.9::!:l
2
J
1
1\
\
var y(x) dx:;:: L
-1·
n=1
2
1 t'
t'
\ . 1 t'
t
2n-l -=-2n-l -1 + L 2ii=l -2n-~2 +
n=2
3d-l
. 2
\
+ L
d+1
n=T
1
2n-l !2n-d
~
34
It is desired to minimi~e this (V) subject to conditions in
equation
(3.38),
which may be written as
t' E
-1
a'1
....
t r E
.:.:.:2
92
a'
=
r
···
a'
E
t
-=u
~
The Lagrangian equations are
Lj
=V -
j = 1, 2,
2 _j
t' E Aj
\
2(T WI - EA)
=
0
••• , d
,
,
Nxd
where the (r x d) matrix A is def'ined as
A = l\
: ~ : ... : Ao.) ,
and the matrix T is given by equation
and
(3.44)
T'E
that is,
(3.42).
and assuming (ErE) is non-singularr
= W-1l
AO, (ErE) = A
?
Using equations
(3.47)
35
Substituting It. into equation (3.46) gives,
that is,
T'
= A(E'E)-l
E'
Similarly for d even, 1.~., equations (3.44), (3.46), and (3·48),
are of the same form.
If the second derivative of equation (3·45)
is taken with respect to ] j , j = 1, 2, ••• , d, then the matrix of
second derivatives is
1
0
2
1
5
0
1
3
!
3
0
0
...
!
<5
>'
...
=
(3·49)
31
1
0
Since W is positive definite T', as specified in,equation (3.48), is
l
the matrix which minimizes V defined by equation (3.41). Hence,. equation
(3.48) gives the T matrix which defines the (min V
. polynomial which is given by
A
y(x)
= ~i A(E'E) -1
E'
~
•
I min
B) estimating
36
It should be noted that, since equation (3.48) requires that
(E'E) be non-singular, a requirement is
N
~
r
,
where r - 1 is the number of points in the grid.
3.5 Some Properties of the Estimator
Using the estimator, given by equation
(3.50), minimum B is
obtained when r is any point in the grid in equation (3.3).
Subject
to this minimum B, minimum V is obtained, for any given experimental
design.
In Chapters
on the grid, and a
4 and 5 the bias is investigated for r values not
c~arison
of this bias
with minimum bias is made:
minimum. bias, for a given r value, is defined to be the value of
equation (3.39) evaluated at this given value of r.
1\
The y(x) obtained
depends on the chosen grid of gammas, as shown in the A and E matrices.
From equation
e -b
=
e
T' ""
,
that is, using equations (3.12), (3.32),. (3.33) and Lemma 1
37
~.~.,
1
o
o
when the true r is the
i~
grid value.
When the true value of r is
not one of the grid points, from equation
e :2. =
T'
e y..
(3.52)
,
that is,
e -b
= T' Er
~
t::.
,
where the (N x 2) matrix E is defined by equation
r
E
r
1
e
1
e
1
e
1
e
=
r~
rX2
··•
r~-l
rXN
and E! is given by eguation (3.1).
equation
e -b
(3.54)
=
(3.55),
becomes
A(E'E)-l E' Er t::.
~ •
Using expr'ession in equation (3.48),
£
The variance of
var
£ = var
is given by
(T':L)
,
that is,
Using the definition of
T', this becomes
1\
In terms of the estimator
y(x),
equations
(3-52), (3-54)
and
(3 -58) become
1
o
1\
e y(x)
=
-o
2f{'
(3-59)
....
when "1 is the ith grid value - For "1 not a point on the grid equation
(3-59)
becomes
e 1\y(x) = 2f{
A(E'E)-
1
E' E ~
r
-
(3·60)
1\
The variance of
1\
var y(x) =
2f{
y(x)
A(E'E)
is defined as
-1
2
At 2£1 a
-
It should be noted, that the matrix E is independent of the
degree of the approximating polynomial, and therefore, once (E'E) has
39
been inverted, for a particular grid, it may be used in the approximating polynomial for any value of d.
In Chapters 4 and 5 the linear and quadratic approximat:l.ng
functions are investigated.
40
4. THE LINEAR ESTIMATOR (d = 2)
4.1
Introduction
In this chapter, a linear approximating polynomial
A
y( X) = b o + b 1x = -J.
~ -b
is used to
~it
,
the exponential response
where the (1 x 2) vector 2£1. is defined as
2£i
= (1, x)
(4·3) "
,
and the (1 x 2) vector b' is given by
(4.4)
Throughout this chapter and the next, the estimator used is the
Min V
I Min
by equations
B estimator, as defined in Chapter 3, 1·!:..,
(3.8)
and
(3.48)
with d = 2 or
3.
investigated, a grid of 3 values of '1 (1.!:.., r
:9. as given
Over the ranges of '1
= 4)
was considered
sufficient, since at the 3 grid points, minimum bias is obtained,
and thus, forces the bias curve to attain minimum bias at these values
of '1'
For '1 values between grid points, the "difference between actual
bias and minimum bias will depend on the value of '1' and the distance
between grid points.
At a grid point, minimum bias is attained, and
as '1 increases, away from this grid point, the rate
o~
increase for
41
the actual bias is slightly greater than that for minimum bias.
This
increase continues until a maximum is obtained, when the bias begins to
approach minimum bias again, which is attained at the next grid point.
If the minimum bias cu.rve, at '1' is denoted by
B('1)
= Min B at'1
,
and the actual bias curve is denoted by
B
'1
= Actual
Bias at '1
(4.6)
,
then
B - B('1) ~ 0
'1
for every '1
Further, the differences, given by the inequality in equation (4.7),
are very small for the ranges
~;f
?' considered.
These points, mentioned
above, will be shown, in the .-f~llowing sections, by graphs of the
function B for the various grids used, and by the tables of B and
'1
'1
B - B('1) (see Table 4.1 and Figure 4.1).
?'
In this chapter, an additional criterion,
other than the criteria satisfied by the Min V
is introduced.
of designs.
1.~ '.'
a criterion
I Min
B estimator,
This additional criterion is satisfied by_the choice
The criterion chosen in this chapter,
and in Chapter 5,
.
.~
A
is the minimization, over designs, of V, the integrated var y(x),
for certain grids and values of N.
The class of designs, over which
V is minimized, is the class of four level designs characterized by
e
e
.,
e
..
Table 4.1 Bias and deviations for various grids
Defining Vectors
b
2
Deviation
Bias
1a
r
.2
.4
.6
.8
1.0
L2
L4
1.6
L8
2.0
2.2
2.4
2.6
2.8
3. 0
3·2
3·4
3·6
3·8
4.0
Bias
Deviation
Bias
.00014 .0
.00014
.00234
.00233 .0
.01225 .0
.01225
.04059 .0
.04059
9
.10530
•10530 .4.10- 8
.23498 .28.10- 6
.23498
.47436 .1408.10-6
.47436
.89246 .8140.10-6
.89244
1.59498
1·59498 .3406.10-4
2·74274
2·74275 .1158.10-4
4·57928 ·340.10- 4
4·57924
7·47167
7·47176 .904.1011·9716
11·97198 .00024
18·90622
18.90672 .00052
29·51180 .00112
29·51067
45.62815
45·63056 .00244
70.00780 ,.•00510
70 .00292
.01044
106.7152
106·72454
161.82380
161.84100 .02106
.04188
'244·31639
244·34684
.92.10-6 6
·552.10- 6
.108.10-6
.324.10.0
.28 .10, _66
·72.10- 6
.8144.10.3824.10 _6
.0
.752.10-6 6
·3188.10.
.
6
.472•10- 6
.436.10.0
4
.268.10-3
.236.10.00108
.00384
.0116
.00014
.00233
.01225
.04059
.10530
.23499
.47436
.89246
1·59498
2·74274
4·57924
7.47168
11·97176
18·90626
29·51088
4~.62900
70 •00 553
106·72208
161.84039
244·35372
c
3
Deviation
7
.352•10- 7
.100.10-7
.140.10.12.10-6
.228. 10-:
.196•10- 7
.396•10;
.60.10•652.10 _66
.176•10- 6
.224.10-_6
·712.10 6
.180.10-4
·36.10- 3
.220.10.00088
.00284
.00798
'.02044
.0488
Bias
.00014
.00233
.01225
.04059
.10530
.23498
.47436
.89246
L59498
2·74274
4.57924
7·47169
11·97187
18·90659
29·51180
45·63124
70.01063
106·73301
161.86277
244·39782
4d
Deviation
7
.42.10-7
.42.10-7
.16.10.0
8
·32.10- 8
.12.10:8
.20.10 8
.124.10.0
6
.3128.10;
·396.10- 4
.244.10- 3
.1052.10-3
·372.10.00112
.00312
.00792
.01892
.04284
.0928
a(2,4j.6,·3jO·340514,0.834771)
b(2,4j2,ljO.846623,0.309123)
c(2,4jl·5,1jO.840539,0.315348)
>
Where the notation (d, rj "c' 8; t 1,
t
2 ) is used.
d(2,4j1.3,0.5jO.83885,0.319857)
I
.J=""
I\)
43
BIAS
1------====:..-----I----l----,o::~-------'7
11
Figure.4.l An illustrative example of the deviations of bias, B ,
'7
from Min B, B(1), where 11' '72 and '7 are the grid ~oints
3
44
t l , t 2 ~ 0 where n1 observations are taken
at each of ~tl and n observations at each of ±t2 ' The total number
2
of observations is N = 2 (n + n ). The reason for choosing four level
1
2
designs, was that this is the minimum number of levels necessary, to
the levels
~tl
and
~t2'
ensure the non-singularity of (E'E).
To represent an estimator with its grid of 7 values, and the
design used, the following notation has been adopted:
(4.8)
where d - 1 is the degree of the approximating polynomial, r - 1 is
the number of grid points, 7c is the mid-point of the grid, and 8 is
the distance between grid points.
The design levels are given by
The integrated mean s.quare error is computed, in this chapter,
for the fitted model
A*
A
Y (x) =
A
eta
A
+ ~ x
,
A
where the (a ' ~) are the least squares estimators, and where the
O
true functional relationship is given by equation (4.2). This
integrated mean square error is couu>ared with that of the :Linear
Min V
e·
I MinB
estimator (d = 2), for 0 < 7 <.,,6 •.
45
4.2 Bias for the Linear EStimator
Integrated squared bias (B ) was calculated, for various grid
I
protection systems on I and was compared with minimum bias (B(/»
o<
I ~
used.
6.
for
The calculation of this bias, B , depends on the designs
I
The designs chosen for this calculation were those designs
which satisfied the additional criterion, mentioned above.
These
designs are given in Table 4.2.
Since (E'E) was assumed non-singular in the derivation of T,
in Chapter 3, it is necessary that the number of design levels be
greater than, or equal to, r, where r - 1 is the number of distinct,
non-zero points in the grid of I values.
Therefore, this requires
that the total number of observations, N, satisfy the inequality in
equation (4.10),
(4.10)
N~r
In addition, one reqUirement initially imposed, and which is satisfied
by the designs mentioned in section 4.1, is that all odd design moments
be zero.
This requirement of design symmetry is made primarily
because of algebraic convenience.
Therefore, since in this work
r = 4, the values of N used, are greater than, or equal to, four;
values actually considered are Ne(4,6,8,10) for the four level symmetric design.
From equation (3.24), with T' satisfying equations (3.35) and (3.36),
if I is any one of the (r - 1) grid points, then minimum integrated
squared bias, B(/)' is attained, and is given by
e
e
e
Table 4.2 Designs for minimizing V2
!
Grid (.3, .6, .9)
n
.t1
.t2
2
N
n
4
1
0.340514
1
6
1
0.907956
8
1
8
Grid (1, 2, 3)
n
.t1
.t2
2
Min V
Min V
N
n
0.834771
1.888936
4
1
0.846623
1
0·309123
1.890820
2
0.44'789
1.842692
6
l'
0.896272
2
0.410885
1·923639
0·920110
3
0.480215
1.922354
8
1
0·92001
3
0.443110
1.949746
2
0·340505
2
0.834769
1.888936
8
1
0.846623
2
0·309123
1.896820
10
1
0·92011
4
0.5 0'784
2.000157
10
1
0·92011
4
0.45011
2.018896
10
2
0.874776
3
0.408047
1.890975
'10
2
0.874431
3
0.475391
1.910608
1
Grid (0.5, 1.5, 2.5)
n
.t1
.t2
2
1
Grid (.8, 1.3, 1.8)
n
.t2
.t1
2
:Min V
N
n
0·315348
1.892177
4
1
0.83885
1
0·319857
1.893265
0.42211
1·914223
6
1
0·904612
2
0.427890
1.910186
, 3
0.451596
1·944885
8
1
0·92011
3
0.454174
1·942904
0.840539
2
0·315348
1.892177
8
2
0.83885
2
0.319856
1.893265
1
0·92011
4
0.464673
2.029125
10
1
0·92011
4
0.469339
2.03 0084
2
0.874334
3
0.385842
1.904854
10
2
0.874871
3
0·389164
1·902963
N
n
4
1
0.840539
1
6
1
0·90111
2
8
1
0·92011
8
2
10
10
1
1
Min V
~
0\
47
i =
W~
where
,
~.
~i
1, 2, 3 ,
(4.11)
and W are defined as, using the definitions given in
1
.
~
Chapter 3,
W
3i
1
=~
J-1
1
= 1,
I' x 2
(ex + f3e 1)
1
dx
= 1,
2, 3
,
that is,
2, 3
Now,
i = 1, 2,
Let this vector
~2
3
be expressed as
i
.
i
= 1,
2, 3
,
where
w
11
=~ f
t=.
1
1
I'.x
(ex +f3e
-1
= 1,
2, 3
,
~)
dx
= ex +
1'1
f3 (e
-I'i
; e
)
,
1'1
(4.14)
48
and
xdx +
'12 13
i
= 1,
i
= 1, 2 ,3 . (4.15)
2, 3
,
that is,
,
For d
= 2,
-1
W
1
is given by
Let
7i
-"1i
- e
i = 1, 2,
7i
-"1i
+ e
i
gli
=e
g2i
=e
(4.16)
3 ,
and
Then, using equations
= 1,
2, 3
(4.13), (4.14), (4.15),
and the definitions of-
gli and g2i' ~2 becomes
i
w
=2
=
i
i = 1, 2,
3
(4.18)
49
From equations
(4.12), (4.16) and (4.17), W may be
3i
e~ressed
as
.
= 1, 2,
i
3, ...
(4.19)
Therefore,
i
= 1,
Therefore, using equations
(4.20)
2, 3
(4.11), (4.19) and (4.20) minimum integrated
squared bias, B(7i)' with the true 7 value equal to the i.:Yl grid point,
is given by
i = 1, 2,
(4.21)
3
I f the true 7 is actually at anyone of the three grid points, equation
(4.21) will give the value of the integrated squared
bias, B(7.),
.
J.
which will be, in fact, the minimum value,
~.~.,
Min B for 7
~~
... ----
= 7i'
However, suppose the
true value of 7 is not one of t!J.e .yalues
,.
on the grid. Then fr0I9- equation (3.24), it can be seen that the value
.......
"
of the integrated squared bias is equal to B(7), given in equation
(4.21), with 7i replaced by 7, plus the positive quantity defined by
e~ression
(4.22)
( Y.
-1)
- Wl-1)
~2 ' Wl (y. - Wl ~2
'
where y, W and ~ are defined in Chapter 3, with '7 not a grid value.
l
It is possible to express equation (4.22) in terms of the matrix T,
which defines the (Min V
I Min
B) estimator.
To do this the follow-
ing expression is required:
Recall the definition of the vector
y., in Chapter 3 ,
a: + T' E 13 ,
V = T' J
-
I
where the (N x 1) vector --,
E is defined by
E
I
=
e
'7:XW
But,
(4.26)
51
therefore, using equations (4.2,3), (4.24) and (4.26), the expression,(4.22) may be written as
where
That is, when the true value. of .., is not one of the grid points,
the integrated square bias of the linear Min V
I Min
B estimator
is given by
B..,
= B(..,)
+ ~2(T'E
- W- l -J!;
w-)' Wl(T'E
- W- l -e
w)
-y
-..,
l
l
,
(4.28)
where the function B(..,) is given by equation (4.21), with the argument
of the function at the value'of..,.
This function in equation (4.28)
has been evalup.ted for various grids.
From the definition, given in
equation (1.6), the primary criterion is N2 B(..,).
The graphs shown
C1
in this section, and the next, show N2 B(..,) plotted against .." with
2
2
C1
.
.
~ = 1, since ~ acts as a scale factor in equation (4.28), changes
C1
a
2
in the value of ~ will only cause corresponding increa~e~ or
C1
decreases in the various effects shown (Figures 4.2), (4.3) and Table
(4.1».
It can be shown that equation (4.28) reduces to
-'.
-
..
'
52
NB
a
2
10
5
.
~-----'--=';"--':------------'.:~--~-2-=--5---
I
Figure 4.2 ~ for the estimator defined by (2,4; .6, .3; .340514, .834771)a
a
~ere the notation (d, r; Ie' 0;
.e1 ' .e2 )
is used.
53
E
10
Least Squares
5
(Min V
2
1
Figure 4.3
I Min
B)a
2
The integrated MSE ~or the least squares estimator and
Min V I Min B estimator
~or the defining vector (2,4;.6, .3;.340514,.834771).
54
In using a linear estimator (a Min V
I Min
B estimator), it
cannot be expected to perform well for relatively large values of 7
(7 > 3, say).
= 3.6,
For exaIlij?le, for 7
using a linear approximating
2
polynomial, the minimum B > 106~, and therefore, this is the best
a
that can be achieved,
1.~.,
the true value of 7, say 7
if one of the grid points was, in fact,
= 3.6,
then the minimum integrated squared
2
bias would be greater than 106 ~
Therefore, if the ~ priori
a
information indicates that 7 will be greater than 3, then the linear
I
approximating polynomial should not be considered.
The bias function B , for 0 < 7 < 6, was evaluated (see Table 4.1)
7
for the following situations:
e
(2,4; 0.6, 0·3; 0.3405 14, 0.834771)
(2,4; 2,1; 0.846623, 0.309123)
(2,4; 1.5, 1; 0.840539, 0.315348)
,
,
,
(4.3 0 )
(4.3 1 )
(4.32)
(2,4; 1.3, 0·5; 0.83885, 0·319857)
(4·33 )
The four-level symmetric designs given in the vectors (4.30) through
1\
(4·33), are the designs which minimize V2' the integrated var y(x)
(d = 2) for the corresponding grids. These "min V -II designs are listed
2
in Table 4.2, and are for the situation N = 4.
The grid given in vector (4.30) was selected for the-linear
estimato!", since i t 7 was expected to lie in the interval [0,1],
then for any such 7' the difference between B and Min B (i.e., B(7»
7
is zero (to 10 decimal places).
In
--
addition, the variance of the
estimator, when using this grid system, cOIlij?ared very favourably with
55
tl:e variance of the estimator using other grid systeIllS'
The grids.
given in vectors (4.31) through (4.33) are shown for comparative
reasons only, (see Figure 4 t 2).
4-.3 Variance for the Linear Estimator
From eguation (3.62), the variance of the linear Min V
I Min
B
1\
estimator, y{x), is given by
1\
var y(x
)
=~
A(E'E
)-1
2
AI ~1 a
Note that the order of the matrix E is dependent on the sanq>le N,
and the grid size (r - 1), and not on the degree of the approximating
polynomial.
Therefore, no matter how large the degree of the polynomial
used, i f the grid size is 3 (i:,-!.., r = 4) only a (4 x 4:) need be inverted.
Let
all
a
12
l
,
A{E'Er A' =
a 12
a
22
where the diagonal elements all and a are easily calculated.
22
Then,
1\
the integra.ted var y{x) is given by
.-~
-
Lemma 2:
Suppose another point is a.dded to the grid of values of 'Y, so
that there are now r points in the new grid.
Then
*
where V is the integrated var A
y(x) for the case where r points are
d
A
used in the grid, and Vd is the ;I.ntegrated var y(x); when there is one
A
less point in the grid.
The estimator y(x) is a polynomial of degree
d - 1.
'Proof':
Let
I
A = (A2
-~
I a)
1-
,
I
and
I
E2
= (~
:~)
,
where the (d xl) vector !!: is the new column, in the A-matrix, due to
adding the extra grid point, and
in the E-matrix.
(E~
~
is the ,corresponding new column·
Let
EJ.~
llli~
-1
=
E2fl =
zt ~
-1
Zi Zl
[:. :]
,
(4·39J.
where
D=
(llli
~)-l [I -
(Ei ~)qt]
,
(4.40)
57
and
Let
for the case with r points in the grid, and
for the case with r - 1 points in the gri4.
Using equations
*
Vd (x)
= 2S:i.
(4.37)
(~ D
Ai.
Substituting for D and
equation
(4.44)
gives
and
g"
(4.38),
+ ~ g,'
equation
Ai. + A g, ~'
from equations
Then
(4.43)
+
U
(4.40)
becomes
.! .!') ~
and
•
(4.41),
into
Defining
(4.46)
a scalar, and
k(x)
= (xl
a) = (a'
x )
-J. -1
also a scalar, equation
*
Vd(x)
'
(4.45)
= Vd (
x) +2
u(c (x)
reduces to
- 2k(x) c(x) + k2 (x)] ,
/
that is,
(4.48 )
Now, from equation
(4.42)
The (N x N) matrix
~s
a symmetric idempotent matrix of rank r
is positive semi-definite (Graybill, 1961).
This gives, from equation (4.48), that
-
=s N. Therefore, this matrix
Thus, u is non-negative.
59
Therefore,
*
N
20'2
V =d
*
f 1-1 Vd(X)
dx
N
- 20'2
>-
J1-1 Vd(X)
dx = V ,
d
that is,
Therefore, in choosing the grid size Lemma 2 should be considered,
.!.~.,
whether the increase in ":protection" from the bias viewpoint is
worth the corres:ponding increase in the variance of the estimator.
(By "protection" it is meant that min B is obtained for several :possible
., values.)
A grid of size 3 seems sufficient for most purposes.
Table
(4.2) gives the "Min V " four-level symmetric designs for the ·three
2
point grids, given in the vectors (4.30) through (4.33), with N = 4, 6,
8, and 10.
4.4
Comparison with the Linear Least Sguares Estimator
Sup:pose that the true functional relationshi:p in equation (3.1) .
is incorrectly assumed to be linear, and is fitted by least squares
"y * =cra+01x
" "
,
where
..
"
= the least squares estimators,
~--.,.-
"
60
and the (N x 2) matrix x is given by
x=
1
xl
1
x
•
•
1
•
2
•
· •·
~
Therefore,
(XrX)
=
:J
[:
,
i
since only symmetric designs are considered, Ex
i
(4.49)
and
*
A
y (x)
From equations
(4·50)
=~
(XrX )-1 Xr
where the (1 x 2) vector
~
= 0.
= (1,
2£i
~
,
is
x)
It is easily shown (see Appendix
9.4),
point x, due to fitting equation
(4.49),
that the squared bias, at the
when the' true fUhctional
relationship is as in equation (3.1), is given by equation
2
(4.55),
61
= 1,
where all summations are over i
B =
N
f
2
20
+
~
-1
4y
(x)
Therefore,
2
NS2
B (x) d.x = 2
0
e 2')' - e-2Y ]
*
var
1
2, ••. , N.
,
= ~i(X'X)-l ~l
0
2
Therefore,
1
N
VL =-2
20
J
-1
,,*
( ) d.x=N
vary x
(1 + ...L-)
N
3I:x2
,
i
that is,
VL = °1
+.1L2
3I:xi
For y in the range [0,6], equation (4.56)-was evaluated, for that
,
four level design which gave min V2' for the estimator defined by
vector (4.30).
The integrated variance V , given in equation (4.57),
L
62
was also evaluated for the same design, ana. gave a value of 1.820218.
The results are shown in Figure 4.3.
bias (defined by equation
4.56)
of the least squares estimator, is
much greater than that of the Min V
equation 4.29) for any 'Y.
error (MSE) for the Min V
It is clear, that the integrated
I Min
B estimator (as defined by
Figure 4.3 shows the integrated mean square
I Min
B estimator, and the integra.ted MSE
for the least squares estimator, plotted for 'Y over the range [0, 3.5],
with (32/0 2
= 1.
The Min V
I Min
B linear estimator is obviously
better from both bias and MSE viewpoint.
In Chapter 5, a com;parison is made, of the linear Min V
estimator and the quadra.tic Min V
I Min
B estimator.
I Min
B
5.
THE QUADRATIC ESTIMATOR (d
5.1
= 3)
Introduction
This chapter is concerned with approximating the e:x.-ponential
response, defined in equation (3.1), by a quadratic polynomial, where
the fitted equation is given to be
In equation (5.1) the (3 x 1) vector
£ is
the vector of estimators
of the coefficients, where the estimators are the Min V
estimators, derived in Chapter 3.
I Min
B
Recall the reasons, given in
Chapter 4, for adopting a grid of size 3.
These reasons involved
the results of Lemma 2, and the fact that the actual bias curve was
forced to obtain Min B at the middle and end points of the grid.
This
restricts the magnitude of the deviations of the actual bias from
minimum bias, while not adding too greatly to the variance of the
estimator.
In this chapter the minimum integrated squared b.ias, B (.., ), for
3
d = 3, will be e~ressed in terms of the analagous'quantity for the·,
case d
= 2,
"
B (,). It will also be shown, that the integrated var y(x),
2
. vaX y{x), for
for d = 3, is greater than or equal to the integrated
d
= 2;
"
where the same grid, of size (r - 1), ·1s used in both estimators.
The integrated mean square error (MSE) is computed for the case when
the response is fitted by
*
" (x)
Y
="
eta +"(XiX
x
+ "
(X22
,
1\
where
1\
(ac, '1.'
A
~) are*the least squares estimators.
The integrated
1\
MSE of the estimator y '(x) is compared with that of the quadratic
Min V
I Min
B estimator given in equation (5.1).
The integrated variance of y(x), V ' where y (x) is the Min V
3
Min B estimator in equation (5.1), is derived in section 5.3.
I
Note
that V depends on the particular grid used, and on the experimental
3
design. As in Chapter 4, a four level, symmetric design is used,
since this is the minimum number of levels required by the nonsingularity requirement of the matrix (E'E).
The four levels of this
synnn.etric design are denoted by :ttl and :tt (t , t 2
2 l
> 0) with
~
observations at each of :ttl' and n
observations at each of :!:t2'
2
Note that N = 2(n + n ) • For various specified grids, the four
1
2
level design which minimizes V is found, for different values of
3
N. These "Min V " designs are fotUld by a computer search procedure.
3
5.2
Bias for the Quadratic Estimator
Adopting the notation, defined in the previous chapter, an
estimator may be uniquely represented by the row vector
when a four level design is used; where d - 1 is the degr-ee of the
approximating polynomial, r - 1 is the grid size; r
c
is the center
point of the grid, 8 is the distance between grid points, and :ttl'
:tt
2
are the design points.
In this chapter, the defining vectors
considered are the following:
65
(3, 4; 0.6, 0.3; 0.3011, 0.86535)
(3, 4; 2·5, 1; 0.8633 06, 0·32789)
(3, 4; 1·5, 1; 0.85517, 0.3 00379)
(3, 4; 1·3, 0.5; 0.29191, 0.85111)
,
,
,
,
(5·4)
(5·5)
(5·6)
(5·7)
for each one of these vectors the deviations of actual bias, B
r
from minimum. bias B(r) are calculated for 0
B (r) - B(r)
r
(r),
< r S 6, that is,
0<rS 6 ,
,
/
is calculated, where B (r) and B(r), for d = 3, are to be derived
r
in this section.
The estimator given by vector
Min V
I Min
B estimator, in the co~arison of this estimator with
the lmear Min V
I Min
B estimator, and also with the quadratic
least squares estimator.
vector
(5.5) was used for the quadratic
The reasons for the choice of the defining
(5.5) were that for the grid contained in this vector, and for
the range of r considered, the deviations
(5.8) were very small, and
A
in addition, the integrated var y(x) for the estimator using this
grid was smaller than the variance obtained for the estimator using "
other grid systems with co~arable bias curves (see Tables
The estimator specified by vector
(5.4),
may
be d.:trectly -eo~ared with
the linear estimator, using the same grid, of_Chapter 4.
given in the vectors
(5.6) and (5.7) are used for
illustrative pur,poses only.
5.1 and 5.2).
The grids
co~arative and
The designs given in vectors
(5.4) through
(5.7) are those four level, symmetric designs, which minimize
the given grids, with N =
4 (see Table 5.2).
V ' for
3
·.
Table 5.1 Bias and deviations for various grids
Defining Vectors
b
2
Deviation
Bias
1a
"I
Bias
Deviation
.00000 0.0
.2
.00001 0.0
.4
.00012 0.0
.6
.8
.00072 0.0
8
1.0
.00288 0.1.10- 7
1.2
.00908 0.64.10-8
1.4
.02444 0.68.10-6
1.6
.05866 0.4.10- 4
1.8
.12926 0.18.10-4
2.0
.26675 0.6.102.2
.52283 0.0002
2.4
·98334 0.00056
2.6
1·78848 0.0016
2.8
3·16435 0.0032
3. 0
5·47175 0.0073
9·28145 0.015873
3·2
0.03288
15·48986'
3·4
3·6 25·49615 . 0.0661
3·8 41.47277 0.i2933
4.0 66·77754 0.24758
4.2 106·58009 0.4649
5. 0 646.60095 4.?8
6.0 5571.02222 76.64
Bias
4
0.32.10- 4
Bias
d
4
Deviation
6
_6
.00000 0.128.10.00000
0.2.10 6
.00003
0.2.10- 7
.00001
.00001 0'36.10-~
.00008 0·72.10.00012
0.4.10.00012 0.48.10-6
.00020 0.84.10-:
0.0
.00072
.00072 0.4.10- 6
.00078 0.64.10- 4
7
.00288
0.13·10-a
.00288 0.8.10-6
.00291 0.36•10-_ 4
.00908
.00908 0.6.10- 6
.00910 0.136.10 6
0.5. 10- a
.02444
0.8.10- 7
.02444 0.116.10;
.02445 0.133. 106
0.44.10.04866
.05866 0.16.10- 6
.05866 0.12.10;
.12924
.12924 0.20.10- 6
.12925 0.8.100.0. 6
0.1.10- 4
.26668
.26669 0.48.10;
.26669 0.96.10-6 5
0.13.104
·52264
·52264 0.8.10- 6
·52264 0.8.10- 6
0.8.10.98286
·98279 0.188.1~·98279 0.124.10- 6
0.00032
1·78741
1.78709 0.4.101.78709 0.165.10- 4
6.10 _4
0.0012
16216
0.9
3.
3·16115
3·16117 0.168.104
0.0032
5·46766
5·46491 0.00056
5·46440 0.4.10- 4
0.009
9·27445
9.26562 0.48.10- 4
9·26773 0.00216
0.024
0.0068
8
0.148.1015·47905
15.463
1
15·45699
4
0.0515
25·48162
25.44902 0.0188
25.43 01 2 0.36.100.1143
41.45777
41·39115 0.0476
41.34415 0.0008
0.2431
0.1120
0.00404
66.64192
66·77309
66·53401
106.61421
0.4992
.106·36376 0.2488
106.13073 0.0156
6.808
648.42649
1
4.1472
645.765
3
0·7692
642·38710
1)611).1)011)1) 121.12
1)1)'78.1)1219 84.B2
I) I) 21.87189
27.4918
a(3,4j.6,·3j·3 0111,.86535)
b(3,4;2.5,1;.8633 06,.32789)
c(3,4j1·5,lj.855 17,.3Q0379)
d(3,4;1.3,.5j·29191,.85111)
e
c
3
Deviation
}
Where the notation (d, rj
"I
c' 8; ll' l2) is used.
I
0'\
0'\
e·
e
-
..
e
.
e
.
Table 5.2 Designs for minimizing V
3
N
n1
4
1
\
Grid (.3, .6, ·9)
n
£1
£2
2
Grid (.5, 1·5, 2·5)
n
£1
£2
2
Min V
N
n
1
4
1
0.855170
1
0·3°°379
2·7679°194
6
1
0.892210
2
0·3235°3
2·93355°96
° ·3°1110
1
0.865369
2·79877629
0·920110
2
0·367461
2·78512078
Min V
6
1
8
1
0·920110
3
0.370110
3·03876388
8
1
0·920110
3
0·370110
3·21539°70
8
2
° ·3°1110
2
0.865357
2·79877629
8
2
0.855 170
2
0·3°°379
2·76790194
Grid (.8, 1.3, 1.8)
n2
£1
£2
N
n1
4
1
0.291910
1
6
1
0·9°5572
8
8
I
Grid (1.5, 2·5, 3.5)
n
£2
£1
2
Min V
Min V
N
n1
0.851110
2·77734982
4
1
0.8633 06
1
0·327890
2
0·352794
2·91381729
6
1
0.870228
2
0·334404· 2.82916433
1
0·920110 . 3
0·36875°
3 .17118532 .
8
1
0.875496
3
0·3414°7
3·23704859
2
0.291910
2
0.851110
2·77734982
8
2
0.863306
2
0·327890
2.64028376
\.
2.64028376
"
0'\
~
68
To derive the expression relating B (')') and B (')'), it will be
2
3
necessary to define some terms.
The scalar W.3
is defined as
i
i = 1, 2,.3
,
where
')'i
~i
= e
g
= e
2i
til
~i
- e
')'i
+ e
_1 i
i = 1, 2, .3
-I'i
i
= 1,
2, .3
e-
........... ...
=
that is,
~
i
=
•
•
•
•
•
•
where the (2 x 1) vector
•
h:t
•
•
is
•
•
•
•
•
•
•
•
0
•
•
•
i
= 1,
2, .3
69
From the definition of WI' in Chapter 3,
(5·13 )
1
L=
0
,
o !
3
that is,L corresponds to WI for the case d
and the vector
~
is
1
3
o
Define
-1
L
-1
WI =
"'\
~
1
~' 5
A
Ii
12'
f
=
where A is a (2 x 2) matrix given by
,
= 2,
in Chapter 4 J
70
A=L
-1
[I-~~']
the (1 x 2) vector ~' is given by
B' = -f
-1
~'L
,
(5.18)
and it can easily be shown that f
= 45/4.
We are now in a position to
derive B (r) in terms of B (r).
2
3
From equation (3.24) with the matrix T' satisfying equations
(3 •35) and (3.36), if the true r is the i th grid value, then Min B
is a.ttained and is given by
i
.
Using the definitions of
•
-1
and W ' the second term. in equation
1
i
£!',o
i
2, 3
~2
(5.19) is given oy
where the scalar k
= 1,
,
f
is given by
,i
Therefore, for i
= 1,
~ w~ ~2 =!!i
1
i
i
2, and 3
A
h:t
+ 2ki
~' % + f k~
•
= 1, 2,
3
.(5.21)
71
Using the definitions
(5.17)
and
(5.18),
this expression becomes for
i :; 1, 2, and,
Let the scalar c
c :; h' L
i
-1
9.
SUbstituting c
be defined as
i
w
-
i
i :; 1, 2, ,
into equation (5.23) gives the following expression:
i
= 1,
2, 3
Therefore, from the definition of minimum bias, given by equation
(5.19),
B (r) may be expressed as
3
i = 1, 2,
when r takes on the value of the ith grid point.
3,
(5.26)
However, recalling
the definition of B Cr) given in Chapter 4, equation (5.26) becomes ..
2
i = 1, 2,
Using the definition of c
3
•
in' equation (5.24) and substituting for w,
i
h. and L-1, it is possible to express B, (r .) in terms of B ( r .) and the
~
~
2 ~
72
scalars gli and S2i·
The scalar
Substituting for k , f and c ' equation (5.27) becomes
i
i
(5.28)
Equation (5.28) expresses Min B for the quadratic estimator, B
3
in terms of Min B for the linear estimator, B (1 i).
2
(/i)'
Obviously, from
equation (5.28)
for i = 1, 2, and 3
If, in equation (5.29), I i is replaced by I' (0
<
I ~
6),
then this
gives the fUnction Min B, for the quadratic approximating polynomial.
It is this function which is cOIIq:lared with the actual bias B
I
the various estimators.
(I)'
for
Obviously, from the construction of the
estimator,
B
I
(I)
= Min B
(I) ,
when I is anyone of the grid points.
From the definition of B , in
I
Chapter 3, it can be shown that the actual bias for any I is given by
B = B(/) +
I
~2(T'E - w- l w ) I W (T'E - w- l ~E) ,
I
l-E
l-r
l
73
where B(y) is the Min B, for the case d
!y
The vector
e
e
and the vector
w = -E
2
(5.28).
YXi
YX2
,
--y
1
given by equation
is given by equation (5.31),
e
E =
= 3,
y~
~E
1
J-1
e
is defined as
Yx
1
ax.
x
x
2
Substituting B( Y), from equation (5 .19) into equation (5.3°), and
rearranging terms gives
For the grid systems given in vectors (5.4.) through (5· 7),
equation (5.33) was evaluated, for ° < Y :S ,'6, and compared with Min B
(See Figure 5.1, and Table 5.1).
In Table 5.1, f~r each one of these
grids, N Boy is given, and the deviation of B· from Min B, that is,
2
a
'
Y
74
~
2
a
10
I
5
======::::i--------""---------r
2
'3
3.5
L - - - -.......
1
1.5
Figure 5.1 N~ for the estimator defined
a
by
(3,4;.2.5,1; .863306, .32789)a
75
2
is also given, where ~
= 1.
To give a representative 1nq)ression
(]
of the bias function,
N~Z
2 ' for the various grids,
i2 was chosen to
(]
(]
2
N~
~ is a multiplicative factor in ~. ;
be equal to unity since
(]
(]
2
different values of ~ merely magnifying or decreasing the various
(]
effects shown in Table 5.1, Figures
5~1
and 5.2.
In the following section, the designs which &m1nim1ze the
integrated va.r y(x), are derived, and are given in Table 5.2.·
5.3 Variance for the Quadratic Estimator
A
From the definition of y(x), the quadratic Min V
I Min
B
estimator,
A
var y(x)
=~
A(E'E)
The (4 x 4) matrix (E'E)
-1
-1
A' ~l (]
2
is the sam.eas that used in the linear
case (1'~"
Chapter 4), and depends on the experimental design and
grid used.
The matrix A is a (d x r), where d - 1 is the. degree of
the fitted equation, and r - 1 is the grid size, and thus .in this
case Ais a (3 x ~), as opposed to a (2 x 4) in the linear situation.
Let the (3 x 3) matrix
A(E'E)-l A' =
symmetric
(]22
10
/
/
5
Quadratic
Figure 5.2
Integrated MSE for the quadratic
a
.
-
b
and linear
~..,
3
estimators
aDefined by the vector (3,4;2.5,1;.863306,.32789), where the notation (d, r; Yc' 8; t 1 , t 2 ) is used.
bDefined by the vector (2,4;.6,.3;.34°514,.834771), where the notation (d, r; Yc' 8; t 1 , t ) is used.
2
77
Then, using equations (5.35) and (5.36)
For each oi.' the grids given in the dei.'ining vectors (5.1) through
(5 ·7), the minimum oi.' V , in equation (5.37), among all symmetric,
3
i.'our level designs, was i.'ound, for various values of N.
These designs
which minimize V , for the situations considered, are listed in Table
3
5.2.
These designs are called Min V designs.
3
The following Lemma. gives the relationship between V , and the
3
1\
integrated var y(x) for d = 2, V2 •
Lemma. 3:
Let V be the integrated variance for the quadratic estimator
3
(!.!!., V given by equation (5 .37», and V2 the integrated variance
3
for the linear estimator (V given in Chapter 4). Let both estimators
2
have the same grid for a fixed design (!.!!., fixed .£1 and '£2)·
the grid system used has r - 1 distinct, non-zero points.
Suppose
Then, if
(ErE) is non-singular
Proof:
Denote by A , the associated A-matrix for the quadratic estimator,
2
and by
~
the associated A-matrix for the linear estimator.
the definition of the A-matrix,
e.
Then, from
where the (1 x r) vector ~' is
•• 9'
where
eli =
e
"1 i
- e
2
"1i
-"1i
i = 1, 2, •.. , r - 1
and
Define the (2xl) vector J:! as
J:!=
o
so that the (3 x 3) matrix, partitioned as shown,
-
it:;
~
4
e
1,r-l
.-:.,,:
e
79
0
1
....
0
where
I
I
I
J!
I
1
0
1
I
I
13
I
I
0
I. •
I
'1
I
0
1
2
.r .
.Q'
I '1
I
=
.
,
Q' is the (1 x 2) nul'1 vector and 12 is the (2 x 2) identity
matrix.
1\
From the definition of the var y(x) when d
1 2 !:.
1\
var y(x) = ~'
Q.t· ·1
[~
I
K(Ai
~'
I a.)
1-
= 3,
and equations
12 0
1:'
1
~
,
(5·44) .
where the (r x r) matrix K is defined as (for E"E non-singular)
K = (E'Ef1
and
z' = (x'
-"1'
-
2
x )
(5·46)
(5.44)
Expanding equation
\
1\
ve.r y(x) =~'
~KAJ. + ~'KAi + ~K!I::' + ~'K'£!:!:'
2x2a'Vall'x + x 4a.'Ka
-
.u::t:.
-1
--
•
80
Using the definition of
t-
~',
given in equation (5.42)i var'Y(x) raay
be expressed as
_1
3
since ~~~ = (1 x)
the linear Min V
equation
o
I Min
,
'1
= - 3'
and where V2 (x) is the variance of
B estimator, at the :point x.
Rearranging
(5.48) gives
'x
2
- -31
+ a(x
-
2
1 2
--)
3.
1)
x(x2 - 3
(5·49 )
Integration of equation (5 .• 49) 'over [-1, 1], and division by 2 gives
v3
= V2 + L45-a'
K!.
•
Since (ErE) is assumed non-singular, K is a :positive definite
matrix (Graybill, 1961) and therefore,
!.' K!. > 0
•
Thus,
Table 5.2 gives the minimum. values of V ' for various N values,
3
and for the grids s:pecified in vectors
(5.4) through. (5.7). Table 4.2
e
81
lists the. minimum values of V2 for the grids of Chapter 4, and for
the same values of N as in Table 5.2.
The result of Lemma 3 is
evident fram a comparison of these two tables.
In addition Figure
5.2 shows a comparison of the integrated MSE for the linear and
quadratic estimator.
It is not until '1 exceeds 1.5, that the
smaller bias of the quadratic estimator balances its larger variance,
and the integrated MSE for the quadratic estimator becomes smaller
-
than that of the linear estimator.
This suggests that the linear
estimator should only be used when it is anticipated that '1€[o, 1.5].
5.4 COmparison with the Quadratic Least Sguares Estimator
In this section, it is supposed that the response is incorrectly
assumed to be quadratic in the x-variate, given by equation (5.5+),
when in fact the response is the exponential of equation (3.1),
Further, suppose that the model (5.51) is fitted by least squares, so
that the approximating function becomes
*
'"
A
A
"'2
Y (x) = a O + alx + a~
,
A
where the (3 x 1) vector of estimators g'
usual least squares estimator,
~ = (X'x)-l XI! ,
A
A
A
= (~,.al'~)
is the
82
where, for a symmetric design, (X'X) is given by
(X'X) =
N
0
ZX
0
r.x.i2
0
r.x.2i
0
ZX4
i
,
(5·54)
1
The summations are over i = 1, 2, ••• , N.
For the four-level symmetric
design, with N = 4 observations, one at each of ::!:..tl , ::!:..t , the summa.2
tions in the elements of (X'X) become
and
Therefore, (X'X)-l may be written as
C
2
2
NC -C
2 l
(x'xf l =
0
-cl
2
NC 2-Cl
0
1
Cl
0
-c1
NC -C 2
2 l
0
N
NC -C 2
2 l
(5·57)
Now, it has been assumed that the true functional relationship is the
exponential, given in equation (3.1).
Thus, the squared bias at the
point x due to fitting the model (5.51) by the least squares estimation
procedure is
where
x
2
i
e
yx.
J.
'
and
(For the derivation of equation (5.58), see Appendix 9.1).
integrated squared bias (see Appendix 9.2)
B =
N
1
2
2!
B (x)
20
-1
ax ,
is given by equation (5.62):
.
The
84
where
g1 -- eY - e-Y
and
Equation (5.62) was evaluated for the designs which gave minimum
V for the estimator defined by vector (5.5) (see Table 5.2) with
3
(32/
(l
= 1, as before, and N = 4.
The integrated MSE for this least
squares estimator (equation (5.52)) is co~ared with that of the
quadratic Min V
I Min
B estimator (see Figure 5.3).
In order to
obtain the integrated MSE for the least squares estimator, the
integrated variance,
given by
*
V = N
v*,
for this estimator is necessary:
V* is
MaE
8
6
4
Least Squares
Min V
I MinB
2
1-.----"'---------~
1
Figure 5.3
2
Integrated MaE for least squares estimator
estimatorS-
---!o---1
3
ana Min V
Min B
aDefined by the vector (3,4;2.5,1;.863306,.32789), where the notation (d, r; I c ' 8; t l , t ) is used.
2
86
For the Min V designs, mention.ed in the previous paragraph,
3
equation (5.63). gave
*
V = 2.741005.
Figure 5.3 shows the integrated MSE for the quadratic Min V , Min B
estimator, and for the quadratic least squares estimator (denoted by
IMSE).
*
Then
*
IMSE = B + 2.741005
,
where B is given by equation (5.62).
Min V
I Min
The integrated MSE for the
B estimator, us ing the grid in vector (5.5), and the
corresponding Min V design, is always less than that of the least
3
squares estimator; the difference increasing as r becomes larger.
Therefore, in both cases, linear and quadratic, the Min V
I Min
B
estimator is superior, from the integrated MSE point of view, to
the corresponding least squares estimator for that particular design
chosen.
°< r
~
The linear Min V
I Min
B estimator should be used when
1.5, and the quadratic estimator otherwise, r
values of
r,
considered.
~
3. For larger
higher order polynomial approximating functions should be
6.
DERIVATION OF THE ESTIMATOR WHICH PROTECTS SJMULTANEOUSLY
AGAINST HIGHER ORDER POLmOMIALS AND EXPONENTIALS
6.1 Introduction
In this chapter, it is assumed that the true functional relation-
ship is either a polynomial in x of degree d+ k - 1, as in Karson,
Hader and Manson (1967), or an exponential with a grid of possible
values on the parameter, '1 , in the exponent, as in Chapter
3.
The
polynomial model may be expressed as
(6.1)
where
~ =
(1, x, x2 , ••• , xd-l ) ,
I =
X
~
(d
x, xd+l , ••• , xd+k-l)
,
and
The exponential is given in equation (3.1), and the grid of values in
the expression
(3.3).
88
It is intended to derive an estimator which satisfies the two
criteria given by equations (1.6) and (1.7) when the true f~ctiona.l
relationship is either the polynomial, given by equation (6.1), or
the exponential given in equation
(3.1).
This estimator will then
give simultaneous protection, in the sense of Min B, against either
of the two models.
In other words, this estimator will satisfy the
conditions derived in Karson, Hader and Manson (1967), and those
specified in Chapter 3.
Using
1\
y(x) = ~:£
as the approximating polynomial, it is proposed to restrict attention
"'-
to those
:£' s
which are linear combinations of the observations, and
therefore, let
J2.
= Tr:l.
(6.4)
•
It remains now to determine T' such that the primary and secondary
criteria, equations
(1.6)
and
(1.7),
respectively, are satisfied.
-
...
Deriving T' will define, in terms of T, the estimator, which may be
written as
1\
y(x)
=~
Try:'
•
6.2
The Bias Criterion
This is the primary criterion.
the matrix T r must satisfy
From equations
(3.35) and (3.36)
~.-
89
1
T' il.
0
0
,
..
=
•
•
(6.6)
0
and
T' ~
-1
= Wl
i = 1, 2, ••• , r - 1
~E
for a grid of size (r-l), where
il. is the unit vector, and
and ~ are defined in section (3.3).
i
equations
(6.6)
,
i
and
(6.7)
Any matrix,
will give the estimator
~i' W
l
T',
satisfying
£ the
minimum
bias property, when the true functional relationship is given in
equation (3.1) and the true'Y is anyone of the (r-l) gridpoin~s.
In addition to T' satisfying equations
(6.6)
and
(6.7),
suppose T'
also satisfies the two conditions derived in Karson, Hader and
Manson
(1967),
namely
(6.8)
and
where the (N x d) matrix
is.
is defined as
90
1
xl
2
~
1
x2
2
x2
~=
1
...
....
2
~
.:KN
d-1
~
x d-l
2
··•
,
(6.10)
d-l
:KN
the (N x k) matrix X as
2
d
~
X2
d+l
xl
·••
=
d
~
·••
..
d+l
~
~.
d+k-l
~
,
(6.11)
d+k-l
~
and the (d x k) matrix W as
2
1
1
W=-!
2
2
-1 ~~dx
-.I.
(6.12)
,
The (d x d) matrix, W ' was defined in Chapter .3.
1
order that the estimator
~
Therefore, in
may simultaneously protect against the
true functional relationships defined by either eqUation (.3.1) or
equation (6.1), the matrix '1" must simultaneously satisfy equations
,.
(6.6), (6.7), (6.8) and (6.9).
However, condition (6.8) implies
condition (6.6), since the matrix ~ may be writte~ as
I
*
~ = (~ I X)
,
I
where the N x (d-l) matrix X* is defined as
91
...
2
xl
x2
*
X =
x2
2
d-l
xl
x d-1
2
(6.14)
d-l
~
Using ~quation
(6.13) condition (6.8) may be written as
I
I
I x*) =
T' (l
I
1
0
•
o
I 0
0·. • •
0
1
100
I ...........
I
......................
I
,
..... 1
tha.t is,
e
TIl
=
1
0
0
(6.16)
0
and
TIX*
=
0
1
0
...
.......
0
Equation
0
0
1..........
,
0
0
0
'-.1
(6.16) is condition (6.6). Therefore, 'in summary the
conditions which must be satisfied, to obtain the simultaneous protection mentioned,
(6.18)
92
-1
T'x
= W w...
9.
l-J:!i
i = 1, 2, .•• , r - 1
i
These conditions may be written as
(6.20)
T'X = A
where the N x (d + k + r - 1) matrix X is defined as
X
= (Xl
I
I
I
IX
I xl
Ix
I 2 1-
19:
~-1]
...
.. •
(6.21)
,
and the d x (d + k + r - 1) matrix A is defined by
I
I
I
W-1w
]
1 -E(r-1)
6.3 The Variance Criterion
Condition (6.20) does not uniquely define the (d x N) matrix T'
which in turn, therefore, does not uniquely define the estimator
:E..
It is intended to choose that estimator £., given by equation (6.4)
with T' satisfying equation (6.20), which has the smallest integrated
From equation (6.5)
variance.
A
var y(x; T) =
2
2£:i. T' T ~ a
Let
N
V =2
20
f
1
-1
A
var y(x; T)
N
ax ="2
1
J-1 ~
T' T 2£:1
93
When d is odd equation (6.24) becomes
V=N
.d±l
2
d+1
2
\'
L
n=1
1
,
t
--L t'
t +
2n-1·~-1 ~l + n=2 2n-l -2n-2 -2
L
--.L
t'
2n-1 -2n-d
where the (d x
t
~
N) matrix T' is expressed as
t'
-1
t'
T'
-=-2
=
t'
~
When d is even equation (6.24) becomes
d+2
2'
d/2
V=N
L
n=1
--L t'
d
L
L
t +
2n=1
2n-1 -2n-l -1
.n-2
+
-l...t'
t
d+2 2n-1 -2n-d ~
n=T
12n-e !2 +
...
94
It is
~tended
to minimize V given by equation
(6.24), with respect to
T', subject to conditions (6.20), which may be expressed as
t'1
....
T'X·=
,
X
~l
t' X
-2
a'
-2
=
=
(6.28)
A
Using the Lagrangian multiplier technique, the
equation~
to be
differentiated are
,
where
~
i = 1, 2, ••• , d
are the Lagrangian multipliers.
tiated with respect to
1i
The f1lllction L is differeni
, i = 1, 2, ••• , d. This leads to equation
(6·3°)
I
: ~) = X A
,
which holds whether d is odd or even, and where the (d + k + r - 1)
%
matrix A is defined as
I
I
A = (~
~I
...
Next, equation (6.29) is differentiated with respect to the Lagrangian
multipliers,
T'X = A
A...
-:L.
The resulting equations may be expressed as
(6·32)
d
95
(6.24)
The T' which m:in1m:l.zes equation
satisfies equations
W T' X
1
= A'
(6.30)
(X' X)
and
(6.32).
is that T' which simultaneously
From equation
(6.30)
,
that is,
assuming (X' X) is non-singular.
(6.34)
But using equation
(6.32),
equation
leads to
= W1
A'
A(X' Xr
Substituting
1
•
A' into equation
(6.30)
gives
(6·36)
Postmultiplication of both sides of equation
T'
= A(X'
(6.36)
by w~
1
gives
X)-l X'
As in Chapter 3, the matrix of the second derivatives is 2W'1' which is
positive definite, and so this is the T' which minimizes y given by
equation
(6.24),
subject to conditions
(6.20).
That is, the estimator
!2. defined by equation (6.4) and (6.37) gives simultaneous protection
against the true model being either the polynomial given in equation
(6.1), or the ex;ponential given by equation (3.1) with r allowed to
96
be any one of the (r-l) grid points.
Subject to this simultaneous
A
protection the integrated variance of y(x;T) is minimized, for any
fixed design.
Since (X' X) was assumed non-singular, it is necessary that the
number of levels of the independent variable x be greater than, or
equal to, d + k + r-l.
Therefore, the sample size, N, must also
satisfy this inequality.
Lemma 4:
A
Let V be the integrated variance of y(x,T), when the true model
l
is either a polynomial of degree d (.!.~., k
= 1)
or an ex,ponential
1\
(model in equation 3.1), and V be the integrated variance of y(x,T),
2
th
- when the true model is either (d + 1) . degree polynomial (.!.~., k = 2)
or an eX,Ponential.
In either case the same grid of size (r - 1) is used,
and the approximating polynomial is of degree (d - 1).
Then
For the proof of this lemma see Appendix (9.3).
This sim.ultaneous protection, in the sense of Min B for the twoalternative ty:pes of models, is an indication of the fleXibility of the
technique.
Further work may be done with the estimator d.e:fined by
equations (6.4) and (6.38).
It is easy to see, .that if the true
response model were the exponential (equation 3.1) with the true
r
at
the i]h grid point, then the integrated squared bias of the estimator
derived in this chapter is equal to that of the estimator of Chapter 3,
97
namely Min B( r . ), equation (3.39).
~
.
Similarly, if the true model is a
polynomial of degree d + k - 1 (equation 6.1), then the integrated
squared bias of the estimator defined in this chapter is the same as
that' of the estimator derived in Karson, Hader and Manson (1967),
namely Min B as defined in their paper.
However, one pays for
this increase in protection as can be seen from LeIIlIllElo 5.
Lemma. 5:
*
Let V be the integrated variance of the estimator A
y(x) derived
/
in Chapter 3, and V denote the integrated variance of the estimator
derived in this chapter.
In both cases the approximating polynomial
is of degree (d - 1), and the same grid, of size (r - 1), is used.
In the situation described in this chapter, !.~., for the estimator
protecting simultaneously against higher order polynomials and
exponentials, the higher order polynomial is of degree d + k - 1.
Then
V?V*
Proof:
Rearranging the X matrix given in equation (6~2l),
I
X = (E I X )
I
m
,
where the N x (d + k - 1) matrix X is defined as
m
98
xl
X
2
m=
X
~
·..
d-l
xl
d
xl
·..
..
x d-l
2
xd
2
·..
d-l
~
···
d
~
d+k-l
xl
...
...
xd+k-l
2
,
(6·39)
d+k-l
~
and for the corresponding arrangement of the A matrix,
I
A = (A1 :
Pm) ,
(6.40)
where the d x (d + k - 1) matrix A is given by
...
0
1
0
0
0
.
, , ··
•
0
0
'1
A =
m
~
0
0
-1 .
W W
1
2
(6.41)
The matrices Al and E are as defined in Chapter 3. Let the variance
of the estimator defined in this chapter be V(x), and that of the
estimator of Chapter 3 be V*(x).
V(x) = x' A(X'X)-1 A' x
--1
-1
0
Then
2
Using the above definitions, this becomes
99
where
-1
E'E
E'X
I
m
I
•• • ••• ••r ••• •• ••
X'X
X' E
mm
~2
·······r···-···
=
m
where
M
11
= (E'E)-l
~
=
--12
-M
(r - E'X'M' )
m 12
X' E(E'E)-l
22 m
(6.44)
'
(6.45 )
'
I
and
M = [XI X - X' E(E'E)-l E' X ]-1
22
m m m
m
Then substituting the definitions of
~l
•
and
(6.46)
~2
into equation
(6.42), vex) becomes
Vex) = ,,2 x! A (E ' Ef1 A'x.. +,,2[x! A (E'Ef
-.1 1
1~1
-.1 1
1 E' X M XI E(E'Er 1
m 22 m
that is,
where
(6.48)
100
and
z, = -J.
~
=2
JL
-~
(E'E)-l E' X M
m 22 '
where from equation (6.46), the matrix M is square and symmetric,
22
and since it has been assumed that (X'X)-l exists, M;~ must exist.
Equation (6.47) may be written
-1
The matrix M may be written as
22
M- l
22
=
X' [I - E(E'E)-l E'l X
m
m
Since, in the derivation of the estimator of Chapter 6, (X'X) was
assumed non-singular, X must be of full rank (i.e., rank of X =
m
d + k - 1 ~ N).
--
Therefore, since [I - E(E'E)
-1
m
E'l is a symmetric
)
idempotent matrix M-1 is positive semidefinite (
Graybill,
1961 .
22
Therefore, the quadratic form in equation (6.50) satisfies
Therefore,
Vex) ~ V*(x) , for every x .e[ -1, 1]
and so,
*
V?,V
,"
101
Similarly, it may be shown that V is greater than, or equal to
the integrated variance of' the estimator derived in Karson, Hader
and Manson (1967).
..
102
7. SUMMARY
The problem considered was that of fitting the exponential
response,
11 = a + l3e'i'X
,
-1 =5 x =5 1
,
by the approximating function
••• + b _ xd-l
d l
The vector of estimators
£ was
selected to satisty certain criteria.
The primary criterion of optimality was that of minimizing the
integrated sguared bias, and subject to this, the secondary criterion
A
of minimizing the integrated var y{x) was satisfied.
It was found not-possible to satisfy the primary criterion for
all values of 'i'.
Therefore, an alternative procedure was developed,
which led to the estimator derived in Chapter
estimator.
3, called the Min V
This alternative procedure was to assume that some
knowledge eXisted, as to the approximate range of 'i'.
of'i"
I Min
priori
§:
Over this range
a grid of values was chosen, 'i'l' 'i'2' ••• , 'i'r-l' and minimum
integrated sguared bias was attained, simultaneously, at each one of
these grid values.
integrated var
Subject to this Min B criterion, the minimum
~(x)
was attained (for any 'i')"-
fu
Chapters 4 and 5,
the integrated sguared bias was evaluated for various grids, when
the true value of 'i' was not a grid point, for the linear Cd
guadratic Cd
= 3)
approximating polynomials.
= 2)
and
It should be noted, that
B
1°3
this procedure does not restrict the design moments in order to
obtain Min B, a.s do other methods.
In addition, there is no restric-
tion on gamma., except that its approximate range should be known.
Having derived the estimation procedure, an additional criterion
was chosen; minimizing V over designs, for specified values of N,
and specified grids on 7'
For the linear estimator, the integrated
1\
var y(x), V , was expressed as an explicit function of the design
2
points, and the grid.
It was shown in Lemma 2, of Chapter
more grid points added the larger V becomes.
4,
that the
With Lemma 2 in mind,
it was found that a grid of size 3 was adequate to cope with the
problem of bias for any 7 in the ranges considered.
For the various
grids of size 3 used, the designs which minimize V were found,
2
using a com;puter direct search routine.
The designs considered
were those which had all odd, design moments equal to zero (1. ~. ,
symmetric designs).
The (N x r) matrix E, defined in Chapter 3,
where (r - 1) is the number of points in the grid and N is the
number of observations, was restricted to be of rank r, so that
the (r x r) matrix (E'E) was non-singular.
This restriction on
the matrix E, requires that the number of design levels be greater
than, or equal to r:
1\
The coefficients b , in the apprOXimating
i
.
-
function y(x), are really linear combinations of the least squares
estimators of the parameters in a linear model with (r - 1) controllable
variables; this may be seen from the definition of the Min V
estimator:e..
I Min
B
The minimization of V2'· over symmetric designs, was
restricted to four level designs,since this is the minimum. number of
104
levels required for a 3-point grid.
given in Table (4.2).
These "Min V " designs are
2
I Min
In Chapter 4 the linear Min V
B
estimator was compared with the linear least squares estimator for
"1
> 0, by comparing their respective integrated mean square errors
(MaE).
Using a Min V design the linear Min V
2
was superior, for all 'Y
> 0,
I Min
B estimator
from this integrated MSE point of view.
In Chapter 5 the quadratic (d = 3) Min V
I Min
B estimator was
A
developed.
The integrated va.r y(x), V3' was derived as a function
In Lemma. 3, it was shown that V > V '
2
3
For various 3-point grids, the designs which minimize V were found:
3
The same class of designs was considered as in Chapter 4, as described
of the design and the grid.
above.
These I'Min V3" des igns are listed in Table 5.2.
ratic Min V
I Min
This quad-
B estimator was compared with the quadratic least
squares estimator; again the comparison was made with respect to
the basic measure of goodness of
app~oximation
used in this thesis,
the integrated MaE.
For all values of "1, using a Min V design,
3
the quadratic Min V I Min B estimator was superior from this integrated
MSE view point.
In addition, in Chapt.er 5, the quadratic Min V
estimator was compared with the linear Min V
'Y
>
I Min
I Min
B
B estimator, for
0, and for the grids chosen - for the linear (.3, .6, .9) and for
the quadratic (1.5, 2.5, 3.5).
These two grids were each-f0und to be
A
best, from the point of view of bias and variance" of y(x), amongst
those considered, for the linear and quadratic estimators.
this comparison, it was suggested that the linear Min V
estimator should be used if
'YEto,
From
I Min
B
1.5], an.d the quadratic Min V
I Min
B
105
estimator if 7e[1.5, 3].
It is not until 7 exceeds 1.5, that the
larger variance of the quadratic estimator is balanced by its
smaller bias.
For 7 values greater than 3, polynomials of order
higher than 2 should be considered.
It was shown, in Chapter 6, that it is possible, using a
polynomial approximating function, to protect, in the Min V
I Min
B
sense, simultaneously against the true model being either a higher
order polynomial, as in Karson, Hader and Manson (1967), or an
exponential model with the parameter 7 taking on any value in some
specified grid.
This shows the flexibility of the technique of
minimum. bias estimation, since if the true model is the higher
order polynomial, then this estimator, derived in Chapter 6, gives
the Min B obtained in Karson, Hader and Manson (1967).
On the
other hand, if the true response is exponential, with 7 equal to
the ith grid point, then this estimator gives Min B for each
i = 1, 2, ••• , (r-l), as given in Chapter 3 of this thesis.
However,
one pays for this extra protection by an increase in the integrated
1\
var y(x), V, as shown in Lemma. 5.
But, if there exists no theoretical
knowledge as to what the true response model is, then this increase
in V may be justifiable.
In summary, the deve1gpment presented
in
this thesis-has utilized
a more flexible approach to an estimation
pro~lem
involving the estima-
tion of parameters in the true functional
relationsh~p,
where, prior to
this development, this estimation was cong;>utationally difficult and
unsatisfactory.
The experimenter may use a sing;>le polynomia.l
106
approximating model and protect, in the sense ot Min B, against the
true tunctional relationship being exponential, with the exponential
parameter '7 allowed to be any one ot the values in a specitied grid.
This thesis suggests some areas tor tuture research.
The
tollowing is a list ot the more obvious ot these:
1.
The extension ot this work, to the case where the true
model and the approximating polynomial involve two or more controllable
variables.
I
2.
Investigation ot the properties ot the estimator derived in
Chapter 6:
Find designs which satisty additional criteria,
~og.
A
minimize, over designs, the integrated var y(x).
3.
Change the measure ot integration in equations (1.6) and (1.7),
.!..!., the bias and variance criteria. For
exa~le,
instead ot unitorm.
measure distributed over R, it may be ot more value to weigh more
heavily, those areas in R, which are ot more specitic interest than
other areas.
4.
Change the torm ot the true model trom exponential to other
types ot tunctions, tor example, a function used in time series
analysis is
or perhaps a ratio ot two polynomials
107
It would also be of interest to try different types of approximating
functions and regions of interest.
5. The introduction of other forms ,of criteria, such as those
given in the literature review, in particular those mentioned by
Folks (1958).
6. Apply the procedure of simultaneous protection to other
combinations of functional relationships.
108
8.
LIST OF REFERENCES
Box, G. E. P., and N. R. Draper. 1959. A basis for the selection of
a response surface design. J. Am. Stat. Assoc. 54:622-654.
Box, G. E. P., and N. R. Draper. 1963. The choice of second order
rotatable designs. Biometrika 50:335-352.
Box, G. E. P., and J. S. Hunter. 1957. Multi-factor experimental
designs for exploring response surfaces. Ann. Math. Stat.
28:195-241.
Box, G. E. P., and K. B. Wilson.
ment of qptimum conditions.
1951. On the experimental attainJ. Roy. Stat. Soc. (B)13:1-45.
Bronfenbrenner, M. 1944. Production functions: Cobb-Douglas,
interfirm, intrafirm. Econometrica 12:35-44.
Brown, E. H. Phelps. 1957., The meaning of the fitted Cobb-Douglas
function. Quart. Jour. Econ. 71:546-560.
Chernoff, H. 1953. Locally optimal designs for estimating parameters.
Ann. Math. Stat. 24:586-602.
David, H. A., and B. E. Arens. 1959. O,ptimal spacing in regression
analysis. Ann. Math. Stat. 30:1072-1081.
de la Garza, A. 1954. Spacing of information in polynomial regression. Ann. Math. Stat. 25:123-130.
Elfving, G. 1952. O,ptimum allocation in linear regression theory.
Ann. Math. Stat. 23:255-262.
Folks, J. L. 1958. Comparisons of designs for exploration of
response relationships. Unpublished Ph.D. thesis, Department
of Statistics, Iowa State College, Ames, Iowa ..
Graybill, F. A. 1961. An Introduction to Linear Statistical Models.
Vol. 1. MeGraw-Hill Book Co., Inc., New Yor~, New York.
Heady, 0. E., and J. L. Dillon. 1961. Agricultural Production
Functions. Iowa State University Press,Ames, Iowa.
Hoel, P. G. 1958. Efficiency problems in polynomial regression.
Ann. Math. Stat. 29:1134-1145.
Hoel, P. G. 1961. Some properties of optimal spacing in polynomial
estimation. Ann. Inst. Stat. Math. (Tokyo, Japan) 13:1-8.
e.
109
Hotelling, H. 1941. The experimental determination o:f the maximum
o:f a :function. Ann. Math. stat. 12:20-45.
Inkson, R. H. E. 1964. The precision o:f estimates o:f the soil
content o:f phosphate using the Mitscherlich response equation.
Biometrics 20:873-882.
Karson, M. J., R. J. Hader and A. R. Manson. 1967. Bias and
variance criteria :for estimators and designs :for :fitting
polynomial responses. Unpublished Ph. D. thesis, Institute
o:f Statistics, Mimeograph Series No. 510, North Carolina
State University, Raleigh, North Carolina.
Kie:fer, J. 1961. Optimum experimental designs V, with applications to systematic and rotatable designs, pp. 381-405. In
.J. Neyman (ed.), Proceedings o:f the Fourth Berkeley Symposium
on Mathematical Statistics and Probability, Vol. 1, University
o:f Cali:fornia Press, Berkeley and Los Angeles, Cali:fornia.
Manson, A. R. 1966. Minimum bias designs :for an exponential
response. Oak Ridge, Tennessee, ORNL-3873.
Schmitz, A. 1967. Production :function analysis as a guide to
policy in low-income :farm areas. Canadian Journal o:f Agricultural EconOmics 15:100-111.
Stevens, W. L.
1951.
Asymptotic regression.
Biometrics 7:247-267.
Williams, E. s. 1958. Optimum allocation :for estimation o:f polynomial regression • Biometrics 14:573-574.
•
110
9. APPENDICES
9.1 Algebra Leading to Equation (5.59)
The
(3 x 3) matrix (x' X)-l is defined by equation (5.44), -1
x' is
defined as
and the (N x 2) matrix E is given by equation (9.3)
1
e
1
e
y~
E=
Y~
The (1 x 2) vector ~' is defined as
(9·4)
Now
N
X'E
=
YX
1:e i
YX i
0
1:x e
i
2
1:x
i
2 YX i
1:x e
i
Then,
a:B
(X'X)-l X' E ~
ll
+
,
=
+
where
and B12, B22 and B are defined by equations (5.48), (5.49) and
32
(5·50), respectively. Using equations (9.1), (9.2) and (9.7)
However, using the definitions of Bll and B ,
31
B'll
=1
and
112
Therefore,
which is equation
(5.59).
9.2 Algebra Leading to Equation (5.63)
The integrated squared bias B is defined as
B =
N
2
20'
J1
2
B (x) dx
,
-1
where B(x) is defined by equation
(5.47), which if sUbstituted into
(9.11) leads to equation (9.12):
that is
B
{(~ + 1. B + 1.5 B32 + oS3 B.-12B22 + L
(e27
= §2.N
47
cr 2 -12 3 22
.
2
2
.~
This is equation
_-
e
~2r)
(5.63).
e.
9.3 Proof of Lemma 4
Let
for the case k
=1
and
for the case k = 2
•
The [N x (d + r)] matrix Zl is given by
The matrix Xl is defined by (6.10), the vector X is defined by (6.11)
2
where k
= 1.
The vectors ~ (i = 1, 2, ••• , r - 1) are defined in
The [N x (d + r + 1)] matrix Z2 is given by·
section (3.3).
(~.16)
where the (N x 1) vector ~ is given by
d+l
~
··•
d+1
~
114
The [d x (d + r)] matrix A is given by
1
W~l ~r-
1 :
I
W~l
W2'
~
where W ' ~ , i = 1, 2, ••• , r - 1, have already been defined in
1
i
section (3.3). The matrix W2 is defined by eguation (6.12) vIi th
k
= 1.
The matrix A is given by
2
where !!:1 is defined as
..L
d+2
o
..L
d+3
o
if d is odd
-L
2d+1
.0
..L
d+3
o
-L
2d+l
o
if d is even
•
115
Using the partition method, for finding an inverse, given in section
(4.3), the inverse of
Z2 Z2) may be expressed as
Z'z
1=Jt
=
~Z1
.
,
with
(9·22) .
Therefore, using (9.14) and (9.20)
v2 (x)
= x! (A
-J.
1
Using the definitions of
"
0:
and f!' given by (9.21) and (9.22),
respectively, (9.23) becomes
Let
116
and
a'x.
-1-J.
= d(x)
(9.24)
Then,
•
becomes
Using equation
(9.13),
equation
(9.25)
becomes
Using the definition of 8
8
-1
()-l
= ~ (I - Z{ S~
Since [I - ~(Z{Zl)
-1
Zi l
Zi l 2S:k
•
is a symmetric idempotent matrix
8- 1 ~ 0
for every -1
9.4
:5 x :5 1
Some Algebra Leading to Eguation
From equation
(4.47)
the bias at the poi:p.,t x, B(x), when the true
function relationship is given by equation
B(x) =
(4.49)
~i (X'Xr 1 X' e
y.. - (0: + t3 eYx )
(4.2),
may be expressed as
117
Using
(4.46) and the definition of
1
B(x)
1.
0
n
e~,
1
=~
.l:...
N2
0
xl
x2
(9.27) becomes,
...
1
e
1
e
1
...
Xri
rX2
.•
1
Ex
r~
e
rXri
~]
- (afot3e rx )
i i
1
= -1
x'
1 N r Xi
-Ze
n i
~
N
r x J..
Z xie
0
i
(afot3 erx )
-
N 2
Z xi
i
that is,
N ?,x
Ze i
B(x)
= t3
i
=---+ X
N
N
?,xi
.E x. e
i
J.
~--N 2
Z,x i
-
e
rx
i
.
When B(x), given by
(9.29), is squared, equation ,(4.49) results.