, n,l
~,I,·.
l,
ESTIMATION OF THE INTERCEPT OF A LnmAR REGRESSION
Fm~CTION
by
1!Vl1. JACKSON HALL
Institute of Statistics, University of North Carolina
Special report to the Office of Naval Research
of work at Chapel Hill under Project NR 042 031
for research in probability and statistics.
Institute of Statistics
Mimeograph Series No. 79
August 1953
I'~
,I,
(".,.It.I'/,
By performing experiments at the levels xl ,x 2' ••• ,
we obtain observations Y • a + ~xi + 6 (i=1,2, ••• ), the 6's being indei
i
2
pendent N(O,cr ) random variables, and a and 8 the unknown linear regression
1. Introduction.
coefficients.
The x's are at the disposal of the experimenter, subject to
the restriction of being within a given range, say x'
~
x
~ Xli.
It is
desired to design an experiment--i.e., to specify the levels at which the
experi~ent
is to be performed--to estimate that x, say e, for Which E(y)
equals some specified value; we choose this value to be zero without loss
of generality.
Then, we are to estimate 9 =
-a/~.
We shall assume that
x' < e < x"; moreover, we transform the x-axis so that x' = -1, x" ... +1.
(See the diagrams, pg. 13 .)
propose t N = -aN/bN as an estimate of e, aN and b being the
N
least squares estimates of a and ~, respectively, based on N observations,
v~e
assu~ing
the
XIS
are determined before experimentation begins.
In this
is the maximum likelihood estjJnate of e since aN and b are maximum
N
N
likelihood estimates. If the x's are chosen sequentially so that they are
case, t
random variables, we use the estimate tN' computing aN and bN from the same
formulas. From regression theory, we have
-2-
(Sometimes we shall omit the subscripts on
~,
b , and t without fear of
N
N
confusion. )
We shall give designs of experiments for estimating
Q
based on cer-
tain lfoptii'11um" criteria in both the non-sequential and sequential cases.
Properties of both the non-sequential and sequential estimates t
are disN
cussed, including, in the non-sequential case, an approxlination to the distribution of t N and a confidence interval for S. Finally, some examples have
been constructed to illustrate the character of the designs.
2. Nor.-:::.e_q'len"j,ial estimation.
l,ve assume the
experimenter in advance of the experimentation.
XIS
to be fixed by the
From regression theory, we
know that (a,b) has a bivariate normal distribution with means a and
.
varJ.ances
0'a 2 em d
2
0':10
a
l.X
2
(lb'
2
2
:-20',
NZ(x-x)
and covari ance
=
0'ab'
I
z(x-x)
~,
h
Here
2
20',
0'
2
-1:x
_)~
ab = NZ ( x-x
0'
2 2 2
For later reference, if we replace 0' above by s , on estim.:>.te of (J on N-2
2
2
d.f., we obtain the estimates sa ' sb ' and sab' respectively, which are
independent of a and b.
It may be shown that the ratio of two normally distributed variables
has no finite moments (except in special easeL): hence, t =
finite moments.
-alb
has no
This would not be so if b could not take on values in a
small interval about zero; hence, if the coeffic:tent of variation of b, Vb'
is sufficiently small, such an event would not occur in practice.
Therefore,
-)-
we may suppose that if we give expansions in
p~Ners
of vb for the mean and
variance of t, that the first few terms in the expansions will give, in
practice, reasonable measures of location and
s~ale,
respectively.
symbols for mean and variance of t used beloiT are to be
unde~stood
The
in this
Ugh:',
U8i~1g
the series expansion for the expectat,ion of the rat.io of ti-10
normally dist:cibuted va~iables given by Rao
E t.
where vb
=
£1,
e _ (x-g) ~o (2'!..U V 2v
v
b
, v=l 2 v 1
= 0'1/8 = a/r3l(X-X) 2 • By
a development similar to t~at in Rao, us
obtain
Var t
where c v
0)
(4)
:~
t
pp. lSJ-!L7, we have
= e _ _ i ..e
Z(x_x)2
0'2 +
~
o( v,
I)
t~)
-4(Further justification may be given for these expansions:
If we
(1)
assume b to have a truncated (at b=O) nor.mal distribution, the variance of
t is finite and the first two terms in the expansion (4) may be shown to
be correct up to the proper order
pand t as a function of
~
..
(~1,
1:2,
pp.
353-4, 358_7.
(2)
If we ex-
... ,6N) in a r1aclaurin series and take
expectation and variance t,erm-wise, we obtain the same first two terms as
given in (3) and (4), even if the 6 1 s are not assumed normally distributed.)
lve can reduce the bias of t and the variance of t, as given by (3)
and (4), simultaneously by choosing the x's so that x-e is small and Z(x_~)2
is large. We shall choose the
XIS
so that Z(x_x)2 will be maximized subject
to i being fixed close to e; this is accomplished by performing all experiments at one of the two extreme levels"
n at xI("-l) and N-n at xlI( ..+l)
where n is chosen so as to make the following
approx~nation
as close as pos-
sible: ~ .. (N-2n)/N ~ e; i.e., n is the closest integer to ~J(l-e). These
will be termed the optimum criteria in the non-sequential case.
Thus a
good design will require some! priori knowledge of Q; if none is available,
it would appear reasonable to retain the two-level design with
(As further
(1)
justifi~ation
n~N/2.
for the optimum criteria, we note that:
Maximizing the term corresponding to e in the inverse of the informa-
tion matrix for
(~,e)
leads to the same criteria.
(2)
According to
Fieller (see Section 5), we may obtain a confidence interval for
e with
length
-,where to is a constant.
If we replace (a,b,s2) by (a,~,a2), minimization
of the length of the confidence interval is essentially accomplished by
the above criteria.)
If n observations are taken at x • -1 and N-n at x ... +1, it may be
shown that
aU
III
1(-) ,
y+z
~
b
N
1(-) ,
... -~
y-z
where here y denotes the average of the observations at x..-l and z the
average of those at x-+l.
3. Pro2erties of non-sequential
estimate~.
For designs in which
x-Q
is very small and/or z(x_i)2 is large (in particular, for optimum nonsequential desic:ns), Var t :, a2/N~2. This may be estimated by s2/Nb2. lrle
find that a 2jNb2 has no finite moments, but by arguments similar to those
above, we obtain
.,. s
2
;!.rNb~·· ==
Hence, s2/Nb2 is a consistent estimate of a2~T~2 if z(x-i)2 tends to infinity
with N; moreover, it is a conservative estimate in that the bias is positive.
tN' being the maximum likelihood estimate of
Q,
has all the well-
known properties of such estimates, in particular, consistency and asymptotic
normality, assuming Z(x_x)2 tends to infinity with N. The asymptotic variance
is a2/N~2.
-6-
4.
The distribution of the.rt~n-saquentialestimate.
Geary
["3_7
gives
an approximation to the distribution of the ratio of two normal variables,
the error being small if the coefficient of variation of the denominator
variable is small.
t
= -alb,
Applying his work, with some refinement, to the variable
we have, denoting the c.d.f. of til by F ,
N
where ~ is the standard normal cod.f...
=
z( _)2
I'\1
8
(t-Q)- -:2
x-x
----2-
t - 2'Xt + Zx
IN
,
and
00
,
R(t)
•
(fIR \
)
e -x
2
/2
dx-
f)/ab
Now' R( t) is a monotone increasi. ng function of t and varies from - ~ ( -l/v )
b
to +~(-l/Vb)' Hence, for small vb'
-1-
In particular, if
v~
0.430,
:R(t)i~
0.01.
For two-level designs,
u(t)
=
(t-Qt=B
f
of
4!i(N-n)
- --Nt 2_ 2(N-2n)t
+ N
and the nomal approximation is valid within 1 %
if vb:: 0.430 or
n(N-n)/N ~ 1.353 o2/~2.
5.
Confidence interval for
e
(non-sequential cas!l.
Fieller ~4_7
develops exact confidence intervals for the ratio of two normal variables
based on the Student-Fisher t-distribution. He reasons as follows:
Since a + Qb is normal and sa 2+ 2Qsab+ Q2 Su 2 is an independent estimate of
its variance on N-2 d.f., it follows that
has a t-distribution with
so that
Pr(lzj~
to)
= 0,
n~2
d.f.
Therefore, for given 0, if to is chosen
we have
Fieller shows that if b is sufficiently different from zero (specifically,
-8if b
222
ISb > to , which is likely if vb is small), then we have a confidence
interval for 9:
5
=
6. Sequential estimation. We now suppose the experiments to be
perfonned sequentially, the level of each
the basis of the previous observations.
~xperiment
being determined on
Specifically, we suppose observa-
tions to be taken in groups of k (a positive integer), the levels in the
(~l)st group being determined on the basis of the observations in the first
m groups; thus
say.
Hence, the y's are no longer independent, and the previous reBults do
not necessarily hold.
We propose the same estimate, t N = -aN/bN' where aN
and b are defined by (1) and (2) though now they may not be normally
N
distributed.
as a function of ~ = (6 , ••• ,6 ), from equations (1), (2),
N
1
N
and (5), in a I5 aclaurin series, we obtain:
Expanding t
t
N
=
+ •••
-9"
Assuming that we may take expectation and variance term-wise (though they
may not be existent as in the non-sequential case), we have
(6)
+ •••
ci l:( atN lI) 2 +
(7)
i
I
06i
•••
6=0
where
~i
:: xi]
•
6=0
Consider a sequential plan such that
x :: t f
for some
f <N; we call
all such plans in which observations are taken in groups of k "designs of
type D • If Then
k
;r:: x] = t f J
6=0
6=0
=
Q,
and Var t
N
=
0'2/Np 2 + ....
By
evaluating the second term in equation (6), we find it to be zero for
designs of type Dk • Hence, for such designs, the bias and the variance,
as given by (6) and (7), are simultaneously reduced. By evaluation of
some of the higher order terms, it may be shown that l:(~_~)2 always appears
with a negative exponent.
Hence, as in the non-sequential case, a design
in which all but the last group of observations are taken at x :: :1, and
the last group is so allocated that
sense.
x = t N_k,
is optimum in the above
We call such a design a "truncated two-level design".
Thus, for
-10-
k
= 1,
the first N-l observations are to be taken at -1 and +1, keeping the
average close to the previous estimate of
e so
that the last observation
may be taken at some x between -1 and +1 in such a way that
x•
tN_I.
Explicitly, the sequential design is :
\I
m-2n) ,1<m<N-2 (after the mth stage, we
xm+l • sgn(tm- 'iii+I
suppose n observations have been taken at -1 and m-n at +1)
/
(8)
)
l
+1)
N-l
xN = NtN_l - i:lXi
where sgn(u) = +1 if u
~
,
0, -1 if u <
o.
(If we wish to add further single observations with the possibility
of terminating the experiment at any step but yet retaining a design of
type Dl' we can take
(n+l)tn - ntn-1
1 n+l
Then, at any stage, we have n+l i:lXi = tn.
(n~N)
•
Noreover, such observations
will permit a check on the linearity of the regression line, if desired.)
7. Properties of sequential estimates. In the two-level sequential,
FN(t) -- the distribution of non-sequential t in Section
4 -- is the condi-
tional distribution of t given that n observations were taken at -1 and
N-n at +1.
-11-
In the sequential oase, we have no assuranoe that t
likelihood estimate; however, we do prove consistency:
is a maximum
N
Consider an arbi-
trary sequential design in which Z(x-x)2 tends in probability to infinity
with N.
(For two-level designs, this assumption is that nand N...n tend in
probability to infinity with N.)
Now xi is independent of 6 j for i':; j, so
that E(XiAi)(XjA j ) ... E(XiAiXj)EA j ...
EZA = 0,
EZxA .. 0,
° (i<j); thus
E(i:A)2 .. EZA 2 = Nd 2 ,
22222
E(ZxA) .. EZx A ~ EZA = No ;
moreover, I i:xj <N and ZX2< N.
By application of Tchebycheff l s Inequality on
jZA and izxA and ,slutsky's Theorem
["2, pg. 255_7, the consistency of aN and
b as estimates of a and ~ follow's from equations (1) and (2) and the above
N
remarks, irrespective of the distribution of the A's. Further application
of Slutsky's Theorem proves the consistency of t
N
as an estimate of
e under
the stated conditions.
8. Examples.
~~o
samples have been constructed by assigning the
values a • 1, ~ .. 4 (Q = ...1/4),
fl.
1, x' .....1, x" ... +1, and taking values
of the 6's from Nahalanobis' "Tables of a random sample from a normal
population II
£5_7. (Sample 1 consists of the first 20 values and Sample 2
the second 20 in Plate L) Four designs l'Tere used with each sample, two
non-sequential and two sequential, and estimates were computed on the first
10 observations
~s
well as on the total of 20.
The designs used were:
-12Design 1:
Design 2:
two-level non-sequential design with n .. N/2, xi- (_l)i.
Design):
xl = -1, x 2- +1, xn.. ntn- 1- (n-l)tn- 2 (n > 2)
(see the last paragraph in Section 6).
Design 4:
"optimum" sequential design, given by (8).
(The
designs for the samples of 10 were not truncated.)
Before presentation, the data were transformed by the transformation
x ~~= 1 + 4x so that a*.. 0,
an estimate of
e*= o.
~
*.. 1, (]2= 1, x~~ I
..
-),
(See the diagrams below.)
x*"'= +5; hence t
N
The estimates of
e*
from each design and variance estimates from Designs 1 and
4 are
* is
given in
Table I below; the levels of the designs are given in Table II; confidence
intervals for
9.
e* are given in Table III.
Acknowledge~.
Acknowledgement is hereby made to Professor
H. Hotelling who proposed this problem and gave many helpful suggestions,
and to Professor S. N. Roy for suggestions after reading the manuscript.
-13DIAGRAMS
y
y
I
/
,
i
,~
lEy = a + Bx
./ * .
Ey • a. +' ~ x
.
~~
.~.'
I
1
I
I
i
I
I
/
1/
II
,
II
x =-l /!
x"=+l - x
-----.-. g.; 14--._t1
./
I'
~*'
l
/
/
I
/
I
...,x. ,.,."=+5
. -..... x~*'
!
.,
.:f-
a=l, .B=4
a =0,
~
~fo
=1
i
I
/
!
TABLE I
t
~}
N
"i~
(Estimate of g'. 0) and Variance Estimates
(after the transformation x*= 1 + 4x)
.
I
Design
1
2
I
3
4
I
!N=10
I samp1a~.1!
IN=20
+.158 -.070 +.001 -.004
N=10
-.901 ... 640 -.619 -.578
+.199 -.011 +.016 +.008
I
Sample
I
jN=20 i -.349
21
,I
I
I,
s/1Nb
s/rNb
Design 1 i Design
.316
.214
.224
.224
.316
.441
.
I
-.218 -.239 -.228
.224
.277
I
I
f
!
I
I
41
.355
I
I
.201
I
I
.404
~
I
I
(J/.m~
i
.261
j
I
-14-
TABLE II
Levels of the Experiments (x.*, i=I, ••• ,20)
1*
(after the transformation x = 1 + 4x)
Design 3
Design 4
Design 1 Design 2
Sample 1 Sample 2 Sample 1 Sample 2
i
1
2
3
I 4
I
i 5
6
7
8
9
10
_{f-
-3
-3
-1
-1
+1
+1
+)
+3
+5
+5
I
I
I
I
I
-3
+5
-3
+5
-3
+5
-3
+5
-3
+5
x
+1.000
+1.000
11
12
13
14
15
16
11
18
19
20
-3
-3
-1
-1
+1
+1
+.3
+3
+5
-3
+5
+'
-3
+5
-3
+5
-3
+,
, 'f*
+1.000
+1.000
(
-3
+5
-3.000
+5.000
-0.803
+0.598
+1.827
i
I
I +0.793
-0.347
I -1.840
+0.922
-1.828
I
-3.000
+5.000
-4.792
-0.745
-1.749
-0.446
-2.4 (5
-1.).).22
+2.062
+1.855
r
I
I
-3
+5
-3
+5
-3
+5
-3
-3
-3
+5
-3
-3
-3
+5
-3
I
+5
-3
-3
+,
I
I
-3
+0.132
\
-0.571
+0 ..200
-1.142
-0.190
+1.385
-0.516
+0.855
,
I -0.134
+0.119
, +0.676
II -0.500
-0.984
-1.093
+1.901
-3
-3
+5
+0.977 I
-3
+,
-1.284
-1.185
-3
+0,301
+5
+2.534
-3
+0.010
-3
-0.887
+1.745
I
-0.260
I
+0.045
-1.271
!
II
I
i
!
i -0.600
i
I
-3
+5
-3
+5
I
I
I
i
-3
-3
+5
II
I
I
I
I!
-3
-3
+4.274
i
I
+0.031 I -0.236
J
I
-15TABLE III
~~
Confidence Interval for g"
(after the· transt'l'mation. x-l!-= 1 + 4x)
I
!--------,-------.....,
Sample 1
Sample 2
I
Design 2
N
I
,
10
I
0.95 I
I 20
-0.782, +1.027
-0.342, +1.827 ,i
-0.375, +0.540
-0.335, +0.8 23/
10
I -1.170, +1.541
-0.772, +2.5181
20
-0.536, +0.724
0.99
I
I
;
;
-0.526, +1.048·
REFERENCES
Advanced Statistical Methods in Biometric
Research, John Wiley and Sons, New York (1952).
["2_7 Harald Cramer, Mathematical Methods of Statistics, Princeton
University Press, Princeton (1946).
"The frequency distribution of the quotient
of two normal variates", Journal of the R0i$l
Stutistical Society, Vol. 93 (i~30), pp. 4 -6.
~4_7
E. C. Fie11er,
£5_7 P.
c.
r~ fundamental formula in the statistics of
biological assay, and some applications",
Quarterly Journal of Pharmacy, Vol. 17 (1944),
pp. 117-23.
Maha1anobis, et.al., "Tables of a random sample from a
normal population, II Sankytza, Vol. 1 (1934),
pp. 289-328.
© Copyright 2026 Paperzz