An empirical Bayes approach to polynomial regression
under order restrictions
By
B Y LEONARD W. DEATON
SUMMARY
STJMMABY
An unknown polynomial is to be estimated over a finite interval from N independent,
normally distributed observations. A prior distribution is placed on the polynomial
coefficients expressing the opinion that the coefficients
coefficients decrease in absolute value as the
degree of the corresponding terms increase. The data are used to estimate the parameters in
the prior distribution of the coefficients. A Monte Carlo study is presented
presented which compares
the proposed method with the lack-of-fit
that the proposed
lack-of-fit procedure. This study indicates that
method performs well in terms of minimizing a squared error loss as well as in yielding the
correct degree of the polynomial being estimated.
SQ'TTU!,
Some key
key words:
words: Bayes rule; Empirical Ba.yes
Bayes procedure; Isotonic regression; Least squares estimate;
Maximum likelihood procedure; Monte Carlo study; Order restriction; Orthonormal; Polynomial
regression.
regression.
. 1. INTRODUCTION
INTBODTJCTION
There are many methods for polynomial regression. Most classical methods are well known.
Bayesian approaches have been devised by Guttman (1967), Halpern (1973) and Young
(1977). Hager &
& Antle (1968) studied Guttman's method for determining the degree of
(1977).
of a
polynomial and concluded that it was not of practical value. They also recommended
recommended that
lack-of-fit procedure. Halpern
future approaches to this problem be compared to the lack-of-fit
Halpern made
such a comparison which indicated that his method, using a vague prior on the parameters,
was of practical value. However, unless a vague prior is used on the parameters, Halpern's
method appears to be computationally cumbersome.
Young's procedure is not designed for determining the correct degree of the polynomial,
but is designed only for optimal prediction. Hence, Young's procedure always yields a
polynomial of maximal degree. Young's procedure requires numerical methods to approxi
mate a mode.
mode.
The initial assumptions of the method proposed here closely resemble those of
of Young.
of the polynomial being estimated.
However, we also attempt to determine the correct degree of
As recommended by Hager &
& Antle, we have compared our procedure with the lack-of-fit
lack-of-fit
As
4. The
The computations
computations required
required for
for
procedure. The results of this comparison are presented in § 4.
5.
our method are simple and exact. A numerical example is given in § 5.
2. THE MODEL
We are to estimate the polynomial function P. We have N independent
independent observations Y
~t
x{ of the form l(
T{ == P(Xi)+£i
P(xi) + ei (i
(t == 1,
1, ...
...,N),
et are normally
of P at the points Xi
,N), where the £i
a2. We write P as the sum of
distributed with mean zero and unknown variance u2.
of orthonormal
polynomials, that is
PW = S ^ ^ i )
(»=1
N).
\p{ isisaa polynomial
polynomial of
ofdegree
degreejj such
such that
that
where .pj
N
S M*k) Hxk) = 8<j (», j = 0,...,TO);
t—i
the coefficients
coefficients fy
(Jj are unknown. This assumes that we have at least m +11 distinct values of
of
the Xx/so
(m +1)
<2with
withthe
theelement
elementof
ofthe
theith
ithrow
rowandjth
andj'th column
column
1) matrix Q
4'B. By defining the N x (m+
if>j(x{),weweobtain
obtainQ'Q
Q'Q
, where
I the
is the
identity
matrix.
as tPj(x,),
= I,= /where
I is
identity
matrix.
Our assumptions thus far may be expressed in matrix
matnx notation as
Y\e~N{Q6,o*I).
YI (J- N(Q(J, u 2 I).
m + 11 dimensional
dimensional
That is, we observe an N dimensional random vector Y which given the m
vector (J8 has a multivariate normal distribution with mean Q(J
Q8 and
and covariance
covariance matrix
matrix ua2 21.1.
The least squares estimator 6
~ for (J6 and
and the
the error
error sum
sum of
ofsquares
squares 8s are
are independent
independent sufficient
sufficient
statistics for the problem. Hence, the components #
~,< given
given (J8 are
are independently
independently normally
normally
6t and variance u<r22,, and
and are
are independent
independent of
of 8sgiven
given (J,8,which
whichisissuch
suchthat
that
distributed with mean (J{.
2
2
so-- is chi-squared with nn degrees of freedom. That is
8U6t\8~N[8t,o*), 8lo*\8~xl,
(2-1)
(2'1)
where n == N -m-1.
— m — 1.
We put a prior distribution on (J8 and assume that its components (J,
8t are independently
normally distributed with mean 0 and variance of. We
allow
=
0,
that
is,
We allow a? 0,
is, some
some components
components
of (J8 may be degenerate at zero.
zero.
If
If a zero mean would contradict our prior opinion for some of the (J,
8{ we
we may
may use
use something
something
other than zero. Then, the procedures given here would only require a slight adjustment.
adjustment.
If
If one has a vague opinion about (J"
8{, then
then aa sufficiently
sufficiently large
largevalue
value of
of erfwill
willnegate
negatethe
the effect
effect
of assigning a prior mean of zero to it.
Hence,
(2·2)
-Zi)^)2laf},
(2-2)
e^-NUl
where
where
u:.
u:
u:
Also, the marginal distributions of the 6
~,t are independent normal distributions with mean
2
2
[z{. Hence
Hence
o0 and variance uo /z,.
(2·3)
ONi^fa).
(2-3)
As a result of (2'1)
(21) and (2'3)
(2-3) the joint marginal distribution of the 8
~{.t and 8s is proportional to
zl (Z, ~:)
n
-exp - - 2
—M-s?)fl?H-£*)
8i(n-21
(
8) m
--n-exp -2-2
u
u- iooO U
U
2
•
(2(2'4)
4)
'
(2-2) will provide us with the Bayes rule for estimating 8(J,t
The mean of the distribution (2·2)
whenCT
when 22and of and hence zZii are known. We proceed in a manner somewhat similar to that of
of
Efron &
& Morris (1973) and use the data to estimate the zz,.t. However, this procedure differs
differs
Efron
zi are all equal nor shall we use a loss function in
from theirs in that we shall not assume the Zi
obtaining our estimates. Indeed, the process of selecting an appropriate model for regression
is accomplished by estimating certain zi to be 1.
u u:
z,
We eventually express complete vagueness in our prior opinion of
of d800, the constant
constant term
of the polynomial, by taking u8
o-§ == 00.
oo.Hence,
Hence,we
weuse
usethe
theleast
leastsquares
squaresestimator
estimatorb9o0totoestimate
estimate
8d00,. The
The theory at this point will not allow such an assignment. So, temporarily we assume
u8
a%isisfixed
fixedatatsome
somelarge
largepositive
positivevalue.
value.We
Wealso
alsoassume
assumethat
that
(2·5)
(2-5)
Young (1977) also assumes (2·5)
constraint (2-5)
(2·5) reflects
(2-5) and uses a vague prior on 8600•. The constraint
a prior
opinion
that
becomes
increasingly
stronger,
as
the
index
i
increases,
that
8,
prior opinion
i
dt is near
zero.
point of
of being eli
zero. The assumptions in (2·5)
(2-5) can be relaxed in varying degrees to the point
minated entirely. However, we believe that (2·5)
(2-5) is appropriate for most practical problems.
In
In terms of the zi our assumptions are
z,
,
(2·6)
(2-6)
where Ee is a known positive number near zero.
Although estimates of the zi are enough to give us an estimate of 6,
8, itit isisboth
both practical
practical and
and
convenient also to estimate ua22•. One way to do this is to use a maximum likelihood procedure
zi to maximize (2·4)
(2-4) subject
subject to the restrictions in (2-6).
and select ua22 and the unknown z,
(2·6).
transformation
First, we make the transformation
z,
V^zia-\
Vm+1 = *-*
(2·7)
(2-7)
for ii == 1,
, m. Then (2·4)
1, ...
...,m.
(2-4) may be rewritten as
sil n - 2l zt
^ ii^ {vtexp
^ (-!~Dm·
.
}
V~~tll exp{ -!Vm+1 (s+z obm
(2.8)
(2·8)
i-I
3. ESTIMATING THE HYPEBPABAMETEBS
HYPERPARAMETERS
TJ by
by selecting
selecting them
them to
to maximize
maximize (2·8)
(2-8) subject
subject to
to
We could estimate the hyperparameters ~
(2-6). In terms of the ~.
Vt,(2·6)
(2-6)becomes
becomes
the restrictions in (2·6).
(3-1)
(3·1)
(2-8) could be maximized subject to (3'1).
(3-1). But we prefer
Thus (2'8)
prefer to put a prior on (z
(Zl'
+1))
t ...• zzm'
m , Vm
m+1
proportional to
which is proporlionalto
Vy.,+1-1 exp
m+l
ii: zy.-l
(- Vm +1)
f3.
Pm+V
m+l i 11
~-
(3·2)
'
zt satisfying the restrictions in (2'6)
(2-6) and V
V
> 0.
+1 > O.
for the z,
mm+1
Apart from the restrictions in (2·6)
(2-6) we see that (3·2)
(3-2) is a product
Aparl
product of
of independent
independent beta
distributions and a gamma distribution. The choice yy,{ =— 11gives
gives aa uniform
uniform distribution
distribution of
of z,zt
y4 express stronger opinions that the zZii are near 1. In
In testing the hypoand larger values of y,
hypo
8t =—0,0,inin the
the classical
classical sense.
sense, we
we essentially
essentially express
express aa prior
prior opinion
opinion that
that 8,6i == 00
thesis that 8,
sufficiently strong to reject
and will stay with that opinion unless sampling evidence is sufficiently
reject the
hypothesis. Aparl
Apart from the restrictions in (2·6),
(2-6), we believe that selecting values of
of yy,t larger
than 11 is in spirit similar to selecting significance levels less than 50% in testing the hypohypo
thesis that
that 8,
Bt == o.
0.
thesis
(Zj ... , zm'
2m>^m+i)
8 and «s is proportional
The posterior distribution of (~,
Vm+1 ) given band
proporlional to the product
(3-2) and (2·8)
(2-8) which can be written as
of (3·2)
ft (3-3)
(3·3)
where
n = n + 1 + 2j(y m+1 - 1 ) - S (y< -1)1,
Wm+1
= s+2/f1m+l +zoD~,
~
=
0:
(i
= I, ... ,m),
provided that (3'1)
(3-1) is satisfied.
satisfied.
(3-3) subject
subject to the restrictions (3'1).
(3*1). If
If (3'3)
(3-3) is
Our problem is to select V
~t to maximize (3'3)
maximized by taking P
Vothen
thenwe
wewould
woulduse
use(2·7)
(2-7)to
tosolve
solvefor
for the
theestimates
estimatesz(£tof
ofz(zi so
sothat
that
~<== ~,
of 8(
6i are
are 0((
#((1
Ourestimate
estimateofof8( 6ist zero
is zero
provided
our
estimate
is one.
our final estimates of
I -—
z().zt).
Our
provided
our
estimate
z( iisi one.
This occurs provided %
~ == 'Pmfm+1
+1. .
g{ by
We now define g(
gm+1
+lI
9
= ?i/W
*IWmm+
i, g(
<7<== (2y(-I)/~
(2y,-l)/KJ (i( »=- lI,, ....
. . ,m).
,»)•
m+i =
(3'4)
(3-4)
If
~ to be g(
If we ignore the restrictions in (3'1)
(3-1) and if
if the g(
gi are positive, then taking PJ
gt
1, ...
...,m
+ 1I maximizes (3'3).
(3-3). If
If such V
satisfy the restrictions in (3'1)
(3-1) our problem is
for i = I,
,m+
~t satisfy
solved. The following theorem gives us a more general solution.
I. If
gt in
in (3·4)
(3-4) are
arepositive,
positive, then
then (3'3)
(3-3) isis maximized
maximized by
bytaking
taking~Vto{ to
be the
isotonic
1.
If the g(
be the
isotonic
regression
. .. , m
+ 1.
regression of
ofgg({ with
with weights
weights~Wfor
...,m+1.
{fori t= =I,1,
THEOREM
The proof
proof is too long to be included here. For an account of isotonic regression, see Barlow
at. (1972, p. 9).
et al.
The next theorem gives a formula for computing the isotonic regression.
THEOREM
THEOREM
% be
be.the
theisotonic
isotonic regression
regression ofofg(gt with
with weights
weights ~for
Wtfor i i == I,1,.... .,.m+
, m + 1.1.Then
Then
2. Let
Let l{
Vt
= maxminAv(s,t)
maxminAv(s,<) (i(i == 1,1,.... .,m+1),
.,m+l),
~ =
8E;(
I;1;i
where
where
Av (s, t) =
Av(s,t)
= (1:g
(Lgr rIf,:)/(1:lf,:)
Wr)lpWr)
both sums are from rr = ss to
to rr = t.t.
where both
where
The formula for ^~ in Theorem 2 is called a max-min formula and is given by Barlow et
et al.
al.
(1972, p. 19).
It
It is rather difficult
difficult to make any simple statement
statement as to how many coefficients
coefficients $D(t will be
If a given coefficient
coefficient is eliminated, then all coefficients
coefficients of
eliminated by this procedure. If
of
higher degree orthonormal polynomials will also be eliminated. As for those coefficients
coefficients which
left in, the shrinkage factors increase with the degree of
of the polynomial. RougWy
Roughly speak
speakare left
coefficient D(
dt is eliminated provided g(
gt is 'significantly'
'significantly' larger than ggm+l'
ing, a coefficient
m+1. The previous
statement is an oversimplification
oversimplification since all g(
gi and their weights W
mustbe
betaken
takeninto
intoaccount.
account.
statement
~Jmust
For maximum elimination, we would want ggm+1
small
and
all
other
g
large.
As
shown
in the
g(t
m+1
seemsdifficult
difficult to
to overeIiminate
overeliminate when
when some
someelimination
elimination isisrequired.
required.
Monte Carlo study in §4,4,itit seems
In that study g
gm+l
m+1 was essentially zero and excellent results were obtained.
The assumption in Theorem I1 that the g
g,t are positive is met provided that
m
~y,<t{n-1+2(m+Ym+1)}'
Yi>!
(i=I, ... ,m).
i-I
i-l
> 1J for
for it == I,1,...
m does
does restrict
restrict our
our ability
ability to
to express
express aa strong
strong prior
prior
The requirement that yY,i >
... , m
if we have a very strong prior opinion that zZii is
opinion that any z^
Zi is near zero. However, if
z0. It
It would seem rather rare that
very near zero, we can simply take Zzii == 0 as we do with zoo
Theorem 1 could not be applied in any practical case.
By comparing (3·3)
(3-3) to (2·8)
(2-8) we see that the problem of maximizing one is equivalent to
In fact, if
if in (3·3)
(3*3) and (3·4)
(3-4) we take jS
= 00
oo and the Yi
y< == 1 the
maximizing the other. In
fJm+1
m+1 =
problems are equivalent.
We now consider the problem of expressing complete vagueness in our prior opinion of 800•.
difficulty encountered with simply taking Zo
z0 == e =— 00
We note that there is no mathematical difficulty
(3-1), (3·3)
(3-3) and (3·4).
(3-4). An appropriate continuity result justifies the process of
of taking Zo
z0 == 0
in (3·1),
Vt to maximize (3·3)
(3-3) subject
subject to (3·1).
(3-1). This procedure can be applied to
when selecting the V.
6it.• For example, if
if we wanted our estimate to be at least a quad
quadexpress vagueness for any 8
ratic, we could take Zo
z0 == Zl
zx == Z2
zz == o.
0.
4. MONTE CARLO
CAKLO STUDIES
4.
The Monte Carlo study consisted of adding a N(O,
N(0,1)1) deviate to a polynomial P(x). Two
+ 1, ±
+ 2 and ±
+ 3. With these 14
observations were made at each of the seven points x = 0, ±
observations the following estimates of P(x) were computed:
& Antle's (1968) lack-of-fit
lack-of-fit test using a 5% level of significance
significance for each F test.
test.
(i) Hager &
(ii) The isotonic regression rule proposed in this paper in which the assumed restrictions
restrictions
V{were
wereTiV~t^......~Vm+1.
4;Vm+1
. The
parameters
in (3-2)
were
selected
to express
as strong
on the V.
The
parameters
in (3·2)
were
selected
to express
as strong
an an
opinion as possible that the zZit were near 1 while still keeping the ggit positive. In
In fact
Yl
Yi
=
= Y2
Yz =
= 1,
1»
= 1·5,
1-S,
Ya
Yz =
Y4
Yt ==
0
Yo
Y6 =
Ys == 22·0,
> Ye
= 2·49999999,
2-49999999,
Y7
y7 ==
1,
fJ7 =
1, ft
= 00,
oo, Zo
z0 == O.
0.
The strange selection for Y6
y6 is because, once the others were selected, gg71 is positive provided
y6 < 2·5.
2-5. This selection
selection of the parameters will
will almost certainly eliminate the sixth
sixth degree term
term
Y6
perfect.
of the polynomial unless the sixth degree fit is essentially perfect.
(iii) This is the same as (ii) except that null values for the parameters were used. That is,
all Yi
yt = 1, jS
oo and Zo
z0 = o.
0. This expresses a vague prior opinion on the zi.
zt.
fJ77 = 00
Thus the rules (ii) and (iii) provide two extreme prior opinions on the Zi.
z{.
To compare the accuracy of each estimate P{x)
p(x) of P(x), the loss function
function L(P,P)
L[P, P) was used
where L(P,P)
L(P,P) == i\${P(x)
— P(x)Ydx and
and the
thelimits
limitson
onthe
theintegral
integral were
were -—3 3and
and3. 3.Hence,
Hence,forfor
f{P(x)-P(x)}2dx
of 121
121 observations
each estimate, a loss was observed. The process was repeated for a total of
121 trials each rule
of the loss for each rule. Also observed was the number of times in the 121
yielded the correct degree of P(x).
P{x).
The entire process was repeated for nine different
different polynomials P(x) which ranged from
from
degree two to degree four. For each rule it was assumed that the degree of the polynomial
was known to be between one and six inclusive. The results are summarized in Table 1.
coefficient of the orthonormal polynomial of degree zero was 4,
For example, in case 3 the coefficient
others. Thus case 3 dealt with a
of degree one was 10, of degree two was 20 and zero for the othets.
quadratic polynomial. The smallest average loss was obtained by the rule (ii) which was
0-1874. The lack-of-fit
lack-of-fit rule yielded the correct degree 116 times in 121
121 trials. Although not
0·1874.
shown in the table, the average loss for the least squares estimator of the full model, a sixth
degree polynomial, was also computed and was 0·8069
0-8069 for every case.
lack-of-fit in every case. In terms of
One can see that in terms of loss, (ii) did better than lack-of-fit
of
lack-of-fit in 6 out of the 9 cases. Method (iii) beats lack-of-fit
lack-of-fit in 4
degree, (ii) beats lack-of-fit
cases in terms of degree and in 4 cases in terms of 1088.
loss.
The five cases numbered 3, 4, 5, 7, 9 have coefficients
coefficients that appear rather unlikely on the
basis of
of the prior assumptions for the rules (ii) and (iii).
(iii). Of those, only in two cases did
lack-of-fit
lack-of-fit outdo rule (ii) in terms of degree.
Table 1. Movie
M ante Garlo
Carlo re8ult8
results
Number
N u m b e r of
of times
tunes
correct degree
Case
Actual 0
(i)
(i)
(ii)
(ii)
(iii)
(iii)
1
2
3
4
5
6
7
8
9
5,
6, 2,
2, 1,0,0,
1, 0, 0, 0,
0, 00
15,
15, 0'5,
0-5, 0,05,
005, 0, 0, 0, 0
4, 10,
10, 20,
20, 0,
0, 0,
0, 0, 00
10, 10, - 20,
20, 30, 0, 0, 0
10,
10, 0, 0,
0, 3,
3, 0, 0, 00
0, 0
10,
10, I,
1, 2,
2, -- 22,, 0,0,0,
I,
1, I,
1, 2,
2, 3,
3, 4,
4, 0, 00
1,
1, 5,
5, 4,
4, 3,
3, 2,
2, 0,
0, 00
I,
1, 50,
50, 25,
25, 50,
50, 10,
10, 0,
0, 00
66
11
116
116
117
117
68
68
35
35
108
108
4444
116
116
63
63
30
30
95
95
109
109
108
108
100
100
116
116
93
93
116
116
21
21
12
12
36
36
47
47
47
47
47
47
62
62
59
59
62
62
Average
A v e r a g e loss x 10'
10*
(i)
(ii)
(iii)
(iii)
(i)
(ii)
2592
1683
2067
2067
2655
6300
7102
7102
5330
9075
3810
2110
1320
1874
2392
2738
3371
3643
5109
3207
3370
2484
2484
3306
3932
3999
4188
4691
4691
5090
4759
(i) Lack-of-fit;
Lack-of-fit; (ii) and (iii), isotonic regression
regression rules.
The rules (ii) and (iii) might have been improved by using something other than vague
2
knowledge on
onCT~
Vm+1. Since there were repeat measurements available, .we
we could sub0'-2 or Vm+l'
sub
s, the sum of squares for pure error (Draper &
& Smith,
stitute for the error sum of squares 8,
1966, p. 26).
26). This may improve the rules (ii) and (iii), since ggm+l
(3-4) is used as an
1966,
m+i as given in (3'4)
a~2. These ideas have not been tested in Monte Carlo studies.
initial estimate for Vmm+1
+1 = 0'-2.
5. A NUMERICAL EXAMPLE
5.
As a particular example we consider one of the results from case 6 of the Monte Carlo study.
The computations required to compute the estimate of 86 using rule (ii), as described in §4,
2. The table is composed of two parts. In the top part the index on the
are given in Table 2.
variables runs from zero to six. In the bottom part, the index on the variables runs from one
2-5259 while the value of ~
% is
to seven. For example, the estimate of 8622 given by rule (ii) is 2·5259
00973.
0·0973.
2. Numerical example
eocamplefor
resultsfrom
fromCase
Case6 6ofofTMle
TableI,1,
Table 2.
for result8
application of metlwd
method (ii)
(ii)
Actual
0^
O(
St
Least squares ~(
Z(
Rule (ii)
O(
8,
~
g(
9i
P;
i=O
i- 0
ii== 1
1
i=2
i=2
10
10
9-8422
9·8422
0
9-8422
9'8422
11
1-3619
1·3619
0·1299
01299
1-1850
22
2-9031
2·9031
01299
0·1299
2-5259
2'5259
it -= 11
ii == 22
ii== 3
3-7096
3·7096
0-2696
0·2696
00973
0·0973
16-8556
16'8556
00593
0·0593
00973
0·0973
9-6034
9·6034
0-2083
0·2083
0-2083
0·2083
i=3
i= 3
i=4
i= 4
—2
-2
00
-2,1913
1-0629
-2-1913 1·0629
0-2782 1
0·2782
-1-5816 00
i=4
i=4
2-2596
2·2596
1·3277
1-3277
0-7485
0'7485
i=5
i=5
ii== 6
00
-0,5344
-0-5344
1
00
00
0-2356
0·2356
1
00
i=5
i=5
ii== 6
0-5712
0'5712
5-2525
5·2525
0-7485
0·7485
01110
0·1110
360380
36·0380
0-7485
0·7485
ii == 7
7
10-4187
10·4187
19 x 1010"1010
0-7485
0·7485
2e:.
We is the residual sum of squares for the full model or 8.
s. For ii == I,
1, ...
...,l,W
= 2§\.
The value Jfa
, 7, ~
i =
different from formula (3'3).
(3-3). The doubling of the weights is a direct result of making
This is different
S{ in (2·1)
(2-1) must be
two observations at each point instead of one. Thus, the assumption on 0,
modified by replacing 0'2
a2 with !O'
\a22.• Similar modifications must be made for those parts of the
modified
o-2 appears through an assumption regarding the (J,.
f}{. However,
remaining formulae in which 002
2
where 0'2
o- appears as a direct result of the assumptions on 8,
s, the formula should be left
left unun
(2-4), we replace 0'a by O'N2
a/^/2 when and only when it occurs after
after the
changed. For example, in (2'4),
product sign. The definitions of the zi are changed to zi = 0'2/(002
^/(o 2 +
+ 20':),
2a2), but (2'7)
(2-7) would be left
left
as it is.
canbebedone
doneininunder
underfive
fiveminutes
minuteswith
with
The actual computation of the ^~ from the gg,i and W
~ tcan
et al. (1972,
(1972, p.
p. 19).
a hand calculator using the minimum violator algorithm of Barlow et
z,
z,
Professor H. D. Brunk at
The paper is based primarily on my Ph.D. thesis supervised by Professor
Professor Brunk and the referees for their invaluable
Oregon State University. I thank Professor
assistance with the paper.
REFERENCES
BARTHOLOMEW, D. J., BREMNER, J. M. &
& BRUNK,
BRUNEI, H. D. (1972).
(1972). Statistical
Statistical Inference
Inference
R. E., BARTHOLOMEW,
under Order
Order Restrictions.
Restrictions. New
NewYork:
York: Wiley.
Wiley.
& SMITH, H. (1966).
(1966). Applied Regression
Analysis. New
NewYork:
York:Wiley.
Wiley.
DRAPER, N. R. &
Regression Analysis.
& MORRIS,
MORRIS, C.
C. (1973).
(1973).Stein's
Stein'sestimation
estimationrule
ruleand
andits
itscompetitors-an
competitors—anempirical
empiricalBayes
Bayesapproach.
approach.
EFRON, B. &
J.
A880c. 68,
J. Am. Statist. Assoc.
68, 117-30.
GUTTMAN, 1.
I. (1967).
(1967). The use of a concept of a future observation in goodness·of·fit
goodness-of-fit problems. J.
J. R.
R.
GUTrMAN,
Statist. Soc.
Soc. B 29, 83-100.
& ANTLE,
ANTLE, C.
C. (1968).
(1968). The
The choice
choice of
of the
the degree
degree of
of aa polynomial
polynomial model.
model. J.
J. R.
R. Statist.
Statist. Soc.
Soc. BB 30,
30,
HAGER, H. &
469-71.
l!ALPERN,
137-43.
HAT/PERN, E. F. (1973). Polynomial
Polynomial regression
regression from
from aa Bayesian
Bayesian approach.
approach. J.
J. Am.
Am. Statist.
Statist. A880c.
Assoc.68,
68,137-43.
(1977). A Bayesian approach to prediction using polynomials. Biometrika 64, 309-17.
YOUNG, A. S. (1977).
BARLOW,
© Copyright 2026 Paperzz