Carroll, R.J., Davidian, M. and Smith. WVariance Functions and the Minimum Detectable Concentrationin Assays"

VARIAXCE
MI~IMUM
FU~CTIONS
DETECTABLE
AND THE
COXCE~TRATIOX
I~
ASSAYS
R.J. Carroll
M. Davidian
University of North Carolina at Chapel Hill
W. Smith
Eli Lilly & Company
Acknowledgement
The research of Carroll and Davidian was supported by the Air Force
Office of Scientific Research AFOSR-F-49620-85-C-0144.
Key Words and Phrases:
Weighted least squares, extended least squares,
heteroscedasticity, calibration, prediction.
Abstract
Assay data are often fit by a nonlinear regression model incorporating
heterogeneity of variance, as in radioimmunoassay, for example.
Typically,
the standard deviation of the response is taken to be proporLonal to a
power
B
of the :nean.
There is considerable empirical evidence suggesting
that for assays of a reasonable size, how one estimates the parameter
~
not greatly affect how well one estimates the mean regression function.
additional
component
constructs such as
of
assay
analysis
is
the
estimation
the minimum detectable concentration,
definitions exist; we focus on one such definition.
concentration depends
both on
B
and
the mean
of
for
does
An
auxilIary
which many
The minimum detectable
regression
function.
We
compare three standard methods of estimating the parameter e due to Rodbard
(1978), Raab (1981a) and Carroll and Ruppert (1982b).
are taken at each concentration.
When duplicate counts
the first method is
only 20% efficient
asymptotically in comparison to the third, and the resulting estimate of the
minimum detectable concentration is asymptotically 3.3 times more variable
for first
than the third.
estimator compared
however.
theory.
to
the
Less dramatic
third;
resul ts
obtain for
the
second
this estimator is still not efficient,
Simulation results and an example are supportive of the asymptotic
1
1.
Introduction
~he
analysis of assay data has long been an important problem in clinical
chemistry and
the
Oppenheimer, et al.
biological
sc iences;
see,
for example,
Finney
The most common method of analysis
(1983).
nonlinear regression model to the data.
data can be markedly heteroscedastic;
(1964)
and
is to fit a
Much recent work suggests that these
in radioimmunoassay,
for
example,
this
characteristic has been observed repeatedly and incorporated into the analysis
as discussed by Finney (1976),
Raab
(1981a. b)
cases of the
Rodbard
and Butt (1984).
heteroscedastic
(1978),
Tiede and Pagano
(1979)
and
Such analyses are for the most part special
nonlinear
regression
model.
observe independent counts Y.. at concentrations x.
1J
1
for i
Specifically,
=
1, ... , Nand j
we
=
1, ... M. with mean and variances given by
1
(1.1 )
2
{og(x.,p,8)} ,
E Y ..
1
1J
where 13 is the unknown regression parameter vector of
structural
variance
A fairly
parameter.
length p and 8
is the
standard model for the mean in a
radioimmunoassay is the four parameter logistic model
f (x ,13)
(1.2)
Almost without exception, the variances have been modeled as functions of the
mean response, usually either as a quadratic or as a power of the mean, e.g.,
(1. 3)
The
O.
1
fundamental
Standard deviation of Y..
1J
contribution
of
=
Rodbard
og(x. ,13 ,8)
1
and
other
8
of (x. ,13) .
1
workers
has
been
to
2
incorporate
the
heterogenei ty
into
contribution
has
been
improvement
a
great
the
analysis;
in
the
the
resul t
quality
of
of
the ir
statistical
analysis.
Methods of estimating
method of estimating
~
and 9 are discussed in Section 2.
is generalized least squares.
~
By various devices, one
forms estimates o. of the variances o. and then estimates
1
The most common
1
~
by weighted least
squares.
As discussed by Jobson and Fuller (1980) and Carroll and Ruppel't
(1982b),
under quite general circumstances for large enough sample sizes how
one estimates the variance does not matter, and 13 is asymptotically normally
2
distributed with mean 13 and variance (0 /NS)SG
-1
. where N is the total sample
S
size and
N S
(1. 4)
1
Z~1= 1Z~i1
{f/l(x, .p)f/l(x. ,13) T}/g2(x. ,p,G).
J=
~
1
~
1
1
and f ~ the del' i va ti ve of f with respect to 13.
estimation of
parameter
~,
I t has been shown that for
how one estimates the variance function,
in particular the
e, has only a second order effect asymptotically, see, for example,
Rothenberg (1984).
The general asymptotic result (1.4) can be optimistic, but
in our experience for RIA and ELISA assays,
the asymptot ics are often rather
reasonable.
What the previous discussion suggests is that if our only interest is to
estimate p, then in many assays the method of estimating the variance function
may not be crucial.
However.
the assay problem does not always stop with
estimating 13, but rather also addresses issues of calibration.
include
confidence
cal ibration problem.
intervals
for
a
true
x*
given
a
new Y*,
These issues
the
classic
Also of interest is determining the sensi ti vi ty of the
assay using such concepts as the minimum detectable concentra tion of Rodbard
3
(1978)
and
the
critical
Oppenheimer, et al.
level,
(1983).
detection
level
and determination
limit of
A unique feature of these calibration problems is
that the efficiency of estimation is essentially determined by how well one
estimates the variance parameter e;
the purpose of this paper is to justify
this claim.
To the best of our knowledge, our paper is one of the first which shows
explicitly that how one estimates the structural variance parameter e can be
important in determining the behavior of estimates of interesting quantities.
Far from being only a nuisance parameter as it is often thought to be,
quantity which has an
prediction problems.
important
role
In addition,
in
the
analysis
of
e is a
calibration
and
e can be important in itself, as in for
example off line quality control, see Box and Meyer
(1986).
In this latter
application, one might want to find the levels of x which give minimum variance
subject to a constraint on the mean.
Our general qualitative conclusion is that how well one estimates
matters.
Instead
determination
of
of
pursuing
minimum
a
fully
detectable
general
theory,
concentration.
we
There
e really
focus
is
on
no
the
unique
definition of this concept, and for illustration we pick one of the possible
candidates.
Definition.
Let Y(x,M)
be
the
mean
response
based
on
M replicates
at
concentration level x, taken independently of the calibration data set {Y .. }.
1J
Let
f(O,P)
be
the
expected
calibration data set.
response
at
zero
concentration
The minimum detectable concentration x
c
based
on
at level (1-a)
is the smallest concentration x for which
(1. 5)
Pr(V(x,M)
~
f(O,P)} >
1 - a.
the
o
4
Qualitatively,
illustrated
in
the minimum
Figure
detectable
One
1.
first
concentration
constructs
the
is
arrived
estimated
function f(x.p), and then attaches to i t an estimated lower
(1-<1)
at
as
regression
confidence
line for the new response Y(x,M) at concentration x based on M replicates.
Starting from the estimated zero concentration mean f(O,p), one does a standard
calibration
by
drawing
a
horizontal
line
until
it
intersects
the
lower
confidence interval, the value of x at which this intersection occurs being the
minimum detec table concen tra tion.
In Figure I,
we illustrate why getting a.
good handle on the variance function is important.
Assuming as is natural that
the variance is smallest where the mean count is smallest. we see that the
prediction interval based on an unweighted analysis is much too conservative
for low concentrations.
estimated
minimum
This translates immediately into a large bias in the
detectable
concentration.
variance is taken into account, Figure
Even
if
the heterogeneity of
makes clear that a poor estimate of
1
e
can have considerable impact on the estimated minimum detectable concentration.
We now outline the standard method for estimating the minimum detectable
concentration.
To
be
more
precise
and
follow
the
outline
given
in
the
preceeding paragraph. one would replace the t-percentage point to follow with
an asymptotically negligible correction based on the limit distribution of the
estimates of (9.13,0).
last
limit
This program has not been followed in practice since the
distribution
has
asymptotically unimportant.
t-distribution
with
been unknown,
If
t(d,N -p)
s
N -p degrees
s
of
and
is
the
freedom.
in any case
(l--<x:)th
the usual
the
effect
percentile of
estimate x
c
satisfies
(1. 6)
{f(x ,13) - f(0.p)}2
c
{t(d,~S-P)}
2
~2
2
~
{a g (X ,P,9)/M - var[f(O.p)]},
c
is
the
of x
c
5
where var[f(O,p)] is an estimate of the variance of
~2
f(O,p)
and cr
is the usual
mean squared error from the weighted fit:
In
Section
2
of
this
paper,
we
estimating the variance parameter 9.
discuss
three
standard
methods
for
Under relati vel y general conditions.
two
of these can be quite a bit less efficient than the third, and we discuss in.
Section
3
how
this
difference
translates
detectable concentration problem.
theoretically
to
the
minimum
In Sections 4 and 5, we pres en t a small
Monte-Carlo study and an example to illustrate the results.
The key conclusion
is that how one estimates 9 can affect the relative efficiency of estimated
quantities useful in the calibration of assays.
2.
Methods of estimating mean and variance parameters
The problem of estimating G in models (1.1) and (1.3) has been discussed
in many places in the literature.
al.
(1935, Chapter 11).
A nice introduction is given by Judge,
et
More specialized and formal treatments include those
by Rodbard (1978), Jobson and Fuller (1980), Raab (1981a), Carroll and Ruppert
(1982b)
and
Davidian
and
Carroll
sampling of the possible selection.
with quite different motivations.
case of equal replication M.
1
replication.
=
M ~
(1986).
although
these
only
represent a
We focus our attention on three methods
For simplicity, we will discuss only the
2.
The first two methods require some
6
2.1
Log-linearized estimation
~odel
is
linear
(1.3) implies upon taking logarithms that the Jog standard deviat.ion
in
the
log mean with
slope
e.
Letting
-
2
,8. ) be the within
(Y.
I'
1
concentration sample means and variances, this suggests that one estimate e as
the slope from regressing log 8. on log Y. .
1
eLL'
Rodbard
sample
mean
(1978)
to
suggests
the
power
forming
A
If we denote this estimate as
I'
eU .. _and
estimated standard
deviations
then applying weighted
least
as
the
squares
to.
estimate j3.
2.2
Modified maximum likelihood
Raab
(1981a)
suggests a
method
for
estimating 9
using
normal
theory
maximum I ikelihood but wi thout making any assumptions about the form of the
mean function.
Raab assumes independence and proposes estimation of 9 by joint
maximization of the "modified" normal likelihood
(2 . 1)
~
2
Ii. 1 {22TO g (j.J . ,9 ) }
1=
(m-l)/2
in the parameters
0,
B.
j.J
1
m
exp [-1:.
1
J=~
1
, .... j.J~.,
"
2
(Y . . -j.J .) 1(20
1J
1
2
g (j.J . ,9 ) }]
1
where we have written g(x .. j3.9) as g(j.J.,9)
1
1
to emphasize the dependence of the variance function on the mean response.
modification serves to make the estimator of
now
proceed
via
weighted
log-linearized method.
least
squares
0
in
2
unbiased.
a
fashion
The
Estimation of j3 may
analagous
to
the
7
2.3
Pseudo-likelihood
For given B,
the pseudo-likelihood estimator of 9 is
the normal theory
maximum likelihood estimate, maximizing
(2.2)
see Carroll and Ruppert (1982b).
jointly.
One can devise many ways to estimate 9 and
For example, one can
( i)
( i i)
Set p = unweighted least squares;
Estimate 9 by pseudo-likelihood;
2
(i i i)
( j
v)
( v)
~
Form estimated variances g (x.
1
Re-estimate
~
,P,9);
p by weighted least squares;
Iterate (ii) - (iv) one or more times.
The number of cycles
(iv).
P
~
of this algorithm is the number of times one hits step
One can do step (ii) by direct maximization or by weighted least squares
as in Davidian and Carroll (1986).
A key point to note is that pseudo-likelihood requires no replication and
hence easily copes with unequal replication.
2.4
Other methods
Var ious other methods have been proposed;
see,
for example,
Jobson and
8
Ful1er (1980) and Box and
Hill
(1974).
Robust variance function estimation
methods have also been developed, see Carroll and Ruppert (1982b) and Giltinan,
Carroll and Ruppert (1986).
A final
likelihood.
use of
method
of
estimating
(jJ ,9)
is normal
theory maximum
There are important issues of robustness which complicate routine
this method,
Davidian
jointly
and
see McCullagh
Carroll
(1986)
for
(1983),
further
Carroll and Ruppert
discussion.
(1982a) and
For
assay
data,
pseudo-likelihood and maximum likelihood estimates of 8 have similar asymptotic
behavior, and we will use the former largely for its ease of calculation.
3.
Asymptotic theory
The asymptotic theory of the log-linearized estimator 8
because regressing
problem.
log S.
1
Likewise,
on
log Y.
l'
is not a standard
wi th
is complicated
linear regression
Both
~.
of
these
problems
are
the parameter space
nonlinear
functional
errors-in-variables problems of kind addressed by Wolter and Fuller
Amemiya and Fuller
estimating J.i
i
•
the asymptotic theory for the modified maximum likelihood
estimator emiL is complicated because the dimension of
increases
LL
(1985)
by Vi. in
eLL
and Stefanski and Carroll (1985).
or by the joint estimator I-'i in
estimators to be biased asymptotically.
e
MML
(1982),
The error in
causes these
This bias is typically negl igible,
because in most of the assays we have seen the parameter a in (1.1) is quite
small .
Thus, empirically it makes sense to define an asymptotic theory where
the sample size N
S
=
NM becomes large and a simultaneously is small.
in most assays the number of replicates M is small, we shall let N ~
a
Because
00
and a
~
while keeping M fixed; Raab (1981a) suggests that M = 2 is the most common
•
9
case.
It is important for the reader to understand that letting X ~
simultaneously is dictated by the problems
and
estimator
the
mod i f ied
pseudo-likelihood estimator
maximum
of
studying
likelihood
the
~
and
0
~
0
log-linearized
estimator
a
.
~ML'
the
a pL has a routine asymptotic theory even for fixed
o.
The asymptotic distribution of these estimates of 9 can be obtained from
the general theory of Davidian and Carroll (1986).
1
o
~ ~
As
log f(x.
v.
E. •.
IJ
Theorem 1.
Define
00
and
1
,{J).
2
v
~ 0 simultaneously and X1 / 20 = ()' (1),
0
i f the random
variables {E. .. } are symmetric and independent and identically distributed. then
1J
;;e
2
~l/2(~
N(O. var{log Qi}/(40 ) '
•
LL - 9)
v
~
...
;;e
1 2
N(O . var(Q~)/(402)) .
N / (~MML - 9)
1
v
Nl/
2
(~PL
- 9)
;;e
~
2
N(O. var (E. •• ) /
1J
(4~
2
V
)).
0
Under these asymptotics. the symmetry condition is necessary to ensure that the
asymptotic
distributions
of
9
LL
and
estimator obtained by replacing /.11' by
virtually
indistinguishable
in
MML
have
zero
mean;
symmetry
is
Sadler and Smith (1985) note that the
unnecessary for the result for 9 PL'
is
9
Y.I ' in (2.1) and maximizing in
practice
from
the
full
modified
0
2
and 9
maximum
10
likelihood estimator of e.
asymptotically equivalent under
es tima tor
is
equivalent
replaced everywhere by Vi'
can be shown that these two estimators are
It
to
the above asymptotics
the
pseudo-likelihood
in (2.2);
thus,
and that
estimator
this
with
as an approximation to e
second
•
f ( x . ,/3 )
1
MML
one
could compute the pseudo-likelihood estimator using Y" .
.1'
From Theorem 1, the asymptotic relative efficiencies of the log-linearized
method and the modified maximum likelihood method relative to pseudo-likelihood
can be computed.
Note that
2
var (q. )
(3. 1 )
1
so that 9
MML
var(E. . . )/M
1J
+
2/{M(M-l)}
has uniformly larger asymptotic variance than 9 pL for all M > 2
regardless of the distribution of the {E. .. }.
1J
We have tabulated the relative efficiencies of 9
LL
and 9 'l:L to 9 pL for
MJ
various numbers of replications M, assuming normally distributed data.
..
Asymptotic relative efficiencies of estimators of e for small a
M
2
0.203
0.500
0.405
3
0.405
0.667
0.608
4
0.535
0.750
0.713
9
0.783
0.889
0.881
10
0.804
0.900
0.893
•
11
In our experience.
these numbers slightly exaggerate the inefficiency of eLL
relative to pseudo-likelihood. especially when the assay is rather small.
dupl icates YI = 2,
For
in two assay simuJ ations. we have found that the variance
efficiency of the log-linearized method was 0.35 and 0.31; the former number is
repol'ted
in
relative to 8
39%,
The asymptotic
the next section.
relative
efficiencies
of
9
LL
agree quite well with the efficiencies of 39%, 62% and 77% and
MML
64% and 74% for YI
=
2. 3 and 4
~'eported
by Raab (1981a) and Sadler and
Smith (1985). respectively, in two Monte-Carlo studies.
While modified maximum
likelihood represents an improvement over the log-linearized method. the theory
clearly points to the
inefficiency of
both
the
log-linearized
and
modified
maximum likelihood methods when the number of replicates is small and the data
are nearly normally distributed.
For the minimum detectable concentration.
var{f(O,,6)}
is of the order
the other terms.
Of course,
(~M)
-1
note
that
in
(1.6)
the
term
and is hence rather small relative to all
for normally distributed data,
the solution to
(1.6) is the quantity x * • where
c
o
(3.2)
{z(a)} 2C1 2 f 29 (x * .,6)/M - (f(x * .,6) - f(O.,6)} 2 .
c
c
ans z(a) is the (1 - a)th percentile point of the standard normal distribution.
Here is the major result, the technical details for which are given in the
appendix.
Define
log f(O,,6) -
(3.3)
Theorem
2.
Let
xc (LL).
x (MJ'tL)
c
lim~ ~-lZ~=l
and
x (PL)
c
log f(x .,6).
i
denote
the
estimated
detectable concentrations using the log-linearized estimate 9
LL
.
minimum
the modified
12
maximum
estimate
likelihood
respectively.
b~
sequence
eMML
and
there is a constant A and a
O
~1/2(~
•
*
AO N1/2(~ c (MML) - x C ) /0
~
(3.6)
b
1 2
*
A N / (; (PL) - x )/0
N O
c
C
...
(3.1),
c
-
~ N(O, var(E.~.)
b~
O
(LL)
/)/0
(3.3)
likelihood
estimate
for which
b"" A
From
pseudo-likelihood
Then under regularity conditi.ons,
(3.4)
"
the
C
1J
~
~
+-
var(log
q~)d02M/02 ),
1
V
2
N(O, var(E. .. )
1J
var(q~)d~M/02
1
v
2
N(O, var (E. .. ) { 1
1J
+
d2/ 2}
O,ov
)
.
)
.
0
the asymptotic relative efficiency of the modified maximum
estimate
to
minimum
detectable
concentration
relative
to
the
pseudo-likelihood estimate is
e
•
2
2
222
var(E. .. )(0 -'-dO)/{var(E. . . )(0 +d ) + 2d /(M-l)}
O
O
1J
v
1J
V
(3.7)
2
whLch is less than 1 for all M regardless of the value of var(E. .. ).
1J
the asymptotic relative efficiency of
Similarly,
the log-linearized estimate of minimum
detectable concentration to the pseudo-likelihood estimate is
2222222
var(E. .. }(o +do}/{O var(E. .. } + MdOvar(log q1')}'
1J
v
v
1J
(3.8)
c
It follows from Theorem 1 and (3.7) and (3.8) that the ordering in efficiency
of estimated minimum detectable concentration is the same as the ordering for
estimating
e
and thus will favor pseudo-likelihood for
normally distributed
data in the case of the log-linearized estimator and for all distributions in
the case of the modified maximum likelihood method.
For distributions other
13
than normal.
calculations wi th other symmetric distributions such as double
exponential and various contamjnated normal distributions show very few cases
where 9 LL
is
more
eff ic ient
than 9 PL'
see Davidian
The numer ical
(1986).
2
2
efficiencies depend on the logarithm of the true means through do and avo
example.
in
the
simulation
discussed
in
the
next
section.
the
For
asymptotic
relative efficiency of the log-linearized estimate is 27% for M = 2 and 63% for
~
= 4.
The asymptotic theory thus suggests that inefficiencies in estimating the
e
variance parameter
translate into inefficiencies for estimating the minimum.
detectable concentration.
4.
A simulation
To check the qualitative nature of the asymptotic theory, we ran a small
simulation.
We
restrict
our
focus
here
to
the
log-linearized method and
pseudo-likelihood.
The
responses
satisfying (1.2),
1.0022.
4.
each
e =
were
Y
ij
(1.3).
where /3
0.7 and a =.0872.
We studied the case M
situation.
there
normally
=
were
distributed
= 29.5274. /3
1
2
with
mean
= 1.8864, /3
3
and
=
variance
1.5793, /3
4
=
The 23 concentrations chosen are given in Table
2 or duplicates and M = 4 or quadruplicates.
500
simulated
simulation was run with the larger value a
data
sets.
A
limited
For
second
0.17, but there did not appear to
be significant qualitative differences from the case reported here.
The estimators chosen were unweighted least squares for /3.
log-linearized
method
and
the
pseudo-likelihood/generalized
combination which we report only for
~
the Rodbard
least
squares
= 1 and 2 cycles of the algorithm.
The
14
methods of estimating the minimum detectable concentration are as discussed in
Section 1.
The estimates of B were constrained to lie in the interval 0 S B S
1.50.
In Table
I,
The
variance.
we
compare
biases are
the
large
estimators of B on the basis of bias and
relative
to
the
standard
error,
mean-squared error comparisons are artificial and dramatic.
pseudo-likelihood estimate of
e
when doing only
has been observed by us in other problems.
~
=
that
The bias in the
1 cycles of the algorithm
One sees here that the effect of
doubl ing the repl icates from two to four for a given set of
improves the Monte-Carlo efficiency of
so
concentrations.
the log-linearized estimate of B from
35% Lo 58%, compared to the theoretical asymptotic increase from 20% to 54%.
This
example
indicates
that
pseudo-likelihood estimation of B
can
in
some
circumstances be a considerable improvement over the log-linearized method.
For the minimum detectable concentration we chose a
the methods used in the study,
satisf ied;
97%.
= 0.05.
the probability requirement
rather than 95% exceedance probabil ity,
For all of
(1.5) was easily
every case was more than
The mean values of the minimum detectable concentrations are reported in
Table 2, with variances given in Table 3.
we give
results
for
the
case
that B
is
Note that in both of these tables,
known as well as estimated.
relatively poor behavior of unweighted least squares is evident.
Oppenheimer, et al.
(1983):
The
To quote from
"Rather dramatic differences have been observed
depending on whether a valid weighted or inappropriate unweighted analysis is
used."
When the variance parameter e is known,
there is little difference
between any of the weighted methods.
When B is unknown, there are rather large proportional differences.
The
figures in Table 4 show that the mean minimum detectable concentration for the
log-linearized method is 10% larger than for the pseudo-likelihood method based
15
on
= 2 cycles;
'f
whether the raw numerical difference is of any practical
consequence will depend on the context.
For M = 2 replicates. the pseudo-likelihood estimate of minimum detectable
conentration with unknown
10
-4
e has mean 3.934 x 10 -2 and standard deviation 0.05 x
; the corresponding figures for M
=
4 are 2.722 x 10
--2-4
and 0.028 x 10
.
Proportionately, when e is unknown. the method of estimating it seems to have
important consequences for the estimate of minimum detectable concentratior;.
particularly in the variability of the estimate.
For the case of duplicates.
the Monte-Carlo variance of pseudo-likelihood is only 37% as
based
on
the
log-linearized
estimate,
while
the
asymptotics
large as
suggest
that·
27%,
increasing to 71% and 63% respectively for quadruplicates.
The point here is that the relative efficiency of the estimated minimum
detectable concentration can be affected by the algorithm used to estimate the
variance parameter
5.
e.
An example
Differences among the three estimators of
e and the subsequent estimators
of minimum detectable concentration which are reminiscent of the quali tati ve
implications of
the asymptotic
following example.
Table 4.
theory
and
the
can
be
seen
in
the
The data are from a radioimmunoassay and are presented in
The analysis presented here is for illustrative purposes only; we do
not claim to be analyzing these data fully.
that
simulation
three methods
of analysis
Our aim is to exhibi t the fac t
considered
in this
paper
can
lead
to
nontrivially different results.
We assumed in all cases the model (1.2) and (1.3).
For the full data set
16
and
reduced
(except
one
data
set
sets
for
considering
which
an
all
possible
permutations
and
log-l inearized methods and
Smith (1985) as described in SecUon 3
minimum
detectable
concentration
based
on
e,
0
2
and x
c
using the
the estimate of Sadler and
in place of
difficult modified maximum likelihood estimate.
of
duplicates
s~1 '" 0, compi icating the application of the
log-llneadzed method), we computed the estimates of
pseudo-likelihood
of
the more computationally
We also computed the estimate
ordinary
least
squares.
The
results are given in Table 5.
An
investigation
of
both the full
and reduced data sets suggests
that·
there are no massive outliers and that design points 1, 22 and 23 are possible
high leverage points.
point;
see
Davidian
For our purposes of illustration we do not pursue this
and
Carroll
(1986)
for
discussion
on
accounting
for
The results of Table 5 show that the three estimates can vary greatly.
As
leverage in estimation of 9.
a crude measure of this, consider the means and standard deviations of 9 and x
for
the
c
five data sets obtained by considering duplicates (ignoring the fact
that these data sets are not strictly independent).
Below we
list
"relative
efficiencies" for the estimators based on these crude measures:
"Relative efficiencies" for estimators of
e
and Xc
for data in example when M =2
LL to PL
x
c
Qualitatively,
the
MML to PL
LL to MML
.222
.351
.632
.529
.659
.802
estimates
exhibit
the
type of behavior predicted by the
..
17
asymptotic theory; quantitatively, the values compare favorably with what the
theory would predict given the crudity of the comparison.
This example shows that there can be wide differences among the various
estimation methods for
and minimum detectable concentration in application
9
and that the qualitative way in which the differences manifest themselves is
predicted by the asymptotic theory of Section 3.
6.
Incorporating unknowns and standards
In many assays,
along with
the known standards
{Y ij
,xi}
there
is
* i
additional set {Y *.. } of proporti onal size at unknown concentrations {x.},
1J
1
.L
••••
~
,..
*
J. --
,
1
1 , ... ,4Mi'
an
=
It is common to assume that these unknowns satisfy
Standard Deviations (Y * )
ij
(6.1)
The power of the log-linearized or modified likelihood methods is that they can
incorporate the responses at unknown concentrations to obtain a better
es~~mate
of e; pseudo-likelihood and other similar techniques cannot easily incorporate
this
information because
*
they rely on knowing the concentrations {x.}.
1
A
simple way to improve pseudo-likelihood to take into account the unknowns is to
incorporate the additional information about the variances in the unknowns by
exploiting an estimator that does not depend on the form of the mean response.
For the
equally
exposition
well
estimator.
employ
Let 9
the unknowns
u
here,
the
and 9
a lone or
consider
same
S,u
the
the
idea
log-linearized
using
the
modified
estimator;
maximum
we
could
likelihood
denote the log-linearized estimates of 9 based on
full
data
respectively.
and
let
e PL denote the
18
pseudo-l ikelihood es timate based on the standards alone.
variance estimate of 9
u
Let S2 dena te the
u
produced by the linear least squares program and let
2
SpL be the estimated variance of 9 pL ' where following Theorem 1
(4~S)-1
sample variance of
{standardized squared residuals r .. }
1J
{sample variance of log predicted
values log f(x.,f3)
1
Then a weighted estimate of 9 is simply
e
(6.2)
we prefer to replace 8
w=
u
by 8
s ,u
in (6.2).
improve upon the log-linearized estimate 9
The weighted estimate (6.2) will
s,u
based on all the data, although
the degree of improvement will be smaller than that found in Sections 3 or 4.
For example. if there are exactly as many unknowns as standards and duplicates
are used, then
e s,u
has asymptotic relative efficiency of 34% versus 20% when
only standards are available.
7.
Discussion
We have addressed the general issue of estimating calibration quantities
in assays which exhibit large amounts of heterogeneity.
We have shown that not
weighting at all leads to large decreases in the efficiency of analysis.
Even
19
when weighting is used. we have shown that changes in relative efficiency occur
depending on the method of estimating the variances. especially the parameter e
in (1.3).
The key point is that while for estimation of p the effect of how
one estimates the variance function is only second order.
for estimation of
other quantities such as minimum detectable concentration. the effect is first
order.
We have had success using the idea of pseudo-likelihood in Carroll and
Ruppert (1982b); this method applies in general heteroscedastic models and is
easy to compute as shown in the appendix. although the reader shou]d be aware
that it is not robust against outliers.
One can also
consider data
transformation rather than weighting.
The
transform-bath-sides idea in Carroll and Ruppert (1984) applies to the assay
problem.
20
REFERENCES
Amemiya, Y. and Fuller, W.
functional relationship.
A. (1985).
Estimation
unpublished manuscript.
Box, G. E. P. and Meyer (1986).
]echnometrlcs 28, 19-28.
Butt, W.R. (1984).
York.
for
the
nonlinear
Dispersion effects from fractional designs.
Practical Immunoassay.
Marcel Dekker, Inc.,
~ew
Carroll ,R. J., and Ruppert, D. ( 1982a) .
A compar i son l:letween maxi mum
likelihood
and generalized
leas t
squares
in a heteroscedas tic
linear model.
Journal of the American Statistical Association
77. 878-882.
Carroll,
R.
J.,
and Ruppert,
D.
(1982b).
Robust estimation
heteroscedastic linear models. Annals of Statistics 10. 429-441.
Carroll, R. J .. and Ruppert. D. (1984).
fitting theoretical models to data.
Statistical Association 79, 321-328.
in
Power transformations when
Journal of the American
Davidian, M. (1986).
Variance function estimation in heteroscedastic
regression
models.
Unpublished Ph.D.
dissertation,
University
of ~orth Carolina at Chapel Hill.
Davidian, M. and Carroll,
estima ti on. Preprint.
Finney,
D.
J.
(1964).
Griffin, London.
R.J.
Theory
(1986)
Statistical
Finney. D. J. (1976). Radioligand assay.
Methods
Biometrics
of
on
variance
function
Biological
Assay.
32, 721-740.
Giltinan, D. M.• Carroll. R. J. and Ruppert. D. (1986).
Some new
methods for weighted regression when there are possible outliers.
Technometrics 28, 000-000.
Jobson, J. D. & Fuller. W. A. (1980). Least squares estimation when
the
covariance
matrix
and
parameter
vector
are
functionally
related.
Journal of the American Statistical Association
75,
176-181.
Judge, G. G., Grifffiths, W. E., HEl, R. C.,
C.
(1985) .
The
Theory and Practice
Edition. John Wiley and Sons, ~ew York.
Lutkepohl, H. and Lee, T.
of Econometrics.
Second
McCullagh,
P.
Statistics
functions.
Oppenheimer.
L..
(1983) .
11, 59-67.
Capizzi,
Quasi -likelihood
T.P.,
Weppelman,
R.M.
and
Mehto,
Annals
H.
of
(1983)
21
Determining
the
lowest
limit
Analytical Chemistry 55, 638-643.
of
reliable
assay
measurement.
Raab,
G.
~.
(1981a).
Estimation
of
a
variance
function,
application to radioimmunoassay. Applied Statistics 3D, 32-40.
Raab,
G.
~.
(1981b).
Letter
on
'Robust
radioimmunoassay'. Biometrics 37, 839-841.
cal ibration
Rodbard, D. (1978).
Statistical estimation of the minimum
conentration ("Sensitivity") for radioligand assays.
Biochemistry 90, 1-12.
Approximate normality
Rothenberg, T . J . (1984)
squares estimators. Econometrica 52, 811-825.
Sadler, W.A. and Smith, M. H .
(1985) .
error relationship in immunoassay.
1802-1805.
A. and Carroll, R.J.
Stefanski,
L.
logistic
regression.
error
in
1335-1351.
of
(1985) .
Annals
and
detectable
Analytical
generalized
Estimation
Clinical
with
least
of the responseChemistry
31/11,
Covariate measurement
of
Statistics
13,
Tiede, J.J. and Pagano, M. (1979).
The application
bration to radioimmunoassay. Biometrics 35,567-574.
of
robust
cali-
Wolter, K. M., and Fuller, W. A. (1982) .
Estimation of nonlinear
errors-in-variables models. Annals of Statistics 10, 539-548.
Appendix
The analysis of the minimum detectable concentration is complicated by the
behavior of the derivative of f(x,/3) with respect to /3 at x=O. especially for
the standard model (1.2).
=
e (x c* ./3)
and 17
c
=
We will write f(x./3) = h(17 ./3). where 17 =
e (x c ./3).
In the model (1.2). 17 =
e (x,/3)
e (x,/3),
17 *
c
= exp (/34 log x).
We assume throughout that f(O./3) > O. and that all functions are sufficiently
smooth.
(A.1 )
Assume further that
e (0,/3)
o
22
(A.2)
h(O,)3)
= h (0,)3)
fJ
e13 ( 0 ,)3) =
(A.3)
0
~
0 ;
If w ~ 0 and v is a random variable such that
(A.4)
p
e(v,p)/e(w,p)
I efJ (av+(l--<l)w,p)1e fJ (w,p)
sup{
I, then
~
- 1
I
p
for OsaSl
These assumptions are satisfied for the model (1.2) if 13
We need the following results.
at the end of the appendix.
Lemma A.l
As a
Proof:
c
> O.
The proofs of Lemmas A.2 and A.3 will be
2
{z{a)} .
=
.,.
(5
2
a
(0 ),
1/2 9
(c/M) . f (O,p){a/a!?
c
h(O,p)}
A Taylor series expansion of (3.2) in fJ * and around zero.
c
Lemma A.2
Assume that as N ~
00,
a ~ 0, (n
./ c - "/n*)
C
2Ma {a/afJ
c
Then as N ~
(A.5)
Lemma A.3
00,
a
A N
1
~
1/2
.
for f(O,p) > 0,
~ 0
fJ c* = aa
Let c
4
~O
h(O,p)}
1/2 ~
0, if N
(9 - 9)
~
(q
c
- fJ
= (5
p
2
/{cf
29
= (5
1 2
P (ON / ).
-1
.
0
Define
(O,p)}.
(I), we have the asymptotic expansion
* ) /0
c
1/2(-2
=,N
a
-0
Consider Lemma A.2.
2)/a 2
Then
~
1/2 2{log f(O,p)} N' (9 - 9)
+ ().
P
(1).
o
23
-
1) -
1
1<)
-
2'v(NM) 1'"'(8
e) .,. 0
-
p
(1)
1 2
s o tha t l'f A0 'M / AI'
*
1/2 -
(lJ
AON-'-'
- lJ )/0
c
c
( ~M)-1/2~Ni=l zMj=l (2
E. ij
"
&.
where do is defined in (3.3).
Proposition 1
where lJ
=
c
The
:
0
limit results
(3.4)
lJ (LL), lJ (MML) or lJ (PL).
c
c
Proof of Proposition 1
c
From Davidian and Carroll (1986), using Theorem 1 we
have that
N
1/2 (9
- B)
LL
(1/2) N
-1/2 _N
2
2
.l.i=l {log qi - E(log qi)}
1/2 -
1/2
N
(;
,-1/2
( eMML - e) = (1 1 2) N
N
PL
_ 8) = (112) N-
1/2
M-
1
N
2
(vi -
2
v) . .
0p(l)
-
Z i =1 { q i - E ( q i )} ( v i - v ) + (). p ( 1)
z~ z~
1=1 J=l
(E..~
IJ
-
1)
(v. -
1
v)
-r
()
P
;
(1)
so that by Lemmas A.l - A.3 and equation (A.5), we have
AON
1/2 *
(lJ (PL) - lJ ) 10
c
c
( •NM) -1/2~N.'
&.
~M.
&.
1=1 J=1
(",2..
~
IJ
-
2
1){1 + d 0 ( VI' - -)/0
(1) ,
V
V } ..,. n
v
p
which with the central limit theorem gives the same limit distribution as
(3.6).
We also have that
in
24
AON
*
1/2 -
(17 (LL) - 17 )/0
c
c
d M1/2~-1/2ZN
0"
=
(
i=l vi
(NM. )-1/2~N
..
"-.
~M (2.. -1)
l=l""-.J= 1 e. 1J
- -)1
v
( NM)
•
2
2
og q.lo
1
V
-1/2 Z N
+ 0
ZM
. 1 . 1
1=
J=
2 2
V) q. 10
-'1
0-
V
P
P
(1)
(2
E. ••
1J
- 1)
(1).
Simple central limit theorem calculations yield the same limit distribution as
in (3.4) and (3.5).
0
Remark:
Result (3.5) is based on a obtained from the residuals of the final
fit
the
of
mean
response
function
as
for
pseudo-likelihood, so that Lemma A.3 holds.
the
log-linearized
method
The modified maximum likel ihood
method also provides a joint estimate of a along with the estimate of e.
one considers this estimator in place of
the
0
and
If
in Lemma A.3, it can be shown that
resulting estimator of minimal detectable concentration has even larger
asymptotic variance than that in (3.5).
For reasons of space we do not prove
this, but it certainly has interesting implications for practice.
Proof of Theorem 2:
By (A.5), for any of the estimators x
we have the limit result that for some A.
N
(A.6)
Thus, for
~
c
1/2
between x
c
{
e (x c ,,13)
-
e (x c* ,,13)
}/o
'£
... N(O,A).
and x * , defining
c
1/2
N
e x (x*,,I3)
c
(x
c
- / ) 10. we have
c
c
since 17
€(x.,I3).
25
W"r P.
.•
where € (v,/3)
x
is
x
("1
c
,/3) / p.
x
(x * ,/3)
:£
~(O,L1).,
-+
c
the derivative of the first
component of p.(v,/3).
It thus
suffices through (A.4) to prove that
I! ("f
c
,/3)
e (x *c ,/3)
/
c
Proof of Lemma A.2:
1.
e (x c* ,/3) /0
But this follows from (A.6) since fJ * /0
result now follows from Propostion 1.
p
--+
.... a , see Lemma A. 1 .
c
0
By a series of Taylor expansions and using Lemma A.l,
(A.7)
1 2
N / {h(fJ*.f3) - h(O,f3)}2/0 2
c
*
2{h(fJ ,f3)
...
=
c
*
- h (O.f3) }{d/dfJ h(fJ ,f3)}N
1/2
c
c
c
r-;1/2{h·(fJ*.f3) - h(O,f3)}2/0 2
*
+ 2fJ {d 1dfJ
c
1/2
= N
c
c
2 1/2
h ( 0 ,f3 )} N
*
~
(fJ
- fJ ) 10
c
c
2
+
(J.
p
(1)
{h(fJ:,f3) - h(O.f3)}2 /0 2
+ 2a {d/dfJ
h(O.f3)}2N1/2(~
C
- fJ*)/O +
C
(J.
P
(1).
1/2
Similar calculations taking into account that N
yield
*
~
(fJ - fJ ) /0
~
(f3 - /3)
2
The
26
(A.8)
{h
28
('7*.J3)/M};\1/2
e
.,- (2eIM) h
Combining (1.6),
(3.2).
Proof of Lemma A.3:
29
(elM)
-+-
(0.13)
h28(0.J3)N1/2(~
(log h(O.,a)} ~
1 /2
I
- 0)/0 2
(8 - 8) .~
A
(A.7) and (A.8) yields (A.5).
(1).
0.
P
0
Define
, ) -1 Z.N lZ"M
( NM
1=
J=~
f ( x .. J3 ) } I f9 ( x .. J3 ) ] 2
[ {Y ..
1J
1
(NM)
,
1
~M.
( ••"M)-1/2~N.T
~
1""
1=
f(
[{V
J=1
. .
1J
-1 2~N
0
~.
1=
~M
1~'
J=
2
]E. . . .
1J
~)}2/f28( X .• pA)
1
X. 'f"
1
2 29
1 -2
-{Y .. - f(x . •P)} If
(x. ,P)p
1J
(A.7)
~M 1 [{ Y..
' )-1/2_N
= ( NM
~. 1"'"
1=
-{Y
ij
J=
-f(xi.P)} If
_
-2v(NM)
completing the proof.
0
1J
2 28
1/2
-
1
1
f( x .. /3 )}2 /f 28( x . .p )
1
(xi,P)]o
1
-2
+ o.p(l)
A
(8 - 9) ... (). (1),
P
fZ
::)
o
u
Weighted MOC
Unweighted MOC
CONCENTRATION
FIGURE 1
Schematic Representation of Estimated MOC for Small Concentrations
Table #1
Three Estimates of the Variance Parameter 9
Monte-Carlo Bias
2
Log-l ineari zed
4
0.15
0.001
1
0.045
0.022
2
0.000
0.001
3
0.004
0.000
Pseudo-likelihood
'£
Variance Relative to Pseudo-likelihood with
M
Log-linearized
~
2
M
4
2
Monte-Carlo
Asymptotic
2.85
4.93
Monte-Carlo
1. 71
Pseudo-likelihood
1
0.99
0.99
3
1. 00
1. 00
Asymptotic
1. 87
Table #2
100 x Mean Minimum Detectable Concentrations
M= 2
Unweighted Least
Squares
Log-Linearized
~
= 4
Replicates
Replicates
e Known e Estimated
e Known e Estimated
13.106
13.106
9.173
9.173
3.937
4.346
2.718
2.785
Pseudo-likelihood
~
1
'£
2
4.216
3.927
3.934
2.809
2.715
2.722
Table #3
Ratio of Monte-Carlo Variance of the Estimate of
Minimum Detectable Concentration Relative to Pseudo-likelihood
with
~
= 2 Cycles
M = 2 Replicates
e Known
L"nweighted
Least Squares
Log-linearized
14.90
1. 02
e Estimated
8.46
2.72
M = 4 Replicates
e Known
e Estimated
18.83
13.72
1.00
1. 41
e
•
Pseudo-likelihood
'e = 1
1.19
1.10
•
Table #4
Data for Example of Section 5
Concentration (x)
Response (Y)
0.000
1.700, 1.660, 1.950, 2.070
0.075
1. 910, 2.270.
0.1025
2.220, 2.250, 3.260, 2.920
0.135
2.800, 2.940, 2.380, 2.700
0.185
2.780, 2.640, 2.710, 2.85C
0.250
3.540, 2.860, 3.150, 3.320
0.400
3.910, 3.830, 4.880, 4.210
0.550
4.540, 4.470, 4.790, 5.680
0.750
6.060, 5.070, 5.000. 5.980
1.000
5.840, 5.790, 6.100, 7.810
1. 375
7.310, 7.080, 7.060, 6.870
1.850
9.880, 10.120, 9.220, 9.960
2.500
11.040, 10.460, 10.880, 11.650
3.250
13.510, 15.470, 14.210, 13.920
4.500
16.070, 14.670, 14.780, 15.210
6.000
17.340, 16.850, 16.740, 16.870
8.250
18.980, 19.850, 18.750, 18.510
11.250
21.666, 21.218, 19.790, 22.669
15.000
23.206, 22.239, 22.436, 22.597
20.250
23.922, 24.871, 23.815, 24.871
27.500
25.748, 25.874, 24.907, 24.871
37.000
24.441, 25.874, 25.748, 27.270
50.000
29.580, 26.698, 26.536, 27.181
2.110, 2.390
Table #5
Estimates of 8.0 and x
Squares
Full
based on example of Section 5
Pseudo-
Least.
x
c
c
.1554
li~elihood
x
Log'£=2
.0790
Linearized
x
c
.4750
Mod i f ied
x
c
.0793
Max. Likelihood
.4757
c
.0822
.4500
Dupli-
e
cates
1 & 2
.2230
.0728
.7000
.0476
.9404
.0659
.7500
2 & 3
. 2385
.1535
.3500
.1870
.1950
.1739
.2500
3 & 4
.2513
.1324
.5750
.1112
.6940
.1104
.7000
1 & 4
.1593
.0612
.5500
.0601
.5931
.0695
.5000
1 & 3
.1859
.0938
.4500
.0981
.4233
.0909
.4750
Mean
.2116
.1031
.5250
.1008
.5692
.1021
.5350
SO
.0763
.0357
.1183
.0491
.2511
.0439
.1997
Note:
Means and SDs are based only on the five reduced permutations of
data with duplicates.
•
the
•