Fitting Brendan`s Five-Parameter Logistic Curve - Bio-Rad

Software
Bio-Plex suspension array system
®
tech note 3022
Fitting Brendan’s Five-Parameter Logistic Curve
Paul G Gottschalk, PhD, and John R Dunn II, PhD, Brendan Scientific
Corporation, 2236 Rutherford Road, Suite 107, Carlsbad, CA 92008 USA
Introduction
Biologists perform immunoassays to determine the
concentration of an analyte in a sample. In Bio-Plex assays,
this is typically done by measuring a response in the form of
a signal that is proportional to the amount of analyte bound to
the antibody on beads that have been incubated with
the sample.
In order to quantitate the concentration of the analyte, the
response must be compared to a calibration curve commonly
called the standard curve. The unknown concentration of an
analyte may then be determined by finding the concentration
on the standard curve that produces the same response
as that obtained from the unknown sample (Wild 1994,
Dudley et al. 1985).
Ideally, the standard curve would be the true curve: the curve
that expresses the concentration vs. response relationship
without any distortion by errors. If an infinite number of
concentrations were used, each with an infinite number of
replicates, the resulting curve would be the true curve. Since,
for practical reasons, only a limited number of samples can be
run in an assay, the true curve must be estimated from a
limited number of noisy responses.
From these noisy responses, the true response at each
concentration must be estimated. Because there cannot be a
standard at every concentration, a means of interpolating
between standards is necessary. This is done by selecting a
mathematical function that does a good job of approximating
the true curve. This approximating function is called a curve
model. Many functions have been employed as curve models
of immunoassays, including lines, cubic splines, logistic
functions, and lines in logit-log space.
Curve Fitting
Two steps must be taken to find a function that gives a good
approximation of the true curve. First, the mathematical curve
model must be selected. Then, the particular curve out of the
entire family of possible curves that best explains the data
must be determined by fitting the curve. This means that the
parameters in the function must be adjusted until the function
approximates the assay’s true curve as well as possible.
In statistics, one of the most common ways to determine how
well a candidate standard curve is fitting the true curve is to
determine how likely (probable) it is for the candidate standard
curve to have yielded the observed standard data under the
assumption that the candidate standard curve is actually the
true curve. The best-fitting curve is therefore the curve most
likely to have given rise to the observed data. This curve is
often called the maximum likelihood estimate of the true curve.
Statistical regression theory shows that finding the parameters
of the maximum likelihood curve is equivalent to finding the
curve whose parameters generate the smallest weighted sum
of squared errors (wSSE) (Draper and Smith 1981, Bates and
Watts 1988). The weighted sum of squared errors is the sum
of all of the squares of the differences (D2) between the
observed standard responses (yi ) and the response predicted
by the curve model (ŷi ), weighted by the inverse variance
(1/variance) of the standard responses at that concentration.
N
N
i =1
i =1
wSSE = S wi [ yi – ŷi ]2 = S wi [ Di ]2
Regression, or fitting, is the process of minimizing the weighted
sum of squared errors (wSSE). Therefore, the best-fitting curve
is the one whose parameters produce the smallest wSSE.
Note that the differences (D) are being called errors here, but
the word “residuals” is often used, and wSSE could just as well
be called the weighted sum of squared residuals.
Figure 1 diagrams the computation of wSSE, showing the
errors and weights that factor into the computation.
14,000
12,000
Response
Once the curve model has been fitted by adjusting its
parameters to minimize wSSE, a means of assessing the
quality of the curve fit is necessary because a poor curve fit
will generate unreliable concentration estimates.
Perhaps the first quantity one might consider using as a fit
metric is wSSE, since this is precisely what is being minimized
by the curve-fitting algorithm to find the best fit. For any given
set of standard data, a curve fit with a smaller wSSE is better
than one with a larger wSSE.
10,000
8,000
6,000
4,000
2,000
0
Residual Variance vs. r2
0.05
0.1
0.5
Dose
1
5
Fig. 1. Graphical rendition of the computation of wSSE. Shown are the
residuals, how the weights are applied to them, and the summation of the
weighted squared residuals to obtain wSSE. The direction and length of the line
from the curve to the data point indicate the sign and magnitude of the residual.
This residual is then squared, multiplied by a weighting factor, and summed to
give wSSE.
Weighting
Choosing proper weights for the squared residuals is crucial
for obtaining the best curve fit. According to regression theory,
the weights should be set equal to the inverse variance of
the responses at that concentration. This keeps the fitting
procedure tighter around those standard responses with the
smallest variance (error), that is, those with smallest responses,
and looser around those standard responses with the largest
variance, that is, those with largest responses. This yields the
optimal use of information from the noisier and less noisy
standards, and leads to the most accurate concentration
estimates. Sample concentrations computed from unweighted
curve-fitting procedures can differ from properly weighted
curves by hundreds of percent.
The variance of a standard is a function of the magnitude of its
response. It is common for the variance of a standard at the
high-response end of a curve to be 3 or 4 orders of magnitude
larger than those at the low-response end of the curve. There
are two reasons for this response dependence. First, most
signal detectors produce noise with a standard deviation that
is proportional to the magnitude of the response. Second,
the kinetics associated with antibody binding are nonlinear
(Wild 1994), with the result that the kinetic variations in the
reaction change with the magnitude of the response.
The variance of the standards can usually be approximated
by a power function of the response:
variance = A(response)B
where A is a function of the magnitudes of the responses and
B falls in the range 1.0–2.0 for most immunoassays (Finney
1978). Data such as immunoassay responses in which the
variances are not constant are called heteroscedastic data.
Since it is impractical to run enough replicates to reliably
estimate the true variance function from a single assay, a pool
of historical assay data can be used to compute this variance
function (Finney and Phillips 1977, Raab 1981).
The problem with using wSSE is that it cannot be compared
directly with the wSSE of other assays if the number of data
points is not the same. A quantity that is based on wSSE but
that can be compared across assays that have differing
amounts of data is called the residual variance. The residual
variance is the wSSE divided by the number of degrees of
freedom in the assay. The number of degrees of freedom in
an assay is the number of data points in the standard curve
beyond the number of parameters in the curve model. For
example, a line has two parameters, and so requires two data
points to uniquely determine it. If a line were being fitted to
eight standard points, there would be 8 – 2 = 6 degrees of
freedom. A five-parameter logistic (5PL) has five parameters
and requires five data points to uniquely determine it. If a 5PL
were being fitted to eight standard points, there would be
8 – 5 = 3 degrees of freedom. The residual variance normalizes
the wSSE to properly account for differences in the number of
data points.
The statistic r2 is another measurement that is commonly
used as a fit metric, especially for linear regression. Residual
variance and r2 are related, but they behave quite differently.
If the squares of each of the mean-corrected responses are
weighted and summed together, this total sum of squares
(total SS) can be divided into two parts: the regression sum of
squares (regression SS) is that portion of the total SS that is
explained by the regression model, and the residual sum of
squares (SSE) is that portion of the total SS that is left over.
The value of r2 is the fraction regression SS/total SS.
The problem with r2 is that it is not a sensitive metric for
assessing the quality of a curve fit. Even with bad curve fits,
the vast majority of the total SS will be accounted for in the
regression SS fraction. This leads to r2 values above 0.95
even for a bad fit. Further, because it is not weighted, the
majority of r2 will have been contributed by the standards
with the highest responses. The residual variance, on the other
hand, is very sensitive to imperfections in the curve fit, when
properly weighted.
Fit Probability
Residual variance allows the quality of fit to be measured.
However, it provides no information about how good or bad a
fit is. Under the assumption that the responses of the individual
standard concentrations are approximately normally
distributed, it can be shown that wSSE obeys a chi-square
distribution with the number of degrees of freedom present in
the assay (Draper and Smith 1981, Bates and Watts 1988).
This distribution allows us to determine how likely it is that a
The Advantages of a Good Fit
The
Advantages
a Gooddesigned
Fit
The
goal of anofassay
to determine the concentration
The goal of an assay designed to determine the concentration
The
of an assay
designed
determine
the as
concentration
of goal
an analyte
in a sample
is totobe
as accurate
possible.
of an analyte in a sample is to be as accurate as possible.
of Improving
an analytethe
in astandard
sample is
to befitas
curve
willaccurate
improve as
thepossible.
accuracy of all
Improving the standard curve fit will improve the accuracy of
Improving
the standard
curve
fit willimproving
improve the
concentration
estimates.
In turn,
theaccuracy
accuracyofofall
the
all concentration estimates. In turn, improving the accuracy
concentration
estimates.
turn,
improving
the accuracy
concentration
estimatesInwill
extend
the dynamic
range of the
of the concentration estimates will extend the dynamic range
concentration
estimatesorwill
extend the
dynamic
assay. The dynamic,
reportable,
range
of anrange
assayof
is the
the
of the assay. The dynamic, or reportable, range of an assay
assay.
dynamic,
or reportable,
of an assay
is the stay
rangeThe
over
which the
errors of therange
concentration
estimates
is the range over which the errors of the concentration
range
over
the errors
of the concentration
below
thewhich
maximum
acceptable
error. Typically,estimates
a good fitstay
will
estimates stay below the maximum acceptable error. Typically,
below
thethe
maximum
Typically,
a good
cause
dynamicacceptable
range to beerror.
extended
at the
low fit will
a good fit will cause the dynamic range to be extended at
cause
the dynamic
be extended
at the
low
concentration
endrange
of thetoassay,
permitting
lower
the low concentration end of the assay, permitting lower
concentration
endtoofbe
theaccurately
assay, permitting
lower
concentrations
determined.
concentrations to be accurately determined.
concentrations to be accurately determined.
To improve the quality of the fit of the standard curve, the
To improve the quality of the fit of the standard curve, the
Tosources
improveofthe
of the
of the standard
curve, the
fit quality
error must
befitidentified.
In any regression,
sources of fit error must be identified. In any regression,
sources
of
fit
error
must
be
identified.
In
any
regression,
regardless of what curve model is used, there are two reasons
regardless of what curve model is used, there are two reasons
regardless
of what
is used,
thereThe
are first
tworeason
reasonsis
that the curve
willcurve
not fitmodel
the data
perfectly.
that the curve will not fit the data perfectly. The first reason is
that
the
curve
will
not
fit
the
data
perfectly.
The
first
reason
the presence of random variation in the data. This kind ofiserror
the presence of random variation in the data. This kind of error
theis presence
of error
random
in the data.
This kind ofthe
error
called pure
andvariation
can be reduced
by increasing
is called pure error and can be reduced by increasing the
is number
called pure
error
and
can
be
reduced
by
increasing
the
of replicates of each standard. The second reason is
number of replicates of each standard. The second reason is
number
of curve
replicates
of may
eachnot
standard.
The second
that the
model
approximate
the truereason
curve isvery
that the curve model may not approximate the true curve very
that
the
curve
model
may
not
approximate
the
true
curve
verybe
well. This kind of error is called lack-of-fit error and cannot
well. This kind of error is called lack-of-fit error and cannot be
well.
This
kind
of
error
is
called
lack-of-fit
error
and
cannot
reduced by increasing the number of standard replicates.be
For
reduced by increasing the number of standard replicates.
reduced
by much
increasing
the number
of standard
replicates.
For
example,
immunoassay
data
has a sigmoidal,
or “S”,
For example, much immunoassay data has a sigmoidal,
example,
immunoassay
has
a sigmoidal,
or “S”,
shape ifmuch
data are
taken over data
a wide
enough
range of
or “S”, shape if data are taken over a wide enough range of
shape
if
data
are
taken
over
a
wide
enough
range
of
concentrations. If a line were used as the curve model to fit
concentrations. If a line were used as the curve model to fit
concentrations.
If a of
linethe
were
used
as the
model
to fit error.
such data, much
wSSE
would
becurve
due to
lack-of-fit
such data, much of the wSSE would be due to lack-of-fit error.
such
data,
much
of
the
wSSE
would
be
due
to
lack-of-fit
error.
This is because a straight line simply cannot fit the sigmoidal
This is because a straight line simply cannot fit the sigmoidal
This
is
because
a
straight
line
simply
cannot
fit
the
sigmoidal
shape of the data. In other words, the shape of the assay’s
shape of the data. In other words, the shape of the assay’s
shape
of the isdata.
In straight
other words,
true curve
not a
line. the shape of the assay’s
true curve is not a straight line.
true curve is not a straight line.
To fit the data as well as possible, a curve model must be
To fit the data as well as possible, a curve model must be
Tochosen
fit the data
wella as
possible,
a curve model the
must
becurve.
that as
does
good
job of approximating
true
chosen that does a good job of approximating the true curve.
chosen
a good
job of of
approximating
true process
curve.
If this that
is notdoes
done,
no amount
improving thethe
assay
If this is not done, no amount of improving the assay process
If this
is not done,
no amount
of improve
improvingthe
thefit assay
process
by reducing
random
noise will
beyond
the limit
by reducing random noise will improve the fit beyond the limit
byallowed
reducing
noise will
improve the
fit beyond
byrandom
the lack-of-fit
component
of the
error. the limit
allowed by the lack-of-fit component of the error.
allowed by the lack-of-fit component of the error.
The Four- and Five-Parameter Logistic Curve Models
The Four- and Five-Parameter Logistic Curve Models
The
and Five-Parameter
Logistic
Models
TheFourfour-parameter
logistic (4PL) model
hasCurve
historically
The four-parameter logistic (4PL) model has historically
The
four-parameter
logistic
(4PL)
model
has
historically
been more successful at fitting immunoassay data than its
been more successful at fitting immunoassay data than its
been
more successful
at fitting
data than
its
predecessors,
particularly
theimmunoassay
logit-log and mass
action
predecessors, particularly the logit-log and mass action
predecessors,
particularly
the
logit-log
and
mass
action
models. However, the 4PL model has a weakness: It is a
models. However, the 4PL model has a weakness: It is a
models.
However,
the 4PL
a weakness:
It isare
a not
symmetrical
function,
andmodel
most has
immunoassay
data
symmetrical function, and most immunoassay data are not
symmetrical
function,
and
most
immunoassay
data
are
not
symmetrical. To rectify this shortcoming, the 4PL model was
symmetrical. To rectify this shortcoming, the 4PL model was
symmetrical.
Toadding
rectify athis
shortcoming,
the controls
4PL model
extended by
fifth
parameter that
the was
degree
extended by adding a fifth parameter that controls the degree
extended
by
adding
a
fifth
parameter
that
controls
the
degree
of asymmetry of the curve (Prentice 1976, Rodbard et
al.
of asymmetry of the curve (Prentice 1976, Rodbard et al.
of 1978).
asymmetry
curve
(Prentice
1976,by
Rodbard
et al.
With of
thethe
extra
flexibility
afforded
its g parameter,
1978). With the extra flexibility afforded by its g parameter,
1978).
Withmodel
the extra
flexibility
afforded
its g parameter,
the 5PL
is able
to eliminate
thisbylack-of-fit
error from
the 5PL model is able to eliminate this lack-of-fit error from
theimmunoassay
5PL model iscurves.
able toFigure
eliminate
this
lack-of-fit
from the
2 shows the resulterror
of fitting
immunoassay curves. Figure 2 shows the result of fitting the
immunoassay
curves.
Figure
2
shows
the
result
of
fitting
the a
same set of asymmetric data with a 5PL curve model and
same set of asymmetric data with a 5PL curve model and a
same
set
of
asymmetric
data
with
a
5PL
curve
model
and
a data
4PL curve model. It is apparent that the 5PL curve fits the
4PL curve model. It is apparent that the 5PL curve fits the data
4PL curve model. It is apparent that the 5PL curve fits the data
better and will provide more accurate concentration estimates
better and will provide more accurate concentration estimates
better
will provide
more
accurate
concentration
thanand
the 4PL
fit. Table
1 shows
the fit
statistics for estimates
the two
than the 4PL fit. Table 1 shows the fit statistics for the two
than
thefits
4PL
Table2:1wSSE,
showsdegrees
the fit statistics
for the
two
curve
in fit.
Figure
of freedom,
residual
curve fits in Figure 2: wSSE, degrees of freedom, residual
curve
fits in and
Figure
2: wSSE, degrees
of freedom, in
residual
variance,
fit probability.
The fit probabilities
Table 1
variance, and fit probability. The fit probabilities in Table 1
variance,
andthe
fit probability.
The
fit probabilities
in Table
1 of
show that
quality of the
5PL
fit is good, while
the fit
show that the quality of the 5PL fit is good, while the fit of
show
that the
quality
of the 5PL fit is good, while the fit of
the 4PL
is very
poor.
the 4PL is very poor.
the 4PL is very poor.
The 5PL and 4PL models are described by the
The 5PL and 4PL models are described by the following
The
5PL and
4PL models are described by the
following
equation:
equation:
following equation:
Where:
a–d
a = estimated response at
Where:
y=d+
zero concentration
a–d b g
a = estimated
response at
x
y=d+
bzero
= slope
factor
concentration
1+ b g
xc
= mid-range
b =c slope
factor concentration
1+
c
= estimated
response at
c =dmid-range
concentration
[[
()
( ) ]]
infinite concentration
d = estimated
response at
g infinite
= asymmetry
factor
concentration
g = asymmetry factor
In this equation, x is the concentration, y is the response,
In
this
equation,
x isthe
concentration,
theof
response,
In and
this
equation,
is
concentration,
y yisisthe
response,
a,
b, c, d, xand
gthe
are
the five parameters
the 5PL
and
and
arethe
fiveparameters
parameters
5PL
and
a, a,
b,b,
c,c,d,d,
and
g gare
five
ofofthe
model.
The
4PL
model
isthe
obtained
by setting
gthe
= 5PL
1.
model.
The
4PL
model
obtained
settingg g= =1.1.
model.
The
4PL
model
is is
obtained
bybysetting
A
30,000
30,000
30,000
25,000 A
25,000
25,000
20,000
20,000
20,000
15,000
15,000
15,000
10,000
10,000
10,000
5,000
5,000
5,000
0
0 0
Response, mean fluorescence units
Response, mean fluorescence units
Response, mean fluorescence units
Response, mean fluorescence units
particular value of wSSE will occur. The chi-square probability
particular value of wSSE will occur. The chi-square probability
particular
value of
willif occur.
The under
chi-square
probability
is the fraction
of wSSE
assays,
performed
exactly
the same
is the fraction of assays, if performed under exactly the same
is conditions,
the fraction that
of assays,
exactly
thecurve
samefit,
would ifbeperformed
expected under
to have
a worse
conditions, that would be expected to have a worse curve fit,
conditions,
that would
bethan
expected
to have
worse
curve
fit,
that is, a larger
wSSE,
the curve
fit ofathe
assay
under
that is, a larger wSSE, than the curve fit of the assay under
that
is, a larger wSSE,
than the curve
fit offrom
the assay
under
consideration.
This probability
obtained
the chi-square
consideration. This probability obtained from the chi-square
consideration.
probability
obtained from
chi-square the
distribution isThis
called
the fit probability.
Beingthe
a probability,
distribution is called the fit probability. Being a probability, the
distribution
the fit probability.
Being
a probability,
values of is
thecalled
fit probability
range from
1 (perfect
fit) to 0the
(no fit).
values of the fit probability range from 1 (perfect fit) to 0 (no fit).
values
of
the
fit
probability
range
from
1
(perfect
fit)
to
0
(no
fit).
The Advantages of a Good Fit
B
A
100
100
100
30,000
30,000
B
30,000
25,000
25,000 B
25,000
20,000
20,000
20,000
15,000
15,000
15,000
10,000
10,000
10,000
5,000
5,000
5,000
00
0 100
100
100
500 1,000
500 1,000
500 1,000
500
500 1,000
1,000
500 1,000
5,000 10,000
50,000 100,000
5,000 10,000 50,000100,000
5,000 10,000
50,000 100,000
5,000
5,00010,000
10,000
5,000
Dose 10,000
Dose
50,000
100,000
50,000100,000
50,000 100,000
Dose
Fig. 2. Comparison
Comparisonof
of5PL
5PLand
and4PL
4PLcurve
curve
fitting.A,A,the
theresult
result
a 5PL
fitting.
of of
a 5PL
fit
asymmetric
immunoassay
data;
B,
the
result
4PLfitfitof
the
on
asymmetric
immunoassay
data;
B,
the
result
a a4PL
onon
Fig. 2. Comparison of 5PL and 4PL curve fitting.
A,ofof
the
result
athe
5PL
same
data.
data.
fit on
asymmetric
immunoassay data; B, the result of a 4PL fit on the
same data.
Table 1. Fit statistics for the 5PL and 4PL fits shown in Figure 2.
Table 1. Fit statistics for the 5PL and 4PL fits shown in Figure 2.
Statistic
5PL
4PL
Statistic
wSSE
5PL
2.41
4PL
120.0
120.0
10
wSSE
Degrees of freedom
2.41
9
Degrees of freedom
Residual variance
9
Residual variance
Fit probability
0.269
0.983
12.0
<0.001
Fit probability
0.983
<0.001
0.269
10
12.0
All curves
generated by
by the
All
curves generated
the 5PL
5PL equation
equation are
are either
either
All
curves generated
by the
5PL equationdepending
are either on the
monotonically
increasing
or decreasing,
decreasing,
monotonically
increasing
or
depending on
the
monotonically
increasingb,orand
decreasing,
depending on the
the
choice of
of parameters
choice
parameters a,
a, b, and d.
d. Table
Table 22 summarizes
summarizes the
choice
of
parameters
a,
b,
and
d.
Table
2
summarizes
the
effect of
of the
effect
the parameters
parameters a,
a, b,
b, and
and dd on
on the
the slope
slope of
ofthe
the
effect
of
the
parameters
a,
b,
and
d
on
the
slope
of
the
logistic function.
logistic
function.
logistic function.
Figure
3 shows
Figure 3
shows semi-log
semi-log plots
plots of
of aa family
family of
of curves
curves that
that can
can
Figure
3 shows
semi-log
plots
of a family
of
curves
that can
be
generated
from
the
5PL
equation.
The
figure
makes
be generated from the 5PL equation. The figure makes several
several
be
generated from
the 5PLfunction
equation. The figure
makes several
characteristics
of the
characteristics of
the 5PL
5PL function apparent.
apparent. The
The function
function
characteristics
of the 5PLasymptote
function apparent.
Theapproaches
function
approaches
a
horizontal
as
the
dose
approaches a horizontal asymptote as the dose approaches
approaches
a horizontal another
asymptote
as the dose
approaches
zero,
and
it
approaches
horizontal
asymptote
zero, and it approaches another horizontal asymptote as
as the
the
zero,
and
it approaches
another
horizontal
asymptote
as the
dose
approaches
infinity.
Between
the
asymptotic
regions
of
dose approaches infinity. Between the asymptotic regions of
dose
approaches
infinity.region.
Between
theisasymptotic
regionspoint
of
the
curve
is
a
transition
There
a
single
inflection
the curve is a transition region. There is a single inflection point
the
curve is a transition
region. There
is athe
single inflection
point
in
in the
the transition
transition region.
region. On
On either
either side
side of
of the inflection
inflection point,
point,
in
the
transition
region. On
either
side
of the
inflectionatpoint,
the
curve
will
approach
the
left
and
right
asymptotes
the curve will approach the left and right asymptotes at
the
curverates
will approach
the left and right asymptotes at
different
unless g
g=
different rates
unless
= 1.
1.
different rates unless g = 1.
Each parameter
Each
parameter of
of the
the 5PL
5PL function
function has
has aa different
different effect
effect on
on
Each
parameter 3ofsummarizes
the 5PL functioneffect
has a different
effect on
the curve.
the
curve. Table
Table 3 summarizes the
the effect of
of each
each parameter
parameter
the
curve. Table
3 summarizes the effect of each parameter
on the
on
the 5PL
5PL function.
function.
on the 5PL function.
Table
Table 3.
3. Geometric
Geometric interpretation
interpretation of
of the
the5PL
5PLfunction’s
function’sparameters.
parameters.
Table 3. Geometric interpretation of the 5PL function’s parameters.
Parameter
Parameter
a
a
b
b
c
c
d
d
g
g
Effect on Curve
Effect on Curve
Controls the position of the asymptote. Order relative
Controls
the position
of the Along
asymptote.
relative
to d controls
sign of slope.
with b,Order
magnitude
to d controls
sign of slope.
Alongofwith
b, magnitude
relative
to d controls
magnitude
slope.
relative to d controls magnitude of slope.
Magnitude controls the rapidity of the transition region;
Magnitude
controls
thecurve.
transition
region;
sign
controls
sign ofthe
therapidity
slope ofofthe
Solely
controls
signrate
controls
sign of the
slope
of the curve.
controls
the
of approach
to the
asymptote,
and Solely
jointly with
g
the
rate of
controls
theapproach
approachtotothe
theasymptote,
asymptote.and jointly with g
controls the approach to the asymptote.
Controls the position of the transition region.
Controls the position of the transition region.
Controls the position of the asymptote. Order relative to
Controls
asymptote.
OrderAlong
relative
to b,
a
controlsthe
theposition
sign of of
thethe
slope
of the curve.
with
a
controls the
sign to
of athe
slope of
the curve.ofAlong
magnitude
relative
controls
magnitude
slopewith
of b,
magnitude
the curve. relative to a controls magnitude of slope of
the curve.
Jointly with b controls the rate of approach to
Jointly
with b controls the rate of approach to
the asymptote.
the asymptote.
12,000
12,000
12,000 A
10,000 A
10,000
10,000
8,000
8,000
8,000
6,000
6,000
6,000
Increasing
a a
Increasing
Increasing a
4,000
4,000
4,000
2,000
2,000
2,000
0
0
B
0
10 10
10
100100
100
1,000
1,000 1,000
10,000
10,000 10,000
100,000
100,000
100,000
10,000
10,000
10,000 B
B
8,000
8,000
8,000
6,000
6,000
6,000
Increasing
b b
Increasing
Increasing b
4,000
4,000
4,000
2,000
2,000
2,000
0
0
C
D
E
0
10 10
10
10,000
10,000
10,000 C
Response, mean fluorescence units
Note that when g = 1 and the curve is actually a 4PL curve, pairs of cases
that
when g =into
1 and
the
curve
is actually
4PLcase
curve,
of cases
Note
and
thecases,
curve so
actually
curve,
can be
combined
single
that for a4PL,
#1pairs
and case
#2,
single
cases,the
sosame
that for
for
can
be combined
into#3,
single
cases,
so
that
4PL, case
#1 and
#2,
or case
#1 and case
generate
functional
forms.
For case
the 5PL
or
case
#1
and
case
#3,
generate
the
same
functional
forms.
For
the
5PL
function, all cases produce distinct functionalfunctional
forms. forms. For
function, all cases
cases produce
produce distinct
distinct functional
functionalforms.
forms.
function,
A
Response,
Response,
mean
mean
fluorescence
fluorescence
units
units
Table 2. The relationship between the order of a and d, the sign of b,
Table
2.
The
between
the
order
Tablethe
2. slope
The relationship
relationship
between
thefunction.
orderof
ofaaand
andd,d,the
thesign
signofofb,b,
and
of the monotonic
5PL
and
and the
the slope
slope of
of the
the monotonic
monotonic5PL
5PLfunction.
function.
Case #
Order of a and d
Sign of b
Slope
Case #
Order of a and d
Sign of b
Slope
1
a>d
b>0
Down
1
a>d
b>0
Down
2
a>d
b<0
Up
2
a>d
b<0
Up
3
a<d
b>0
Up
3
a<d
b>0
Up
4
a<d
b<0
Down
4
a<d
b<0
Down
100100
100
1,000 1,000
1,000
10,000 10,000
10,000
100,000
100,000
100,000
C
8,000
8,000
8,000
6,000
6,000
6,000
Increasing
Increasing
c c
Increasing c
4,000
4,000
4,000
2,000
2,000
2,000
0 0
0 10 10
10
10,000
10,000
10,000 D
8,000D
8,000
8,000
6,000
6,000
6,000
4,000
4,000
4,000
2,000
2,000
2,000
0
0
10
0 10
10
10,000
10,000
10,000 E
8,000E
8,000
8,000
6,000
6,000
6,000
4,000
4,000
4,000
2,000
2,000
2,000
0
0
10
0 10
10
100100
100
1,000 1,000
1,000
10,000 10,000
10,000
100,000
100,000
100,000
100
100
100
Increasing
Increasing
d d
Increasing d
1,000 10,000 1,000
10,000
1,000
10,000
100,000
100,000
100,000
Increasing g
Increasing g
Increasing g
100
100
100
1,000 10,000 100,000
1,000
10,000
100,000
1,000
10,000
100,000
Dose
Fig. 3. Effects of varying the parametersDose
of a 5PL function. Panels A–E,
effects
of varying
parameters
a, b, c, d, of
and
respectively.
In this example,
Fig. 3. Effects
of the
varying
the parameters
a g,
5PL
function. Panels
A–E,
a
> d3.and
b < 0.ofthe
Fig.
Effects
varying
the parameters
a g,
5PL
function. Panels
A–E,
effects
of
varying
parameters
a, b, c, d, of
and
respectively.
In this example,
effects
of varying
a > d and
b < 0. the parameters a, b, c, d, and g, respectively. In this example,
a > d and b < 0.
Fitting
FittingWith
Withthe
theFive-Parameter
Five-ParameterLogistic
LogisticModel
Model
Asymmetric
AsymmetricCurve
CurveFitting
Fitting
To
Tohandle
handledata
datathat
thatwere
weretoo
tooasymmetric
asymmetricfor
forthe
the4PL,
4PL,many
many
researchers
were
forced
to
use
methods
such
as
spline
researchers were forced to use methods such as spline
interpolations.
interpolations.The
Thespline
splineinterpolation
interpolationmethod
methoddoes
doesnot
not
attempt
to
find
a
minimum
parameter
model
that
attempt to find a minimum parameter model thatapproximates
approximates
the
thetrue
truecurve.
curve.Instead,
Instead,the
thespline
splineinterpolation
interpolationmethod
methodputs
putsa
cubic
spline
through
the
data
points
using
a
parametric
cubic
a cubic spline through the data points using a parametric cubic
function.
function.Since
Sinceaaspline
splinepasses
passesexactly
exactlythrough
througheach
eachdata
data
point,
the
number
of
parameters
in
a
spline
curve
point, the number of parameters in a spline curveisisalways
always
equal
equaltotothe
thenumber
numberofofdata
datapoints.
points.The
Theresult
resultisisthat
thatthere
thereare
are
always
zero
degrees
of
freedom
in
a
spline
fit.
This
means
always zero degrees of freedom in a spline fit. This meansthat
that
aaspline
splinefitfitperforms
performsno
noaveraging
averagingofofthe
thedata
datatotoreduce
reduce
random
variation.
Furthermore,
a
spline
is
random variation. Furthermore, a spline isnot
notaagood
good
interpolating
interpolatingfunction
functionfor
forimmunoassay
immunoassaydata,
data,because
becauseititoften
often
does
not
approximate
the
curve
between
the
data
does not approximate the curve between the datapoints
pointswell.
well.
Splines
Splinesare
arenot
notalways
alwaysmonotonic,
monotonic,and
andcan
canoscillate
oscillateup
upand
and
down
downbecause
becauseofofthe
therandom
randomvariation
variationininevery
everydata
datapoint.
point.
Spline-based
standard
curves
have
substantial
curve
Spline-based standard curves have substantial curveerror
error
because
becausenone
noneofofthe
therandom
randomvariation
variationininthe
thedata
datapoints
pointsisis
averaged
averagedout,
out,and
andtherefore
thereforethe
theconcentration
concentrationestimates
estimates
contain
a
greater
amount
of
error
contain a greater amount of errorthan
thanaacurve
curvemodel
modelwith
with
fewer
parameters
that
can
average
out
the
random
fewer parameters that can average out the randomvariation.
variation.
Another
Anothermethod
methodsometimes
sometimesused
usedininlieu
lieuofofaa5PL
5PLisisthe
thelog
log
4PL
method.
This
method
takes
advantage
of
the
fact
4PL method. This method takes advantage of the factthat
that
taking
takingthe
thelogarithm
logarithmofofthe
theresponse
responseofofsome
someasymmetric
asymmetric
curves
can
make
the
data
more
symmetrical,
curves can make the data more symmetrical,and
andtherefore
therefore
better
suited
to
a
4PL
fit.
This
approach
works
when
better suited to a 4PL fit. This approach works whenthe
the lowlow-response
a sigmoidal
curve
a shorter
“tail”
response endend
of aofsigmoidal
curve
hashas
a shorter
“tail”
(approaches
(approachesthe
theasymptote
asymptotefaster)
faster)than
thanthe
theupper
upperend
endofofthe
the
curve
(parameter
g
between
1.1
and
1.5).
However,
curve (parameter g between 1.1 and 1.5). However,taking
takingthe
the
log
logofofthe
thesigmoidal
sigmoidaldata
datawhen
whenthe
thelow-response
low-responseend
endofofthe
the
curve
curvehas
hasaalonger
longertail
tailthan
thanthe
thehigh-response
high-responseend
endmakes
makesthe
the
data
more
asymmetric.
Since
this
type
of
behavior
is
data more asymmetric. Since this type of behavior is
encountered
encounteredininimmunoassay
immunoassaydata
datamore
moreoften
oftenthan
thanthe
theformer,
former,
this
approach
is
not
satisfactory
for
routine
data
reduction.
this approach is not satisfactory for routine data reduction.
Even
Evenwhen
whenggfits
fitsinto
intothis
thisoptimal
optimalrange,
range,a a5PL
5PLmodel
modelwill
will
almost
always
result
in
a
better
fit
and
more
accurate
almost always result in a better fit and more accurate
concentration
concentrationestimates.
estimates.
When
Whenthe
the5PL
5PLmodel
modelwas
wasfirst
firstproposed
proposedas
asthe
themeans
meanstoto
obtain
good
fits
to
asymmetric
assay
data,
many
obtain good fits to asymmetric assay data, manyresearchers
researchers
recognized
recognizedits
itspotential
potentialand
andattempted
attemptedtotointegrate
integratethe
the5PL
5PL
into
their
assay
analyses.
They
found
that
fitting
a
5PL
into their assay analyses. They found that fitting a 5PLmodel
model
totoassay
assaydata
dataisismuch
muchmore
moredifficult
difficultthan
thanfitting
fittingthe
the4PL
4PLmodel.
model.
The
traditional
numerical-fitting
algorithms
used
for
the
The traditional numerical-fitting algorithms used for the4PL
4PL
failed
failedatatan
anunacceptably
unacceptablyhigh
highrate;
rate;that
thatis,is,they
theyeither
eitherreturned
returned
an
error
or
never
finished.
Worse
still,
in
many
of
the
an error or never finished. Worse still, in many of theassays
assaysforfor
which
whichthe
thealgorithm
algorithmreturned
returnedaafit,
fit,the
thefitfitwas
wasclearly
clearlynot
notthe
the
best
fit.
best fit.
Why
Whythe
the5PL
5PLModel
ModelIsIsDifficult
DifficulttotoFit
Fit
One
Onereason
reasonthat
thatthe
the5PL
5PLmodel
modelisisso
sodifficult
difficulttotofitfitisisthat
thatit itisis
possible
possibletotowildly
wildlyadjust
adjustits
itsparameters
parametersinintandem
tandemso
sothat
thatthe
the
5PL
5PLcurve
curveitself
itselfhardly
hardlychanges
changesininthe
theplaces
placeswhere
wherethe
thedata
data
are.
are.When
Whenone
onecan
canchange
changethe
theparameters
parametersofofthe
thecurve
curveso
sothat
that
the
thecurve
curvehardly
hardlychanges
changesrelative
relativetotothe
thedata
datapoints,
points,then
thenthe
the
residuals
residualsalso
alsowill
willhardly
hardlychange.
change.This
Thisresults
resultsininthe
thevalue
valueofof
wSSE being largely unaffected by these tandem parameter
wSSE
being
largely
unaffected
by theseoftandem
parameter
changes.
Figure
4 shows
an example
a 5PL curve
where
changes.
4 shows
ancexample
of a 5PL curve
changingFigure
the values
of the
and g parameters
of thewhere
5PL
changing
the has
values
the c on
andthe
g parameters
of the 5PL
substantially
littleofeffect
curve and therefore
on the
substantially
hasThis
little can
effect
the by
curve
therefore
oninthe
value of wSSE.
beon
seen
the and
wide,
flat valley
the
value
of wSSE.
This can be
seen
by the
valley
in the
plot where
an algorithm
could
easily
bewide,
fooledflat
into
stopping
plot
where
benear
fooled
early.
Also,an
thealgorithm
flat plaincould
on theeasily
far left
theinto
g =stopping
0 axis could
early.
Also,
the flatinto
plainstopping
on the far
leftanear
theworse
g = 0value
axis could
fool an
algorithm
with
much
of
fool
an algorithm
stopping
with aofmuch
worse
of
wSSE
than in theinto
valley.
The values
c and
g thatvalue
minimize
wSSE
thanofinwSSE
the valley.
The
values are
of ccand
g that minimize
the
the value
in this
example
= 16,748
and
value
of wSSE in this example are c = 16,748 and g = 2,899.
g = 2,899.
wSSE
40,000
40,000
5,000
5,000
20,000
20,000
10,000
0
15,000
2,000
g
20,000
4,000
4,000
6,000
c
25,000
25,000
Fig. 4. A plot of the wSSE surface of a 5PL curve fit against the
Fig. 4. A plot of the wSSE surface of a 5PL curve fit against the parameters
parameters c and g. The color of the plot corresponds to the slope of the
c and g. The color of the plot corresponds to the slope of the surface. Blue hues
surface. Blue hues are low slope, or “flat”. The values c = 16,748 and g = 2,899
are low slope, or “flat”. The values c = 16,748 and g = 2,899 minimize wSSE.
minimize wSSE. This minimum falls in the middle of a flat plain that can fool
This minimum falls in the middle of a flat plain that can fool algorithms into
algorithms into stopping early or going haywire, or can cause other problems.
stopping early or going haywire, or can cause other problems. Note also the
Note also the plain on the far left near the g = 0 axis, where an algorithm could
plain on the far left near the g = 0 axis, where an algorithm could easily stop
easily stop without realizing that the valley where the true answer lies even exists.
without realizing that the valley where the true answer lies even exists.
To facilitate the interpretation of this geometric view, imagine
that wSSE is the result of fitting a two-parameter curve model
To facilitate
the
interpretation
of this
geometric
imagine
instead
of the
5PL
curve model.
A region
in theview,
parameter
that
wSSE
is
the
result
of
fitting
a
two-parameter
curve
model
space where wSSE hardly changes even across wide changes
instead
of
the
5PL
curve
model.
A
region
in
the
parameter
in the parameter values would be like a wide flat plain. A plot of
spacecan
where
changes
even sides
acrossbut
wide
changes
wSSE
alsowSSE
have hardly
“ravines”
with steep
very
level
in
the
parameter
values
would
be
like
a
wide
flat
plain.
A plot of
bottoms. Unlike a person standing on a landscape scanning
wSSE
canfor
also
have
“ravines”
with
steep sides
the
scene
a low
spot,
a fitting
algorithm
has but
onlyvery
locallevel
bottoms.
Unlike
a
person
standing
on
a
landscape
scanning
information about wSSE and the history of where it has been
the
scene
a low
spot,
fitting
algorithm
only local
to
guide
itsfor
next
guess
forathe
location
of thehas
minimum
of
information
about
wSSE
and
the
history
of
where
it has been
wSSE. It has no way to “see” where to go.
to guide its next guess for the location of the minimum of
The
point
in the
space
thattominimizes
wSSE
wSSE.
It has
noparameter
way to “see”
where
go.
provides the parameters of the best-fitting curve. Such a point
point
in theofparameter
that surface.
minimizes
isThe
at the
bottom
a valley onspace
the wSSE
If awSSE
wide plain
provides
the
parameters
of
the
best-fitting
curve.
Such
surrounds such a valley, an algorithm that does not
usea point
is
at
the
bottom
of
a
valley
on
the
wSSE
surface.
If
a
wide
plain
advanced methods can be fooled into thinking that the
plain
surrounds
such
a
valley,
an
algorithm
that
does
not
use
is actually the floor of the best-fitting valley, causing it to stop
advanced
be fooled
intoonthinking
thesecond
plain
well
short ofmethods
the truecan
minimum.
Also,
such athat
plain,
is
actually
the
floor
of
the
best-fitting
valley,
causing
it
to
derivative-based methods such as Gauss-Newton and stop
well short of the true minimum.
on such
a plain,
secondLevenberg-Marquardt,
the fittingAlso,
methods
most
commonly
derivative-based
methods
such
as
Gauss-Newton
and
used to fit 4PL curves, can be fooled into going in an entirely
Levenberg-Marquardt,
thethe
fitting
methods
mostderivatives
commonlyis
wrong
direction because
matrix
of second
used
to
fit
4PL
curves,
can
be
fooled
into
going
in an entirely
very nearly singular. Such algorithms may hop around
the
wrong
direction
because
the
matrix
of
second
derivatives
is
parameter space forever, or stop and report an error after their
very
nearly
singular.
Such
algorithms
may
hop
around
the
iteration limits are reached. Also, the near singularity of the
second-derivative matrix can cause many numerical problems
since it must typically be inverted to be used in algorithms
that employ it. This can cause an algorithm to have
unpredictable problems.
Another problem that happens to algorithms on such flat plains
in parameter space is that they can be going somewhere,
probably not even in the right direction, very, very slowly.
The result of this is that the algorithm may also never return
because it never thinks it is “finished”.
Lastly, the wSSE surface is often not one big “bowl” such that
all a fitting algorithm needs to do is “slide” down to the bottom.
Almost any fitting algorithm can handle such a simple scenario.
Real wSSE surfaces have many possible valleys, the bottoms
of which may be the correct best minimum but more likely are
not. A successful algorithm must be able to find the valley with
the lowest bottom, and then successfully find the bottom.
This requires that the algorithm be able to find a starting point
for the fitting algorithm so that the starting point is within the
region of convergence to the correct minimum of wSSE of the
fitting algorithm being used. This is perhaps the most difficult
task in developing a reliable algorithm for fitting the 5PL model.
Many algorithms will end up in the wrong valley, and will report
that they have found the best fit, even though they are
nowhere near it.
n
Is numerically stable and robust due to careful selection
of appropriate numerical algorithms, careful customization
and optimization of them, and carefully designed logic for
handling exceptions
Summary
Weighted 5PL regression produces the most accurate
concentration estimates of any method currently in use for
asymmetric immunoassay data. Because of difficulties with
fitting the 5PL curve model, the 5PL model is only now
becoming widely used in conjunction with software like the
Bio-Plex system’s Brendan logistic module and StatLIA.
Since no immunoassay data are perfectly symmetric,
fitting with a 5PL model will almost always result in better
concentration estimates than using a 4PL model. This in turn
allows greater dynamic ranges to be used, and produces
greater accuracy in the results. The Brendan logistic module
is able to fit the 5PL model better than other software.
References
Bates DM and Watts DG, Nonlinear Regression Analysis and Its Applications,
New York: Wiley (1988)
Draper NR and Smith H, Applied Regression Analysis, New York: Wiley (1981)
Dudley RA et al., Guidelines for immunoassay data processing, Clin Chem 31,
1264–1271 (1985)
Finney DJ, Statistical Methods in Biological Assays, 3rd ed, London:
Charles Griffin (1978)
The Brendan Logistic Module Fits the 5PL Better Than
Other Software
Finney DL and Phillips P, The form and estimation of a variance function, with
particular reference to immunoassay, Appl Stat 26, 312–320 (1977)
A better 5PL fit means that concentration estimates will be
more accurate. More accurate concentration estimates lead
o wider dynamic ranges for assays. The Brendan logistic
module, used in Bio-Plex Manager™ version 3.0 software,
does a better job of fitting the 5PL than other data reduction
software. The Brendan logistic module performs better
because it:
Prentice RLA, A generalization of the probit and logit methods for dose
response curves, Biometrics 32, 761–768 (1976)
n
n
n
n
Uses optimal weighting with the 5PL curve model to
correctly model the random error and thereby get the best
possible fit
Employs a number of sophisticated fitting algorithms that are
able to traverse the “plains” and “level valleys” of the wSSE
surface that can fool other algorithms
Uses sophisticated logic to determine which algorithms are
most effective to use during each phase of the fitting
Uses advanced methods to ensure that the true global
minimum of wSSE is obtained, not a false local minimum
Raab GM, Estimation of a variance function, with application to immunoassay,
Appl Stat 30, 32–40 (1981)
Rodbard D et al., Radioimmunoassay and Related Procedures in Medicine 1,
Vienna: International Atomic Energy Agency, 469–504 (1978)
Wild D (ed), The Immunoassay Handbook, New York: Stockton Press (1994)
About Brendan Scientific
Brendan Scientific develops and markets StatLIA software for
advanced computational analysis and workflow management
of immunoassay-related technologies. Brendan’s core focus
has been to combine proven scientific methods and robust
mathematics with the processing speed and automation of
software. This combination provides the highest level of
accuracy and reliability with the workflow flexibility and
automation required for today’s laboratory.
StatLIA is a trademark of Brendan Scientific.
The Bio-Plex suspension array system includes fluorescently labeled microspheres
and instrumentation licensed to Bio-Rad Laboratories, Inc. by the Luminex
Corporation.
Information in this tech note was current as of the date of writing (2004) and
not necessarily the date this version (rev A, 2007) was published.
Bio-Rad
Laboratories, Inc.
Web site www.bio-rad.com USA 800 4BIORAD Australia 61 02 9914 2800 Austria 01 877 89 01 Belgium 09 385 55 11 Brazil 55 21 3237 9400
Canada 905 712 2771 China 86 21 6426 0808 Czech Republic 420 241 430 532 Denmark 44 52 10 00 Finland 09 804 22 00 France 01 47 95 69 65
Germany 089 318 84 0 Greece 30 210 777 4396 Hong Kong 852 2789 3300 Hungary 36 1 455 8800 India 91 124 4029300 Israel 03 963 6050
Italy 39 02 216091 Japan 03 5811 6270 Korea 82 2 3473 4460 Mexico 52 555 488 7670 The Netherlands 0318 540666 New Zealand 0508 805 500
Norway 23 38 41 30 Poland 48 22 331 99 99 Portugal 351 21 472 7700 Russia 7 495 721 14 04 Singapore 65 6415 3188 South Africa 27 861 246 723
Spain 34 91 590 5200 Sweden 08 555 12700 Switzerland 061 717 95 55 Taiwan 886 2 2578 7189 United Kingdom 020 8328 2000
Life Science
Group
Bulletin 3022
US/EG
Rev A
07-0749
0907
Sig 1106