Software Bio-Plex suspension array system ® tech note 3022 Fitting Brendan’s Five-Parameter Logistic Curve Paul G Gottschalk, PhD, and John R Dunn II, PhD, Brendan Scientific Corporation, 2236 Rutherford Road, Suite 107, Carlsbad, CA 92008 USA Introduction Biologists perform immunoassays to determine the concentration of an analyte in a sample. In Bio-Plex assays, this is typically done by measuring a response in the form of a signal that is proportional to the amount of analyte bound to the antibody on beads that have been incubated with the sample. In order to quantitate the concentration of the analyte, the response must be compared to a calibration curve commonly called the standard curve. The unknown concentration of an analyte may then be determined by finding the concentration on the standard curve that produces the same response as that obtained from the unknown sample (Wild 1994, Dudley et al. 1985). Ideally, the standard curve would be the true curve: the curve that expresses the concentration vs. response relationship without any distortion by errors. If an infinite number of concentrations were used, each with an infinite number of replicates, the resulting curve would be the true curve. Since, for practical reasons, only a limited number of samples can be run in an assay, the true curve must be estimated from a limited number of noisy responses. From these noisy responses, the true response at each concentration must be estimated. Because there cannot be a standard at every concentration, a means of interpolating between standards is necessary. This is done by selecting a mathematical function that does a good job of approximating the true curve. This approximating function is called a curve model. Many functions have been employed as curve models of immunoassays, including lines, cubic splines, logistic functions, and lines in logit-log space. Curve Fitting Two steps must be taken to find a function that gives a good approximation of the true curve. First, the mathematical curve model must be selected. Then, the particular curve out of the entire family of possible curves that best explains the data must be determined by fitting the curve. This means that the parameters in the function must be adjusted until the function approximates the assay’s true curve as well as possible. In statistics, one of the most common ways to determine how well a candidate standard curve is fitting the true curve is to determine how likely (probable) it is for the candidate standard curve to have yielded the observed standard data under the assumption that the candidate standard curve is actually the true curve. The best-fitting curve is therefore the curve most likely to have given rise to the observed data. This curve is often called the maximum likelihood estimate of the true curve. Statistical regression theory shows that finding the parameters of the maximum likelihood curve is equivalent to finding the curve whose parameters generate the smallest weighted sum of squared errors (wSSE) (Draper and Smith 1981, Bates and Watts 1988). The weighted sum of squared errors is the sum of all of the squares of the differences (D2) between the observed standard responses (yi ) and the response predicted by the curve model (ŷi ), weighted by the inverse variance (1/variance) of the standard responses at that concentration. N N i =1 i =1 wSSE = S wi [ yi – ŷi ]2 = S wi [ Di ]2 Regression, or fitting, is the process of minimizing the weighted sum of squared errors (wSSE). Therefore, the best-fitting curve is the one whose parameters produce the smallest wSSE. Note that the differences (D) are being called errors here, but the word “residuals” is often used, and wSSE could just as well be called the weighted sum of squared residuals. Figure 1 diagrams the computation of wSSE, showing the errors and weights that factor into the computation. 14,000 12,000 Response Once the curve model has been fitted by adjusting its parameters to minimize wSSE, a means of assessing the quality of the curve fit is necessary because a poor curve fit will generate unreliable concentration estimates. Perhaps the first quantity one might consider using as a fit metric is wSSE, since this is precisely what is being minimized by the curve-fitting algorithm to find the best fit. For any given set of standard data, a curve fit with a smaller wSSE is better than one with a larger wSSE. 10,000 8,000 6,000 4,000 2,000 0 Residual Variance vs. r2 0.05 0.1 0.5 Dose 1 5 Fig. 1. Graphical rendition of the computation of wSSE. Shown are the residuals, how the weights are applied to them, and the summation of the weighted squared residuals to obtain wSSE. The direction and length of the line from the curve to the data point indicate the sign and magnitude of the residual. This residual is then squared, multiplied by a weighting factor, and summed to give wSSE. Weighting Choosing proper weights for the squared residuals is crucial for obtaining the best curve fit. According to regression theory, the weights should be set equal to the inverse variance of the responses at that concentration. This keeps the fitting procedure tighter around those standard responses with the smallest variance (error), that is, those with smallest responses, and looser around those standard responses with the largest variance, that is, those with largest responses. This yields the optimal use of information from the noisier and less noisy standards, and leads to the most accurate concentration estimates. Sample concentrations computed from unweighted curve-fitting procedures can differ from properly weighted curves by hundreds of percent. The variance of a standard is a function of the magnitude of its response. It is common for the variance of a standard at the high-response end of a curve to be 3 or 4 orders of magnitude larger than those at the low-response end of the curve. There are two reasons for this response dependence. First, most signal detectors produce noise with a standard deviation that is proportional to the magnitude of the response. Second, the kinetics associated with antibody binding are nonlinear (Wild 1994), with the result that the kinetic variations in the reaction change with the magnitude of the response. The variance of the standards can usually be approximated by a power function of the response: variance = A(response)B where A is a function of the magnitudes of the responses and B falls in the range 1.0–2.0 for most immunoassays (Finney 1978). Data such as immunoassay responses in which the variances are not constant are called heteroscedastic data. Since it is impractical to run enough replicates to reliably estimate the true variance function from a single assay, a pool of historical assay data can be used to compute this variance function (Finney and Phillips 1977, Raab 1981). The problem with using wSSE is that it cannot be compared directly with the wSSE of other assays if the number of data points is not the same. A quantity that is based on wSSE but that can be compared across assays that have differing amounts of data is called the residual variance. The residual variance is the wSSE divided by the number of degrees of freedom in the assay. The number of degrees of freedom in an assay is the number of data points in the standard curve beyond the number of parameters in the curve model. For example, a line has two parameters, and so requires two data points to uniquely determine it. If a line were being fitted to eight standard points, there would be 8 – 2 = 6 degrees of freedom. A five-parameter logistic (5PL) has five parameters and requires five data points to uniquely determine it. If a 5PL were being fitted to eight standard points, there would be 8 – 5 = 3 degrees of freedom. The residual variance normalizes the wSSE to properly account for differences in the number of data points. The statistic r2 is another measurement that is commonly used as a fit metric, especially for linear regression. Residual variance and r2 are related, but they behave quite differently. If the squares of each of the mean-corrected responses are weighted and summed together, this total sum of squares (total SS) can be divided into two parts: the regression sum of squares (regression SS) is that portion of the total SS that is explained by the regression model, and the residual sum of squares (SSE) is that portion of the total SS that is left over. The value of r2 is the fraction regression SS/total SS. The problem with r2 is that it is not a sensitive metric for assessing the quality of a curve fit. Even with bad curve fits, the vast majority of the total SS will be accounted for in the regression SS fraction. This leads to r2 values above 0.95 even for a bad fit. Further, because it is not weighted, the majority of r2 will have been contributed by the standards with the highest responses. The residual variance, on the other hand, is very sensitive to imperfections in the curve fit, when properly weighted. Fit Probability Residual variance allows the quality of fit to be measured. However, it provides no information about how good or bad a fit is. Under the assumption that the responses of the individual standard concentrations are approximately normally distributed, it can be shown that wSSE obeys a chi-square distribution with the number of degrees of freedom present in the assay (Draper and Smith 1981, Bates and Watts 1988). This distribution allows us to determine how likely it is that a The Advantages of a Good Fit The Advantages a Gooddesigned Fit The goal of anofassay to determine the concentration The goal of an assay designed to determine the concentration The of an assay designed determine the as concentration of goal an analyte in a sample is totobe as accurate possible. of an analyte in a sample is to be as accurate as possible. of Improving an analytethe in astandard sample is to befitas curve willaccurate improve as thepossible. accuracy of all Improving the standard curve fit will improve the accuracy of Improving the standard curve fit willimproving improve the concentration estimates. In turn, theaccuracy accuracyofofall the all concentration estimates. In turn, improving the accuracy concentration estimates. turn, improving the accuracy concentration estimatesInwill extend the dynamic range of the of the concentration estimates will extend the dynamic range concentration estimatesorwill extend the dynamic assay. The dynamic, reportable, range of anrange assayof is the the of the assay. The dynamic, or reportable, range of an assay assay. dynamic, or reportable, of an assay is the stay rangeThe over which the errors of therange concentration estimates is the range over which the errors of the concentration range over the errors of the concentration below thewhich maximum acceptable error. Typically,estimates a good fitstay will estimates stay below the maximum acceptable error. Typically, below thethe maximum Typically, a good cause dynamicacceptable range to beerror. extended at the low fit will a good fit will cause the dynamic range to be extended at cause the dynamic be extended at the low concentration endrange of thetoassay, permitting lower the low concentration end of the assay, permitting lower concentration endtoofbe theaccurately assay, permitting lower concentrations determined. concentrations to be accurately determined. concentrations to be accurately determined. To improve the quality of the fit of the standard curve, the To improve the quality of the fit of the standard curve, the Tosources improveofthe of the of the standard curve, the fit quality error must befitidentified. In any regression, sources of fit error must be identified. In any regression, sources of fit error must be identified. In any regression, regardless of what curve model is used, there are two reasons regardless of what curve model is used, there are two reasons regardless of what is used, thereThe are first tworeason reasonsis that the curve willcurve not fitmodel the data perfectly. that the curve will not fit the data perfectly. The first reason is that the curve will not fit the data perfectly. The first reason the presence of random variation in the data. This kind ofiserror the presence of random variation in the data. This kind of error theis presence of error random in the data. This kind ofthe error called pure andvariation can be reduced by increasing is called pure error and can be reduced by increasing the is number called pure error and can be reduced by increasing the of replicates of each standard. The second reason is number of replicates of each standard. The second reason is number of curve replicates of may eachnot standard. The second that the model approximate the truereason curve isvery that the curve model may not approximate the true curve very that the curve model may not approximate the true curve verybe well. This kind of error is called lack-of-fit error and cannot well. This kind of error is called lack-of-fit error and cannot be well. This kind of error is called lack-of-fit error and cannot reduced by increasing the number of standard replicates.be For reduced by increasing the number of standard replicates. reduced by much increasing the number of standard replicates. For example, immunoassay data has a sigmoidal, or “S”, For example, much immunoassay data has a sigmoidal, example, immunoassay has a sigmoidal, or “S”, shape ifmuch data are taken over data a wide enough range of or “S”, shape if data are taken over a wide enough range of shape if data are taken over a wide enough range of concentrations. If a line were used as the curve model to fit concentrations. If a line were used as the curve model to fit concentrations. If a of linethe were used as the model to fit error. such data, much wSSE would becurve due to lack-of-fit such data, much of the wSSE would be due to lack-of-fit error. such data, much of the wSSE would be due to lack-of-fit error. This is because a straight line simply cannot fit the sigmoidal This is because a straight line simply cannot fit the sigmoidal This is because a straight line simply cannot fit the sigmoidal shape of the data. In other words, the shape of the assay’s shape of the data. In other words, the shape of the assay’s shape of the isdata. In straight other words, true curve not a line. the shape of the assay’s true curve is not a straight line. true curve is not a straight line. To fit the data as well as possible, a curve model must be To fit the data as well as possible, a curve model must be Tochosen fit the data wella as possible, a curve model the must becurve. that as does good job of approximating true chosen that does a good job of approximating the true curve. chosen a good job of of approximating true process curve. If this that is notdoes done, no amount improving thethe assay If this is not done, no amount of improving the assay process If this is not done, no amount of improve improvingthe thefit assay process by reducing random noise will beyond the limit by reducing random noise will improve the fit beyond the limit byallowed reducing noise will improve the fit beyond byrandom the lack-of-fit component of the error. the limit allowed by the lack-of-fit component of the error. allowed by the lack-of-fit component of the error. The Four- and Five-Parameter Logistic Curve Models The Four- and Five-Parameter Logistic Curve Models The and Five-Parameter Logistic Models TheFourfour-parameter logistic (4PL) model hasCurve historically The four-parameter logistic (4PL) model has historically The four-parameter logistic (4PL) model has historically been more successful at fitting immunoassay data than its been more successful at fitting immunoassay data than its been more successful at fitting data than its predecessors, particularly theimmunoassay logit-log and mass action predecessors, particularly the logit-log and mass action predecessors, particularly the logit-log and mass action models. However, the 4PL model has a weakness: It is a models. However, the 4PL model has a weakness: It is a models. However, the 4PL a weakness: It isare a not symmetrical function, andmodel most has immunoassay data symmetrical function, and most immunoassay data are not symmetrical function, and most immunoassay data are not symmetrical. To rectify this shortcoming, the 4PL model was symmetrical. To rectify this shortcoming, the 4PL model was symmetrical. Toadding rectify athis shortcoming, the controls 4PL model extended by fifth parameter that the was degree extended by adding a fifth parameter that controls the degree extended by adding a fifth parameter that controls the degree of asymmetry of the curve (Prentice 1976, Rodbard et al. of asymmetry of the curve (Prentice 1976, Rodbard et al. of 1978). asymmetry curve (Prentice 1976,by Rodbard et al. With of thethe extra flexibility afforded its g parameter, 1978). With the extra flexibility afforded by its g parameter, 1978). Withmodel the extra flexibility afforded its g parameter, the 5PL is able to eliminate thisbylack-of-fit error from the 5PL model is able to eliminate this lack-of-fit error from theimmunoassay 5PL model iscurves. able toFigure eliminate this lack-of-fit from the 2 shows the resulterror of fitting immunoassay curves. Figure 2 shows the result of fitting the immunoassay curves. Figure 2 shows the result of fitting the a same set of asymmetric data with a 5PL curve model and same set of asymmetric data with a 5PL curve model and a same set of asymmetric data with a 5PL curve model and a data 4PL curve model. It is apparent that the 5PL curve fits the 4PL curve model. It is apparent that the 5PL curve fits the data 4PL curve model. It is apparent that the 5PL curve fits the data better and will provide more accurate concentration estimates better and will provide more accurate concentration estimates better will provide more accurate concentration thanand the 4PL fit. Table 1 shows the fit statistics for estimates the two than the 4PL fit. Table 1 shows the fit statistics for the two than thefits 4PL Table2:1wSSE, showsdegrees the fit statistics for the two curve in fit. Figure of freedom, residual curve fits in Figure 2: wSSE, degrees of freedom, residual curve fits in and Figure 2: wSSE, degrees of freedom, in residual variance, fit probability. The fit probabilities Table 1 variance, and fit probability. The fit probabilities in Table 1 variance, andthe fit probability. The fit probabilities in Table 1 of show that quality of the 5PL fit is good, while the fit show that the quality of the 5PL fit is good, while the fit of show that the quality of the 5PL fit is good, while the fit of the 4PL is very poor. the 4PL is very poor. the 4PL is very poor. The 5PL and 4PL models are described by the The 5PL and 4PL models are described by the following The 5PL and 4PL models are described by the following equation: equation: following equation: Where: a–d a = estimated response at Where: y=d+ zero concentration a–d b g a = estimated response at x y=d+ bzero = slope factor concentration 1+ b g xc = mid-range b =c slope factor concentration 1+ c = estimated response at c =dmid-range concentration [[ () ( ) ]] infinite concentration d = estimated response at g infinite = asymmetry factor concentration g = asymmetry factor In this equation, x is the concentration, y is the response, In this equation, x isthe concentration, theof response, In and this equation, is concentration, y yisisthe response, a, b, c, d, xand gthe are the five parameters the 5PL and and arethe fiveparameters parameters 5PL and a, a, b,b, c,c,d,d, and g gare five ofofthe model. The 4PL model isthe obtained by setting gthe = 5PL 1. model. The 4PL model obtained settingg g= =1.1. model. The 4PL model is is obtained bybysetting A 30,000 30,000 30,000 25,000 A 25,000 25,000 20,000 20,000 20,000 15,000 15,000 15,000 10,000 10,000 10,000 5,000 5,000 5,000 0 0 0 Response, mean fluorescence units Response, mean fluorescence units Response, mean fluorescence units Response, mean fluorescence units particular value of wSSE will occur. The chi-square probability particular value of wSSE will occur. The chi-square probability particular value of willif occur. The under chi-square probability is the fraction of wSSE assays, performed exactly the same is the fraction of assays, if performed under exactly the same is conditions, the fraction that of assays, exactly thecurve samefit, would ifbeperformed expected under to have a worse conditions, that would be expected to have a worse curve fit, conditions, that would bethan expected to have worse curve fit, that is, a larger wSSE, the curve fit ofathe assay under that is, a larger wSSE, than the curve fit of the assay under that is, a larger wSSE, than the curve fit offrom the assay under consideration. This probability obtained the chi-square consideration. This probability obtained from the chi-square consideration. probability obtained from chi-square the distribution isThis called the fit probability. Beingthe a probability, distribution is called the fit probability. Being a probability, the distribution the fit probability. Being a probability, values of is thecalled fit probability range from 1 (perfect fit) to 0the (no fit). values of the fit probability range from 1 (perfect fit) to 0 (no fit). values of the fit probability range from 1 (perfect fit) to 0 (no fit). The Advantages of a Good Fit B A 100 100 100 30,000 30,000 B 30,000 25,000 25,000 B 25,000 20,000 20,000 20,000 15,000 15,000 15,000 10,000 10,000 10,000 5,000 5,000 5,000 00 0 100 100 100 500 1,000 500 1,000 500 1,000 500 500 1,000 1,000 500 1,000 5,000 10,000 50,000 100,000 5,000 10,000 50,000100,000 5,000 10,000 50,000 100,000 5,000 5,00010,000 10,000 5,000 Dose 10,000 Dose 50,000 100,000 50,000100,000 50,000 100,000 Dose Fig. 2. Comparison Comparisonof of5PL 5PLand and4PL 4PLcurve curve fitting.A,A,the theresult result a 5PL fitting. of of a 5PL fit asymmetric immunoassay data; B, the result 4PLfitfitof the on asymmetric immunoassay data; B, the result a a4PL onon Fig. 2. Comparison of 5PL and 4PL curve fitting. A,ofof the result athe 5PL same data. data. fit on asymmetric immunoassay data; B, the result of a 4PL fit on the same data. Table 1. Fit statistics for the 5PL and 4PL fits shown in Figure 2. Table 1. Fit statistics for the 5PL and 4PL fits shown in Figure 2. Statistic 5PL 4PL Statistic wSSE 5PL 2.41 4PL 120.0 120.0 10 wSSE Degrees of freedom 2.41 9 Degrees of freedom Residual variance 9 Residual variance Fit probability 0.269 0.983 12.0 <0.001 Fit probability 0.983 <0.001 0.269 10 12.0 All curves generated by by the All curves generated the 5PL 5PL equation equation are are either either All curves generated by the 5PL equationdepending are either on the monotonically increasing or decreasing, decreasing, monotonically increasing or depending on the monotonically increasingb,orand decreasing, depending on the the choice of of parameters choice parameters a, a, b, and d. d. Table Table 22 summarizes summarizes the choice of parameters a, b, and d. Table 2 summarizes the effect of of the effect the parameters parameters a, a, b, b, and and dd on on the the slope slope of ofthe the effect of the parameters a, b, and d on the slope of the logistic function. logistic function. logistic function. Figure 3 shows Figure 3 shows semi-log semi-log plots plots of of aa family family of of curves curves that that can can Figure 3 shows semi-log plots of a family of curves that can be generated from the 5PL equation. The figure makes be generated from the 5PL equation. The figure makes several several be generated from the 5PLfunction equation. The figure makes several characteristics of the characteristics of the 5PL 5PL function apparent. apparent. The The function function characteristics of the 5PLasymptote function apparent. Theapproaches function approaches a horizontal as the dose approaches a horizontal asymptote as the dose approaches approaches a horizontal another asymptote as the dose approaches zero, and it approaches horizontal asymptote zero, and it approaches another horizontal asymptote as as the the zero, and it approaches another horizontal asymptote as the dose approaches infinity. Between the asymptotic regions of dose approaches infinity. Between the asymptotic regions of dose approaches infinity.region. Between theisasymptotic regionspoint of the curve is a transition There a single inflection the curve is a transition region. There is a single inflection point the curve is a transition region. There is athe single inflection point in in the the transition transition region. region. On On either either side side of of the inflection inflection point, point, in the transition region. On either side of the inflectionatpoint, the curve will approach the left and right asymptotes the curve will approach the left and right asymptotes at the curverates will approach the left and right asymptotes at different unless g g= different rates unless = 1. 1. different rates unless g = 1. Each parameter Each parameter of of the the 5PL 5PL function function has has aa different different effect effect on on Each parameter 3ofsummarizes the 5PL functioneffect has a different effect on the curve. the curve. Table Table 3 summarizes the the effect of of each each parameter parameter the curve. Table 3 summarizes the effect of each parameter on the on the 5PL 5PL function. function. on the 5PL function. Table Table 3. 3. Geometric Geometric interpretation interpretation of of the the5PL 5PLfunction’s function’sparameters. parameters. Table 3. Geometric interpretation of the 5PL function’s parameters. Parameter Parameter a a b b c c d d g g Effect on Curve Effect on Curve Controls the position of the asymptote. Order relative Controls the position of the Along asymptote. relative to d controls sign of slope. with b,Order magnitude to d controls sign of slope. Alongofwith b, magnitude relative to d controls magnitude slope. relative to d controls magnitude of slope. Magnitude controls the rapidity of the transition region; Magnitude controls thecurve. transition region; sign controls sign ofthe therapidity slope ofofthe Solely controls signrate controls sign of the slope of the curve. controls the of approach to the asymptote, and Solely jointly with g the rate of controls theapproach approachtotothe theasymptote, asymptote.and jointly with g controls the approach to the asymptote. Controls the position of the transition region. Controls the position of the transition region. Controls the position of the asymptote. Order relative to Controls asymptote. OrderAlong relative to b, a controlsthe theposition sign of of thethe slope of the curve. with a controls the sign to of athe slope of the curve.ofAlong magnitude relative controls magnitude slopewith of b, magnitude the curve. relative to a controls magnitude of slope of the curve. Jointly with b controls the rate of approach to Jointly with b controls the rate of approach to the asymptote. the asymptote. 12,000 12,000 12,000 A 10,000 A 10,000 10,000 8,000 8,000 8,000 6,000 6,000 6,000 Increasing a a Increasing Increasing a 4,000 4,000 4,000 2,000 2,000 2,000 0 0 B 0 10 10 10 100100 100 1,000 1,000 1,000 10,000 10,000 10,000 100,000 100,000 100,000 10,000 10,000 10,000 B B 8,000 8,000 8,000 6,000 6,000 6,000 Increasing b b Increasing Increasing b 4,000 4,000 4,000 2,000 2,000 2,000 0 0 C D E 0 10 10 10 10,000 10,000 10,000 C Response, mean fluorescence units Note that when g = 1 and the curve is actually a 4PL curve, pairs of cases that when g =into 1 and the curve is actually 4PLcase curve, of cases Note and thecases, curve so actually curve, can be combined single that for a4PL, #1pairs and case #2, single cases,the sosame that for for can be combined into#3, single cases, so that 4PL, case #1 and #2, or case #1 and case generate functional forms. For case the 5PL or case #1 and case #3, generate the same functional forms. For the 5PL function, all cases produce distinct functionalfunctional forms. forms. For function, all cases cases produce produce distinct distinct functional functionalforms. forms. function, A Response, Response, mean mean fluorescence fluorescence units units Table 2. The relationship between the order of a and d, the sign of b, Table 2. The between the order Tablethe 2. slope The relationship relationship between thefunction. orderof ofaaand andd,d,the thesign signofofb,b, and of the monotonic 5PL and and the the slope slope of of the the monotonic monotonic5PL 5PLfunction. function. Case # Order of a and d Sign of b Slope Case # Order of a and d Sign of b Slope 1 a>d b>0 Down 1 a>d b>0 Down 2 a>d b<0 Up 2 a>d b<0 Up 3 a<d b>0 Up 3 a<d b>0 Up 4 a<d b<0 Down 4 a<d b<0 Down 100100 100 1,000 1,000 1,000 10,000 10,000 10,000 100,000 100,000 100,000 C 8,000 8,000 8,000 6,000 6,000 6,000 Increasing Increasing c c Increasing c 4,000 4,000 4,000 2,000 2,000 2,000 0 0 0 10 10 10 10,000 10,000 10,000 D 8,000D 8,000 8,000 6,000 6,000 6,000 4,000 4,000 4,000 2,000 2,000 2,000 0 0 10 0 10 10 10,000 10,000 10,000 E 8,000E 8,000 8,000 6,000 6,000 6,000 4,000 4,000 4,000 2,000 2,000 2,000 0 0 10 0 10 10 100100 100 1,000 1,000 1,000 10,000 10,000 10,000 100,000 100,000 100,000 100 100 100 Increasing Increasing d d Increasing d 1,000 10,000 1,000 10,000 1,000 10,000 100,000 100,000 100,000 Increasing g Increasing g Increasing g 100 100 100 1,000 10,000 100,000 1,000 10,000 100,000 1,000 10,000 100,000 Dose Fig. 3. Effects of varying the parametersDose of a 5PL function. Panels A–E, effects of varying parameters a, b, c, d, of and respectively. In this example, Fig. 3. Effects of the varying the parameters a g, 5PL function. Panels A–E, a > d3.and b < 0.ofthe Fig. Effects varying the parameters a g, 5PL function. Panels A–E, effects of varying parameters a, b, c, d, of and respectively. In this example, effects of varying a > d and b < 0. the parameters a, b, c, d, and g, respectively. In this example, a > d and b < 0. Fitting FittingWith Withthe theFive-Parameter Five-ParameterLogistic LogisticModel Model Asymmetric AsymmetricCurve CurveFitting Fitting To Tohandle handledata datathat thatwere weretoo tooasymmetric asymmetricfor forthe the4PL, 4PL,many many researchers were forced to use methods such as spline researchers were forced to use methods such as spline interpolations. interpolations.The Thespline splineinterpolation interpolationmethod methoddoes doesnot not attempt to find a minimum parameter model that attempt to find a minimum parameter model thatapproximates approximates the thetrue truecurve. curve.Instead, Instead,the thespline splineinterpolation interpolationmethod methodputs putsa cubic spline through the data points using a parametric cubic a cubic spline through the data points using a parametric cubic function. function.Since Sinceaaspline splinepasses passesexactly exactlythrough througheach eachdata data point, the number of parameters in a spline curve point, the number of parameters in a spline curveisisalways always equal equaltotothe thenumber numberofofdata datapoints. points.The Theresult resultisisthat thatthere thereare are always zero degrees of freedom in a spline fit. This means always zero degrees of freedom in a spline fit. This meansthat that aaspline splinefitfitperforms performsno noaveraging averagingofofthe thedata datatotoreduce reduce random variation. Furthermore, a spline is random variation. Furthermore, a spline isnot notaagood good interpolating interpolatingfunction functionfor forimmunoassay immunoassaydata, data,because becauseititoften often does not approximate the curve between the data does not approximate the curve between the datapoints pointswell. well. Splines Splinesare arenot notalways alwaysmonotonic, monotonic,and andcan canoscillate oscillateup upand and down downbecause becauseofofthe therandom randomvariation variationininevery everydata datapoint. point. Spline-based standard curves have substantial curve Spline-based standard curves have substantial curveerror error because becausenone noneofofthe therandom randomvariation variationininthe thedata datapoints pointsisis averaged averagedout, out,and andtherefore thereforethe theconcentration concentrationestimates estimates contain a greater amount of error contain a greater amount of errorthan thanaacurve curvemodel modelwith with fewer parameters that can average out the random fewer parameters that can average out the randomvariation. variation. Another Anothermethod methodsometimes sometimesused usedininlieu lieuofofaa5PL 5PLisisthe thelog log 4PL method. This method takes advantage of the fact 4PL method. This method takes advantage of the factthat that taking takingthe thelogarithm logarithmofofthe theresponse responseofofsome someasymmetric asymmetric curves can make the data more symmetrical, curves can make the data more symmetrical,and andtherefore therefore better suited to a 4PL fit. This approach works when better suited to a 4PL fit. This approach works whenthe the lowlow-response a sigmoidal curve a shorter “tail” response endend of aofsigmoidal curve hashas a shorter “tail” (approaches (approachesthe theasymptote asymptotefaster) faster)than thanthe theupper upperend endofofthe the curve (parameter g between 1.1 and 1.5). However, curve (parameter g between 1.1 and 1.5). However,taking takingthe the log logofofthe thesigmoidal sigmoidaldata datawhen whenthe thelow-response low-responseend endofofthe the curve curvehas hasaalonger longertail tailthan thanthe thehigh-response high-responseend endmakes makesthe the data more asymmetric. Since this type of behavior is data more asymmetric. Since this type of behavior is encountered encounteredininimmunoassay immunoassaydata datamore moreoften oftenthan thanthe theformer, former, this approach is not satisfactory for routine data reduction. this approach is not satisfactory for routine data reduction. Even Evenwhen whenggfits fitsinto intothis thisoptimal optimalrange, range,a a5PL 5PLmodel modelwill will almost always result in a better fit and more accurate almost always result in a better fit and more accurate concentration concentrationestimates. estimates. When Whenthe the5PL 5PLmodel modelwas wasfirst firstproposed proposedas asthe themeans meanstoto obtain good fits to asymmetric assay data, many obtain good fits to asymmetric assay data, manyresearchers researchers recognized recognizedits itspotential potentialand andattempted attemptedtotointegrate integratethe the5PL 5PL into their assay analyses. They found that fitting a 5PL into their assay analyses. They found that fitting a 5PLmodel model totoassay assaydata dataisismuch muchmore moredifficult difficultthan thanfitting fittingthe the4PL 4PLmodel. model. The traditional numerical-fitting algorithms used for the The traditional numerical-fitting algorithms used for the4PL 4PL failed failedatatan anunacceptably unacceptablyhigh highrate; rate;that thatis,is,they theyeither eitherreturned returned an error or never finished. Worse still, in many of the an error or never finished. Worse still, in many of theassays assaysforfor which whichthe thealgorithm algorithmreturned returnedaafit, fit,the thefitfitwas wasclearly clearlynot notthe the best fit. best fit. Why Whythe the5PL 5PLModel ModelIsIsDifficult DifficulttotoFit Fit One Onereason reasonthat thatthe the5PL 5PLmodel modelisisso sodifficult difficulttotofitfitisisthat thatit itisis possible possibletotowildly wildlyadjust adjustits itsparameters parametersinintandem tandemso sothat thatthe the 5PL 5PLcurve curveitself itselfhardly hardlychanges changesininthe theplaces placeswhere wherethe thedata data are. are.When Whenone onecan canchange changethe theparameters parametersofofthe thecurve curveso sothat that the thecurve curvehardly hardlychanges changesrelative relativetotothe thedata datapoints, points,then thenthe the residuals residualsalso alsowill willhardly hardlychange. change.This Thisresults resultsininthe thevalue valueofof wSSE being largely unaffected by these tandem parameter wSSE being largely unaffected by theseoftandem parameter changes. Figure 4 shows an example a 5PL curve where changes. 4 shows ancexample of a 5PL curve changingFigure the values of the and g parameters of thewhere 5PL changing the has values the c on andthe g parameters of the 5PL substantially littleofeffect curve and therefore on the substantially hasThis little can effect the by curve therefore oninthe value of wSSE. beon seen the and wide, flat valley the value of wSSE. This can be seen by the valley in the plot where an algorithm could easily bewide, fooledflat into stopping plot where benear fooled early. Also,an thealgorithm flat plaincould on theeasily far left theinto g =stopping 0 axis could early. Also, the flatinto plainstopping on the far leftanear theworse g = 0value axis could fool an algorithm with much of fool an algorithm stopping with aofmuch worse of wSSE than in theinto valley. The values c and g thatvalue minimize wSSE thanofinwSSE the valley. The values are of ccand g that minimize the the value in this example = 16,748 and value of wSSE in this example are c = 16,748 and g = 2,899. g = 2,899. wSSE 40,000 40,000 5,000 5,000 20,000 20,000 10,000 0 15,000 2,000 g 20,000 4,000 4,000 6,000 c 25,000 25,000 Fig. 4. A plot of the wSSE surface of a 5PL curve fit against the Fig. 4. A plot of the wSSE surface of a 5PL curve fit against the parameters parameters c and g. The color of the plot corresponds to the slope of the c and g. The color of the plot corresponds to the slope of the surface. Blue hues surface. Blue hues are low slope, or “flat”. The values c = 16,748 and g = 2,899 are low slope, or “flat”. The values c = 16,748 and g = 2,899 minimize wSSE. minimize wSSE. This minimum falls in the middle of a flat plain that can fool This minimum falls in the middle of a flat plain that can fool algorithms into algorithms into stopping early or going haywire, or can cause other problems. stopping early or going haywire, or can cause other problems. Note also the Note also the plain on the far left near the g = 0 axis, where an algorithm could plain on the far left near the g = 0 axis, where an algorithm could easily stop easily stop without realizing that the valley where the true answer lies even exists. without realizing that the valley where the true answer lies even exists. To facilitate the interpretation of this geometric view, imagine that wSSE is the result of fitting a two-parameter curve model To facilitate the interpretation of this geometric imagine instead of the 5PL curve model. A region in theview, parameter that wSSE is the result of fitting a two-parameter curve model space where wSSE hardly changes even across wide changes instead of the 5PL curve model. A region in the parameter in the parameter values would be like a wide flat plain. A plot of spacecan where changes even sides acrossbut wide changes wSSE alsowSSE have hardly “ravines” with steep very level in the parameter values would be like a wide flat plain. A plot of bottoms. Unlike a person standing on a landscape scanning wSSE canfor also have “ravines” with steep sides the scene a low spot, a fitting algorithm has but onlyvery locallevel bottoms. Unlike a person standing on a landscape scanning information about wSSE and the history of where it has been the scene a low spot, fitting algorithm only local to guide itsfor next guess forathe location of thehas minimum of information about wSSE and the history of where it has been wSSE. It has no way to “see” where to go. to guide its next guess for the location of the minimum of The point in the space thattominimizes wSSE wSSE. It has noparameter way to “see” where go. provides the parameters of the best-fitting curve. Such a point point in theofparameter that surface. minimizes isThe at the bottom a valley onspace the wSSE If awSSE wide plain provides the parameters of the best-fitting curve. Such surrounds such a valley, an algorithm that does not usea point is at the bottom of a valley on the wSSE surface. If a wide plain advanced methods can be fooled into thinking that the plain surrounds such a valley, an algorithm that does not use is actually the floor of the best-fitting valley, causing it to stop advanced be fooled intoonthinking thesecond plain well short ofmethods the truecan minimum. Also, such athat plain, is actually the floor of the best-fitting valley, causing it to derivative-based methods such as Gauss-Newton and stop well short of the true minimum. on such a plain, secondLevenberg-Marquardt, the fittingAlso, methods most commonly derivative-based methods such as Gauss-Newton and used to fit 4PL curves, can be fooled into going in an entirely Levenberg-Marquardt, thethe fitting methods mostderivatives commonlyis wrong direction because matrix of second used to fit 4PL curves, can be fooled into going in an entirely very nearly singular. Such algorithms may hop around the wrong direction because the matrix of second derivatives is parameter space forever, or stop and report an error after their very nearly singular. Such algorithms may hop around the iteration limits are reached. Also, the near singularity of the second-derivative matrix can cause many numerical problems since it must typically be inverted to be used in algorithms that employ it. This can cause an algorithm to have unpredictable problems. Another problem that happens to algorithms on such flat plains in parameter space is that they can be going somewhere, probably not even in the right direction, very, very slowly. The result of this is that the algorithm may also never return because it never thinks it is “finished”. Lastly, the wSSE surface is often not one big “bowl” such that all a fitting algorithm needs to do is “slide” down to the bottom. Almost any fitting algorithm can handle such a simple scenario. Real wSSE surfaces have many possible valleys, the bottoms of which may be the correct best minimum but more likely are not. A successful algorithm must be able to find the valley with the lowest bottom, and then successfully find the bottom. This requires that the algorithm be able to find a starting point for the fitting algorithm so that the starting point is within the region of convergence to the correct minimum of wSSE of the fitting algorithm being used. This is perhaps the most difficult task in developing a reliable algorithm for fitting the 5PL model. Many algorithms will end up in the wrong valley, and will report that they have found the best fit, even though they are nowhere near it. n Is numerically stable and robust due to careful selection of appropriate numerical algorithms, careful customization and optimization of them, and carefully designed logic for handling exceptions Summary Weighted 5PL regression produces the most accurate concentration estimates of any method currently in use for asymmetric immunoassay data. Because of difficulties with fitting the 5PL curve model, the 5PL model is only now becoming widely used in conjunction with software like the Bio-Plex system’s Brendan logistic module and StatLIA. Since no immunoassay data are perfectly symmetric, fitting with a 5PL model will almost always result in better concentration estimates than using a 4PL model. This in turn allows greater dynamic ranges to be used, and produces greater accuracy in the results. The Brendan logistic module is able to fit the 5PL model better than other software. References Bates DM and Watts DG, Nonlinear Regression Analysis and Its Applications, New York: Wiley (1988) Draper NR and Smith H, Applied Regression Analysis, New York: Wiley (1981) Dudley RA et al., Guidelines for immunoassay data processing, Clin Chem 31, 1264–1271 (1985) Finney DJ, Statistical Methods in Biological Assays, 3rd ed, London: Charles Griffin (1978) The Brendan Logistic Module Fits the 5PL Better Than Other Software Finney DL and Phillips P, The form and estimation of a variance function, with particular reference to immunoassay, Appl Stat 26, 312–320 (1977) A better 5PL fit means that concentration estimates will be more accurate. More accurate concentration estimates lead o wider dynamic ranges for assays. The Brendan logistic module, used in Bio-Plex Manager™ version 3.0 software, does a better job of fitting the 5PL than other data reduction software. The Brendan logistic module performs better because it: Prentice RLA, A generalization of the probit and logit methods for dose response curves, Biometrics 32, 761–768 (1976) n n n n Uses optimal weighting with the 5PL curve model to correctly model the random error and thereby get the best possible fit Employs a number of sophisticated fitting algorithms that are able to traverse the “plains” and “level valleys” of the wSSE surface that can fool other algorithms Uses sophisticated logic to determine which algorithms are most effective to use during each phase of the fitting Uses advanced methods to ensure that the true global minimum of wSSE is obtained, not a false local minimum Raab GM, Estimation of a variance function, with application to immunoassay, Appl Stat 30, 32–40 (1981) Rodbard D et al., Radioimmunoassay and Related Procedures in Medicine 1, Vienna: International Atomic Energy Agency, 469–504 (1978) Wild D (ed), The Immunoassay Handbook, New York: Stockton Press (1994) About Brendan Scientific Brendan Scientific develops and markets StatLIA software for advanced computational analysis and workflow management of immunoassay-related technologies. Brendan’s core focus has been to combine proven scientific methods and robust mathematics with the processing speed and automation of software. This combination provides the highest level of accuracy and reliability with the workflow flexibility and automation required for today’s laboratory. StatLIA is a trademark of Brendan Scientific. The Bio-Plex suspension array system includes fluorescently labeled microspheres and instrumentation licensed to Bio-Rad Laboratories, Inc. by the Luminex Corporation. Information in this tech note was current as of the date of writing (2004) and not necessarily the date this version (rev A, 2007) was published. Bio-Rad Laboratories, Inc. Web site www.bio-rad.com USA 800 4BIORAD Australia 61 02 9914 2800 Austria 01 877 89 01 Belgium 09 385 55 11 Brazil 55 21 3237 9400 Canada 905 712 2771 China 86 21 6426 0808 Czech Republic 420 241 430 532 Denmark 44 52 10 00 Finland 09 804 22 00 France 01 47 95 69 65 Germany 089 318 84 0 Greece 30 210 777 4396 Hong Kong 852 2789 3300 Hungary 36 1 455 8800 India 91 124 4029300 Israel 03 963 6050 Italy 39 02 216091 Japan 03 5811 6270 Korea 82 2 3473 4460 Mexico 52 555 488 7670 The Netherlands 0318 540666 New Zealand 0508 805 500 Norway 23 38 41 30 Poland 48 22 331 99 99 Portugal 351 21 472 7700 Russia 7 495 721 14 04 Singapore 65 6415 3188 South Africa 27 861 246 723 Spain 34 91 590 5200 Sweden 08 555 12700 Switzerland 061 717 95 55 Taiwan 886 2 2578 7189 United Kingdom 020 8328 2000 Life Science Group Bulletin 3022 US/EG Rev A 07-0749 0907 Sig 1106
© Copyright 2026 Paperzz