HEMATOPATHOLOGY Original Article Exponentially Adjusted Moving Mean Procedure for Quality Control An Optimized Patient Sample Control Procedure FREDERICK A. SMITH, MD, AND STEVEN H. KROFT, MD The idea of using patient samples as the basis for control procedures elicits a continuing fascination among laboratorians, particularly in the current environment of cost restriction. Average of normals (AON) procedures, although little used, have been carefully investigated at the theoretical level. The performance characteristics of BulPs algorithm have not been thoroughly delineated, however, despite its widespread use. The authors have generalized Bull's algorithm to use variably sized batches of patient samples and a range of exponential factors. For any given batch size, there is an optimal exponential factor to maximize the overall power of error detection. The optimized exponentially adjusted moving mean (EAMM) procedure, a variant of AON and Bull's algorithm, outperforms both parent procedures. As with any AON procedure, EAMM is most useful when the ratio of population variability to analytical variability (standard deviation ratio, SDR) is low. (Key words: Quality control; Patient samples; Bull's algorithm; Average of normals) Am J Clin Pathol 1996; 105:44-51. In 1974, Bull and colleagues' published a procedure designed to assess quality control (QC) of erythrocyte indices measured by automated hematology analyzers using patient data rather than control materials. This has since become known as "Bull's algorithm." Quality control indices using patient data are particularly desirable in hematology because of the expense and instability of available manufactured control materials.2 In addition, commercial controls may possess very different physical properties than the patient samples that they should ideally mimic within the analytical environment. This limits their effectiveness in the detection of analytical errors that may significantly affect patient results.2 The simplest quality control procedure using patient samples is the average of normals procedure (AON), as described by Hoffman and Waid in 1965.3 This procedure signals an error condition if the average of a selected number of patient samples falls beyond predetermined control limits set around the mean of designated "normal" patient population. Because abnormal samples may be present within any given patient population, it is generally necessary to set truncation limits, outside of which results are excluded from analysis. The appropriate position of the truncation limits depends on the desired sensitivity of the procedure as well as the percentage of abnormals within the population.4 In addition, it has been demonstrated that AON performs best when the standard deviation of the patient population (spop) is not much larger than the analytical standard deviation (smeas)-4 For example, AON is much more sensitive to detection of analytical bias for an assay with a standard deviation ratio (SDR = SpOp/smcas) of two than for an assay with an SDR of eight (Table I).5 Note that the spop is a resultant of the underlying between-patient biologic standard deviation (sbio) and the smeas. This relationship is expressed as: = Vs • 2 + s s •Jpop V °bio 2 ' °mcas en V' / Bull's algorithm is a variation of the AON procedure developed specifically for application to red cell indices on automated analyzers. Although it is widely incorporated into commercial hematology analyzers, it has historically resisted clear explanation of how and why it works. This may be partially due to its considerable notational complexity, which in fact has led to misprinting by reputable authors.2 The algorithm as described by Bull and colleagues' is as follows: From the Department of Pathology. Northwestern University Medical School. Chicago. Illinois. X B .i = Manuscript received June 15, 1995; revision accepted August 30, 1995. Address reprint requests to Dr. Kroft: Department of Pathology, Passavant Pavilion Rm. 316, 303 E. Superior, Chicago. IL 60611. (2) where XB,i is the current Bull's mean and XBj_i is the previous Bull's mean. Thus, the new Bull's mean after a 44 SMITH ET AL. Exponentially Adjusted Moving Mean batch or run of N samples of analyte X is computed by adding a calculated "signed function" d to the previous Bull's mean. The function d is denned as: d = sgnl 2 sgn(Xj - XB,i-i)|Xj - XB.i-il /^sgn(Xi-XB,i_,)|Xi-XB,,_,l'yp (3) Traditionally, the exponential factor (P) = 1/2 and n = 20. Thus, d is calculated for a batch of 20 values of analyte X by taking the square, with sign maintained, of the average of the square roots, with sign maintained, of the differences between X, through X2o and XB. It is important to realize that the bulk of the formula is present simply to maintain the sign of the function d, so that XB is incremented in the same direction as the predominant deviation of the twenty values of X in the batch. A critical feature of the formula is that it averages the square roots of the deviations of individual points from the previous Bull's mean and then squares that average. As larger numbers are "reduced" more by taking their square root than smaller numbers, this formulation serves to reduce the effect of points that are far from the previous Bull's mean compared to points that are close to it. This effectively narrows the distribution of XB signal compared to AON by dampening both the random variation of the normal population and the effect of abnormal patient outliers. Finally, the effect of calculating d relative to the previous Bull's mean is to maintain maximum sensitivity to deviations around the current operating position of the method, thus making the measure particularly responsive to stable bias or progressive in- TABLE 1. TYPICAL STANDARD DEVIATION RATIOS (SDR) VALUES FOR REPRESENTATIVE CLINICAL ASSAYS Assay MCHC Calcium Sodium Chloride Prothrombin time Creatinine Potassium Glucose MCH CO2 Cholesterol SDR 1.7 2.6 2.7 2.9 3.0 4.0 5.4 7.0 7.4 7.8 9.1 MCV 11.0 Urea nitrogen Hemoglobin 13.0 23.0 MCHC c mean corpuscular hemoglobin concentration; MCH = mean corpuscular hemoglobin: MCV - mean corpuscular volume. 45 creases in bias in one direction. This imparts a trendlike behavior to the procedure. However, because of the dampening inherent in the process, Bull's algorithm would be expected to respond slowly to analytical bias compared to an AON procedure, as on a given run the Bull's mean does not completely shift to the position of the true mean of that run. However, the error detection would be expected to "pick up speed" over multiple runs because of the trend effect. Note that Bull's algorithm will not detect increases in random error. To date, Bull's algorithm has only been evaluated for the quality control of red cell indices for which it was originally designed,l6~9 whereas we believe it has potential applications beyond hematology analyzers. Furthermore, although in their original article, Bull and colleagues state that a value of P = 1/2 performed best, no data on the effect of varying P was presented. Also, the use of n other than 20 has never been investigated. We view Bull's algorithm, as currently used, as a specific example of a more generalized quality control procedure, which we call the Exponentially Adjusted Moving Mean (EAMM), which in turn is a modified AON procedure. In fact, when P = 1, EAMM is identical to an untruncated AON. Our goal in this study was to describe the general performance characteristics and behavior of the EAMM over a range of values of n, P, and SDR over multiple runs. We used a QC simulation software program written by one of the authors (F.A.S.) to determine the rates of rejection (error detection) of simulated analytical runs with varying degrees of bias imposed on a normally distributed population. To ensure valid comparisons of different values of P, it was necessary to normalize the probability of falsely rejecting a run. We chose a fixed rate of 0.3% per run of n patient samples, inasmuch as the EAMM test statistic is not calculated until n patient samples have accumulated. The sample run thus constitutes the unit of decision (to accept or reject an analytical run) of the EAMM procedure. We deemed this very low rate of false rejection to be a reasonable target for the EAMM procedure in order to make the procedure acceptable for routine use with high volume analyses. Although the EAMM procedure can be adjusted to yield false rejection at any level, a rate of .003 per run, regardless of run length, is analogous to setting the limit of an AON rule at three times the standard error of the mean as has typically been reported.2"4 Like AON, we expected the EAMM to be most effective at low SDR. We also expected the optimal value of P and n to be dependent on several factors, including the SDR, the desired performance characteristics of the Vol. 105 • No. 1 46 HEMATOPATHOLOGY Original Article procedure, and the percentage of abnormals in the population. The last factor we will evaluate in a future study, the current analysis being limited to normally distributed populations. MATERIALS AND METHODS The quality control simulation program is written in Quick BASIC 4.0 (Microsoft, Redmond, WA) for use on MS-DOS compatible microcomputers. The simulations were run on a variety of personal computers powered by i486 and Pentium microprocessors (Intel, Santa Clara, CA). The simulator represents an extension of a previously described simulation system10"12 to include a variety of patient sample-based control procedures as well as classical, known-value control algorithms. In the EAMM procedure simulation, the following parameters were under our control: the number of patient samples per run (n); the exponential factor (P); the truncation/ exclusion limit for outliers; the flagging limit for the EAMM signal; and the analytical and population variances. The simulator generates simulated patient data points using a file of pre-generated gaussian integers with a mean of 0 and a standard deviation of 1,000. These are scaled to the user-defined SbiO and added to the patient population mean to yield the "true" patient sample values. To each sample value is added analytical bias, if any, and analytical variation generated analogously from a separate file of gaussian integers. The simulator allows generation of abnormal samples distributed randomly among normal patient samples, but for this study, only a normal population was used. The use of pregenerated random numbers to simulate both biological and analytical variations insures that each QC algorithm is exposed to identical datasets, and thus comparisons between different algorithms are strictly fair, if arguably not strictly random. Flagging limits that set a false rejection rate of 0.003 were determined by a simple simulation of the EAMM procedure using 200,000 sequential gaussian data points at different n and P. After each batch of n points, the magnitude of the EAMM signal was tabulated and the 99.7th percentile determined. Output from the simulation program was incorporated into a Lotus 1-2-3 spreadsheet (Lotus Development, Cambridge, MA) for analysis and graphic display. Each statistical power curve was fitted empirically to the equation: y = 1 - [(e- b x " (4) The parameters b and m were determined by a least squaresfitof y on x using the Lotus 1-2-3 solver function. This equation is useful to produce smoothed curves that allow for interpolation at any point, as well as for economical transfer of the curves for use (eg , in the design of control systems for routine use). Simulations were performed for n = 20, 30,40, 60, 80, and 90 for P of 0.3 through 0.7 at intervals of 0.05, with error conditions ranging from zero to at least 4.5 times the smeas- For each N, P, and error level a total of 1,000 series of 9 runs each were simulated to calculate probabilities of rejection. After this preliminary analysis, for each n the range of P values within which the procedure was most sensitive were reanalyzed at intervals of 0.01 for P. The optimal value of P for each n was determined by summing the area over the power function curves for runs 1 through 9 over values of analytical bias ranging from 0 to the value that yielded 100% rejection on the first run. Greater sensitivity to persistent analytical bias is demonstrated by a smaller summed area over the power curve set. RESULTS A representative set of power curves for n = 60 and P = 0.69 is demonstrated in Figure 1. These demonstrate the probability of rejection on the first run over a range of error conditions (expressed as multiples of smcas) for various values of SDR. As predicted, sensitivity for a given level of systematic error increases rapidly as SDR is reduced. This was true for all values of n and P, as well as over multiple runs. When these families of curves demonstrated in Figure 1 are normalized for SDR, it is seen that they superimpose exactly (see below). Note that when the systematic error expressed as multiples of smeas (systematic error/smeas) is divided by SDR (SpOp/ smcas), the Smeas term cancels out and the resultant is systematic error in terms of spop (systematic error/SpOp). Therefore, the remainder of the analyses will be presented in terms of composite curves normalized for SDR, with error expressed as multiples of spop (see below). These composite curves may be closely approximated by a least squares fitted curve according to equation 4 above (see below). The effect of increasing n is as predicted, with dramatic increases in sensitivity with increases in n over the range we investigated (Fig. 2). It should be noted that the increase in sensitivity is not arithmetic. For instance, 3 runs of 30 is not equivalent to 1 run of 90. In fact, 1 run of 90 is more sensitive to error than 3 runs of 30, even though it has a lower false rejection rate (.003 compared to approximately .009). The effect of varying P is demonstrated cumulatively over multiple runs in Figure 3 for n = 60 and a fixed level A.J.C.P. 'Junuary 1996 SMITH ET AL. Exponentially Adjusted Moving Mean 47 • j FlG. 1. Probability of rejection as a function of systematic error for n = 60, P = 0.69, over a range of SDR: (1) first run rejection; (2) cumulative third run rejection. i I j \/Jf/ • — 2 3 Systematic error / Smeas 4 0 — - — 1 f —•IT 1 i • jlll: t / t /1 rlt / •—m—i ! — i -i— | ; ' i 1 2 i 3 4 5 Systematic error / Smeas = 5 sSDR - 6 of SE. Note that the best performance is obtained on the first run for P = I (equivalent to untruncated AON). However, this advantage is quickly lost over multiple runs, and by run nine all parameters demonstrated except 0.3 are performing better than AON. This was seen over all values of n except 20. Interestingly, at this level of n, parameters of < I were almost exactly as sensitive as AON, even on the first run, as demonstrated in Figure 4. The optimal P levels for various values of n are shown in Table 2. These vary between 0.63 and 0.70. Families of curves for optimized P for different levels of n are shown in Figure 5. Curves for P = I and P = .5 are given for comparison. Fitted curves are superimposed on observed curves in these figures. DISCUSSION As expected, the EAMM, like the closely related AON, is most sensitive to systematic error (bias) for assays in which the ratio of spop to smeas is small. The EAMM is highly responsive to quite small biases for SDR = 3, but this sensitivity falls offquickly for SDR increasing up to 7. However, it should be pointed out that for diagnostic use, higher degrees of bias, expressed as multiples of Smeas. may be more tolerable for assays with larger SDR. This is because when SpOp/smeas is large, small biases may be insignificant compared to the intrinsic variability present within the population, and may therefore not affect diagnostic decisions. However, the relative importance of various amounts of analytical error depends on both the particular characteristics of the assay and analyte as well as the specific application of the test. For example, consider an analyte with a large between-patient variability, but which is under very tight physiologic control in an individual patient. Small analytical biases would be unimportant if the assay were to be used as a screening procedure, but may assume profound importance if used to follow an important parameter in an individual patient over time in a critical care setting. Examples of SDR values for some common analyses are shown in Table 1.5 An analysis of the curves for a range of SDR at a given n and P revealed that when the magnitude of error is corrected for differences in SDR (expressed as fractions or multiples of spop), the curves are exactly superimposable. These composite curves can be closely approximated by an equation of the form given in equation 4. Thus, the ability of the EAMM to detect a given level of error can be predicted directly from the ratio of the error to the spop. This represents the mathematical basis for the observed increased sensitivity for lower SDR. It can be explained intuitively in the following way: Consider that we are monitoring differences in the mean of our normal population to detect drift in our analytical process. In effect, the SpOp represents the "noise" against which we are attempting to detect the "signal" (ie, the analytical error): the smaller the signal-to-noise ratio, the lower the power of the signal. This signal-to-noise ratio represents the sole determinant of the sensitivity of the control procedure, regardless of the value of SDR. The effect of increasing n, the number of samples analyzed per run, was not surprising in that larger values of n markedly increased the sensitivity to bias. Also, as pointed out earlier, large values of n provide better error detection, with less false rejection, than equal numbers of samples analyzed in smaller batches. However, the trade-off of using large values of n is fewer QC data points at less frequent intervals. In addition, at very large n (eg, n > 100), the procedure may actually be too sensitive, as Vol. 105-No. 1 HEMATOPATHOLOGY 48 Original Article 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 1.3 Systematic error / Spop | « N = 20 «-N = 30 * N = 40 s N = 60 * N = 80 I P (Exponential adjustment factor) FIG. 3. Probability of rejection as a function of/" for runs 1, 2, 3, 4, 5, 7, 9 for n = 60 at afixedlevel of systematic error (0.3 X SpoP). 0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 1.2 /fU yi f / 1.3 0.9 Systematic error / Spop I B • 0.8 i N = 20 • N = 30 * N = 40 B N = 60 /1 : 0.7 ! FIG. 2. Probability of rejection for n = 20. 30,40. 60, and 80 for a fixed P (0.7): (A)firstrun rejection; (B) cumulative third run rejection. -i-o.6 1 1 0 5 i 0.4 medically insignificant analytical error might signal error conditions at an unacceptably high rate, thus increasing the functional false rejection rate. Thus, the ideal n for the EAMM depends on the desired sensitivity for detecting various amounts of error as well as the number of samples analyzed daily in a given lab. An advantage of patient data QC procedures over control procedures is that they allow the accrual of large number of QC data points per day for minimal cost. For low volume assays, high values of n may prohibit the generation of sufficient data points to effectively monitor an analyzer's opera- ! < //I ; ii / I II \ / '•if n/\ II 1/ / i i ! / i. y/ / / a. 0.2 0.1 0 0.1 0.2 • ! / 0.3 0.4 0.5 ! 1 0.6 0.7 0.8 0.9 1 j i 1.1 1.2 1.3 1.4 Systematic error / Spop - P = 0.66 (optimal) - P = 1.0 (average of normals) FIG. 4. Comparison of probability of rejection versus error for optimized EAMM and AON at n = 20 over runs 1, 2, and 5 (right to left). A.J.C.P. • January 1996 49 SMITH ET AL. Exponentially Adjusted Moving Mean 7 Sy^WMOcEirarfSpop SytiamOc Efror / Spop B Systematic Eiror / Spop Syttamatic Error / Spop SyiumatJc Error / Spop I-Exporiinanta] •Flttad j-Experimental «Rt1ad FIG. 5. Probability of rejection versus systematic error (experimental and fitted) for P = 0.5 (Bull's algorithm); P = optimal level; and P= 1 (AON) over runs 1, 2, 3, 5, and 9 (right to left): (A) n = 20; (B) n = 30; (C) n = 60; and (D) n = 80. tion. However, a high volume assay may easily accommodate large values of n for the EAMM calculation. Varying the value of P in the EAMM had profound effects on the sensitivity of the analysis, as predicted (Fig. 3). High values of P (eg, 0.9 or 1.0) showed the best early run performance. Intermediate values of/'(eg, 0.5-0.7) performed less well in early runs, but showed better performance over multiple runs than higher values. At very Vol. 105 •No. 1 HEMATOPATHOLOGY 50 Original Article TABLE 2. OPTIMAL VALUES OF P FOR VARIOUS N N Optimal P 20 30 40 60 80 90 0.66 0.64 0.63 0.69 0.70 0.63 low values of P (eg, 0.3) the performance remained poor both in early runs and over multiple runs. Thus, it appears that two opposing factors influenced the performance of the EAMM: as P is lowered, there is a progressive increase in the dampening of errors resulting in a loss of information on early runs; however, there is a simultaneous increase in the trend effect, producing an advantage over multiple runs. It turns out that for each value of n there is a point at which these two effects balance out and produce a maximal sensitivity to error, as given in Table 1. These values always fell between 0.63 and 0.70. Thus, the optimized EAMM detected error better than either the traditional Bull's algorithm or the AON. An interesting exception to the previously described behavior was that the optimized EAMM had essentially equalfirst-runsensitivity as AON for low n (n = 20). This may be explained by the fact that the dampening effect on random error imparted by the exponential function allows narrower error limits than AON for the same level of false rejection, thus improving sensitivity. However, this first run equivalence is lost for higher values of n, such that AON is superior to EAMM in sensitivity on thefirstrun after error is introduced. This reflects the fact that when n is high (eg, >30), the mean of a given run will very accurately and reliably reflect the mean of the entire population. This will occur regardless of the standard deviation of the population values, such that the advantage of symmetrical dampening of random error with resultant narrower error limits imparted by the EAMM is lost. To reiterate, the cumulative sensitivity of EAMM over multiple runs will be superior to AON even for high values of n because of the trend effect in EAMM. It must be realized that our current analysis is limited to normally distributed populations lacking abnormal components. Therefore, our results as described are strictly applicable only to assays and patient populations that have a very small percentage of abnormals. We expect that as the percentage of abnormals within a population increases, the sensitivity for a given level of false rejection will decrease. The AON procedure handles outliers by applying truncation limits outside of which data are excluded from analysis. For populations without outliers, the procedure performs best without any truncation. The truncation limits progressively narrow as the percentage of outliers increases. Analogously, we would expect the optimal P value for the EAMM to decrease (increased "trim") as the percentage of outliers increases. Alternatively, one could apply truncation limits to the EAMM procedures. The effects of outliers on performance and the optimal methods of correcting for them are currently under investigation and will be the subject of a subsequent report. Ultimately, we imagine that a customized EAMM procedure could be easily incorporated into computerized laboratory quality control systems. Once installed, it would provide essentially cost-free quality control, although it will likely not replace traditional control materials. In fact, perhaps the most promising use for these procedures is as an event gauge, or signal, to run a known value control. For example, the sensitivity of the procedure could be set at a more stringent level than clinically necessary. An error signal, then, would not precipitate rejection of a run, but would rather initiate the running of traditional controls. This could serve to decrease the use of expensive control materials while maintaining good control of the system. For instance, using an n of 60 with false rejection set at 0.3%, one false positive result would occur every 20,000 specimens, thus necessitating the use of traditional control materials only at very infrequent intervals in the absence of true bias. At an SDR of 5, a systematic error of magnitude two times smeas would be detected on the first run after introduction of error 54% of the time. This increases to 88% after two runs and 98% after three runs. However, the acceptable false rejection rate in such a system could be set much higher, as the cost of a false positive result would not be the rejection of an entire analytical run, but simply the cost of a sample of control material. Therefore, the sensitivity of the assay could be markedly increased. In addition, a multi-rule version of EAMM could be used to improve performance, as described for Bull's algorithm.7 REFERENCES 1. Bull BS, ElashoffRM, Heilbron DC, Couperus J. A study of various estimators for the derivation of quality control procedures from patient erythrocytic indices. Am J Clin Pathol 1974;61: 473-481. 2. Cembrowski GS, Carey RN. Quality control in hematology. In: Laboratory Quality Management. Chicago: ASCP Press, 1989, pp 186-212 3. Hoffman RG, Waid NE. The "average of normals" method of quality control. Am J Clin Pathol \965;43:134-141. 4. Cembrowski GS, Chandler EP, Westgard JO. Assessment of "average of normals" quality control procedures and guidelines for implementation. Am J Clin Pathol 1984; 81:492-499. A.J.C.P. -January 1996 SMITH ET AL. Exponentially Adjusted Moving Mean Cembrowski GS. Use of patient data for quality control. Clin Lab Med \9S6\6J 15-733. Cembrowski GS, Westgard JO. Quality control of multichannel hematology analyzers: Evaluation of Bull's algorithm. Am J Clin Pathol 1985;83:337-345. Levy WC, Hay KL, Bull BS. Preserved blood versus patient data for quality control: Bull's algorithm revisited. Am J Clin Pathol 1986;85:719-721. Lunetzky ES, Cembrowski GS. Performance characteristics of Bull's multirule algorithm for the quality control of multichannel hematology analyzers. Am J Clin Pathol 1987;88: 634-638. 51 9 Tramacere P, Marocchi A, Gerthoux P, et al. Inefficacy of moving average algorithm as principal quality control procedure on Technicon system H6000. Am JCiin Pathol 1991 ;95:218221. 10. Smith FA. The effects of long-term components of variance on the performance of rules for statistical quality control (Abstr). Clin C/ieml987;33:21O. 11. Smith FA, Cossitt NL. Simulated comparison of multi-rule protocols for statistical quality control (Abstr). Clin Client 1987;33: 909. 12, Smith FA. Statistical power functions of multi-point rules (Abstr). ClinChem 1986;32:1183-1184. Vol. 105-No. 1
© Copyright 2026 Paperzz