SEQUENTIAL EVALUATION OF QNDE DEVICES FOR UNDERGROUND STORAGE TANKS Y. H. Michlin Quality Assurance and Reliability, Technion - Israel Institute of Technology, Haifa, 36812, Israel ABSTRACT. Data showing that QNDE devices for tightness testing of storage tanks require periodic precision checks under maximum reproduction of the field conditions. The measurement error was larger than the accuracy prescribed by standards, and much larger than that claimed by the manufacturer. In the paper, the algorithm based on the sequential approach for such tests, and the probability distributions of the number of measurements up to a positive/negative serviceability decision -are substantiated. INTRODUCTION Underground storage tanks (UST's) for fuels and liquid chemicals are a major source of environmental pollution. As a protective measure against leaks from these tanks, the procedure developed by the U.S. Environmental Protection Agency (EPA) as per [1] was adopted in Israel. According to it, selected UST's for petroleum products are required to undergo volumetric tank tightness testing (VTTT) with the aid of portable devices. In seeking to apply these devices in Israel, the following factors have to be borne in mind: (1) The tank population involved is relatively small compared with its North American counterpart. (2) Such devices are not manufactured here, and there is no infrastructure for their calibration and checking. (3) The tightness tests are requested by the owners, who also receive the reports; subsequently, at their discretion, they submit the reports (usually only the favorable ones) to the Ministry of the Environment. The first factor, in view of the high cost of the equipment, limits its inventory in Israel to one or two devices. This, in conjunction with the second factor, dictates extreme caution regarding the quality of the tests. Because of the third factor, the information received by the Ministry of the Environment derives from "truncated" samples. In these circumstances, it is difficult to assess the accuracy of the test equipment and the correctness of its application - with the attendant unreliability in predicting the state of the tank population as a whole. A similar problem is being encountered in the U.S.: according to an EPA document [2], a site survey failed to detect a single leaking tank, in spite of the existence of such cases. Statistical reliability of the decisions as to the serviceability of the device is conditional on a large number of repetitions, which may prove economically prohibitive. This number can reduced by recourse to sequential testing [3,4,5], but this principle has not yet been worked out for such equipment. The present paper discusses particular locally-obtained data on filling-station UST's. Means are proposed for improved operation and periodic control of the portable test CP657, Review of Quantitative Nondestructive Evaluation Vol. 22, ed. by D. O. Thompson and D. E. Chimenti © 2003 American Institute of Physics 0-7354-0117-9/03/S20.00 907 devices. An approach is presented for their sequential testing. Probability distributions of the necessary number of measurements prior to the decision are obtained. STATISTICS OF UST TESTING AND ITS ANALYSIS The basic portable VTTT device used in Israel is the UST 2000/P (USTest, Inc.). Measurement results from June '98 to February '01 for the rate of change of the fuel volume in tanks with different fuel types are presented in the form of distributions in Fig. 1. The designations 95, 96, 98 refer to gasolines with corresponding octane number. Table 1 summarizes the results, and lists the following: - The total population of inspected tanks for each fuel type; - The rejected percentage; - The loss percentage - the rejected tanks in which the volume decreased; - The gain percentage - the rejected tanks in which the volume increased; - The mean and standard deviation of the leak rate. The total population of inspected tanks was 830 - definitely a representative sample. The proportion of tanks which cannot be declared tight is improbably high - 36%, so we preferred to designate them as "rejected" rather than "not tight". It should be borne in mind that according to [1] the probability of false alarm for the VTTT device should not exceed 5%, while according to the work characteristic of the UST 2000/P [6], obtained on the basis of the EPA standard [7], it does not exceed 0.1%. The high percentage of rejected tanks (46%) was found for those with the 95 gasoline. The distinctive feature of the latter is that about one half of it is imported and the other half locally produced, while all the others are exclusively local and their composition shows less fluctuation. All tanks except those with the 95 gasoline show distributions with three clearly defined peaks. These trivariate distributions can be explained by the special features of the mode of allowance for the thermal expansion of the liquid and by errors in determining the coefficient of this expansion. As is known [8], there is a close correlation link between the density of a fuel and its thermal expansion coefficient (TEC). Table 2 shows that the densities of locally-used fuels are stable, hence so are the TEC's, which indicates a systematic error in their determination. The excessive STD found for the 95 gasoline is attributable to the already mentioned circumstance that about half of it is imported, with a density somewhat lower (on the average) than that of the local product. The STD of the fuel density and the percentage of rejected tanks are linked, with a correlation coefficient of 0.82. Thus the tanks of this fuel have a flattened distribution of the measured rate of volume change, and an especially high percentage of rejections. TABLE 1. Distribution of tanks ace. to measured rate of change of fuel volume. Overall Gasoline 95 Gasoline 96 Gasoline 98 Kerosene Diesel fuel Total, items 830 210 184 148 40 248 Rejected, % 37 46 36 33 20 35 Loss, % 15 18 17 12 10 15 Gain, % 22 28 19 21 10 21 Mean, ml/h 21 31 7 26 25 18 STD, ml/h 186 206 181 176 163 184 908 TABLE 2. Statistical characteristics of density of fuels used in Israel. 15°C, Aug.l999-July 2000. Gasoline 95 Gasoline 96 Gasoline 98 Diesel fuel Kerosene Mean, g/cm 0.760 0.754 0.774 0.849 0.805 STD, g/cm3 0.012 0.008 0.007 0.003 0.002 3 -315 | -252 -126 -63 j ~3?9 I -315 -189 426 I -63 0 126 i 189 1 252 | 315 189 1 252 379 I 315 FIGURE 1. Distributions of tanks by volume change rate. Accordingly, the disparity between the scatter of the data obtained with the UST 2000/P under field conditions on the one hand, and the characteristic obtained under standard test conditions [6,7] on the other - confirms the need for periodic checking of the device under field conditions, as is the practice for other measurement techniques [9], FIXED-SAMPLE-SIZE TEST FOR VTTT DEVICE As a VTT test takes several hours, it should be ascertained how many such tests (measurements) under controlled change of the tank content are needed for determining the accuracy of the equipment. As was shown by Table 1 the mean change rate is very close to zero. In these circumstances the systematic error of the device can be disregarded, and it suffices to characterize the accuracy by the standard deviation from the actual rate (standard error). The specification for the VTTT in [1] is a leak rate Z=0.1 gph with a detection probability P/r=0.95 and a false-alarm probability P^=0.05, and for the VTTT device the 909 specified leak threshold Lth is such that "a tank system should not be declared tight if the test result indicates a loss or gain that equals or exceeds this threshold" [6]. As the statistical data analyzed in this work do not cover the "tails" of the distributions, we shall - in accordance with the recommendation of [11,10,7] - assume a normal pattern, the same irrespective of the presence or absence of a leak. In these circumstances the standard error G should not exceed a limit a//,, defined as the smaller of two values: one based on the probability P& of a leak L being detected and the other on that of a false alarm PFA* For example, the specification for the UST 2000/P being Lth = 0.05 gph (189 ml/h), its standard error as per [1] should not exceed Grt = 97ml/h (1) According to its technical characteristic, however [6], P/>=0.999 and P/^-0.001, whence a,A = 58ml/h (2) which is less by a factor of 3 than the actual value reported in Table 1. Let us now, following the model of [12] formulate the problem of construction of a verification criterion for a null hypothesis (H<j) regarding the standard error of the VTTT device: namely, the technical parameters of the latter justify the assumption for the standard deviation (standard error) of its measurements G < GO (3) We first find the sample size (number of measurements) sufficient for our construction problem - a verification criterion with probability a for an error of the first kind (level of significance) and (3 for an error of the second kind for detection of a given excess v of the hypothetic limit, G > VGO (4) where v>l is the excess coefficient. The necessary sample size TV is determined from XYa,AM=vYWl (5) where %2i-a,#-i is the 100(1- a)-percentile point of the chi-square distribution. The calculated value of the necessary N according to a, (3 and v, are given in Fig. 2. The necessary N is very high, which may render the procedure economically unfeasible in the case of time-consuming measurements. 50 i 40- i 1 3° $> Z& 20I 10 1 o ! 2 1,5 Eieess eoeffletett *? FIGURE. 2. Necessary number of measurements N vs. excess coefficient v. l-OF=p=0.05; 2-CF=p=0.1. 910 SEQUENTIAL TESTING FOR VTTT DEVICE The average TV can be reduced by resorting to the methodology of sequential testing [3,4,5]. The decision limits for the mean-square measurement error a, assuming a normaldistributed error with zero mean as per Wald [3] are given by kn-c< <kn + c (6) where n is the serial number of the measurement; #/ is the deviation (measurement error) of the leak reported by the device in the z-th measurement from its real value; £ = <7 0 2 (2v 2 /(v 2 -l))lnv; (7) ca =<T02(2v2/(v2 -l))(-ln£); cr = <72 (2v2/(v2 -l))ln^ ; A = (l-p)/<x; B = p/(l-a). (8) If the sum in (6) is below the left-hand bound, hypothesis HQ is accepted, the measurements are discontinued and the device is recognized as serviceable; if it is above the right-hand bound, hypothesis H\ is accepted, the measurements are likewise discontinued and the device is rejected. If it falls between the bounds, another measurement is called for. For sequential tests of any type, the operating characteristic (OC) can be formulated in parametric terms [3]: = (Bh-\)l(Bh-Ah) (9) (T(h) = C7 O A (V 2 A -1 where h is the parameter. Fig. 3 shows characteristics obtained for specific combinations of the test parameters. A program modeling the sequential testing process was developed. It estimates the probability of acceptance of HQ, i.e. the actual OC of the test. As expected [4,5] these probabilities differ positively from the values yielded by (9). 1.0 • *mm as - 0.95 <*« o\ *& & <»' 0,6 8 ^ e a4£» s s^ 0.2 • ON 0,0 - J g 40 \\ % o% %> 0 Truncated \ .^^^^ test 0.0S ™™™T™™- 60 80 100 120 140 Standard error, ml/h FIGURE 3. Operating characteristic curves for sequential testing: 1. oc=p=0.05, °o=97 ml/h, v=1.5; 2 - cc=P=0.05, a0=58 ml/h, v=2; Truncated test for 1: anoc=0.99; Truncated test for 2: ana=0.96; 911 160 ISO 200 Distribution of Number of Measurements Prior to a Decision, Modification of Sequential Tests The modeling program referred to above yields the numbers of measurements necessary for an acceptance/rejection decision on the null hypothesis, as well as the distributions of these numbers and their average values. In Fig.4, the above average is plotted against the normalized standard error on=a/Go. The term "untruncated" refers to tests which are continued until the cumulative errorssquared cut the straight lines bounding the decision domain (6). In the cases where an falls between the limits l...v, the average number of measurements increases significantly, reaching the values N plotted in Fig. 2 for fixedsample-size tests. As at an>l the parameters of the tested device fall within the rejection domain, the following stopping rule was proposed for reduction of the number of measurements: On reaching a certain NS9 the test is discontinued, the HQ hypothesis rejected and the device categorized as unserviceable (truncated test). Ns is found by iterative approximation, using the above program, so as to yield the specified a, (3 and v. The specified a is obtained for aw=ana (<!)• For example, in an experiment with a=|3=0.05 and v^l.5, Ns=52 and a«a=0.99. In an experiment with a=(3=0.05 and v=2, Nj=l7 and ana-0.96. Fig. 5 presents estimated probability distributions of the number of measurements prior to the acceptance/rejection decision. All curves are seen to be heavily skewed, and the distributions - especially the most decisionwise-problematic intervals around the maxima of Fig.4 - have long "tails" extending far beyond the value of NS9 in which the truncated test is terminated. Compared with the untruncated version, the truncated tests have a smaller average number of measurements (see Fig. 4), especially in the interval between 1 and v. Compared with the fixed-sample-size tests, the economy is quite significant and opens the way to application of these tests to the VTTT devices. v~L5, imtnineated | 35& | 30- v—1,5, tsuncatod I 25 - v-29 untnmcated , truncated 1 JOJ ® m 5f 2 & 5 " 0 0 0.5 1 LS 2 Normalized standard error o« FIGURE 4. Average number of measurements in sequential tests: cc=p-0.05. 912 15 3 20 0 20 20 Number of measurements FIGURE 5. Estimated probability distributions of number of measurements prior to acceptance/rejection decision, for a-|3=0.05, v-2. N=ll - threshold of rejection due to exceeded critical number of measurements. DISCUSSION: RECOMMENDATIONS ON CHECKING OF VTTT DEVICE The results of the project indicate that VTTT devices have to be checked for compliance with the [1] requirements regarding UST's. The checks should be carried out under the closest possible approximation of the actual field conditions and include control of the proficiency of the equipment operators. The optimal milieu for these checks are active filling stations, with groups of five tanks and more. The fuel content must be changed artificially at the prescribed rates. To ensure statistical independence of the tests and minimize the effect of uncontrolled factors, repeat tests should be run at other stations, or even at the same one but after topping up of the tanks. Another important consideration is the economic feasibility of the procedure, in view of the large number of measurements involved. This choice is based on the OC and the necessary numbers of measurements. When the UST 2000/P is checked according to its technical characteristic [6], the recommended combination is aa=ath=58 ml/h, a=(3=0.05, v-2, awa=0.96. The operating characteristic (curve 2, Fig. 3) shows that the probabilities of rejection of a serviceable device [1] and acceptance of an unserviceable one [6] are both of them low. The average necessary number of measurements for the latter case (Fig. 4) will be much smaller than for the former one, and does not exceed 9.4 at aw=1.2. The number is limited here to 17. Another means of independent control of the devices is statistical analysis of the tank inspections carried out with them. By this means, there will be no need for too frequent inspections, and accuracy will be improved. The reported results can be used for estimation of the indices of stationary automatic tank gauging systems currently widespread in filling stations. CONCLUSIONS 1. The unique statistical data presented in this paper on the results of volumetric tightness tests on filling station tanks - permitted estimation of the accuracy of the 913 2. 3. 4. 5. equipment under field conditions, and outline control techniques for its state. The standard error of the leakage measurements was larger than the accuracy level prescribed by national standards, and much larger than that claimed by the manufacturer. VTTT devices must be regularly checked for compliance of their accuracy with national standards under closest possible approximation of the field conditions, including control of operator proficiency. For the case of a fixed-sample-size test, a criterion was established for acceptance/rejection of an accuracy hypothesis. The necessary number of measurements is very large, and for a time-consuming measurement the method may prove economically prohibitive. A modified algorithm is proposed for a reduced number of measurements, with a criterion for early termination of the procedure. The dependences of the average number of measurements were obtained. On the basis of the operating characteristics and the average necessary number data, the basic parameters of the sequential procedure are recommended. It is also suggested that the tests be supplemented with the statistics of volumetric tightness tests carried out on the tanks in question. This would permit improved estimate accuracy and reduction of the number of measurements. ACKNOWLEDGEMENTS This research project was supported by the Israel Ministries of Environment and of Absorption. The author is indebted to Mr. E. Goldberg for editorial assistance. REFERENCES 1. EPA (U.S. Environmental Protection Agency), "Rule 40 Part 280 — Technical Standards and Corrective Action Requirements for Owners and Operators of Underground Storage Tanks (UST)", 1988, 55 p. 2. Farahnak, Sh., D., Drewry M., M., "Are Leak Detection Methods Effective In Finding Leaks in Underground Storage Tank Systems?" (Report) California EPA, 1998. URL:http://www.swrcb.ca.gov/cwphome/ust/docs/leak_report.html, 3. Wald, A., Sequential Analysis, John Wiley & Sons, NY, 1947. 4. Handbook of sequential analysis, Ed. by Ghosh, B. K., Marcel Dekker, NY, 1991. 5. Siegmund, D., 1985. Sequential Analysis: Tests and Confidence Intervals, SpringerVerlag, NY. 6. EPA, "List of Leak Detection Evaluations for UST Systems", 8th Ed., 2001, 311 p. 7. EPA, "Standard Test Procedures for Evaluating Leak Detection Methods: Volumetric Tank Tightness Testing Methods", (EPA/530/UST-90/004). 1990 8. API (American Petroleum Institute), ASTM D 12550-80. "Petroleum Measurement Tables: Volume Correction Factors". API standard: 2540. Volume IX. 1980. 9. WELMEC (European Cooperation in Legal Metrology), WELMEC information, 2001. URL: http://www.welmec.org/countries.htm. 10. EPA, "List of Integrity Assessment Evaluations for Underground Storage Tanks" Third Ed., 1999. URL: http://www.epa.gov/swerustl/ustsystm/ialist3.pdf 11. Maresca, J. W. et al, J. of Hazardous Materials. 26, 261-300 (1991). 12. Bendat, J. S., Piersol, A. G., Random Data: Analysis and Measurement Procedures, Wiley-Interscience, New York, 1986, 566 pp. 914
© Copyright 2025 Paperzz