907_1.pdf

SEQUENTIAL EVALUATION OF QNDE DEVICES FOR
UNDERGROUND STORAGE TANKS
Y. H. Michlin
Quality Assurance and Reliability, Technion - Israel Institute of Technology,
Haifa, 36812, Israel
ABSTRACT. Data showing that QNDE devices for tightness testing of storage tanks require
periodic precision checks under maximum reproduction of the field conditions. The measurement
error was larger than the accuracy prescribed by standards, and much larger than that claimed by the
manufacturer. In the paper, the algorithm based on the sequential approach for such tests, and the
probability distributions of the number of measurements up to a positive/negative serviceability
decision -are substantiated.
INTRODUCTION
Underground storage tanks (UST's) for fuels and liquid chemicals are a major source
of environmental pollution. As a protective measure against leaks from these tanks, the
procedure developed by the U.S. Environmental Protection Agency (EPA) as per [1] was
adopted in Israel. According to it, selected UST's for petroleum products are required to
undergo volumetric tank tightness testing (VTTT) with the aid of portable devices.
In seeking to apply these devices in Israel, the following factors have to be borne in
mind: (1) The tank population involved is relatively small compared with its North
American counterpart. (2) Such devices are not manufactured here, and there is no
infrastructure for their calibration and checking. (3) The tightness tests are requested by
the owners, who also receive the reports; subsequently, at their discretion, they submit the
reports (usually only the favorable ones) to the Ministry of the Environment.
The first factor, in view of the high cost of the equipment, limits its inventory in Israel
to one or two devices. This, in conjunction with the second factor, dictates extreme
caution regarding the quality of the tests. Because of the third factor, the information
received by the Ministry of the Environment derives from "truncated" samples. In these
circumstances, it is difficult to assess the accuracy of the test equipment and the
correctness of its application - with the attendant unreliability in predicting the state of the
tank population as a whole.
A similar problem is being encountered in the U.S.: according to an EPA document
[2], a site survey failed to detect a single leaking tank, in spite of the existence of such
cases.
Statistical reliability of the decisions as to the serviceability of the device is conditional
on a large number of repetitions, which may prove economically prohibitive. This number
can reduced by recourse to sequential testing [3,4,5], but this principle has not yet been
worked out for such equipment.
The present paper discusses particular locally-obtained data on filling-station UST's.
Means are proposed for improved operation and periodic control of the portable test
CP657, Review of Quantitative Nondestructive Evaluation Vol. 22, ed. by D. O. Thompson and D. E. Chimenti
© 2003 American Institute of Physics 0-7354-0117-9/03/S20.00
907
devices. An approach is presented for their sequential testing. Probability distributions of
the necessary number of measurements prior to the decision are obtained.
STATISTICS OF UST TESTING AND ITS ANALYSIS
The basic portable VTTT device used in Israel is the UST 2000/P (USTest, Inc.).
Measurement results from June '98 to February '01 for the rate of change of the fuel
volume in tanks with different fuel types are presented in the form of distributions in Fig.
1. The designations 95, 96, 98 refer to gasolines with corresponding octane number.
Table 1 summarizes the results, and lists the following:
- The total population of inspected tanks for each fuel type;
- The rejected percentage;
- The loss percentage - the rejected tanks in which the volume decreased;
- The gain percentage - the rejected tanks in which the volume increased;
- The mean and standard deviation of the leak rate.
The total population of inspected tanks was 830 - definitely a representative sample.
The proportion of tanks which cannot be declared tight is improbably high - 36%, so we
preferred to designate them as "rejected" rather than "not tight". It should be borne in
mind that according to [1] the probability of false alarm for the VTTT device should not
exceed 5%, while according to the work characteristic of the UST 2000/P [6], obtained on
the basis of the EPA standard [7], it does not exceed 0.1%.
The high percentage of rejected tanks (46%) was found for those with the 95 gasoline.
The distinctive feature of the latter is that about one half of it is imported and the other half
locally produced, while all the others are exclusively local and their composition shows
less fluctuation.
All tanks except those with the 95 gasoline show distributions with three clearly
defined peaks. These trivariate distributions can be explained by the special features of the
mode of allowance for the thermal expansion of the liquid and by errors in determining the
coefficient of this expansion. As is known [8], there is a close correlation link between the
density of a fuel and its thermal expansion coefficient (TEC). Table 2 shows that the
densities of locally-used fuels are stable, hence so are the TEC's, which indicates a
systematic error in their determination. The excessive STD found for the 95 gasoline is
attributable to the already mentioned circumstance that about half of it is imported, with a
density somewhat lower (on the average) than that of the local product. The STD of the
fuel density and the percentage of rejected tanks are linked, with a correlation coefficient
of 0.82. Thus the tanks of this fuel have a flattened distribution of the measured rate of
volume change, and an especially high percentage of rejections.
TABLE 1. Distribution of tanks ace. to measured rate of change of fuel volume.
Overall
Gasoline 95
Gasoline 96
Gasoline 98
Kerosene
Diesel fuel
Total, items
830
210
184
148
40
248
Rejected, %
37
46
36
33
20
35
Loss, %
15
18
17
12
10
15
Gain, %
22
28
19
21
10
21
Mean, ml/h
21
31
7
26
25
18
STD, ml/h
186
206
181
176
163
184
908
TABLE 2. Statistical characteristics of density of fuels used in Israel. 15°C, Aug.l999-July 2000.
Gasoline 95
Gasoline 96
Gasoline 98
Diesel fuel
Kerosene
Mean, g/cm
0.760
0.754
0.774
0.849
0.805
STD, g/cm3
0.012
0.008
0.007
0.003
0.002
3
-315 | -252
-126
-63 j
~3?9 I -315
-189
426 I -63
0
126 i 189 1 252 | 315
189 1 252
379
I 315
FIGURE 1. Distributions of tanks by volume change rate.
Accordingly, the disparity between the scatter of the data obtained with the UST
2000/P under field conditions on the one hand, and the characteristic obtained under
standard test conditions [6,7] on the other - confirms the need for periodic checking of the
device under field conditions, as is the practice for other measurement techniques [9],
FIXED-SAMPLE-SIZE TEST FOR VTTT DEVICE
As a VTT test takes several hours, it should be ascertained how many such tests
(measurements) under controlled change of the tank content are needed for determining
the accuracy of the equipment.
As was shown by Table 1 the mean change rate is very close to zero. In these
circumstances the systematic error of the device can be disregarded, and it suffices to
characterize the accuracy by the standard deviation from the actual rate (standard error).
The specification for the VTTT in [1] is a leak rate Z=0.1 gph with a detection
probability P/r=0.95 and a false-alarm probability P^=0.05, and for the VTTT device the
909
specified leak threshold Lth is such that "a tank system should not be declared tight if the
test result indicates a loss or gain that equals or exceeds this threshold" [6].
As the statistical data analyzed in this work do not cover the "tails" of the distributions,
we shall - in accordance with the recommendation of [11,10,7] - assume a normal pattern,
the same irrespective of the presence or absence of a leak. In these circumstances the
standard error G should not exceed a limit a//,, defined as the smaller of two values: one
based on the probability P& of a leak L being detected and the other on that of a false alarm
PFA*
For example, the specification for the UST 2000/P being Lth = 0.05 gph (189 ml/h), its
standard error as per [1] should not exceed
Grt = 97ml/h
(1)
According to its technical characteristic, however [6], P/>=0.999 and P/^-0.001, whence
a,A = 58ml/h
(2)
which is less by a factor of 3 than the actual value reported in Table 1.
Let us now, following the model of [12] formulate the problem of construction of a
verification criterion for a null hypothesis (H<j) regarding the standard error of the VTTT
device: namely, the technical parameters of the latter justify the assumption for the
standard deviation (standard error) of its measurements
G < GO
(3)
We first find the sample size (number of measurements) sufficient for our construction
problem - a verification criterion with probability a for an error of the first kind (level of
significance) and (3 for an error of the second kind for detection of a given excess v of the
hypothetic limit,
G > VGO
(4)
where v>l is the excess coefficient.
The necessary sample size TV is determined from
XYa,AM=vYWl
(5)
where %2i-a,#-i is the 100(1- a)-percentile point of the chi-square distribution.
The calculated value of the necessary N according to a, (3 and v, are given in Fig. 2.
The necessary N is very high, which may render the procedure economically unfeasible in
the case of time-consuming measurements.
50 i
40-
i
1 3°
$>
Z& 20I 10
1
o
!
2
1,5
Eieess eoeffletett *?
FIGURE. 2. Necessary number of measurements N vs. excess coefficient v.
l-OF=p=0.05; 2-CF=p=0.1.
910
SEQUENTIAL TESTING FOR VTTT DEVICE
The average TV can be reduced by resorting to the methodology of sequential testing
[3,4,5].
The decision limits for the mean-square measurement error a, assuming a normaldistributed error with zero mean as per Wald [3] are given by
kn-c<
<kn + c
(6)
where n is the serial number of the measurement; #/ is the deviation (measurement error)
of the leak reported by the device in the z-th measurement from its real value;
£ = <7 0 2 (2v 2 /(v 2 -l))lnv;
(7)
ca =<T02(2v2/(v2 -l))(-ln£);
cr = <72 (2v2/(v2 -l))ln^ ;
A = (l-p)/<x;
B = p/(l-a).
(8)
If the sum in (6) is below the left-hand bound, hypothesis HQ is accepted, the
measurements are discontinued and the device is recognized as serviceable; if it is above
the right-hand bound, hypothesis H\ is accepted, the measurements are likewise
discontinued and the device is rejected. If it falls between the bounds, another
measurement is called for.
For sequential tests of any type, the operating characteristic (OC) can be formulated in
parametric terms [3]:
= (Bh-\)l(Bh-Ah)
(9)
(T(h) = C7 O A (V 2 A -1
where h is the parameter.
Fig. 3 shows characteristics obtained for specific combinations of the test parameters.
A program modeling the sequential testing process was developed. It estimates the
probability of acceptance of HQ, i.e. the actual OC of the test. As expected [4,5] these
probabilities differ positively from the values yielded by (9).
1.0 •
*mm as -
0.95
<*«
o\
*&
&
<»'
0,6 8
^
e a4£»
s
s^
0.2 •
ON
0,0 -
J
g
40
\\
%
o%
%>
0 Truncated
\
.^^^^
test
0.0S
™™™T™™-
60
80
100
120
140
Standard error, ml/h
FIGURE 3. Operating characteristic curves for sequential testing:
1. oc=p=0.05, °o=97 ml/h, v=1.5; 2 - cc=P=0.05, a0=58 ml/h, v=2;
Truncated test for 1: anoc=0.99; Truncated test for 2: ana=0.96;
911
160
ISO
200
Distribution of Number of Measurements Prior to a Decision, Modification of
Sequential Tests
The modeling program referred to above yields the numbers of measurements
necessary for an acceptance/rejection decision on the null hypothesis, as well as the
distributions of these numbers and their average values.
In Fig.4, the above average is plotted against the normalized standard error on=a/Go.
The term "untruncated" refers to tests which are continued until the cumulative errorssquared cut the straight lines bounding the decision domain (6).
In the cases where an falls between the limits l...v, the average number of
measurements increases significantly, reaching the values N plotted in Fig. 2 for fixedsample-size tests. As at an>l the parameters of the tested device fall within the rejection
domain, the following stopping rule was proposed for reduction of the number of
measurements:
On reaching a certain NS9 the test is discontinued, the HQ hypothesis rejected and the
device categorized as unserviceable (truncated test).
Ns is found by iterative approximation, using the above program, so as to yield the
specified a, (3 and v. The specified a is obtained for aw=ana (<!)•
For example, in an experiment with a=|3=0.05 and v^l.5, Ns=52 and a«a=0.99. In an
experiment with a=(3=0.05 and v=2, Nj=l7 and ana-0.96.
Fig. 5 presents estimated probability distributions of the number of measurements prior
to the acceptance/rejection decision. All curves are seen to be heavily skewed, and the
distributions - especially the most decisionwise-problematic intervals around the maxima
of Fig.4 - have long "tails" extending far beyond the value of NS9 in which the truncated
test is terminated.
Compared with the untruncated version, the truncated tests have a smaller average
number of measurements (see Fig. 4), especially in the interval between 1 and v.
Compared with the fixed-sample-size tests, the economy is quite significant and opens the
way to application of these tests to the VTTT devices.
v~L5, imtnineated
| 35&
| 30-
v—1,5, tsuncatod
I 25 -
v-29 untnmcated
, truncated
1 JOJ
®
m 5f
2
&
5
"
0
0
0.5
1
LS
2
Normalized standard error o«
FIGURE 4. Average number of measurements in sequential tests: cc=p-0.05.
912
15
3
20
0
20
20
Number of measurements
FIGURE 5. Estimated probability distributions of number of measurements prior to acceptance/rejection
decision, for a-|3=0.05, v-2. N=ll - threshold of rejection due to exceeded critical number of
measurements.
DISCUSSION: RECOMMENDATIONS ON CHECKING OF VTTT DEVICE
The results of the project indicate that VTTT devices have to be checked for
compliance with the [1] requirements regarding UST's. The checks should be carried out
under the closest possible approximation of the actual field conditions and include control
of the proficiency of the equipment operators.
The optimal milieu for these checks are active filling stations, with groups of five tanks
and more. The fuel content must be changed artificially at the prescribed rates. To ensure
statistical independence of the tests and minimize the effect of uncontrolled factors, repeat
tests should be run at other stations, or even at the same one but after topping up of the
tanks.
Another important consideration is the economic feasibility of the procedure, in view
of the large number of measurements involved. This choice is based on the OC and the
necessary numbers of measurements.
When the UST 2000/P is checked according to its technical characteristic [6], the
recommended combination is aa=ath=58 ml/h, a=(3=0.05, v-2, awa=0.96. The operating
characteristic (curve 2, Fig. 3) shows that the probabilities of rejection of a serviceable
device [1] and acceptance of an unserviceable one [6] are both of them low. The average
necessary number of measurements for the latter case (Fig. 4) will be much smaller than
for the former one, and does not exceed 9.4 at aw=1.2. The number is limited here to 17.
Another means of independent control of the devices is statistical analysis of the tank
inspections carried out with them. By this means, there will be no need for too frequent
inspections, and accuracy will be improved.
The reported results can be used for estimation of the indices of stationary automatic
tank gauging systems currently widespread in filling stations.
CONCLUSIONS
1.
The unique statistical data presented in this paper on the results of volumetric
tightness tests on filling station tanks - permitted estimation of the accuracy of the
913
2.
3.
4.
5.
equipment under field conditions, and outline control techniques for its state. The
standard error of the leakage measurements was larger than the accuracy level
prescribed by national standards, and much larger than that claimed by the
manufacturer.
VTTT devices must be regularly checked for compliance of their accuracy with
national standards under closest possible approximation of the field conditions,
including control of operator proficiency.
For the case of a fixed-sample-size test, a criterion was established for
acceptance/rejection of an accuracy hypothesis.
The necessary number of
measurements is very large, and for a time-consuming measurement the method may
prove economically prohibitive.
A modified algorithm is proposed for a reduced number of measurements, with a
criterion for early termination of the procedure. The dependences of the average
number of measurements were obtained.
On the basis of the operating characteristics and the average necessary number data,
the basic parameters of the sequential procedure are recommended. It is also
suggested that the tests be supplemented with the statistics of volumetric tightness
tests carried out on the tanks in question. This would permit improved estimate
accuracy and reduction of the number of measurements.
ACKNOWLEDGEMENTS
This research project was supported by the Israel Ministries of Environment and of
Absorption. The author is indebted to Mr. E. Goldberg for editorial assistance.
REFERENCES
1. EPA (U.S. Environmental Protection Agency), "Rule 40 Part 280 — Technical
Standards and Corrective Action Requirements for Owners and Operators of
Underground Storage Tanks (UST)", 1988, 55 p.
2. Farahnak, Sh., D., Drewry M., M., "Are Leak Detection Methods Effective In Finding
Leaks in Underground Storage Tank Systems?" (Report) California EPA, 1998.
URL:http://www.swrcb.ca.gov/cwphome/ust/docs/leak_report.html,
3. Wald, A., Sequential Analysis, John Wiley & Sons, NY, 1947.
4. Handbook of sequential analysis, Ed. by Ghosh, B. K., Marcel Dekker, NY, 1991.
5. Siegmund, D., 1985. Sequential Analysis: Tests and Confidence Intervals, SpringerVerlag, NY.
6. EPA, "List of Leak Detection Evaluations for UST Systems", 8th Ed., 2001, 311 p.
7. EPA, "Standard Test Procedures for Evaluating Leak Detection Methods: Volumetric
Tank Tightness Testing Methods", (EPA/530/UST-90/004). 1990
8. API (American Petroleum Institute), ASTM D 12550-80. "Petroleum Measurement
Tables: Volume Correction Factors". API standard: 2540. Volume IX. 1980.
9. WELMEC (European Cooperation in Legal Metrology), WELMEC information, 2001.
URL: http://www.welmec.org/countries.htm.
10. EPA, "List of Integrity Assessment Evaluations for Underground Storage Tanks" Third Ed., 1999. URL: http://www.epa.gov/swerustl/ustsystm/ialist3.pdf
11. Maresca, J. W. et al, J. of Hazardous Materials. 26, 261-300 (1991).
12. Bendat, J. S., Piersol, A. G., Random Data: Analysis and Measurement Procedures,
Wiley-Interscience, New York, 1986, 566 pp.
914